Skip to content

Commit 9a4bf8e

Browse files
committed
docs: add verified models info.
1 parent 25cf2a6 commit 9a4bf8e

File tree

7 files changed

+35
-15
lines changed

7 files changed

+35
-15
lines changed

LLama/runtimes/libllama-cuda11.dll

4.5 KB
Binary file not shown.

LLama/runtimes/libllama-cuda11.so

-3.52 KB
Binary file not shown.

LLama/runtimes/libllama-cuda12.dll

11.5 KB
Binary file not shown.

LLama/runtimes/libllama-cuda12.so

12.6 KB
Binary file not shown.

LLama/runtimes/libllama.dll

-7 KB
Binary file not shown.

LLama/runtimes/libllama.so

8.23 KB
Binary file not shown.

README.md

Lines changed: 35 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -11,13 +11,13 @@
1111

1212

1313
The C#/.NET binding of [llama.cpp](https://github.com/ggerganov/llama.cpp). It provides APIs to inference the LLaMa Models and deploy it on native environment or Web. It works on
14-
both Windows and Linux and does NOT require compiling llama.cpp yourself.
14+
both Windows and Linux and does NOT require compiling llama.cpp yourself. Its performance is close to llama.cpp.
1515

16-
- Load and inference LLaMa models
17-
- Simple APIs for chat session
18-
- Quantize the model in C#/.NET
16+
- LLaMa models inference
17+
- APIs for chat session
18+
- Model quantization
19+
- Embedding generation, tokenization and detokenization
1920
- ASP.NET core integration
20-
- Native UI integration
2121

2222
## Installation
2323

@@ -35,18 +35,23 @@ LLamaSharp.Backend.Cuda11
3535
LLamaSharp.Backend.Cuda12
3636
```
3737

38-
The latest version of `LLamaSharp` and `LLamaSharp.Backend` may not always be the same. `LLamaSharp.Backend` follows up [llama.cpp](https://github.com/ggerganov/llama.cpp) because sometimes the
39-
break change of it makes some model weights invalid. If you are not sure which version of backend to install, just install the latest version.
38+
Here's the mapping of them and corresponding model samples provided by `LLamaSharp`. If you're not sure which model is available for a version, please try our sample model.
4039

41-
Note that version v0.2.1 has a package named `LLamaSharp.Cpu`. After v0.2.2 it will be dropped.
40+
| LLamaSharp.Backend | LLamaSharp | Verified Model Resources | llama.cpp commit id |
41+
| - | - | -- | - |
42+
| - | v0.2.0 | This version is not recommended to use. | - |
43+
| - | v0.2.1 | [WizardLM](https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/previous_llama), [Vicuna (filenames with "old")](https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/tree/main) | - |
44+
| v0.2.2 | v0.2.2, v0.2.3 | [WizardLM](https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/previous_llama_ggmlv2), [Vicuna (filenames without "old")](https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/tree/main) | 63d2046 |
45+
| v0.3.0 | v0.3.0 | [LLamaSharpSamples v0.3.0](https://huggingface.co/AsakusaRinne/LLamaSharpSamples/tree/v0.3.0), [WizardLM](https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/main) | 7e4ea5b |
4246

4347
We publish the backend with cpu, cuda11 and cuda12 because they are the most popular ones. If none of them matches, please compile the [llama.cpp](https://github.com/ggerganov/llama.cpp)
4448
from source and put the `libllama` under your project's output path. When building from source, please add `-DBUILD_SHARED_LIBS=ON` to enable the library generation.
4549

4650
## FAQ
4751

48-
1. GPU out of memory: v0.2.3 put all layers into GPU by default. If the momory use is out of the capacity of your GPU, please set `n_gpu_layers` to a smaller number.
49-
2. Unsupported model: `llama.cpp` is under quick development and often has break changes. Please check the release date of the model and find a suitable version of LLamaSharp to install.
52+
1. GPU out of memory: Please try setting `n_gpu_layers` to a smaller number.
53+
2. Unsupported model: `llama.cpp` is under quick development and often has break changes. Please check the release date of the model and find a suitable version of LLamaSharp to install, or use the model we provide [on huggingface](https://huggingface.co/AsakusaRinne/LLamaSharpSamples).
54+
5055

5156
## Simple Benchmark
5257

@@ -112,30 +117,35 @@ For more usages, please refer to [Examples](./LLama.Examples).
112117

113118
We provide the integration of ASP.NET core [here](./LLama.WebAPI). Since currently the API is not stable, please clone the repo and use it. In the future we'll publish it on NuGet.
114119

120+
Since we are in short of hands, if you're familiar with ASP.NET core, we'll appreciate it if you would like to help upgrading the Web API integration.
121+
115122
## Demo
116123

117124
![demo-console](Assets/console_demo.gif)
118125

119126
## Roadmap
120127

121-
✅ LLaMa model inference.
128+
✅ LLaMa model inference
122129

123-
✅ Embeddings generation.
130+
✅ Embeddings generation, tokenization and detokenization
124131

125-
✅ Chat session.
132+
✅ Chat session
126133

127134
✅ Quantization
128135

136+
✅ State saving and loading
137+
129138
✅ ASP.NET core Integration
130139

131-
🔳 UI Integration
140+
🔳 MAUI Integration
132141

133142
🔳 Follow up llama.cpp and improve performance
134143

135144
## Assets
136145

137-
The model weights are too large to be included in the repository. However some resources could be found below:
146+
Some extra model resources could be found below:
138147

148+
- [Qunatized models provided by LLamaSharp Authors](https://huggingface.co/AsakusaRinne/LLamaSharpSamples)
139149
- [eachadea/ggml-vicuna-13b-1.1](https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/tree/main)
140150
- [TheBloke/wizardLM-7B-GGML](https://huggingface.co/TheBloke/wizardLM-7B-GGML)
141151
- Magnet: [magnet:?xt=urn:btih:b8287ebfa04f879b048d4d4404108cf3e8014352&dn=LLaMA](magnet:?xt=urn:btih:b8287ebfa04f879b048d4d4404108cf3e8014352&dn=LLaMA)
@@ -149,6 +159,16 @@ The prompts could be found below:
149159
- [awesome-chatgpt-prompts](https://github.com/f/awesome-chatgpt-prompts)
150160
- [awesome-chatgpt-prompts-zh](https://github.com/PlexPt/awesome-chatgpt-prompts-zh) (Chinese)
151161

162+
## Contributing
163+
164+
Any contribution is welcomed! You can do one of the followings to help us make `LLamaSharp` better:
165+
166+
- Append a model link that is available for a version. (This is very important!)
167+
- Star and share `LLamaSharp` to let others know it.
168+
- Add a feature or fix a BUG.
169+
- Help to develop Web API and UI integration.
170+
- Just start an issue about the problem you met!
171+
152172
## Contact us
153173

154174
Join our chat on [Discord](https://discord.gg/quBc2jrz).

0 commit comments

Comments
 (0)