Skip to content

Commit 02150d1

Browse files
paulyu12yangxiaojun0126
authored andcommitted
[DOC] update doc: LoRA with ACLGraph (vllm-project#2430)
### What this PR does / why we need it? Update DOC. Guide users to run LoRA with ACLGraph. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@de7b67a --------- Signed-off-by: paulyu12 <[email protected]>
1 parent 0654e6c commit 02150d1

File tree

2 files changed

+19
-4
lines changed

2 files changed

+19
-4
lines changed

docs/source/installation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,7 @@ vllm-ascend will build custom ops by default. If you don't want to build it, set
171171

172172
```{note}
173173
If you are building from v0.7.3-dev and intend to use sleep mode feature, you should set `COMPILE_CUSTOM_KERNELS=1` manually.
174-
To build custom ops, gcc/g++ higher than 8 and c++ 17 or higher is required. If you're using `pip install -e .` and encourage a torch-npu version conflict, please install with `pip install --no-build-isolation -e .` to build on system env.
174+
To build custom ops, gcc/g++ higher than 8 and c++ 17 or higher is required. If you're using `pip install -e .` and encounter a torch-npu version conflict, please install with `pip install --no-build-isolation -e .` to build on system env.
175175
If you encounter other problems during compiling, it is probably because unexpected compiler is being used, you may export `CXX_COMPILER` and `C_COMPILER` in env to specify your g++ and gcc locations before compiling.
176176
```
177177

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,23 @@
11
# LoRA Adapters Guide
22

3+
## Overview
34
Like vLLM, vllm-ascend supports LoRA as well. The usage and more details can be found in [vLLM official document](https://docs.vllm.ai/en/latest/features/lora.html).
45

5-
You can also refer to [this](https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-text-only-language-models) to find which models support LoRA in vLLM.
6+
You can refer to [Supported Models](https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-text-only-language-models) to find which models support LoRA in vLLM.
67

7-
## Tips
8-
If you fail to run vllm-ascend with LoRA, you may follow [this instruction](https://vllm-ascend.readthedocs.io/en/latest/user_guide/feature_guide/graph_mode.html#fallback-to-eager-mode) to disable graph mode and try again.
8+
You can run LoRA with ACLGraph mode now. Please refer to [Graph Mode Guide](./graph_mode.md) for a better LoRA performance.
9+
10+
## Example
11+
We show a simple LoRA example here, which enables the ACLGraph mode as default.
12+
13+
```shell
14+
vllm serve meta-llama/Llama-2-7b \
15+
--enable-lora \
16+
--lora-modules '{"name": "sql-lora", "path": "/path/to/lora", "base_model_name": "meta-llama/Llama-2-7b"}'
17+
```
18+
19+
## Custom LoRA Operators
20+
21+
We have implemented LoRA-related AscendC operators, such as bgmv_shrink, bgmv_expand, sgmv_shrink and sgmv_expand. You can find them under the "csrc/kernels" directory of [vllm-ascend repo](https://github.com/vllm-project/vllm-ascend.git).
22+
23+
When you install vllm and vllm-ascend, those operators mentioned above will be compiled and installed automatically. If you don't want to use AscendC operators when you run vllm-ascend, you should set `COMPILE_CUSTOM_KERNELS=0` and reinstall vllm-ascend. To require more instructions about installation and compilation, you can refer to [installation guide](../../installation.md).

0 commit comments

Comments
 (0)