Skip to content
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,8 @@ scripts/*.ps1
scripts/*.sh
**/dist
**/build
*.log
*.log
benchmark/
modelTest/
nc_workspace/
debug_openai_history.txt
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ Run local LLMs on iGPU, APU and CPU (AMD , Intel, and Qualcomm (Coming Soon)). E
* Onnxruntime CPU Models [Link](./docs/model/onnxruntime_cpu_models.md)
* Ipex-LLM Models [Link](./docs/model/ipex_models.md)
* OpenVINO-LLM Models [Link](./docs/model/openvino_models.md)
* NPU-LLM Models [Link](./docs/model/npu_models.md)

## Getting Started

Expand All @@ -56,12 +57,14 @@ Run local LLMs on iGPU, APU and CPU (AMD , Intel, and Qualcomm (Coming Soon)). E
- **CUDA:** `$env:ELLM_TARGET_DEVICE='cuda'; pip install -e .[cuda]`
- **IPEX:** `$env:ELLM_TARGET_DEVICE='ipex'; python setup.py develop`
- **OpenVINO:** `$env:ELLM_TARGET_DEVICE='openvino'; pip install -e .[openvino]`
- **NPU:** `$env:ELLM_TARGET_DEVICE='npu'; pip install -e .[npu]`
- **With Web UI**:
- **DirectML:** `$env:ELLM_TARGET_DEVICE='directml'; pip install -e .[directml,webui]`
- **CPU:** `$env:ELLM_TARGET_DEVICE='cpu'; pip install -e .[cpu,webui]`
- **CUDA:** `$env:ELLM_TARGET_DEVICE='cuda'; pip install -e .[cuda,webui]`
- **IPEX:** `$env:ELLM_TARGET_DEVICE='ipex'; python setup.py develop; pip install -r requirements-webui.txt`
- **OpenVINO:** `$env:ELLM_TARGET_DEVICE='openvino'; pip install -e .[openvino,webui]`
- **NPU:** `$env:ELLM_TARGET_DEVICE='npu'; pip install -e .[npu,webui]`

- **Linux**

Expand All @@ -77,12 +80,14 @@ Run local LLMs on iGPU, APU and CPU (AMD , Intel, and Qualcomm (Coming Soon)). E
- **CUDA:** `ELLM_TARGET_DEVICE='cuda' pip install -e .[cuda]`
- **IPEX:** `ELLM_TARGET_DEVICE='ipex' python setup.py develop`
- **OpenVINO:** `ELLM_TARGET_DEVICE='openvino' pip install -e .[openvino]`
- **NPU:** `ELLM_TARGET_DEVICE='npu' pip install -e .[npu]`
- **With Web UI**:
- **DirectML:** `ELLM_TARGET_DEVICE='directml' pip install -e .[directml,webui]`
- **CPU:** `ELLM_TARGET_DEVICE='cpu' pip install -e .[cpu,webui]`
- **CUDA:** `ELLM_TARGET_DEVICE='cuda' pip install -e .[cuda,webui]`
- **IPEX:** `ELLM_TARGET_DEVICE='ipex' python setup.py develop; pip install -r requirements-webui.txt`
- **OpenVINO:** `ELLM_TARGET_DEVICE='openvino' pip install -e .[openvino,webui]`
- **NPU:** `ELLM_TARGET_DEVICE='npu' pip install -e .[npu,webui]`

### Launch OpenAI API Compatible Server

Expand Down Expand Up @@ -142,6 +147,9 @@ It is an interface that allows you to download and deploy OpenAI API compatible

# OpenVINO
ellm_server --model_path '.\meta-llama_Meta-Llama-3.1-8B-Instruct\' --backend 'openvino' --device 'gpu' --port 5555 --served_model_name 'meta-llama_Meta/Llama-3.1-8B-Instruct'

# NPU
ellm_server --model_path 'microsoft/Phi-3-mini-4k-instruct' --backend 'npu' --device 'npu' --port 5555 --served_model_name 'microsoft/Phi-3-mini-4k-instruct'
```

## Prebuilt OpenAI API Compatible Windows Executable (Alpha)
Expand All @@ -161,6 +169,9 @@ _Powershell/Terminal Usage (Use it like `ellm_server`)_:

# OpenVINO
.\ellm_api_server.exe --model_path '.\meta-llama_Meta-Llama-3.1-8B-Instruct\' --backend 'openvino' --device 'gpu' --port 5555 --served_model_name 'meta-llama_Meta/Llama-3.1-8B-Instruct'

# NPU
.\ellm_api_server.exe --model_path 'microsoft/Phi-3-mini-4k-instruct' --backend 'npu' --device 'npu' --port 5555 --served_model_name 'microsoft/Phi-3-mini-4k-instruct'
```

## Acknowledgements
Expand Down
15 changes: 15 additions & 0 deletions docs/model/npu_models.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Model Powered by NPU-LLM

## Verified Models
Verified models can be found from EmbeddedLLM NPU-LLM model collections
* EmbeddedLLM NPU-LLM Model collections: [link](https://huggingface.co/collections/EmbeddedLLM/npu-llm-66d692817e6c9509bb8ead58)

| Model | Model Link |
| --- | --- |
| Phi-3-mini-4k-instruct | [link](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) |
| Phi-3-mini-128k-instruct | [link](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) |
| Phi-3-medium-4k-instruct | [link](https://huggingface.co/microsoft/Phi-3-medium-4k-instruct) |
| Phi-3-medium-128k-instruct | [link](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct) |

## Contribution
We welcome contributions to the verified model list.
3 changes: 3 additions & 0 deletions requirements-npu.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
intel-npu-acceleration-library
torch>=2.4
transformers>=4.42
2 changes: 1 addition & 1 deletion requirements-webui.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
gradio~=4.36.1
gradio~=4.43.0
9 changes: 9 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,10 @@ def _is_openvino() -> bool:
return ELLM_TARGET_DEVICE == "openvino"


def _is_npu() -> bool:
return ELLM_TARGET_DEVICE == "npu"


class ELLMInstallCommand(install):
def run(self):
install.run(self)
Expand Down Expand Up @@ -186,6 +190,8 @@ def get_requirements() -> List[str]:
requirements = _read_requirements("requirements-ipex.txt")
elif _is_openvino():
requirements = _read_requirements("requirements-openvino.txt")
elif _is_npu():
requirements = _read_requirements("requirements-npu.txt")
else:
raise ValueError("Unsupported platform, please use CUDA, ROCm, Neuron, or CPU.")
return requirements
Expand All @@ -204,6 +210,8 @@ def get_ellm_version() -> str:
version += "+ipex"
elif _is_openvino():
version += "+openvino"
elif _is_npu():
version += "+npu"
else:
raise RuntimeError("Unknown runtime environment")

Expand Down Expand Up @@ -256,6 +264,7 @@ def get_ellm_version() -> str:
"cuda": ["onnxruntime-genai-cuda==0.3.0rc2"],
"ipex": [],
"openvino": [],
"npu": [],
},
dependency_links=dependency_links,
entry_points={
Expand Down
Loading