Skip to content

Commit a121d8d

Browse files
committed
Update 2025-10-25 22:18:31
1 parent cd4ef5b commit a121d8d

File tree

104 files changed

+6648
-6451
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

104 files changed

+6648
-6451
lines changed

README.html

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -176,8 +176,6 @@
176176
<li class="toctree-l1"><a class="reference internal" href="basic_usage/sampling_params.html">Sampling Parameters</a></li>
177177
<li class="toctree-l1"><a class="reference internal" href="basic_usage/deepseek.html">DeepSeek Usage</a></li>
178178
<li class="toctree-l1"><a class="reference internal" href="basic_usage/deepseek_v32.html">DeepSeek V3.2 Usage</a></li>
179-
180-
181179
<li class="toctree-l1"><a class="reference internal" href="basic_usage/gpt_oss.html">GPT OSS Usage</a></li>
182180
<li class="toctree-l1"><a class="reference internal" href="basic_usage/llama4.html">Llama4 Usage</a></li>
183181
<li class="toctree-l1"><a class="reference internal" href="basic_usage/qwen3.html">Qwen3-Next Usage</a></li>

_sources/advanced_features/lora.ipynb

Lines changed: 207 additions & 214 deletions
Large diffs are not rendered by default.

_sources/advanced_features/separate_reasoning.ipynb

Lines changed: 123 additions & 117 deletions
Large diffs are not rendered by default.

_sources/advanced_features/speculative_decoding.ipynb

Lines changed: 514 additions & 301 deletions
Large diffs are not rendered by default.

_sources/advanced_features/structured_outputs.ipynb

Lines changed: 172 additions & 172 deletions
Large diffs are not rendered by default.

_sources/advanced_features/structured_outputs_for_reasoning_models.ipynb

Lines changed: 200 additions & 220 deletions
Large diffs are not rendered by default.

_sources/advanced_features/tool_parser.ipynb

Lines changed: 226 additions & 208 deletions
Large diffs are not rendered by default.

_sources/advanced_features/vlm_query.ipynb

Lines changed: 241 additions & 264 deletions
Large diffs are not rendered by default.

_sources/basic_usage/deepseek_v32.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ python -m sglang.launch_server --model deepseek-ai/DeepSeek-V3.2-Exp --tp 8 --ep
6060
- B200: `flashmla_kv` prefill attention, `flashmla_kv` decode attention, `fp8_e4m3` kv cache dtype.
6161
- Currently we don't enable `prefill=flashmla_sparse` with `decode=flashmla_kv` due to latency caused by kv cache quantization operations. In the future we might shift to this setting after attention/quantization kernels are optimized.
6262

63-
### Multi-token Prediction
63+
## Multi-token Prediction
6464
SGLang implements Multi-Token Prediction (MTP) for DeepSeek V3.2 based on [EAGLE speculative decoding](https://docs.sglang.ai/advanced_features/speculative_decoding.html#EAGLE-Decoding). With this optimization, the decoding speed can be improved significantly on small batch sizes. Please look at [this PR](https://github.com/sgl-project/sglang/pull/11652) for more information.
6565

6666
Example usage:
@@ -71,10 +71,10 @@ python -m sglang.launch_server --model deepseek-ai/DeepSeek-V3.2-Exp --tp 8 --dp
7171
- The default value of `--max-running-requests` is set to `48` for MTP. For larger batch sizes, this value should be increased beyond the default value.
7272

7373

74-
# Function Calling and Reasoning Parser
74+
## Function Calling and Reasoning Parser
7575
The usage of function calling and reasoning parser is the same as DeepSeek V3.1. Please refer to [Reasoning Parser](https://docs.sglang.ai/advanced_features/separate_reasoning.html) and [Tool Parser](https://docs.sglang.ai/advanced_features/tool_parser.html) documents.
7676

77-
# PD Disaggregation
77+
## PD Disaggregation
7878

7979
Prefill Command:
8080
```bash

_sources/basic_usage/native_api.ipynb

Lines changed: 284 additions & 314 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)