Skip to content

Commit 13f6bcd

Browse files
committed
minor
Signed-off-by: Kinjal Patel <[email protected]>
1 parent 708630f commit 13f6bcd

File tree

2 files changed

+3
-0
lines changed

2 files changed

+3
-0
lines changed

examples/vllm_serve/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,3 +89,4 @@ torch.distributed.barrier()
8989

9090
1. AWQ is not yet supported in vLLM.
9191
2. PTQ/QAT checkpoint doesn't work with KV Cache quantization enabled.
92+
3. Mixed precision checkpoint doesn't work currently.

modelopt/torch/export/unified_export_hf.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -582,6 +582,8 @@ def export_hf_checkpoint(
582582
dtype: the weights data type to export the unquantized layers or the default model data type if None.
583583
export_dir: the target export path.
584584
save_modelopt_state: whether to save the modelopt state_dict.
585+
export_bf16_weights_amax: whether to export the bf16 weights and amax values separately. This can be used for
586+
vLLM fakequant serving.
585587
"""
586588
export_dir = Path(export_dir)
587589
export_dir.mkdir(parents=True, exist_ok=True)

0 commit comments

Comments
 (0)