Skip to content

Conversation

@vidyasiv
Copy link

@vidyasiv vidyasiv commented Oct 20, 2025

Command

PT_HPU_LAZY_MODE=0 ./calibrate_model.sh \
-m <>/llama4/Llama-4-Maverick-17B-128E-Instruct \
-d <>/mlperf_inference/llama2/processed-data.pkl \
-o /eager_output  -b 128 -t 8 -l 4096

Without this fix:

1/4 Preparing calibration dataset
Calling add_step_closure function does not have any effect. It's lazy mode only functionality. (warning logged once)
Calling mark_step function does not have any effect. It's lazy mode only functionality. (warning logged once)
Calling iter_mark_step function does not have any effect. It's lazy mode only functionality. (warning logged once)
Loading source dataset: /mnt/weka/data/mlperf_inference/llama2/processed-data.pkl
Creating calibration dataset...
Traceback (most recent call last):
  File "/root/work/vllm-gaudi/calibration/step-1-prepare-calibration-dataset.py", line 93, in <module>
    main(args)
  File "/root/work/vllm-gaudi/calibration/step-1-prepare-calibration-dataset.py", line 61, in main
    tmp_input = tokenizer.apply_chat_template(tmp_conversation,
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'bool' object has no attribute 'apply_chat_template'

PR carried over from vllm-fork: HabanaAI/vllm-hpu-extension#341

Explanation:
Based on HF documentation: Llama4 AutoTokenizer should work for Llama4 Text only, in multimodel cases, we need to use Autoprocessor.

I noticed omitting use_fast=False or setting use_fast=True in AutoTokenizer.from_pretrained() helped get past the error.

@github-actions
Copy link

🚧 CI Blocked

The main CI workflow was not started for the following reason:

This is a Draft PR. Please mark it as 'Ready for Review' to trigger the CI.

@github-actions
Copy link

✅ CI Passed

All checks passed successfully against the following vllm commit:
1c691f4a714981bd90ce536cbd00041d3e0aa7bb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant