Fix e2e test failure for RTX6000 Pro (NVIDIA#6420)

farazkh80 · Ria Jain · commit fb436fd548ef · 2025-08-07T16:13:15.000-07:00
Signed-off-by: list &lt;58580514+farazkh80@users.noreply.github.com&gt;
Signed-off-by: Faraz &lt;58580514+farazkh80@users.noreply.github.com&gt;
diff --git a/tests/integration/defs/test_e2e.py b/tests/integration/defs/test_e2e.py
@@ -1628,6 +1628,8 @@ def test_ptp_quickstart_advanced(llm_root, llm_venv, model_name, model_path):
             ]
             if "Qwen3" in model_name:
                 cmds.append(f"--kv_cache_fraction=0.6")
+            if "Llama3.1-70B" in model_name:
+                cmds.append(f"--max_num_tokens=1024")
             llm_venv.run_cmd(cmds, stdout=running_log)
             if model_name in mapping:
                 _check_mem_usage(running_log, [mapping[model_name], 0, 0, 0])
diff --git a/tests/integration/test_lists/waives.txt b/tests/integration/test_lists/waives.txt
@@ -396,8 +396,6 @@ test_e2e.py::test_trtllm_bench_llmapi_launch[trt_backend-llama-v3-llama3-8b] SKI
 stress_test/stress_test.py::test_run_stress_test[llama-v3-8b-instruct-hf_tp1-stress_time_300s_timeout_450s-GUARANTEED_NO_EVICT-pytorch-stress-test] SKIP (https://nvbugs/5375646)
 full:GH200/disaggregated/test_disaggregated.py::test_disaggregated_deepseek_v3_lite_fp8_attention_dp_one[DeepSeek-V3-Lite-fp8] SKIP (https://nvbugs/5375966)
 accuracy/test_llm_api_pytorch.py::TestGemma3_1BInstruct::test_auto_dtype SKIP (https://nvbugs/5375620)
-test_e2e.py::test_ptp_quickstart_advanced[Mixtral-8x7B-NVFP4-nvfp4-quantized/Mixtral-8x7B-Instruct-v0.1] SKIP (https://nvbugs/5377465)
-test_e2e.py::test_ptp_quickstart_advanced[Llama3.1-70B-FP8-llama-3.1-model/Llama-3.1-70B-Instruct-FP8] SKIP (https://nvbugs/5377465)
 test_e2e.py::test_ptp_quickstart_advanced_8gpus[Llama3.1-405B-FP8-llama-3.1-model/Llama-3.1-405B-Instruct-FP8] SKIP (https://nvbugs/5380570)
 test_e2e.py::test_ptp_quickstart_advanced_8gpus[Nemotron-Ultra-253B-nemotron-nas/Llama-3_1-Nemotron-Ultra-253B-v1] SKIP (https://nvbugs/5380570)
 examples/test_multimodal.py::test_llm_fp8_multimodal_general[fp8-fp8-cnn_dailymail-Qwen2-VL-7B-Instruct-pp:1-tp:1-bfloat16-bs:1-cpp_e2e:False] SKIP (https://nvbugs/5385987)

Original file line number	Diff line number	Diff line change
`@@ -1628,6 +1628,8 @@ def test_ptp_quickstart_advanced(llm_root, llm_venv, model_name, model_path):`
`1628`	`1628`	`]`
`1629`	`1629`	`if "Qwen3" in model_name:`
`1630`	`1630`	`cmds.append(f"--kv_cache_fraction=0.6")`
	`1631`	`+ if "Llama3.1-70B" in model_name:`
	`1632`	`+ cmds.append(f"--max_num_tokens=1024")`
`1631`	`1633`	`llm_venv.run_cmd(cmds, stdout=running_log)`
`1632`	`1634`	`if model_name in mapping:`
`1633`	`1635`	`_check_mem_usage(running_log, [mapping[model_name], 0, 0, 0])`