Sg/bamba bench #139

suyoggupta · 2025-09-24T20:40:01Z

bench.yaml

# Compilation backend
compile_backend: torch-cudagraph
# Runtime engine
runtime: trtllm
# Model loading
skip_loading_weights: false
# Fraction of free memory to use for kv-caches
free_mem_ratio: 0.9
# CUDA Graph optimization
cuda_graph_batch_sizes: [64, 32,16,8,4,2,1]
# Attention backend
attn_backend: flashinfer
# Sequence configuration
max_batch_size: 64
model_kwargs:
  torch_dtype: bfloat16

Signed-off-by: Chenghao Zhang <[email protected]>

Signed-off-by: Suyog Gupta <[email protected]>

lucaslie · 2025-09-24T20:55:15Z

tensorrt_llm/_torch/auto_deploy/custom_ops/attention_interface.py

+            if tnsr_device.numel() < tnsr_host.numel():
+                print("WARNING: tnsr_device.numel() < tnsr_like.numel()")
+                print(f"{name=}, {tnsr_device.numel()=}, {tnsr_host.numel()=}")
+                tnsr_device.resize_(tnsr_host.numel())


curious where this is necessary?

suyoggupta · 2025-09-24T20:55:46Z

tensorrt_llm/_torch/auto_deploy/custom_ops/attention_interface.py

+            if tnsr_device.numel() < tnsr_host.numel():
+                print("WARNING: tnsr_device.numel() < tnsr_like.numel()")
+                print(f"{name=}, {tnsr_device.numel()=}, {tnsr_host.numel()=}")
+                tnsr_device.resize_(tnsr_host.numel())


@lucaslie : FYI, this is the WAR I have to get resize functionality working again on the feature branch. Without this llama3.1 + cache_resize is broken

nvchenghaoz and others added 8 commits September 22, 2025 14:16

Fix the bamba unit test

22ade41

Signed-off-by: Chenghao Zhang <[email protected]>

none: Add triton backend for ssm_transform and cuda backend for conv

2344404

Signed-off-by: Chenghao Zhang <[email protected]>

Fully Use the TRT LLM kernels

1bbcf19

Signed-off-by: Chenghao Zhang <[email protected]>

Add fake version for ssm transform op

65083c2

Signed-off-by: Chenghao Zhang <[email protected]>

Fix the datatype error in fake op

8cfb07b

Signed-off-by: Chenghao Zhang <[email protected]>

Fix the conv test error

f6c7aec

Signed-off-by: Chenghao Zhang <[email protected]>

Fix the triton ssm error

08aada6

Signed-off-by: Chenghao Zhang <[email protected]>

WARs to get bamba + bench working

2f7a17b

Signed-off-by: Suyog Gupta <[email protected]>

lucaslie reviewed Sep 24, 2025

View reviewed changes

suyoggupta commented Sep 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sg/bamba bench #139

Sg/bamba bench #139

Uh oh!

suyoggupta commented Sep 24, 2025 •

edited

Loading

Uh oh!

lucaslie Sep 24, 2025

Uh oh!

suyoggupta Sep 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Sg/bamba bench #139

Are you sure you want to change the base?

Sg/bamba bench #139

Uh oh!

Conversation

suyoggupta commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lucaslie Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

suyoggupta Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

suyoggupta commented Sep 24, 2025 •

edited

Loading