feat: Transition to FullyContiguous Host and Disk layouts #3090

oandreeva-nv · 2025-09-17T17:12:19Z

Overview:

Fixes DIS-507

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: #xxx

Summary by CodeRabbit

New Features
- Configure memory layout per device, host, and disk via Python and builder APIs.
- Inter-layout transfers supported (contiguous ↔ layer-separated) with validation and debug assertions.
- Layout verification utilities and reports, including address/size checks and optional checksums.
- Pre-transfer compatibility checks and a new integrity verification API for post-transfer validation.
Bug Fixes
- Corrected block view sizing for fully contiguous layouts with outer dimensions.
Tests
- Extensive coverage for inter-layout copies (CPU/GPU), alignment scenarios, and GDS-compatible disk operations.

Signed-off-by: Olga Andreeva <[email protected]>

copy-pr-bot · 2025-09-17T17:12:23Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: Olga Andreeva <[email protected]>

lib/llm/src/block_manager/block/transfer/cuda.rs

lib/llm/src/block_manager/block/transfer/nixl.rs

coderabbitai · 2025-09-17T17:50:21Z

Walkthrough

Introduces per-component layout configuration (device/host/disk) across Python bindings, vLLM/TRTLLM connectors, and worker/config. Adds inter-layout transfer support (FC↔LS) for CUDA and NIXL paths, pre-transfer compatibility checks, and post-transfer integrity verification. Expands layout module with verification utilities and adjusts memory region sizing. Updates tests for GDS alignment and cross-layout cases.

Changes

Cohort / File(s)	Summary
Python bindings: layout args + enum `lib/bindings/python/rust/llm/block_manager/distributed/worker.rs`	Adds PyLayoutType enum; maps to internal LayoutType; extends KvbmWorker::new with optional device/host/disk layout args; builder wiring with defaults to FullyContiguous.
vLLM/TRTLLM connectors: builder updates `lib/bindings/python/rust/llm/block_manager/vllm/connector/trtllm_worker.rs`, `lib/bindings/python/rust/llm/block_manager/vllm/connector/worker.rs`	Replaces boolean is_fully_contiguous_layout with device/host/disk layout setters; sets FullyContiguous in TRTLLM connector; vLLM connector sets LS on device, FC on host/disk.
Worker/config: per-edge layout inference `lib/llm/src/block_manager/distributed/worker.rs`	Removes boolean flag; adds device_layout_type, host_layout_type, disk_layout_type; infers device layout via match on type; allocates host/disk per specified types.
Block view sizing (FC path) `lib/llm/src/block_manager/block/data/local.rs`	For FC blocks, multiplies size by num_outer_dims in BlockView/BlockViewMut creation.
CUDA transfers: inter-layout copy `lib/llm/src/block_manager/block/transfer/cuda.rs`	Implements FC→LS and LS→FC copy loops with per-layer/outer iteration and assertions; keeps FC→FC memcpy; adds layout transfer tests and debug assertions.
NIXL transfers: inter-layout mapping and sizing `lib/llm/src/block_manager/block/transfer/nixl.rs`	Adds FC→LS descriptor mapping with early return; adds early return for FC→FC; in general path uses min(src,dst) per-layer copy_size with warning.
Distributed transfer: validation & integrity API `lib/llm/src/block_manager/distributed/transfer.rs`	Adds pre-transfer layout compatibility check; introduces public verify_transfer_integrity with byte-wise compare and optional pattern checks; adds logging.
Layout module: APIs, sizing, utils exposure `lib/llm/src/block_manager/layout.rs`	Removes `pub mod distributed`; adds `pub mod utils`; changes FC memory_region_size to outer_dim stride; adds verification methods (verify_memory_regions, expected_memory_address, verify_memory_region) for FC and LS; adds LS alignment verification; introduces crate-private LayerSeparateConfig.
Layout utils: worker verification utilities `lib/llm/src/block_manager/layout/utils.rs`	Adds validate_power_of_2 returning validator::ValidationError; introduces public worker_verification module with WorkerLayoutVerifier, RegionVerificationResult, LayoutVerificationStats, and verify_layout_compatibility.
Offload tests: GDS and integrity `lib/llm/src/block_manager/offload.rs`	Adjusts existing assertions; adds GDS-focused tests, helpers for mixed layouts, pattern population/verification, file lifecycle, and constrained scenarios.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Py as Python
  participant Pyo3 as PyO3 Binding
  participant Cfg as KvbmWorkerConfig Builder
  participant W as KvbmWorker
  participant Dev as Device Layout
  participant Host as Host Layout
  participant Disk as Disk Layout

  Py->>Pyo3: KvbmWorker.new(..., device_layout_type?, host_layout_type?, disk_layout_type?)
  Pyo3->>Cfg: set device/host/disk layout types (default FC)
  Cfg-->>Pyo3: built config
  Pyo3->>W: KvbmWorker::new(config)
  W->>W: infer device layout (match LayoutType)
  W->>Dev: create device layout
  W->>Host: create host layout (from config)
  W->>Disk: create disk layout (from config)
  W-->>Py: ready
  note over W,Dev: Layout-specific dims resolved (layers/outer/inner)

sequenceDiagram
  autonumber
  participant BTH as BlockTransferHandler
  participant Src as Source Blocks
  participant Dst as Target Blocks
  participant IO as write_to(..)

  BTH->>BTH: begin_transfer(sources, targets)
  BTH->>BTH: validate_transfer_compatibility
  alt compatible
    BTH->>IO: write_to(sources -> targets)
    IO-->>BTH: Ok(())
  else mismatch
    BTH-->>BTH: Err(LayoutMismatch)
  end

  opt Post-transfer (caller)
    BTH->>BTH: verify_transfer_integrity(sources, targets, patterns?)
    BTH-->>BTH: Ok / Err (data mismatch)
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

feat: Support new block layouts in KVBM #1462 — Introduces/extends LayoutType with LayerSeparate and propagates per-component layout through KvbmWorker and builders, overlapping with these layout plumbing changes.
feat: kvbm + connector #2258 — Switches KvbmWorkerConfig to per-component LayoutType in bindings/connectors, matching this PR’s builder/API updates.
feat: DIS-373 dynamo KVBM connector API integration with TRTLLM #2544 — Aligns TRTLLM/vLLM connector code with per-component layout setters, similar to the connector changes here.

Poem

In tunnels of memory where pages align,
I hop between layers—contiguous, fine.
Device, host, and disk now dance in three,
FC to LS, we copy with glee.
I nibble on checksums; mismatches I fear—
Verified, validated—carrots are clear! 🥕✨

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The provided PR description is incomplete relative to the repository template: the Overview contains only "Fixes DIS-507", the Details and "Where should the reviewer start?" sections are empty placeholders, and the Related Issues entry uses the placeholder "#xxx", so there is no substantive summary of changes, per-file highlights, testing instructions, or a real issue reference. Because these required elements are missing, reviewers lack the context needed to evaluate scope, risk, and where to focus their review. The description therefore does not meet the repository's template requirements.	Please update the PR description to follow the template: provide a concise Overview stating the intent and high-level impact, populate Details with the concrete changes (files modified, key behavioral/API changes, and test coverage), and fill "Where should the reviewer start?" with specific files or lines to inspect; replace the placeholder "closes #xxx" with the actual issue number and include testing steps, migration notes, and any backward-compatibility concerns so reviewers can proceed. Once updated, re-run the pre-merge checks.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title Check	✅ Passed	The title accurately names a concrete, important change in the diff—host and disk layout defaults/configuration are moved to FullyContiguous—and is specific and non-generic; however, the PR also contains broader layout work (per-edge LayoutType support, Python bindings, layout-compatibility transfer/convert paths, verification APIs, and tests) that the title does not enumerate.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 17

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)

lib/llm/src/block_manager/block/data/local.rs (1)

123-130: Guard against overflow when computing fully-contiguous block size

Multiplying mr.size() by num_layers() and num_outer_dims() can overflow usize and won’t be caught. Use checked_mul and surface a BlockError instead of silently wrapping.

Apply this diff:
-            let size = mr.size() * self.num_layers() * self.num_outer_dims();
+            let per_region = mr.size();
+            let size = per_region
+                .checked_mul(self.num_layers())
+                .and_then(|v| v.checked_mul(self.num_outer_dims()))
+                .ok_or_else(|| BlockError::InvalidState("block size overflow".to_string()))?;

lib/bindings/python/rust/llm/block_manager/vllm/connector/worker.rs (2)

353-356: Don’t panic in runtime hot path when a slot is missing

Panic will take down the process. Log and continue (or convert to an error return higher up) to keep the worker resilient.

Apply this diff:

-            } else {
-                // made this condition more strict slot existence checks were added as a prerequesite
-                // to be added to the maybe_finished_offloading set.
-                panic!("request slot missing for {request_id}; however, it was present when added to the maybe finished offloading set");
-            }
+            } else {
+                // stricter existence checks are in place, but avoid crashing the process
+                tracing::error!(
+                    request_id,
+                    "request slot missing; was present when added to maybe_finished_offloading"
+                );
+                continue;
+            }

382-383: Avoid panic in onboarding check as well

Mirror the non-panicking handling here.

Apply this diff:

-            } else {
-                panic!("request slot missing for {request_id}; however, it was present when added to the maybe finished onboarding set");
-            }
+            } else {
+                tracing::error!(
+                    request_id,
+                    "request slot missing; was present when added to maybe_finished_onboarding"
+                );
+                continue;
+            }

lib/bindings/python/rust/llm/block_manager/vllm/connector/trtllm_worker.rs (2)

311-323: Avoid panic on offloading slot miss

Same resilience concern as the vLLM connector; don’t panic in production paths.

Apply this diff:

-            } else {
-                // made this condition more strict slot existence checks were added as a prerequesite
-                // to be added to the maybe_finished_offloading set.
-                panic!("request slot missing for {request_id}; however, it was present when added to the maybe finished offloading set");
-            }
+            } else {
+                tracing::error!(
+                    request_id,
+                    "request slot missing; was present when added to maybe_finished_offloading"
+                );
+                continue;
+            }

349-350: Avoid panic on onboarding slot miss

Mirror non-panicking handling here.

Apply this diff:

-            } else {
-                panic!("request slot missing for {request_id}; however, it was present when added to the maybe finished onboarding set");
-            }
+            } else {
+                tracing::error!(
+                    request_id,
+                    "request slot missing; was present when added to maybe_finished_onboarding"
+                );
+                continue;
+            }

lib/llm/src/block_manager/distributed/transfer.rs (1)

104-115: Bounds-check indices from the request before indexing into pools.

source_pool_list[idx] and target_pool_list[idx] may panic if a request carries out-of-range indices. Validate and error out.

+        // Validate indices before indexing
+        let src_len = source_pool_list.len();
+        let tgt_len = target_pool_list.len();
+        for (from, to) in request.blocks() {
+            if *from >= src_len || *to >= tgt_len {
+                return Err(anyhow::anyhow!(
+                    "Transfer index out of bounds: from={}, to={}, src_len={}, tgt_len={}",
+                    from, to, src_len, tgt_len
+                ));
+            }
+        }

lib/llm/src/block_manager/distributed/worker.rs (1)

581-589: Avoid unwraps in production path (stream creation).

Propagate errors instead of panicking:

-        let transfer_context = Arc::new(TransferContext::new(
-            Arc::new(Some(agent)),
-            DeviceAllocator::new(config.device_id)
-                .unwrap()
-                .ctx()
-                .new_stream()
-                .unwrap(),
+        let stream = DeviceAllocator::new(config.device_id)?
+            .ctx()
+            .new_stream()?;
+        let transfer_context = Arc::new(TransferContext::new(
+            Arc::new(Some(agent)),
+            stream,
             Handle::current(),
         ));

🧹 Nitpick comments (17)

lib/llm/src/block_manager/block/transfer/nixl.rs (1)

55-56: Early return is fine, but keep assertions in debug

The explicit Ok(()) is fine; ensure any preconditions remain covered by debug assertions elsewhere.
lib/llm/src/block_manager/block/transfer/cuda.rs (3)
246-253: Incorrect debug message for overlap check

H2D path message says “D2D copy”; adjust to avoid confusion or remove overlap check entirely for H2D.

Apply this diff:
-        "Source and destination device memory regions must not overlap for D2D copy"
+        "Source and destination memory regions must not overlap"
270-276: Same nit: overlap message mentions D2D in D2H path

Mirror the wording fix here.

Apply this diff:
-        "Source and destination device memory regions must not overlap for D2D copy"
+        "Source and destination memory regions must not overlap"
309-739: Large CUDA tests: keep but ensure they don’t run by default

The tests are cfg(test, feature = "testing-cuda"); that’s good. Consider marking the perf test with #[ignore] to avoid accidental long runs in CI tiers without the feature.
lib/llm/src/block_manager/distributed/transfer.rs (1)

21-23: Remove unused import.

layout::BlockLayoutConfig isn’t used here. Drop it to keep warnings clean.
lib/llm/src/block_manager/distributed/worker.rs (1)
217-219: Use checked math for bytes_per_block to avoid usize overflow.

Guard against large shapes:
-        let bytes_per_block =
-            num_layers * outer_dim * config.page_size * inner_dim * config.dtype_width_bytes;
+        let bytes_per_block = num_layers
+            .checked_mul(outer_dim).and_then(|v| v.checked_mul(config.page_size))
+            .and_then(|v| v.checked_mul(inner_dim)).and_then(|v| v.checked_mul(config.dtype_width_bytes))
+            .ok_or_else(|| anyhow::anyhow!("bytes_per_block overflow"))?;
lib/bindings/python/rust/llm/block_manager/distributed/worker.rs (1)
171-181: Minor: simplify Option mapping and let rustfmt reflow.

Use map(Into::into) to reduce verbosity.
-            .device_layout_type(device_layout_type.map(|py_layout| py_layout.into()).unwrap_or(LayoutType::FullyContiguous))
+            .device_layout_type(device_layout_type.map(Into::into).unwrap_or(LayoutType::FullyContiguous))
             .host_layout_type(
-                host_layout_type
-                    .map(|py_layout| py_layout.into())
-                    .unwrap_or(LayoutType::FullyContiguous)
+                host_layout_type.map(Into::into).unwrap_or(LayoutType::FullyContiguous)
             )
             .disk_layout_type(
-                disk_layout_type
-                    .map(|py_layout| py_layout.into())
-                    .unwrap_or(LayoutType::FullyContiguous)
+                disk_layout_type.map(Into::into).unwrap_or(LayoutType::FullyContiguous)
             )
lib/llm/src/block_manager/layout.rs (3)

338-345: Confirm semantics: memory_region_size now equals one page.

Setting memory_region_size = outer_dim_stride_in_bytes aligns LocalMemoryRegion::size to a single [page_size × inner_dim × dtype] region. This is consistent with region addressing used elsewhere. Ensure all callers (e.g., transfer validators) expect per-page size, not per-layer size.

861-943: LayerSeparate verification and alignment checks are solid.

Good coverage: region math, alignment, and capacity checks per-layer. Minor: consider early-returning on first mismatch to speed failing tests.

1621-1904: Comprehensive tests are valuable; minor nits.

The no-overlap/alignment/stride tests add strong regression guarantees. Consider parameterizing constants to reduce duplication across FC/LS tests.
lib/llm/src/block_manager/layout/utils.rs (7)
19-25: align_up: guard against overflow and enforce precondition in release builds

value + alignment - 1 can overflow; debug_assert! disappears in release. Use a checked add and a hard assert (or return Result) to make it safe.
 #[inline(always)]
 pub fn align_up(value: usize, alignment: usize) -> usize {
-    debug_assert!(alignment.is_power_of_two(), "Alignment must be a power of 2");
-    (value + alignment - 1) & !(alignment - 1)
+    assert!(alignment.is_power_of_two() && alignment > 0, "Alignment must be a power of 2");
+    let add = alignment - 1;
+    let sum = value.checked_add(add).expect("overflow in align_up");
+    sum & !add
 }
27-36: Don’t leak validator::ValidationError from low-level utils

This module otherwise uses LayoutError. Exposing validator::ValidationError here couples layers unnecessarily.
-/// Validates that the given value is a power of 2.
-pub fn validate_power_of_2(alignment: usize) -> Result<(), validator::ValidationError> {
-    if alignment.is_power_of_two() {
-        Ok(())
-    } else {
-        Err(validator::ValidationError::new(
-            "Alignment must be a power of 2",
-        ))
-    }
-}
+/// Validates that the given value is a power of 2.
+pub fn validate_power_of_2(alignment: usize) -> Result<(), LayoutError> {
+    if alignment.is_power_of_two() {
+        Ok(())
+    } else {
+        Err(LayoutError::InvalidConfig("Alignment must be a power of 2".into()))
+    }
+}
If external callers truly need validator::ValidationError, add a thin adapter outside this module.

167-190: Pre‑allocate results to avoid reallocations

We know total regions up front.
-            let mut results = Vec::new();
+            let mut results = Vec::with_capacity(layout.num_blocks() * layout.num_layers() * layout.outer_dim());
266-289: Stats should ignore “unknown” address state if you adopt tri‑state addr_matches

If you switch to Option<bool>, update counters to only include Some(false) as mismatches and Some(true) in successes.
-            if !result.addr_matches {
+            if matches!(result.addr_matches, Some(false)) {
                 self.stats.addr_mismatches += 1;
             }
 ...
-            if result.addr_matches && result.size_matches {
+            if result.size_matches && matches!(result.addr_matches, Some(true) | None) {
                 self.stats.successful_verifications += 1;
             }
297-300: Critical mismatches should include checksum when verify_data=true

Currently only size mismatches are considered critical. If data verification is opted in, checksum mismatches should elevate to critical as well.
-        pub fn has_critical_mismatches(&self) -> bool {
-            // Only check size mismatches since address verification is layout-specific
-            self.stats.size_mismatches > 0
-        }
+        pub fn has_critical_mismatches(&self) -> bool {
+            self.stats.size_mismatches > 0 || self.stats.checksum_mismatches > 0
+        }
303-328: Avoid intermediate allocations in report building

Use string literals with push_str.
-            report.push_str(&format!("Layout Verification Report\n"));
-            report.push_str(&format!("========================\n"));
+            report.push_str("Layout Verification Report\n");
+            report.push_str("========================\n");
(Apply similarly to the other lines.)

331-396: Compatibility should also check alignment; consider returning a rich report

Missing alignment check can green‑light incompatible GDS paths.

Returning bool hides the reason; a structured report improves UX.

Add alignment check:
         if source_config.dtype_width_bytes != dest_config.dtype_width_bytes {
             tracing::error!("Data type width mismatch: {} vs {}",
                 source_config.dtype_width_bytes, dest_config.dtype_width_bytes);
             return Ok(false);
         }
+
+        if source_config.alignment != dest_config.alignment {
+            tracing::error!("Alignment mismatch: {} vs {}",
+                source_config.alignment, dest_config.alignment);
+            return Ok(false);
+        }
Follow‑up (non‑blocking): return Result<(), Incompatibility> collecting all mismatches instead of the first-fail boolean.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 67ff181 and c5e2be2.

📒 Files selected for processing (11)

lib/bindings/python/rust/llm/block_manager/distributed/worker.rs (4 hunks)
lib/bindings/python/rust/llm/block_manager/vllm/connector/trtllm_worker.rs (2 hunks)
lib/bindings/python/rust/llm/block_manager/vllm/connector/worker.rs (2 hunks)
lib/llm/src/block_manager/block/data/local.rs (2 hunks)
lib/llm/src/block_manager/block/transfer/cuda.rs (2 hunks)
lib/llm/src/block_manager/block/transfer/nixl.rs (2 hunks)
lib/llm/src/block_manager/distributed/transfer.rs (3 hunks)
lib/llm/src/block_manager/distributed/worker.rs (4 hunks)
lib/llm/src/block_manager/layout.rs (6 hunks)
lib/llm/src/block_manager/layout/utils.rs (3 hunks)
lib/llm/src/block_manager/offload.rs (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (8)

lib/llm/src/block_manager/offload.rs (4)

lib/llm/src/block_manager/layout/utils.rs (1)

verify_layout_compatibility (332-395)

lib/llm/src/block_manager/storage.rs (8)

storage_type (171-171)

storage_type (389-391)

storage_type (469-471)

storage_type (510-512)

size (177-177)

size (397-399)

size (477-479)

size (518-520)

lib/llm/src/block_manager/block/data.rs (4)

storage_type (26-26)

layer_view (49-58)

num_layers (32-32)

block_view (73-78)

lib/llm/src/block_manager/storage/disk.rs (3)

storage_type (123-125)

fd (90-92)

size (131-133)

lib/llm/src/block_manager/block/transfer/nixl.rs (3)

lib/llm/src/block_manager/layout.rs (1)

num_layers (235-237)

lib/llm/src/block_manager/block.rs (2)

num_layers (377-379)

size (1049-1051)

lib/llm/src/block_manager/storage/nixl.rs (7)

size (114-116)

size (281-283)

size (299-301)

size (323-325)

size (348-350)

size (373-375)

size (412-414)

lib/llm/src/block_manager/block/transfer/cuda.rs (4)

lib/llm/src/block_manager/block/data/local.rs (1)

num_layers (74-76)

lib/llm/src/block_manager/layout.rs (9)

num_layers (235-237)

storage (194-194)

storage (505-507)

storage (842-844)

config (206-206)

config (519-521)

config (804-806)

allocate (470-495)

allocate (764-796)

lib/llm/src/block_manager/block/data.rs (1)

num_layers (32-32)

lib/llm/src/block_manager/storage/cuda.rs (9)

size (226-228)

size (388-390)

default (284-288)

default (445-449)

ctx (460-462)

allocate (301-303)

allocate (466-468)

addr (222-224)

addr (384-386)

lib/bindings/python/rust/llm/block_manager/vllm/connector/trtllm_worker.rs (1)

lib/llm/src/block_manager/block.rs (1)

layout (1242-1244)

lib/llm/src/block_manager/distributed/transfer.rs (3)

lib/llm/src/block_manager/block.rs (3)

layout (1242-1244)

num_layers (377-379)

size (1049-1051)

lib/llm/src/block_manager/block/data/local.rs (1)

num_layers (74-76)

lib/llm/src/block_manager/block/data.rs (1)

num_layers (32-32)

lib/llm/src/block_manager/distributed/worker.rs (1)

lib/llm/src/block_manager/layout.rs (11)

builder (296-298)

layout_type (191-191)

layout_type (501-503)

layout_type (836-840)

num_layers (235-237)

outer_dim (243-245)

inner_dim (253-255)

config (206-206)

config (519-521)

config (804-806)

page_size (248-250)

lib/llm/src/block_manager/layout.rs (2)

lib/llm/src/block_manager/storage.rs (8)

addr (174-174)

addr (393-395)

addr (473-475)

addr (514-516)

size (177-177)

size (397-399)

size (477-479)

size (518-520)

lib/llm/src/block_manager/storage/nixl.rs (8)

addr (277-279)

size (114-116)

size (281-283)

size (299-301)

size (323-325)

size (348-350)

size (373-375)

size (412-414)

lib/llm/src/block_manager/layout/utils.rs (2)

lib/llm/src/block_manager/block.rs (1)

layout (1242-1244)

lib/llm/src/block_manager/layout.rs (1)

outer_dim (243-245)

🪛 GitHub Actions: Rust pre-merge checks

lib/bindings/python/rust/llm/block_manager/distributed/worker.rs

[error] 10-10: Rustfmt: formatting issue detected by cargo fmt -- --check. Run 'cargo fmt' to fix.

[error] 28-28: Rustfmt: multi-line struct initialization formatting changed. Run 'cargo fmt'.

[error] 168-168: Rustfmt: break device_layout_type call into multi-line layout.

[error] 175-175: Rustfmt: adjust formatting around unwrap_or in device layout chain.

[error] 179-179: Rustfmt: adjust formatting in device layout chain (map/unwrap).

lib/bindings/python/rust/llm/block_manager/vllm/connector/worker.rs

[error] 20-20: Rustfmt: import LayoutType inserted/moved; reformat to match style.

[error] 169-169: Rustfmt: break device_layout_type call into multi-line layout.

[error] 175-175: Rustfmt: trailing comma added and unwrap formatting adjusted.

[error] 179-179: Rustfmt: unwrap/map formatting adjusted in device layout chain.

[error] 26-26: Rustfmt: import LayoutType removed or reordered to satisfy formatting.

lib/bindings/python/rust/llm/block_manager/vllm/connector/trtllm_worker.rs

[error] 19-19: Rustfmt: import LayoutType inserted/moved; reformat to match style.

[error] 25-25: Rustfmt: import LayoutType adjusted to satisfy formatting.

🪛 GitHub Actions: NVIDIA Dynamo Github Validation

lib/llm/src/block_manager/block/transfer/nixl.rs

[error] 1-999: Rust formatting check failed. Run 'cargo fmt' to format code (cargo fmt -- --check).

lib/llm/src/block_manager/block/transfer/cuda.rs

[error] 1-999: Rust formatting check failed. Run 'cargo fmt' to format code (cargo fmt -- --check).

lib/llm/src/block_manager/distributed/transfer.rs

[error] 1-999: Rust formatting check failed. Run 'cargo fmt' to format code (cargo fmt -- --check).

lib/llm/src/block_manager/layout/utils.rs

[error] 1-999: Rust formatting check failed. Run 'cargo fmt' to format code (cargo fmt -- --check).

🔇 Additional comments (15)

lib/bindings/python/rust/llm/block_manager/vllm/connector/worker.rs (2)

26-26: Fix rustfmt import ordering

Sandbox formatting check failed (cargo fmt error: "Could not locate working directory: no /proc/self/exe available"); run cargo fmt locally and commit the resulting changes or reorder imports in lib/bindings/python/rust/llm/block_manager/vllm/connector/worker.rs (around line 26) to satisfy rustfmt.

169-176: Format chained builder calls (rustfmt)

Per-edge layout config is correct; format the chained builder calls to satisfy rustfmt/Clippy.
File: lib/bindings/python/rust/llm/block_manager/vllm/connector/worker.rs (lines 169–176)
Run locally: cargo fmt && cargo clippy -q --no-deps

lib/llm/src/block_manager/block/transfer/nixl.rs (1)

1-999: Run cargo fmt (repo-wide)

Formatter failures are blocking CI on this file; run cargo fmt --all and commit the changes. Automated verification failed in the sandbox with: "Could not locate working directory.: no /proc/self/exe available" — run locally or in CI and re-request verification.

lib/llm/src/block_manager/block/transfer/cuda.rs (1)

1-999: Run cargo fmt (repo-wide)

Formatter failures are blocking CI on this file — run cargo fmt --all and commit the resulting changes. Automated verification here failed (error: "Could not locate working directory: no /proc/self/exe available"); run locally or in CI to confirm formatting is fixed.

lib/bindings/python/rust/llm/block_manager/vllm/connector/trtllm_worker.rs (1)

25-25: Fix rustfmt import ordering — run cargo fmt

rustfmt is failing for the added LayoutType import in lib/bindings/python/rust/llm/block_manager/vllm/connector/trtllm_worker.rs (around line 25). Run cargo fmt locally and commit the formatted changes. (Automated attempt failed with: "Could not locate working directory: no /proc/self/exe available".)

lib/llm/src/block_manager/distributed/transfer.rs (1)

129-151: Run cargo fmt to fix formatting.

The pipeline is failing on rustfmt. Please run cargo fmt and commit the diff.

lib/bindings/python/rust/llm/block_manager/distributed/worker.rs (2)

26-36: Enum mapping LGTM.

PyLayoutType → LayoutType conversion is correct and clear.

1-1: Run cargo fmt.

Rustfmt errors are blocking. Please format this file.

lib/llm/src/block_manager/offload.rs (1)

1404-1408: Nice: index-based verification after bulk onboard.

The revised assertions compare each disk→device pair by index and check contents; this is more robust than counting blocks.

lib/llm/src/block_manager/layout.rs (2)

113-118: Public utils module exposure LGTM.

Making utils public and reusing it across verifiers is consistent with added verification APIs.

558-610: Verification APIs for FullyContiguous look good.

verify_memory_regions, expected_memory_address, and verify_memory_region are clean, use the shared verifier, and provide actionable logs.

lib/llm/src/block_manager/layout/utils.rs (4)

83-104: Bounds checks look good

Indexes validated against config with clear, specific errors. LGTM.

403-447: Nice: cross‑layout compatibility test

Good coverage of FC↔LS size compatibility without touching device memory.

448-476: Incompatible layouts test exercises the right surface

The differing blocks case is a good negative check.

105-477: Formatting check couldn't be executed in sandbox — run rustfmt locally and re-run formatting check

Sandbox error returned: "Unable to proceed. Could not locate working directory.: no /proc/self/exe available. Is /proc mounted?"

Run locally and re-run CI:
cargo fmt && cargo fmt -- --check

coderabbitai · 2025-09-17T17:50:24Z

lib/llm/src/block_manager/block/data/local.rs

            let mr = self.layout.memory_region(self.block_idx, 0, 0)?;
            let offset = mr.addr();
-            let size = mr.size() * self.num_layers();
+            let size = mr.size() * self.num_layers() * self.num_outer_dims();
            let storage_type = mr.storage_type();
            unsafe { view::BlockViewMut::new(self, offset, size, storage_type) }
        } else {


⚠️ Potential issue

Repeat overflow checks in mutable block view path

Same computation and risk as above; mirror the overflow-safe math.

Apply this diff:

- let size = mr.size() * self.num_layers() * self.num_outer_dims(); + let per_region = mr.size(); + let size = per_region + .checked_mul(self.num_layers()) + .and_then(|v| v.checked_mul(self.num_outer_dims())) + .ok_or_else(|| BlockError::InvalidState("block size overflow".to_string()))?;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

let mr = self.layout.memory_region(self.block_idx, 0, 0)?;

let offset = mr.addr();

let size = mr.size() * self.num_layers();

let size = mr.size() * self.num_layers() * self.num_outer_dims();

let storage_type = mr.storage_type();

unsafe { view::BlockViewMut::new(self, offset, size, storage_type) }

} else {

let mr = self.layout.memory_region(self.block_idx, 0, 0)?;

let offset = mr.addr();

let per_region = mr.size();

let size = per_region

.checked_mul(self.num_layers())

.and_then(|v| v.checked_mul(self.num_outer_dims()))

.ok_or_else(|| BlockError::InvalidState("block size overflow".to_string()))?;

let storage_type = mr.storage_type();

unsafe { view::BlockViewMut::new(self, offset, size, storage_type) }

} else {

🤖 Prompt for AI Agents

In lib/llm/src/block_manager/block/data/local.rs around lines 139 to 144, the mutable block-view branch repeats unchecked arithmetic that can overflow; mirror the overflow-safe math used in the immutable path by computing size and final offset with checked operations (checked_mul and checked_add), return an error (propagate with ?) if any checked operation returns None, and only call unsafe view::BlockViewMut::new with the validated offset and size. Ensure you reuse the same error type/creation used by the immutable branch so behavior matches exactly.

coderabbitai · 2025-09-17T17:50:24Z

lib/llm/src/block_manager/block/transfer/cuda.rs

+    } else if !src_data.is_fully_contiguous() && dst_data.is_fully_contiguous() {
+        // Special case: non-contiguous source (LayerSeparate) → contiguous destination (FullyContiguous)
+        // We need to copy each layer sequentially to create a contiguous layout
+
+        assert_eq!(src_data.num_layers(), dst_data.num_layers());
+
+        // Get the full contiguous destination view
+        let mut dst_block_view = dst_data.block_view_mut()?;
+        let dst_base_ptr = unsafe { dst_block_view.as_mut_ptr() };
+
+        let mut dst_offset = 0;
+
+        for layer_idx in 0..src_data.num_layers() {
+            for outer_idx in 0..src_data.num_outer_dims() {
+                // Get the source layer view (this has the correct LayerSeparate layout)
+                let src_view = src_data.layer_view(layer_idx, outer_idx)?;
+                let layer_size = src_view.size();
+
+                // Copy to sequential position in contiguous destination
+                unsafe {
+                    memcpy_fn(
+                        src_view.as_ptr(),
+                        dst_base_ptr.add(dst_offset),
+                        layer_size,
+                        stream,
+                    )?;
+                }
+
+                dst_offset += layer_size;
+            }
+        }
+
+        // Verify we filled the entire destination block
+        debug_assert_eq!(dst_offset, dst_block_view.size(),
+            "Destination offset mismatch: filled {} bytes but destination has {} bytes",
+            dst_offset, dst_block_view.size());
+
+    } else if src_data.is_fully_contiguous() && !dst_data.is_fully_contiguous() {


⚠️ Potential issue

NC→FC: assert outer-dim parity and bound-check per-chunk writes

Add an outer_dim parity assert and ensure dst_offset + layer_size never exceeds dst_block_view.size() before copying.

Apply this diff:

- assert_eq!(src_data.num_layers(), dst_data.num_layers()); + assert_eq!(src_data.num_layers(), dst_data.num_layers()); + assert_eq!(src_data.num_outer_dims(), dst_data.num_outer_dims()); @@ - // Copy to sequential position in contiguous destination + // Copy to sequential position in contiguous destination + let next = dst_offset.checked_add(layer_size).ok_or_else(|| { + TransferError::ExecutionError("dst_offset overflow".into()) + })?; + debug_assert!(next <= dst_block_view.size(), + "NC→FC: chunk exceeds destination block: next={} total={}", + next, dst_block_view.size()); unsafe { memcpy_fn( src_view.as_ptr(), - dst_base_ptr.add(dst_offset), + dst_base_ptr.add(dst_offset), layer_size, stream, )?; } - - dst_offset += layer_size; + dst_offset = next;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

} else if !src_data.is_fully_contiguous() && dst_data.is_fully_contiguous() {

// Special case: non-contiguous source (LayerSeparate) → contiguous destination (FullyContiguous)

// We need to copy each layer sequentially to create a contiguous layout

assert_eq!(src_data.num_layers(), dst_data.num_layers());

// Get the full contiguous destination view

let mut dst_block_view = dst_data.block_view_mut()?;

let dst_base_ptr = unsafe { dst_block_view.as_mut_ptr() };

let mut dst_offset = 0;

for layer_idx in 0..src_data.num_layers() {

for outer_idx in 0..src_data.num_outer_dims() {

// Get the source layer view (this has the correct LayerSeparate layout)

let src_view = src_data.layer_view(layer_idx, outer_idx)?;

let layer_size = src_view.size();

// Copy to sequential position in contiguous destination

unsafe {

memcpy_fn(

src_view.as_ptr(),

dst_base_ptr.add(dst_offset),

layer_size,

stream,

)?;

}

dst_offset += layer_size;

}

}

// Verify we filled the entire destination block

debug_assert_eq!(dst_offset, dst_block_view.size(),

"Destination offset mismatch: filled {} bytes but destination has {} bytes",

dst_offset, dst_block_view.size());

} else if src_data.is_fully_contiguous() && !dst_data.is_fully_contiguous() {

} else if !src_data.is_fully_contiguous() && dst_data.is_fully_contiguous() {

// Special case: non-contiguous source (LayerSeparate) → contiguous destination (FullyContiguous)

// We need to copy each layer sequentially to create a contiguous layout

assert_eq!(src_data.num_layers(), dst_data.num_layers());

assert_eq!(src_data.num_outer_dims(), dst_data.num_outer_dims());

// Get the full contiguous destination view

let mut dst_block_view = dst_data.block_view_mut()?;

let dst_base_ptr = unsafe { dst_block_view.as_mut_ptr() };

let mut dst_offset = 0;

for layer_idx in 0..src_data.num_layers() {

for outer_idx in 0..src_data.num_outer_dims() {

// Get the source layer view (this has the correct LayerSeparate layout)

let src_view = src_data.layer_view(layer_idx, outer_idx)?;

let layer_size = src_view.size();

// Copy to sequential position in contiguous destination

let next = dst_offset.checked_add(layer_size).ok_or_else(|| {

TransferError::ExecutionError("dst_offset overflow".into())

})?;

debug_assert!(next <= dst_block_view.size(),

"NC→FC: chunk exceeds destination block: next={} total={}",

next, dst_block_view.size());

unsafe {

memcpy_fn(

src_view.as_ptr(),

dst_base_ptr.add(dst_offset),

layer_size,

stream,

)?;

}

dst_offset = next;

}

}

// Verify we filled the entire destination block

debug_assert_eq!(dst_offset, dst_block_view.size(),

"Destination offset mismatch: filled {} bytes but destination has {} bytes",

dst_offset, dst_block_view.size());

} else if src_data.is_fully_contiguous() && !dst_data.is_fully_contiguous() {

🤖 Prompt for AI Agents

In lib/llm/src/block_manager/block/transfer/cuda.rs around lines 78 to 115, add an outer-dim parity assertion and a per-chunk bounds check before each copy: assert that src_data.num_outer_dims() is compatible with dst_data.num_outer_dims() (e.g. src_data.num_outer_dims() % dst_data.num_outer_dims() == 0 or the stronger equality if required by your layout invariants) and, before calling memcpy_fn, ensure dst_offset + layer_size <= dst_block_view.size() and return an error or panic if it would overflow; keep the existing dst_offset increment and preserve the final debug_assert_eq verifying the full fill.

coderabbitai · 2025-09-17T17:50:25Z

lib/llm/src/block_manager/block/transfer/cuda.rs

+    } else if src_data.is_fully_contiguous() && !dst_data.is_fully_contiguous() {
+        // Special case: contiguous source (FullyContiguous) → non-contiguous destination (LayerSeparate)
+        // We need to map the contiguous source data to the layered destination structure
+
+        assert_eq!(src_data.num_layers(), dst_data.num_layers());
+
+        // Get the full contiguous source view
+        let src_block_view = src_data.block_view()?;
+        let src_base_ptr = unsafe { src_block_view.as_ptr() };
+
+        let mut src_offset = 0;
+
+        for layer_idx in 0..src_data.num_layers() {
+            for outer_idx in 0..src_data.num_outer_dims() {
+                // Get the destination layer view (this has the correct LayerSeparate layout)
+                let mut dst_view = dst_data.layer_view_mut(layer_idx, outer_idx)?;
+                let layer_size = dst_view.size();
+
+                // Copy from sequential position in contiguous source
+                unsafe {
+                    memcpy_fn(
+                        src_base_ptr.add(src_offset),
+                        dst_view.as_mut_ptr(),
+                        layer_size,
+                        stream,
+                    )?;
+                }
+
+                src_offset += layer_size;
+            }
+        }
+
+        // Verify we consumed the entire source block
+        debug_assert_eq!(src_offset, src_block_view.size(),
+            "Source offset mismatch: consumed {} bytes but source has {} bytes",
+            src_offset, src_block_view.size());
+


⚠️ Potential issue

FC→NC: assert outer-dim parity and bound-check per-chunk reads

Same safety concerns in the reverse path; guard src_offset growth before memcpy.

Apply this diff:

- assert_eq!(src_data.num_layers(), dst_data.num_layers()); + assert_eq!(src_data.num_layers(), dst_data.num_layers()); + assert_eq!(src_data.num_outer_dims(), dst_data.num_outer_dims()); @@ - // Copy from sequential position in contiguous source + // Copy from sequential position in contiguous source + let next = src_offset.checked_add(layer_size).ok_or_else(|| { + TransferError::ExecutionError("src_offset overflow".into()) + })?; + debug_assert!(next <= src_block_view.size(), + "FC→NC: chunk exceeds source block: next={} total={}", + next, src_block_view.size()); unsafe { memcpy_fn( - src_base_ptr.add(src_offset), + src_base_ptr.add(src_offset), dst_view.as_mut_ptr(), layer_size, stream, )?; } - - src_offset += layer_size; + src_offset = next;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

} else if src_data.is_fully_contiguous() && !dst_data.is_fully_contiguous() {

// Special case: contiguous source (FullyContiguous) → non-contiguous destination (LayerSeparate)

// We need to map the contiguous source data to the layered destination structure

assert_eq!(src_data.num_layers(), dst_data.num_layers());

// Get the full contiguous source view

let src_block_view = src_data.block_view()?;

let src_base_ptr = unsafe { src_block_view.as_ptr() };

let mut src_offset = 0;

for layer_idx in 0..src_data.num_layers() {

for outer_idx in 0..src_data.num_outer_dims() {

// Get the destination layer view (this has the correct LayerSeparate layout)

let mut dst_view = dst_data.layer_view_mut(layer_idx, outer_idx)?;

let layer_size = dst_view.size();

// Copy from sequential position in contiguous source

unsafe {

memcpy_fn(

src_base_ptr.add(src_offset),

dst_view.as_mut_ptr(),

layer_size,

stream,

)?;

}

src_offset += layer_size;

}

}

// Verify we consumed the entire source block

debug_assert_eq!(src_offset, src_block_view.size(),

"Source offset mismatch: consumed {} bytes but source has {} bytes",

src_offset, src_block_view.size());

} else if src_data.is_fully_contiguous() && !dst_data.is_fully_contiguous() {

// Special case: contiguous source (FullyContiguous) → non-contiguous destination (LayerSeparate)

// We need to map the contiguous source data to the layered destination structure

assert_eq!(src_data.num_layers(), dst_data.num_layers());

assert_eq!(src_data.num_outer_dims(), dst_data.num_outer_dims());

// Get the full contiguous source view

let src_block_view = src_data.block_view()?;

let src_base_ptr = unsafe { src_block_view.as_ptr() };

let mut src_offset = 0;

for layer_idx in 0..src_data.num_layers() {

for outer_idx in 0..src_data.num_outer_dims() {

// Get the destination layer view (this has the correct LayerSeparate layout)

let mut dst_view = dst_data.layer_view_mut(layer_idx, outer_idx)?;

let layer_size = dst_view.size();

// Copy from sequential position in contiguous source

let next = src_offset.checked_add(layer_size).ok_or_else(|| {

TransferError::ExecutionError("src_offset overflow".into())

})?;

debug_assert!(next <= src_block_view.size(),

"FC→NC: chunk exceeds source block: next={} total={}",

next, src_block_view.size());

unsafe {

memcpy_fn(

src_base_ptr.add(src_offset),

dst_view.as_mut_ptr(),

layer_size,

stream,

)?;

}

src_offset = next;

}

}

// Verify we consumed the entire source block

debug_assert_eq!(src_offset, src_block_view.size(),

"Source offset mismatch: consumed {} bytes but source has {} bytes",

src_offset, src_block_view.size());

coderabbitai · 2025-09-17T17:50:25Z

lib/llm/src/block_manager/block/transfer/nixl.rs

+    } else if src_data.is_fully_contiguous() && !dst_data.is_fully_contiguous() {
+        // Special case: contiguous source → non-contiguous destination
+        // We need to map the contiguous source data to the layered destination structure
+
+        assert_eq!(src_data.num_layers(), dst_data.num_layers());
+
+        // Get the full contiguous source view
+        let src_block_view = src_data.block_view()?;
+        let src_base_ptr = unsafe { src_block_view.as_ptr() } as usize;
+        let src_device_id = src_block_view.device_id();
+
+        let mut src_offset = 0;
+
+        for layer_idx in 0..src_data.num_layers() {
+            for outer_idx in 0..src_data.num_outer_dims() {
+                // Get the destination layer view (this has the correct layout)
+                let mut dst_view = dst_data.layer_view_mut(layer_idx, outer_idx)?;
+                let layer_size = dst_view.size();
+
+                // Create source descriptor with calculated offset from contiguous block
+                unsafe {
+                    src_dl.add_desc(
+                        src_base_ptr + src_offset,
+                        layer_size,
+                        src_device_id,
+                    )?;
+                }
+
+                // Create destination descriptor
+                let dst_desc = dst_view.as_nixl_descriptor_mut();
+                unsafe {
+                    dst_dl.add_desc(
+                        dst_desc.as_ptr() as usize,
+                        dst_desc.size(),
+                        dst_desc.device_id(),
+                    )?;
+                }
+
+                debug_assert_eq!(layer_size, dst_desc.size());
+                src_offset += layer_size;
+            }
+        }
+
+        // Verify we consumed the entire source block
+        debug_assert_eq!(src_offset, src_block_view.size(),
+            "Source offset mismatch: consumed {} bytes but source has {} bytes",
+            src_offset, src_block_view.size());
+


⚠️ Potential issue

FC → non‑contiguous: add outer-dim parity check and guard offset arithmetic

You assert layer parity but not outer_dim parity; also src_offset arithmetic can overflow or exceed the source size before the final debug_assert. Add checks per chunk.

Apply this diff:

- assert_eq!(src_data.num_layers(), dst_data.num_layers()); + assert_eq!(src_data.num_layers(), dst_data.num_layers()); + assert_eq!(src_data.num_outer_dims(), dst_data.num_outer_dims()); @@ - // Create source descriptor with calculated offset from contiguous block + // Create source descriptor with calculated offset from contiguous block + let next = src_offset.checked_add(layer_size).ok_or_else(|| { + anyhow::anyhow!("src_offset overflow while building NIXL xfer list") + })?; + debug_assert!(next <= src_block_view.size(), + "FC→NC: chunk exceeds source block: next={} total={}", + next, src_block_view.size()); unsafe { src_dl.add_desc( - src_base_ptr + src_offset, + src_base_ptr + src_offset, layer_size, src_device_id, )?; } @@ - let dst_desc = dst_view.as_nixl_descriptor_mut(); + let dst_desc = dst_view.as_nixl_descriptor_mut(); unsafe { dst_dl.add_desc( dst_desc.as_ptr() as usize, - dst_desc.size(), + layer_size, dst_desc.device_id(), )?; } @@ - src_offset += layer_size; + src_offset = next;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

} else if src_data.is_fully_contiguous() && !dst_data.is_fully_contiguous() {

// Special case: contiguous source → non-contiguous destination

// We need to map the contiguous source data to the layered destination structure

assert_eq!(src_data.num_layers(), dst_data.num_layers());

// Get the full contiguous source view

let src_block_view = src_data.block_view()?;

let src_base_ptr = unsafe { src_block_view.as_ptr() } as usize;

let src_device_id = src_block_view.device_id();

let mut src_offset = 0;

for layer_idx in 0..src_data.num_layers() {

for outer_idx in 0..src_data.num_outer_dims() {

// Get the destination layer view (this has the correct layout)

let mut dst_view = dst_data.layer_view_mut(layer_idx, outer_idx)?;

let layer_size = dst_view.size();

// Create source descriptor with calculated offset from contiguous block

unsafe {

src_dl.add_desc(

src_base_ptr + src_offset,

layer_size,

src_device_id,

)?;

}

// Create destination descriptor

let dst_desc = dst_view.as_nixl_descriptor_mut();

unsafe {

dst_dl.add_desc(

dst_desc.as_ptr() as usize,

dst_desc.size(),

dst_desc.device_id(),

)?;

}

debug_assert_eq!(layer_size, dst_desc.size());

src_offset += layer_size;

}

}

// Verify we consumed the entire source block

debug_assert_eq!(src_offset, src_block_view.size(),

"Source offset mismatch: consumed {} bytes but source has {} bytes",

src_offset, src_block_view.size());

} else if src_data.is_fully_contiguous() && !dst_data.is_fully_contiguous() {

// Special case: contiguous source → non-contiguous destination

// We need to map the contiguous source data to the layered destination structure

assert_eq!(src_data.num_layers(), dst_data.num_layers());

assert_eq!(src_data.num_outer_dims(), dst_data.num_outer_dims());

// Get the full contiguous source view

let src_block_view = src_data.block_view()?;

let src_base_ptr = unsafe { src_block_view.as_ptr() } as usize;

let src_device_id = src_block_view.device_id();

let mut src_offset = 0;

for layer_idx in 0..src_data.num_layers() {

for outer_idx in 0..src_data.num_outer_dims() {

// Get the destination layer view (this has the correct layout)

let mut dst_view = dst_data.layer_view_mut(layer_idx, outer_idx)?;

let layer_size = dst_view.size();

// Create source descriptor with calculated offset from contiguous block

let next = src_offset.checked_add(layer_size).ok_or_else(|| {

anyhow::anyhow!("src_offset overflow while building NIXL xfer list")

})?;

debug_assert!(next <= src_block_view.size(),

"FC→NC: chunk exceeds source block: next={} total={}",

next, src_block_view.size());

unsafe {

src_dl.add_desc(

src_base_ptr + src_offset,

layer_size,

src_device_id,

)?;

}

// Create destination descriptor

let dst_desc = dst_view.as_nixl_descriptor_mut();

unsafe {

dst_dl.add_desc(

dst_desc.as_ptr() as usize,

layer_size,

dst_desc.device_id(),

)?;

}

debug_assert_eq!(layer_size, dst_desc.size());

src_offset = next;

}

}

// Verify we consumed the entire source block

debug_assert_eq!(src_offset, src_block_view.size(),

"Source offset mismatch: consumed {} bytes but source has {} bytes",

src_offset, src_block_view.size());

coderabbitai · 2025-09-17T17:50:25Z

lib/llm/src/block_manager/block/transfer/nixl.rs

+                // Handle potential size mismatches between layouts
+                let src_size = src_view.size();
+                let dst_size = dst_view.size();
+                let copy_size = std::cmp::min(src_size, dst_size);
+
+                // Log a warning if sizes don't match (this indicates a layout issue)
+                if src_size != dst_size {
+                    tracing::warn!(
+                        "Size mismatch in NIXL layer copy: src_size={}, dst_size={}, using copy_size={}. \
+                         This may indicate a layout configuration issue.",
+                        src_size, dst_size, copy_size
+                    );
+                }

                let src_desc = src_view.as_nixl_descriptor();
                let dst_desc = dst_view.as_nixl_descriptor_mut();

                unsafe {
                    src_dl.add_desc(
                        src_desc.as_ptr() as usize,
-                        src_desc.size(),
+                        copy_size,  // Use the safe copy size
                        src_desc.device_id(),
                    )?;

                    dst_dl.add_desc(
                        dst_desc.as_ptr() as usize,
-                        dst_desc.size(),
+                        copy_size,  // Use the safe copy size
                        dst_desc.device_id(),
                    )?;
                }


⚠️ Potential issue

Do not silently truncate on size mismatch; fail fast instead

Using min(src_size, dst_size) risks silent data loss/corruption. Unless explicitly feature-gated, this should error out with actionable context.

Apply this diff:

- // Handle potential size mismatches between layouts - let src_size = src_view.size(); - let dst_size = dst_view.size(); - let copy_size = std::cmp::min(src_size, dst_size); - - // Log a warning if sizes don't match (this indicates a layout issue) - if src_size != dst_size { - tracing::warn!( - "Size mismatch in NIXL layer copy: src_size={}, dst_size={}, using copy_size={}. \ - This may indicate a layout configuration issue.", - src_size, dst_size, copy_size - ); - } - - let src_desc = src_view.as_nixl_descriptor(); - let dst_desc = dst_view.as_nixl_descriptor_mut(); - - unsafe { - src_dl.add_desc( - src_desc.as_ptr() as usize, - copy_size, // Use the safe copy size - src_desc.device_id(), - )?; - - dst_dl.add_desc( - dst_desc.as_ptr() as usize, - copy_size, // Use the safe copy size - dst_desc.device_id(), - )?; - } + let src_desc = src_view.as_nixl_descriptor(); + let dst_desc = dst_view.as_nixl_descriptor_mut(); + if src_desc.size() != dst_desc.size() { + anyhow::bail!( + "NIXL layer size mismatch at layer={}, outer={}: src={} dst={}", + layer_idx, + outer_idx, + src_desc.size(), + dst_desc.size() + ); + } + unsafe { + src_dl.add_desc( + src_desc.as_ptr() as usize, + src_desc.size(), + src_desc.device_id(), + )?; + dst_dl.add_desc( + dst_desc.as_ptr() as usize, + dst_desc.size(), + dst_desc.device_id(), + )?; + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Handle potential size mismatches between layouts

let src_size = src_view.size();

let dst_size = dst_view.size();

let copy_size = std::cmp::min(src_size, dst_size);

// Log a warning if sizes don't match (this indicates a layout issue)

if src_size != dst_size {

tracing::warn!(

"Size mismatch in NIXL layer copy: src_size={}, dst_size={}, using copy_size={}. \

This may indicate a layout configuration issue.",

src_size, dst_size, copy_size

);

}

let src_desc = src_view.as_nixl_descriptor();

let dst_desc = dst_view.as_nixl_descriptor_mut();

unsafe {

src_dl.add_desc(

src_desc.as_ptr() as usize,

src_desc.size(),

copy_size, // Use the safe copy size

src_desc.device_id(),

)?;

dst_dl.add_desc(

dst_desc.as_ptr() as usize,

dst_desc.size(),

copy_size, // Use the safe copy size

dst_desc.device_id(),

)?;

}

let src_desc = src_view.as_nixl_descriptor();

let dst_desc = dst_view.as_nixl_descriptor_mut();

if src_desc.size() != dst_desc.size() {

anyhow::bail!(

"NIXL layer size mismatch at layer={}, outer={}: src={} dst={}",

layer_idx,

outer_idx,

src_desc.size(),

dst_desc.size()

);

}

unsafe {

src_dl.add_desc(

src_desc.as_ptr() as usize,

src_desc.size(),

src_desc.device_id(),

)?;

dst_dl.add_desc(

dst_desc.as_ptr() as usize,

dst_desc.size(),

dst_desc.device_id(),

)?;

}

🤖 Prompt for AI Agents

In lib/llm/src/block_manager/block/transfer/nixl.rs around lines 112 to 141, the code currently silently truncates on layout size mismatches by using min(src_size, dst_size) and only logging a warning; change this to fail fast: check if src_size != dst_size and return an Err with actionable context (include src_size, dst_size, layer identifiers if available) instead of proceeding, and only call add_desc with the full expected size when they match; preserve the unsafe block but ensure you propagate a clear error before any unsafe operations when sizes differ.

coderabbitai · 2025-09-17T17:50:26Z

lib/llm/src/block_manager/layout/utils.rs

+            Ok(RegionVerificationResult {
+                block_idx,
+                layer_idx,
+                outer_idx,
+                expected_addr,
+                actual_addr: region.addr,
+                expected_size,
+                actual_size: region.size,
+                addr_matches: true, // Always true since we accept the layout's calculation
+                size_matches: expected_size == region.size,
+                checksum,
+            })


⚠️ Potential issue

addr_matches is always true — misleading verification signal

expected_addr is set to region.addr, so addr_matches can never fail. Either compute an independent expected address or mark address verification as “not performed.”

If you can’t compute expected addresses, change the API to express tri‑state:

- /// Whether the addresses match - pub addr_matches: bool, + /// Address verification: None when not performed; Some(true/false) when verified + pub addr_matches: Option<bool>,

And set:

- addr_matches: true, // Always true since we accept the layout's calculation + addr_matches: None, // Address verification not performed

Update stats/reporting accordingly to ignore None.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Ok(RegionVerificationResult {

block_idx,

layer_idx,

outer_idx,

expected_addr,

actual_addr: region.addr,

expected_size,

actual_size: region.size,

addr_matches: true, // Always true since we accept the layout's calculation

size_matches: expected_size == region.size,

checksum,

})

Ok(RegionVerificationResult {

block_idx,

layer_idx,

outer_idx,

expected_addr,

actual_addr: region.addr,

expected_size,

actual_size: region.size,

addr_matches: None, // Address verification not performed

size_matches: expected_size == region.size,

checksum,

})

🤖 Prompt for AI Agents

In lib/llm/src/block_manager/layout/utils.rs around lines 225 to 236, the code sets expected_addr = region.addr and then addr_matches = true which makes address verification meaningless; either compute the expected address independently (derive it from layout metadata or parent offsets and use that value to set addr_matches = (expected_addr == region.addr)), or change the API to express a tri-state for address verification (e.g., Option<bool> or an enum Verified/NotPerformed/Failed), populate expected_addr only when computable (or set it to None), set addr_matches to None/NotPerformed when not computed, and update any stats/reporting code to treat None/NotPerformed as “ignore” instead of failure.

coderabbitai · 2025-09-17T17:50:27Z

lib/llm/src/block_manager/layout/utils.rs

+        fn calculate_memory_checksum(&self, region: &LocalMemoryRegion) -> Result<u64, LayoutError> {
+            unsafe {
+                let slice = std::slice::from_raw_parts(region.addr as *const u8, region.size);
+                let mut checksum = 0u64;
+
+                for (i, &byte) in slice.iter().enumerate() {
+                    checksum = checksum.wrapping_add((byte as u64).wrapping_mul(i as u64 + 1));
+                }
+
+                Ok(checksum)
+            }
+        }


⚠️ Potential issue

Unsafe read can touch device memory and crash

from_raw_parts(region.addr as *const u8, region.size) assumes host‑accessible memory. On CUDA/NIXL, this can segfault or corrupt. Guard and/or route through a safe storage API.

Minimal hardening:

fn calculate_memory_checksum(&self, region: &LocalMemoryRegion) -> Result<u64, LayoutError> { - unsafe { + if region.size == 0 { + return Ok(0); + } + if region.addr == 0 { + return Err(LayoutError::InvalidConfig("region.addr is null; cannot checksum".into())); + } + // TODO: gate on host accessibility once exposed by storage/layout. + unsafe { let slice = std::slice::from_raw_parts(region.addr as *const u8, region.size); let mut checksum = 0u64; for (i, &byte) in slice.iter().enumerate() { checksum = checksum.wrapping_add((byte as u64).wrapping_mul(i as u64 + 1)); } Ok(checksum) } }

Longer‑term: expose a host‑access check or read API on Storage/GenericBlockLayout and avoid raw pointer reads here.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

fn calculate_memory_checksum(&self, region: &LocalMemoryRegion) -> Result<u64, LayoutError> {

unsafe {

let slice = std::slice::from_raw_parts(region.addr as *const u8, region.size);

let mut checksum = 0u64;

for (i, &byte) in slice.iter().enumerate() {

checksum = checksum.wrapping_add((byte as u64).wrapping_mul(i as u64 + 1));

}

Ok(checksum)

}

}

fn calculate_memory_checksum(&self, region: &LocalMemoryRegion) -> Result<u64, LayoutError> {

if region.size == 0 {

return Ok(0);

}

if region.addr == 0 {

return Err(LayoutError::InvalidConfig("region.addr is null; cannot checksum".into()));

}

// TODO: gate on host accessibility once exposed by storage/layout.

unsafe {

let slice = std::slice::from_raw_parts(region.addr as *const u8, region.size);

let mut checksum = 0u64;

for (i, &byte) in slice.iter().enumerate() {

checksum = checksum.wrapping_add((byte as u64).wrapping_mul(i as u64 + 1));

}

Ok(checksum)

}

}

🤖 Prompt for AI Agents

In lib/llm/src/block_manager/layout/utils.rs around lines 242 to 253, the code unsafely does from_raw_parts(region.addr as *const u8, region.size) which can read device/non‑host memory and crash; instead guard this path and route through a safe storage API: check whether the region is host‑accessible (or call a Storage/GenericBlockLayout host_accessible/read API), and only perform the raw pointer read when confirmed; otherwise copy/read the region via the provided storage I/O (or return a LayoutError indicating non‑host memory). For minimal hardening, wrap the unsafe block with a host_accessible check and 1) allocate a Vec<u8> and copy bytes only when safe, or 2) invoke an exposed read_into_host_buffer(...) on the layout/storage and compute the checksum from that buffer, returning an error if neither is possible.

coderabbitai · 2025-09-17T17:50:27Z

lib/llm/src/block_manager/offload.rs

+            // GDS requires 4KB alignment for optimal performance
+            const GDS_ALIGNMENT: usize = 4096;
+
+            let (offload_manager, device_pool, host_pool, disk_pool) = build_pools_with_layout(
+                4,
+                Some(4),
+                Some(4),
+                Some(GDS_ALIGNMENT), // Use GDS-friendly alignment
+                layout_type,
+                BlockRegistrationDuplicationSetting::Disabled,
+            )?;


⚠️ Potential issue

🧩 Analysis chain

GDS alignment test is misconfiguring inner_dim, not alignment.

Passing Some(GDS_ALIGNMENT) goes into inner_dim, not alignment. The test claims “Use GDS-friendly alignment” but doesn’t set LayoutConfig.alignment, so it’s not validating 4KB alignment.

Option A (minimal): add an overload build_pools_with_layout_and_alignment(..., alignment: Option<usize>) and use it here.

Option B (change signature): extend build_pools_with_layout to accept alignment: Option<usize> and plumb it into LayoutConfig.

Example (Option B):

- pub fn build_pools_with_layout( + pub fn build_pools_with_layout( device_blocks: usize, host_blocks: Option<usize>, disk_blocks: Option<usize>, - inner_dim: Option<usize>, + inner_dim: Option<usize>, + alignment: Option<usize>, layout_type: LayoutType, duplication_setting: BlockRegistrationDuplicationSetting, ) -> Result<(... )> { let mut config = LayoutConfig { ... - alignment: 1, + alignment: alignment.unwrap_or(1), dtype_width_bytes: 2, };

Then pass alignment: Some(GDS_ALIGNMENT) from this test.

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "PWD: $(pwd)" echo "Top-level files:" ls -la | sed -n '1,120p' || true echo echo "Searching for offload.rs..." rg -n --hidden --no-ignore -S "offload.rs" || true echo echo "Attempt to show lib/llm/src/block_manager/offload.rs lines 1400-1460 if present..." if [ -f lib/llm/src/block_manager/offload.rs ]; then echo "Found: lib/llm/src/block_manager/offload.rs" sed -n '1400,1460p' lib/llm/src/block_manager/offload.rs || true else echo "lib/llm/src/block_manager/offload.rs not found at that path" fi echo echo "Searching for function definition 'build_pools_with_layout' and its call sites..." rg -n --hidden --no-ignore -S "pub fn build_pools_with_layout" -C5 || true rg -n --hidden --no-ignore -S "build_pools_with_layout\(" -C5 || true echo echo "Searching for LayoutConfig struct and fields 'alignment' / 'inner_dim'..." rg -n --hidden --no-ignore -S "struct LayoutConfig" -C5 || true rg -n --hidden --no-ignore -S "\balignment\b" -C5 || true rg -n --hidden --no-ignore -S "\binner_dim\b" -C5 || true

Length of output: 179265

GDS alignment argument is being passed to inner_dim — tests do not set LayoutConfig.alignment

Confirmed: build_pools_with_layout's 4th parameter is inner_dim: Option and LayoutConfig.alignment is hardcoded to 1, so passing Some(GDS_ALIGNMENT) sets inner_dim instead of alignment. Location: lib/llm/src/block_manager/offload.rs (call at ~1434–1441; function/signature around ~687–709).

Fix options:

Option A (minimal): add an overload build_pools_with_layout_and_alignment(..., alignment: Option) and call it from the GDS tests.

Option B (preferred): extend build_pools_with_layout to accept alignment: Option and plumb it into LayoutConfig.

Example (Option B):

- pub fn build_pools_with_layout( + pub fn build_pools_with_layout( device_blocks: usize, host_blocks: Option<usize>, disk_blocks: Option<usize>, - inner_dim: Option<usize>, + inner_dim: Option<usize>, + alignment: Option<usize>, layout_type: LayoutType, duplication_setting: BlockRegistrationDuplicationSetting, ) -> Result<(... )> { let mut config = LayoutConfig { ... - alignment: 1, + alignment: alignment.unwrap_or(1), dtype_width_bytes: 2, };

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2025-09-17T17:50:27Z

lib/llm/src/block_manager/offload.rs

+                    let expected_size = BLOCK_SIZE * NUM_LAYERS * 2 * 13 * 4; // From test constants
+                    assert!(metadata.len() >= expected_size as u64,
+                        "Disk file size {} is smaller than expected {}",
+                        metadata.len(), expected_size);
+
+                    // Verify file is properly aligned for GDS operations
+                    assert_eq!(metadata.len() % 4096, 0,
+                        "Disk file size {} is not 4KB aligned for GDS", metadata.len());
+                }


⚠️ Potential issue

Fix expected_size computation; dtype width and layout dims are hardcoded.

expected_size uses magic numbers (2 * 13 * 4) that don’t match the LayoutConfig used here (dtype_width_bytes = 2, outer_dim = 1 in many setups). Compute from the block’s actual view or layout config to avoid flakiness:

- let expected_size = BLOCK_SIZE * NUM_LAYERS * 2 * 13 * 4; // From test constants + let bd = block_data; + let region_size = bd.page_size() * bd.inner_dim() * 2; // dtype_width_bytes + let expected_size = NUM_LAYERS * bd.outer_dim() * region_size;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

let expected_size = BLOCK_SIZE * NUM_LAYERS * 2 * 13 * 4; // From test constants

assert!(metadata.len() >= expected_size as u64,

"Disk file size {} is smaller than expected {}",

metadata.len(), expected_size);

// Verify file is properly aligned for GDS operations

assert_eq!(metadata.len() % 4096, 0,

"Disk file size {} is not 4KB aligned for GDS", metadata.len());

}

let bd = block_data;

let region_size = bd.page_size() * bd.inner_dim() * 2; // dtype_width_bytes

let expected_size = NUM_LAYERS * bd.outer_dim() * region_size;

assert!(metadata.len() >= expected_size as u64,

"Disk file size {} is smaller than expected {}",

metadata.len(), expected_size);

// Verify file is properly aligned for GDS operations

assert_eq!(metadata.len() % 4096, 0,

"Disk file size {} is not 4KB aligned for GDS", metadata.len());

}

coderabbitai · 2025-09-17T17:50:27Z

lib/llm/src/block_manager/offload.rs

+        /// Verify layer-specific patterns are preserved across transfers
+        fn verify_layer_patterns<S1, L1, M1, S2, L2, M2>(
+            source_block: &ImmutableBlock<S1, L1, M1>,
+            dest_block: &ImmutableBlock<S2, L2, M2>
+        ) -> Result<()>
+        where
+            S1: Storage, L1: LocalityProvider, M1: BlockMetadata,
+            S2: Storage, L2: LocalityProvider, M2: BlockMetadata,
+            ImmutableBlock<S1, L1, M1>: BlockDataProvider,
+            ImmutableBlock<S2, L2, M2>: BlockDataProvider,
+        {
+            let src_data = source_block.block_data();
+            let dst_data = dest_block.block_data();
+
+            assert_eq!(src_data.num_layers(), dst_data.num_layers());
+
+            for layer_idx in 0..src_data.num_layers() {
+                for outer_idx in 0..2 { // Assuming max 2 outer dimensions
+                    if let (Ok(src_layer), Ok(dst_layer)) = (
+                        src_data.layer_view(layer_idx, outer_idx),
+                        dst_data.layer_view(layer_idx, outer_idx)
+                    ) {
+                        assert_eq!(src_layer.size(), dst_layer.size());
+
+                        let expected_pattern = 0x10 + layer_idx as u8 + outer_idx as u8;
+
+                        unsafe {
+                            let src_ptr = src_layer.as_ptr();
+                            let dst_ptr = dst_layer.as_ptr();
+                            let src_size = src_layer.size();
+                            let dst_size = dst_layer.size();
+
+                            // Safety checks
+                            if src_ptr.is_null() || dst_ptr.is_null() {
+                                return Err(anyhow::anyhow!("Layer view returned null pointer"));
+                            }
+                            if src_size == 0 || dst_size == 0 {
+                                continue; // Skip empty layers
+                            }
+
+                            let src_slice = std::slice::from_raw_parts(src_ptr, src_size);
+                            let dst_slice = std::slice::from_raw_parts(dst_ptr, dst_size);
+
+                            // Verify source has expected pattern
+                            assert!(src_slice.iter().all(|&b| b == expected_pattern),
+                                "Source layer {} outer {} has incorrect pattern", layer_idx, outer_idx);
+
+                            // Verify destination matches source
+                            assert!(dst_slice.iter().all(|&b| b == expected_pattern),
+                                "Destination layer {} outer {} has incorrect pattern", layer_idx, outer_idx);
+                        }
+                    }
+                }
+            }
+
+            Ok(())
+        }


⚠️ Potential issue

Avoid unsafe CPU reads from device memory in helpers.

verify_layer_patterns creates CPU slices from raw pointers without staging device memory; this will break for StorageType::Device. Reuse check_block_contents (which already handles device vs host vs disk) for integrity, or add device→host copies.

Minimal: make this helper operate only on host/disk blocks; return Ok(()) or skip when device storage is detected.

Better: copy device layers to host before inspection, mirroring get_block_contents.

This will prevent spurious crashes in mixed-layout/device tests.

Signed-off-by: Olga Andreeva <[email protected]>

richardhuo-nv · 2025-09-17T23:11:28Z

lib/llm/src/block_manager/block/transfer/nixl.rs

+                // Handle potential size mismatches between layouts
+                let src_size = src_view.size();
+                let dst_size = dst_view.size();
+                let copy_size = std::cmp::min(src_size, dst_size);


Why don't we let it panic or return an error in this case? If this happens, feels like a high chance the model will start generating garbage.

I actually thought this would be the case involving G1 such that one side is not contiguous and the other side is, such that this is an expected case. so I left my below comment :-)

ziqifan617 · 2025-09-17T22:34:25Z

lib/llm/src/block_manager/block/transfer/nixl.rs

+                    tracing::warn!(
+                        "Size mismatch in NIXL layer copy: src_size={}, dst_size={}, using copy_size={}. \
+                         This may indicate a layout configuration issue.",
+                        src_size, dst_size, copy_size
+                    );


would this warn msg be printed out a lot per each onboarding and offloading involving G1?

also, is it really indicating a layout config issue or expected?

ziqifan617 · 2025-09-17T22:43:50Z

lib/llm/src/block_manager/distributed/transfer.rs

+    }
+
+    /// Verify block data integrity after transfer
+    pub fn verify_transfer_integrity<Source, Target>(


where verify_transfer_integrity is actually used?

ziqifan617 · 2025-09-18T00:10:00Z

lib/llm/src/block_manager/distributed/worker.rs

+            LayoutType::FullyContiguous => {
+                let num_layers = shape[1];
+                let outer_dim = shape[2];
+                let inner_dim = shape[3..].iter().product::<usize>() / config.page_size;


just a comment: this is interesting to group all the rest in the inner_dim

ziqifan617 · 2025-09-18T00:13:17Z

lib/llm/src/block_manager/distributed/worker.rs

+                let (outer_contiguous, outer_dim) = if shape[0] >= config.num_device_blocks {
+                    (false, shape[1])
+                } else if shape[1] >= config.num_device_blocks {
+                    (true, shape[0])
+                } else {


could you explain why the branching logic is like this here? around shape[0]?

oandreeva-nv and others added 4 commits September 15, 2025 10:08

FC layouts for host and disk

374057f

docker related changes

bf97e7e

local and layout fixes

6dc31a7

Signed-off-by: Olga Andreeva <[email protected]>

tests

a9ff526

Signed-off-by: Olga Andreeva <[email protected]>

pull-request-size bot added the size/XXL label Sep 17, 2025

Restoring docker files

c5e2be2

Signed-off-by: Olga Andreeva <[email protected]>

oandreeva-nv commented Sep 17, 2025

View reviewed changes

lib/llm/src/block_manager/block/transfer/cuda.rs Outdated Show resolved Hide resolved

oandreeva-nv commented Sep 17, 2025

View reviewed changes

lib/llm/src/block_manager/block/transfer/nixl.rs Outdated Show resolved Hide resolved

oandreeva-nv marked this pull request as ready for review September 17, 2025 17:36

oandreeva-nv requested a review from a team as a code owner September 17, 2025 17:36

coderabbitai bot reviewed Sep 17, 2025

View reviewed changes

Removing unnecessary if else

bea4304

Signed-off-by: Olga Andreeva <[email protected]>

oandreeva-nv changed the title ~~Transition to FullyContiguous Host and Disk layouts~~ feature: Transition to FullyContiguous Host and Disk layouts Sep 17, 2025

oandreeva-nv changed the title ~~feature: Transition to FullyContiguous Host and Disk layouts~~ feat: Transition to FullyContiguous Host and Disk layouts Sep 17, 2025

github-actions bot added the feat label Sep 17, 2025

richardhuo-nv reviewed Sep 17, 2025

View reviewed changes

ziqifan617 reviewed Sep 18, 2025

View reviewed changes

feat: Transition to FullyContiguous Host and Disk layouts #3090

Are you sure you want to change the base?

feat: Transition to FullyContiguous Host and Disk layouts #3090

Uh oh!

Conversation

oandreeva-nv commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Sep 17, 2025

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

richardhuo-nv Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

ziqifan617 Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

ziqifan617 Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

ziqifan617 Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

ziqifan617 Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

ziqifan617 Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

oandreeva-nv commented Sep 17, 2025 •

edited

Loading

coderabbitai bot commented Sep 17, 2025 •

edited

Loading