Update coreml codebook #2648

metascroy · 2025-07-31T21:48:57Z

This PR updates the dequantize_codebook quant primitive to be more compatible with CoreML. More specifically:

code_dtype is changed to nbits because CoreML cannot process a function that has non-standard dtypes in the signature (e.g., torch.uint3)
This changes the codebook rank to codes.dim() + 2 to follow the convention here https://apple.github.io/coremltools/source/coremltools.converters.mil.mil.ops.defs.html#coremltools.converters.mil.mil.ops.defs.iOS18.compression.constexpr_lut_to_dense

pytorch-bot · 2025-07-31T21:49:01Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2648

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Pending

As of commit cceaab2 with merge base 22f9d31 ():

NEW FAILURES - The following jobs have failed:

Run Regression Tests / test-nightly (CPU Nightly, linux.4xlarge, --pre torch --index-url https://download.pytorch.org/wh... / linux-job (gh)
test/test_low_bit_optim.py::TestFSDP2::test_uneven_shard
Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh)
test/test_low_bit_optim.py::TestFSDP2::test_uneven_shard

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2025-08-01T02:14:47Z

torchao/prototype/quantization/codebook_coreml/codebook_quantized_tensor.py

+    weight_tensor = weight_tensor.dequantize()
+    return func(indices, weight_tensor, **kwargs)
+
+
 @implements([aten.detach.default, aten.alias.default])


nit: can be a separate PR, you can try defining CodebookTensor.tensor_data_names and CodebookTensor.tensor_attribute_names removing these things now and see if it still works

#2597 and #2598 added more utils for TorchAOBaseTensor

This I will save for a future PR

jerryzh168 · 2025-08-01T03:21:12Z

torchao/prototype/quantization/codebook_coreml/codebook_ops.py

@@ -48,43 +49,45 @@ def choose_qparams_and_quantize_codebook_coreml(

    Returns:
        Tuple[torch.Tensor, torch.Tensor]  The codebook (lookup table) Tensor and the quantized Tensor (codes, torch.uint8)
+        The LUT table has dimension input_tensor.dim() + 2, where:


nit: maybe spell out the dimensions with variables/expressions?

jerryzh168 · 2025-08-01T03:24:56Z

torchao/prototype/quantization/codebook_coreml/codebook_ops.py

-    dequant = torch.zeros_like(codes, dtype=output_dtype)
+    # Compute shape of lookup group indices from codes shape and block size
+    code_shape = codes.shape
+    ndim = len(code_shape)


nit: can do codes.ndim

jerryzh168 · 2025-08-01T03:25:25Z

torchao/prototype/quantization/codebook_coreml/codebook_ops.py

-        dequant[:, i, :] = codebook[i][codes[:, i, :]].squeeze()
+    # Compute which codebook slice to use for each element
+    group_indices = []
+    for dim, bsz in zip(code_shape, block_size):


nit: dim might be confusing? this seems to be code_size_i and block_size_i

jerryzh168 · 2025-08-01T03:28:06Z

torchao/prototype/quantization/codebook_coreml/codebook_ops.py

+    mesh = torch.meshgrid(*group_indices, indexing="ij")
+    group_index_tensor = torch.stack(mesh, dim=-1)  # shape (..., N), where N = ndim
+
+    # Flatten everything to index efficiently
+    flat_codes = codes.reshape(-1)
+    flat_groups = group_index_tensor.reshape(-1, ndim)  # (..., ndim)
+
+    # Compute dequantized values via indexing
+    # index into codebook with (*group_index, code_index, :)
+    gathered = codebook[(*flat_groups.T, flat_codes)]  # shape (numel, vec_dim)


nit: can you add comments of some examples for these to make it easier to understand

jerryzh168

looks good, just some comments about making it easier to understand

This adds palletization support for embedding/linear layers in CoreML using TorchAO's quantize_ API. Note, this needs to wait for pytorch/ao#2648 to land in ao + a pin bump in ET before landing.

metascroy added 2 commits July 31, 2025 14:37

Update CoreML codebook APIs

6f227ce

up

5e1189e

metascroy requested a review from jerryzh168 July 31, 2025 21:49

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 31, 2025

This was referenced Jul 31, 2025

[CoreML] Enable palletization via quantize_ pytorch/executorch#12923

Open

Add palletization/codebook support to CoreML backend pytorch/executorch#13051

Merged

metascroy added topic: not user facing Use this tag if you don't want this PR to show up in release notes topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) labels Aug 1, 2025

jerryzh168 reviewed Aug 1, 2025

View reviewed changes

jerryzh168 approved these changes Aug 1, 2025

View reviewed changes

metascroy added 2 commits August 3, 2025 23:12

up

126fb3a

up

cceaab2

metascroy merged commit ca5f788 into main Aug 4, 2025
17 of 20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update coreml codebook #2648

Update coreml codebook #2648

Uh oh!

metascroy commented Jul 31, 2025

Uh oh!

pytorch-bot bot commented Jul 31, 2025 •

edited

Loading

Uh oh!

jerryzh168 Aug 1, 2025 •

edited

Loading

Uh oh!

metascroy Aug 4, 2025

Uh oh!

jerryzh168 Aug 1, 2025 •

edited

Loading

Uh oh!

jerryzh168 Aug 1, 2025

Uh oh!

jerryzh168 Aug 1, 2025 •

edited

Loading

Uh oh!

jerryzh168 Aug 1, 2025 •

edited

Loading

Uh oh!

jerryzh168 left a comment

Uh oh!

Uh oh!

Uh oh!

Update coreml codebook #2648

Update coreml codebook #2648

Uh oh!

Conversation

metascroy commented Jul 31, 2025

Uh oh!

pytorch-bot bot commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2648

❌ 2 New Failures, 1 Pending

Uh oh!

jerryzh168 Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

metascroy Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Aug 1, 2025

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 31, 2025 •

edited

Loading

jerryzh168 Aug 1, 2025 •

edited

Loading

jerryzh168 Aug 1, 2025 •

edited

Loading

jerryzh168 Aug 1, 2025 •

edited

Loading

jerryzh168 Aug 1, 2025 •

edited

Loading