docs: Add column overwrite example to batch mapping guide #7737

Sanjaykumar030 · 2025-08-13T14:20:19Z

This PR adds a complementary example showing the column-overwriting pattern, which is both more direct and more flexible for many transformations.

Proposed Change

The original remove_columns example remains untouched. Below it, this PR introduces an alternative approach that overwrites an existing column during batch mapping.

This teaches users a core .map() capability for in-place transformations without extra intermediate steps.

New Example:

>>> from datasets import Dataset
>>> dataset = Dataset.from_dict({"a": [0, 1, 2]})
# Overwrite "a" directly to duplicate each value
>>> duplicated_dataset = dataset.map(
...     lambda batch: {"a": [x for x in batch["a"] for _ in range(2)]},
...     batched=True
... )
>>> duplicated_dataset
Dataset({
    features: ['a'],
    num_rows: 6
})
>>> duplicated_dataset["a"]
[0, 0, 1, 1, 2, 2]

Sanjaykumar030 · 2025-08-30T05:50:38Z

Hi @lhoestq, just a gentle follow-up on this PR.

Sanjaykumar030 added 2 commits August 13, 2025 19:38

Update about_map_batch.mdx

86dc6ab

Update about_map_batch.mdx

7018bf9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: Add column overwrite example to batch mapping guide #7737

docs: Add column overwrite example to batch mapping guide #7737

Sanjaykumar030 commented Aug 13, 2025

Uh oh!

Sanjaykumar030 commented Aug 30, 2025

Uh oh!

Uh oh!

docs: Add column overwrite example to batch mapping guide #7737

Are you sure you want to change the base?

docs: Add column overwrite example to batch mapping guide #7737

Conversation

Sanjaykumar030 commented Aug 13, 2025

Proposed Change

Uh oh!

Sanjaykumar030 commented Aug 30, 2025

Uh oh!

Uh oh!