Skip to content

feat(rows): exclude heavy binary columns from /rows response to avoid UI lag (#3052) #3209

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ArjunJagdale
Copy link

Fixes: #3052

This PR improves the responsiveness of the Dataset Viewer by skipping binary-heavy columns (e.g., t5_prompt_embeds, vae_latents) from the /rows endpoint payload.

These columns typically contain thousands of bytes per row and are not meaningful in the UI. The change introduces a hardcoded exclusion list (EXCLUDED_COLUMNS) and drops those columns from the final pyarrow.Table before response generation.

Tested manually using datasets with large binary columns and confirmed a reduction in payload size and frontend lag.

Future improvements could include:

  • Auto-detecting such columns by dtype or size
  • Allowing dataset creators to opt-out columns explicitly via config or metadata

…e_latents from /rows output

feat(rows): exclude heavy binary columns like t5_prompt_embeds and vae_latents from /rows output

This change introduces a safeguard against rendering performance issues in the Dataset Viewer by skipping certain heavy binary columns (e.g., ~5KB per row) that are not useful for display.

Currently hardcoded to drop columns like "t5_prompt_embeds" and "vae_latents", which caused UI freezing in datasets like `frutiemax/themoviedb_posters`. This is handled right before response construction in the /rows endpoint.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Don't try to include binary cells in the /rows responses
1 participant