Skip to content

Conversation

mgree
Copy link
Contributor

@mgree mgree commented Sep 17, 2025

The range query in operator IDs was leading to a cross join with a filter; using generate_series we can get a proper equijoin.

Motivation

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@mgree mgree requested a review from a team as a code owner September 17, 2025 19:22
@mgree mgree requested a review from ggevay September 17, 2025 19:22
]);
from.extend(["JOIN mz_introspection.mz_dataflow_global_ids mdgi ON (mlm.global_id = mdgi.global_id)",
"LEFT JOIN mz_introspection.mz_expected_group_size_advice megsa ON (megsa.dataflow_id = mdgi.id AND mlm.operator_id_start <= megsa.region_id AND megsa.region_id < mlm.operator_id_end)"]);
"LEFT JOIN (generate_series((mlm.operator_id_start) :: int8, (mlm.operator_id_end - 1) :: int8) AS valid_id JOIN \
Copy link
Contributor

@ggevay ggevay Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this generate_series could be on the left side of the left join. Having it on the right side means that the right side is correlated with the left side, which is a complicated left join lowering case, with its own code path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started with it on the left side, but I got redundant rows. (There may be a good way to hide those, but my SQL skill wasn't sufficient to the task.) The resulting plan is not small, but there is no cross join...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to merge after a tiny doc update, but I made https://github.com/MaterializeInc/database-issues/issues/9730 to record your idea.

@mgree mgree requested a review from a team as a code owner September 24, 2025 17:41
@mgree mgree enabled auto-merge (squash) September 24, 2025 17:42
@mgree mgree force-pushed the explain-analyze-avoid-crossjoin branch from e222e02 to 095769b Compare September 24, 2025 17:45
@mgree mgree merged commit 5c122ab into MaterializeInc:main Sep 24, 2025
128 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants