Skip to content

Conversation

@wlwilliamx
Copy link
Collaborator

What problem does this PR solve?

Issue Number: close #2219

What is changed and how it works?

Currently, when TiCDC outputs DDL events for partitioned tables to MQ downstreams (using open protocol or simple protocol), the event message contains the logical table ID. This is incorrect, as downstream consumers expect the physical table ID (i.e., the partition ID) to correctly identify and process data for a specific partition.

The root cause of this issue is that common.TableInfo objects, which are intended to represent physical partitions in these DDL handlers, were being created with the logical table's ID (info.ID) instead of their own physical partition ID.

This PR fixes this by ensuring that all TableInfo instances representing a physical partition are created with the correct physical table ID.

Changes

This PR introduces a new wrapper function for TableInfo creation and updates all partition-related DDL handlers to use it correctly.

1. pkg/common/table_info.go

  • Introduced WrapTableInfoWithTableID:
    • A new function WrapTableInfoWithTableID(schemaName string, info *model.TableInfo, tableID int64) *TableInfo has been added.
    • This function creates a common.TableInfo wrapper but explicitly passes the provided tableID parameter to NewTableInfo. This allows us to override the ID, using the physical partition ID instead of the logical table ID (info.ID).
  • Updated WrapTableInfo:
    • The existing WrapTableInfo function is now a convenience wrapper that calls WrapTableInfoWithTableID(schemaName, info, info.ID). This maintains its original behavior (using the logical ID) for contexts where that is still the intended behavior.

2. logservice/schemastore/persist_storage_ddl_handlers.go

  • Updated extractTableInfoFuncFor... Handlers:

    • The following functions, which are used to find and return the TableInfo for a specific physical partition ID (tableID), have been updated:
      • extractTableInfoFuncForAddPartition
      • extractTableInfoFuncForTruncateAndReorganizePartition
      • extractTableInfoFuncForAlterTablePartitioning
      • extractTableInfoFuncForRemovePartitioning
    • In all these functions, the call common.WrapTableInfo(event.SchemaName, event.TableInfo) has been replaced with common.WrapTableInfoWithTableID(event.SchemaName, event.TableInfo, tableID).
    • This ensures that when a matching partition ID is found, the returned TableInfo object is correctly initialized with that physical tableID.
  • Refactored buildDDLEventCommon:

    • A new function buildDDLEventCommonWithTableID(rawEvent *PersistedDDLEvent, tableID int64, ...) has been introduced. This function takes an explicit tableID parameter.
    • The logic from the old buildDDLEventCommon was moved here. This new function now uses the passed tableID to:
      1. Create the wrapTableInfo using common.WrapTableInfoWithTableID(rawEvent.SchemaName, rawEvent.TableInfo, tableID).
      2. Set the TableID field on the commonEvent.DDLEvent being returned.
    • The original buildDDLEventCommon now calls buildDDLEventCommonWithTableID, passing rawEvent.TableInfo.ID as the default tableID. This preserves the existing behavior for DDLs that are not partition-specific.
  • Updated buildDDLEventFor... Partition Handlers:

    • All DDL event builder functions that operate on a specific physical partition ID (tableID) have been updated to use the new "WithTableID" helper:
      • buildDDLEventForAddPartition
      • buildDDLEventForDropPartition
      • buildDDLEventForTruncateAndReorganizePartition
      • buildDDLEventForExchangeTablePartition
      • buildDDLEventForAlterTablePartitioning
      • buildDDLEventForRemovePartitioning
    • In all these functions, the call buildDDLEventCommon(rawEvent, ...) has been replaced with buildDDLEventCommonWithTableID(rawEvent, tableID, ...).
    • This ensures that the DDLEvent (and its associated TableInfo and PreTableInfo) is built using the correct physical partition ID.

Impact

After this change, DDL events for partitions (add, drop, truncate, etc.) will correctly carry the physical partition ID in the TableID field and in their associated TableInfo objects. This resolves the inconsistency for MQ downstreams and allows consumers to correctly process partition-level events.

Check List

Tests

  • Unit test
  • Integration test

Questions

Will it cause performance regression or break compatibility?

None

Do you need to update user documentation, design documentation or monitoring documentation?

None

Release note

None

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-triage-completed release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 29, 2025
@gemini-code-assist
Copy link

Summary of Changes

Hello @wlwilliamx, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where DDL events for partitioned tables were being generated with incorrect table identifiers, specifically using the logical table ID instead of the physical partition ID. The changes ensure that all DDL events related to partitioned tables correctly carry their respective physical partition IDs, resolving inconsistencies for downstream message queue consumers and enabling accurate data processing.

Highlights

  • Problem Identification: Previously, DDL events for partitioned tables sent to MQ downstreams incorrectly contained the logical table ID instead of the required physical partition ID, leading to incorrect processing by consumers.
  • New TableInfo Wrapper: A new function, WrapTableInfoWithTableID, was introduced in pkg/common/table_info.go. This function allows explicit specification of the tableID when creating a common.TableInfo object, ensuring the physical partition ID can be used.
  • DDL Handler Updates: Various DDL handlers in logservice/schemastore/persist_storage_ddl_handlers.go were updated. Functions like extractTableInfoFuncFor... and buildDDLEventFor... now utilize the new WrapTableInfoWithTableID or a refactored buildDDLEventCommonWithTableID to correctly embed the physical partition ID in DDL events.
  • Refactoring of Common DDL Event Builder: The buildDDLEventCommon function was refactored into buildDDLEventCommonWithTableID to accept an explicit tableID, which is then used to set the TableID field in the DDLEvent and create the wrapTableInfo.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@wlwilliamx
Copy link
Collaborator Author

/test all

@wlwilliamx
Copy link
Collaborator Author

/check-issue-triage-complete

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses the issue of using logical table IDs instead of physical partition IDs in DDL events for partitioned tables. The introduction of WrapTableInfoWithTableID and buildDDLEventCommonWithTableID is a clean way to handle this, and the updates to partition-related DDL handlers are appropriate. However, the refactoring of buildDDLEventCommon introduces a critical issue that could lead to a panic, as well as an unintended behavior change for TRUNCATE TABLE DDLs. I have provided a detailed comment and a suggested fix for this problem.

Comment on lines 1563 to 1565
func buildDDLEventCommon(rawEvent *PersistedDDLEvent, tableFilter filter.Filter, tiDBOnly bool) (commonEvent.DDLEvent, bool, error) {
return buildDDLEventCommonWithTableID(rawEvent, rawEvent.TableInfo.ID, tableFilter, tiDBOnly)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This refactoring introduces two issues:

  1. Panic on nil TableInfo: For DDLs like CREATE DATABASE, rawEvent.TableInfo is nil. This implementation will cause a panic when trying to access rawEvent.TableInfo.ID.
  2. Incorrect TableID for TRUNCATE TABLE: For a TRUNCATE TABLE DDL, rawEvent.TableInfo.ID is the new table ID, while rawEvent.TableID is the old one. The existing contract for DDLEvent (as per comments in pkg/common/event/ddl_event.go) is that TableID should be the old table ID. This change breaks that contract, which could impact downstream consumers.

Here is a suggested fix that addresses both problems by handling nil TableInfo and special-casing TRUNCATE TABLE to maintain backward compatibility.

func buildDDLEventCommon(rawEvent *PersistedDDLEvent, tableFilter filter.Filter, tiDBOnly bool) (commonEvent.DDLEvent, bool, error) {
	if rawEvent.TableInfo == nil {
		// For DDLs without table info (e.g., CREATE DATABASE), use rawEvent.TableID to avoid a panic.
		return buildDDLEventCommonWithTableID(rawEvent, rawEvent.TableID, tableFilter, tiDBOnly)
	}

	// Use the new table ID from TableInfo to build the event, which correctly wraps the TableInfo.
	ddlEvent, ok, err := buildDDLEventCommonWithTableID(rawEvent, rawEvent.TableInfo.ID, tableFilter, tiDBOnly)
	if err != nil {
		return commonEvent.DDLEvent{}, false, err
	}

	// For TRUNCATE TABLE, DDLEvent.TableID must be the old table ID for backward compatibility.
	if model.ActionType(rawEvent.Type) == model.ActionTruncateTable {
		ddlEvent.TableID = rawEvent.TableID
	}

	return ddlEvent, ok, nil
}

@wlwilliamx
Copy link
Collaborator Author

/test all

1 similar comment
@wlwilliamx
Copy link
Collaborator Author

/test all

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Oct 31, 2025
@ti-chi-bot
Copy link

ti-chi-bot bot commented Oct 31, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lidezhu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot
Copy link

ti-chi-bot bot commented Oct 31, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-10-31 07:14:50.667945683 +0000 UTC m=+1633596.745198243: ☑️ agreed by lidezhu.

@ti-chi-bot ti-chi-bot bot added the approved label Oct 31, 2025
@wlwilliamx
Copy link
Collaborator Author

/test all

@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 31, 2025
@wlwilliamx
Copy link
Collaborator Author

/test all

@ti-chi-bot
Copy link

ti-chi-bot bot commented Oct 31, 2025

@wlwilliamx: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cdc-kafka-integration-light cc7bf16 link true /test pull-cdc-kafka-integration-light
pull-cdc-kafka-integration-heavy cc7bf16 link true /test pull-cdc-kafka-integration-heavy
pull-cdc-storage-integration-heavy cc7bf16 link true /test pull-cdc-storage-integration-heavy

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wrong table ID used in MQ open protocol/simple protocol output for partitioned tables

2 participants