-
Notifications
You must be signed in to change notification settings - Fork 28
fix(ddl): ensure physical table ID is used for partitioned tables in DDL events #2884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix(ddl): ensure physical table ID is used for partitioned tables in DDL events #2884
Conversation
Summary of ChangesHello @wlwilliamx, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses an issue where DDL events for partitioned tables were being generated with incorrect table identifiers, specifically using the logical table ID instead of the physical partition ID. The changes ensure that all DDL events related to partitioned tables correctly carry their respective physical partition IDs, resolving inconsistencies for downstream message queue consumers and enabling accurate data processing. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
/test all |
|
/check-issue-triage-complete |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly addresses the issue of using logical table IDs instead of physical partition IDs in DDL events for partitioned tables. The introduction of WrapTableInfoWithTableID and buildDDLEventCommonWithTableID is a clean way to handle this, and the updates to partition-related DDL handlers are appropriate. However, the refactoring of buildDDLEventCommon introduces a critical issue that could lead to a panic, as well as an unintended behavior change for TRUNCATE TABLE DDLs. I have provided a detailed comment and a suggested fix for this problem.
| func buildDDLEventCommon(rawEvent *PersistedDDLEvent, tableFilter filter.Filter, tiDBOnly bool) (commonEvent.DDLEvent, bool, error) { | ||
| return buildDDLEventCommonWithTableID(rawEvent, rawEvent.TableInfo.ID, tableFilter, tiDBOnly) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This refactoring introduces two issues:
- Panic on nil
TableInfo: For DDLs likeCREATE DATABASE,rawEvent.TableInfoisnil. This implementation will cause a panic when trying to accessrawEvent.TableInfo.ID. - Incorrect
TableIDforTRUNCATE TABLE: For aTRUNCATE TABLEDDL,rawEvent.TableInfo.IDis the new table ID, whilerawEvent.TableIDis the old one. The existing contract forDDLEvent(as per comments inpkg/common/event/ddl_event.go) is thatTableIDshould be the old table ID. This change breaks that contract, which could impact downstream consumers.
Here is a suggested fix that addresses both problems by handling nil TableInfo and special-casing TRUNCATE TABLE to maintain backward compatibility.
func buildDDLEventCommon(rawEvent *PersistedDDLEvent, tableFilter filter.Filter, tiDBOnly bool) (commonEvent.DDLEvent, bool, error) {
if rawEvent.TableInfo == nil {
// For DDLs without table info (e.g., CREATE DATABASE), use rawEvent.TableID to avoid a panic.
return buildDDLEventCommonWithTableID(rawEvent, rawEvent.TableID, tableFilter, tiDBOnly)
}
// Use the new table ID from TableInfo to build the event, which correctly wraps the TableInfo.
ddlEvent, ok, err := buildDDLEventCommonWithTableID(rawEvent, rawEvent.TableInfo.ID, tableFilter, tiDBOnly)
if err != nil {
return commonEvent.DDLEvent{}, false, err
}
// For TRUNCATE TABLE, DDLEvent.TableID must be the old table ID for backward compatibility.
if model.ActionType(rawEvent.Type) == model.ActionTruncateTable {
ddlEvent.TableID = rawEvent.TableID
}
return ddlEvent, ok, nil
}|
/test all |
1 similar comment
|
/test all |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lidezhu The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
|
/test all |
…table-id-for-partitioned-tables-in-tableinfo
|
/test all |
|
@wlwilliamx: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: close #2219
What is changed and how it works?
Currently, when TiCDC outputs DDL events for partitioned tables to MQ downstreams (using open protocol or simple protocol), the event message contains the logical table ID. This is incorrect, as downstream consumers expect the physical table ID (i.e., the partition ID) to correctly identify and process data for a specific partition.
The root cause of this issue is that
common.TableInfoobjects, which are intended to represent physical partitions in these DDL handlers, were being created with the logical table's ID (info.ID) instead of their own physical partition ID.This PR fixes this by ensuring that all
TableInfoinstances representing a physical partition are created with the correct physical table ID.Changes
This PR introduces a new wrapper function for
TableInfocreation and updates all partition-related DDL handlers to use it correctly.1.
pkg/common/table_info.goWrapTableInfoWithTableID:WrapTableInfoWithTableID(schemaName string, info *model.TableInfo, tableID int64) *TableInfohas been added.common.TableInfowrapper but explicitly passes the providedtableIDparameter toNewTableInfo. This allows us to override the ID, using the physical partition ID instead of the logical table ID (info.ID).WrapTableInfo:WrapTableInfofunction is now a convenience wrapper that callsWrapTableInfoWithTableID(schemaName, info, info.ID). This maintains its original behavior (using the logical ID) for contexts where that is still the intended behavior.2.
logservice/schemastore/persist_storage_ddl_handlers.goUpdated
extractTableInfoFuncFor...Handlers:TableInfofor a specific physical partition ID (tableID), have been updated:extractTableInfoFuncForAddPartitionextractTableInfoFuncForTruncateAndReorganizePartitionextractTableInfoFuncForAlterTablePartitioningextractTableInfoFuncForRemovePartitioningcommon.WrapTableInfo(event.SchemaName, event.TableInfo)has been replaced withcommon.WrapTableInfoWithTableID(event.SchemaName, event.TableInfo, tableID).TableInfoobject is correctly initialized with that physicaltableID.Refactored
buildDDLEventCommon:buildDDLEventCommonWithTableID(rawEvent *PersistedDDLEvent, tableID int64, ...)has been introduced. This function takes an explicittableIDparameter.buildDDLEventCommonwas moved here. This new function now uses the passedtableIDto:wrapTableInfousingcommon.WrapTableInfoWithTableID(rawEvent.SchemaName, rawEvent.TableInfo, tableID).TableIDfield on thecommonEvent.DDLEventbeing returned.buildDDLEventCommonnow callsbuildDDLEventCommonWithTableID, passingrawEvent.TableInfo.IDas the defaulttableID. This preserves the existing behavior for DDLs that are not partition-specific.Updated
buildDDLEventFor...Partition Handlers:tableID) have been updated to use the new "WithTableID" helper:buildDDLEventForAddPartitionbuildDDLEventForDropPartitionbuildDDLEventForTruncateAndReorganizePartitionbuildDDLEventForExchangeTablePartitionbuildDDLEventForAlterTablePartitioningbuildDDLEventForRemovePartitioningbuildDDLEventCommon(rawEvent, ...)has been replaced withbuildDDLEventCommonWithTableID(rawEvent, tableID, ...).DDLEvent(and its associatedTableInfoandPreTableInfo) is built using the correct physical partition ID.Impact
After this change, DDL events for partitions (add, drop, truncate, etc.) will correctly carry the physical partition ID in the
TableIDfield and in their associatedTableInfoobjects. This resolves the inconsistency for MQ downstreams and allows consumers to correctly process partition-level events.Check List
Tests
Questions
Will it cause performance regression or break compatibility?
None
Do you need to update user documentation, design documentation or monitoring documentation?
None
Release note