Skip to content

ticdc: add new doc for TiCDC data replication capabilities #20967

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
May 23, 2025

Conversation

hongyunyan
Copy link
Contributor

@hongyunyan hongyunyan commented May 16, 2025

First-time contributors' checklist

What is changed, added or deleted? (Required)

Which TiDB version(s) do your changes apply to? (Required)

Tips for choosing the affected version(s):

By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.

For details, see tips for choosing the affected versions.

  • master (the latest development version)
  • v9.0 (TiDB 9.0 versions)
  • v8.5 (TiDB 8.5 versions)
  • v8.4 (TiDB 8.4 versions)
  • v8.3 (TiDB 8.3 versions)
  • v8.1 (TiDB 8.1 versions)
  • v7.5 (TiDB 7.5 versions)
  • v7.1 (TiDB 7.1 versions)
  • v6.5 (TiDB 6.5 versions)
  • v6.1 (TiDB 6.1 versions)
  • v5.4 (TiDB 5.4 versions)

What is the related PR or file link(s)?

Do your changes match any of the following descriptions?

  • Delete files
  • Change aliases
  • Need modification after applied to another branch
  • Might cause conflicts after applied to another branch

@ti-chi-bot ti-chi-bot bot added missing-translation-status This PR does not have translation status info. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 16, 2025
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @hongyunyan, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

Hello team, Gemini here, providing a summary of this pull request to help everyone quickly understand the changes. This PR introduces a new documentation page specifically for TiCDC, detailing its data synchronization capabilities. The goal is to provide users with a clear understanding of what TiCDC can and cannot synchronize, covering aspects like supported statement types (DDL, DML), handling of system tables and temporary tables, index and foreign key synchronization, and consistency guarantees.

Highlights

  • New Documentation: Adds a new markdown file (ticdc/ticdc-data-sync-capabilities.md) dedicated to explaining TiCDC's data synchronization capabilities.
  • TiCDC Principles Explained: The document provides background on how TiCDC works, including monitoring Raft Logs, generating logical operations, and its consistency guarantees (eventual consistency, redo log, syncpoint).
  • Supported/Unsupported Sync Items: Clearly lists what TiCDC synchronizes (DDL, DML) and what it does not (DQL, DCL, system tables, temporary tables).
  • Index and Foreign Key Handling: Explains how TiCDC handles index creation/addition DDLs (asynchronously for TiDB downstream) and foreign key constraint DDLs, noting that it does not synchronize system variables like foreign_key_checks.
  • Downstream Constraint Checking: Mentions that TiCDC checks internal integrity but does not validate changes against downstream constraints, reporting errors if constraints are violated during writing.

Changelog

  • ticdc/ticdc-data-sync-capabilities.md
    • Added a new documentation file explaining TiCDC's data synchronization capabilities.
    • Detailed the background and principles of TiCDC's data capture.
    • Listed supported and unsupported types of statements and tables for synchronization.
    • Described how index and foreign key DDLs are handled.
    • Clarified TiCDC's role regarding downstream constraint checking.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new document detailing TiCDC's data synchronization capabilities. The document is well-structured and provides valuable information for users. I've identified a few areas for improvement, mostly related to minor style and formatting, along with a broken link that needs fixing. Overall, good work on creating this informative piece!

I've based my review on the provided style guide, focusing on:

  • Correct English grammar, spelling, and punctuation (Style Guide lines 20, 55-57).
  • Writing in the second person ("you") when addressing users (Style Guide line 22).
  • Avoiding unnecessary words (Style Guide line 24).
  • Using sentence case for headings (Style Guide line 32).
  • Using ordered lists for steps (Style Guide line 33).
  • Ensuring code snippets are in backticks (Style Guide line 34).
  • Adding a blank line before and after headings and lists (Style Guide line 38).

Summary of Findings

  • Broken Link: A link to storage services on line 19 is missing a closing parenthesis, which will cause the link to fail.
  • Spelling Error: The heading on line 8, "Backgroud", should be corrected to "Background".
  • Style Guide Adherence: Minor adjustments are suggested for conciseness (lines 3, 17, 27) and addressing the user in the second person (line 29) as per the style guide.
  • Formatting: An extra blank line between list items (line 13) could be removed for better compactness. A newline character is recommended at the end of the file (line 31).

Merge Readiness

The document is well-written and informative. However, there is a high-severity issue with a broken link (line 19) and a medium-severity spelling mistake (line 8) that should be addressed before merging. Other suggestions are minor and aim to improve clarity and adherence to the style guide. I am unable to approve pull requests, but I recommend making these changes for a better user experience.

Comment on lines 13 to 14


Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

There's an extra blank line here between list items 1 and 2. While markdown is flexible, removing it would make the list formatting more consistent and compact. 1

Style Guide References

Footnotes

  1. Style guide recommends a blank line before and after lists, implying the entire list block. Consistent spacing within the list improves readability. (link)


4. For the foreign key constraints set in the table, TiCDC will synchronize the corresponding DDL (`add foreign key`) statements, but TiCDC is not responsible for synchronizing the settings of upstream system variables, such as [foreign_key_checks](/system-variables.md#foreign_key_checks). Therefore, customers need to set appropriate system variables in the downstream to determine whether the downstream foreign key constraint check is enabled.

5. TiCDC only checks the integrity of the upstream changes received internally, and does not participate in checking whether the data changes meet the various downstream constraints. If a data change that does not meet the downstream constraints is encountered, TiCDC will report an error when writing downstream.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

It's a common convention to have a newline character at the end of text files. Could we add one here for consistency and to prevent potential issues with some tools?

Suggested change
5. TiCDC only checks the integrity of the upstream changes received internally, and does not participate in checking whether the data changes meet the various downstream constraints. If a data change that does not meet the downstream constraints is encountered, TiCDC will report an error when writing downstream.
5. TiCDC only checks the integrity of the upstream changes received internally, and does not participate in checking whether the data changes meet the various downstream constraints. If a data change that does not meet the downstream constraints is encountered, TiCDC will report an error when writing downstream.

Copy link

ti-chi-bot bot commented May 16, 2025

@gemini-code-assist[bot]: adding LGTM is restricted to approvers and reviewers in OWNERS files.

In response to this:

Code Review

This pull request introduces a new document detailing TiCDC's data synchronization capabilities. The document is well-structured and provides valuable information for users. I've identified a few areas for improvement, mostly related to minor style and formatting, along with a broken link that needs fixing. Overall, good work on creating this informative piece!

I've based my review on the provided style guide, focusing on:

  • Correct English grammar, spelling, and punctuation (Style Guide lines 20, 55-57).
  • Writing in the second person ("you") when addressing users (Style Guide line 22).
  • Avoiding unnecessary words (Style Guide line 24).
  • Using sentence case for headings (Style Guide line 32).
  • Using ordered lists for steps (Style Guide line 33).
  • Ensuring code snippets are in backticks (Style Guide line 34).
  • Adding a blank line before and after headings and lists (Style Guide line 38).

Summary of Findings

  • Broken Link: A link to storage services on line 19 is missing a closing parenthesis, which will cause the link to fail.
  • Spelling Error: The heading on line 8, "Backgroud", should be corrected to "Background".
  • Style Guide Adherence: Minor adjustments are suggested for conciseness (lines 3, 17, 27) and addressing the user in the second person (line 29) as per the style guide.
  • Formatting: An extra blank line between list items (line 13) could be removed for better compactness. A newline character is recommended at the end of the file (line 31).

Merge Readiness

The document is well-written and informative. However, there is a high-severity issue with a broken link (line 19) and a medium-severity spelling mistake (line 8) that should be addressed before merging. Other suggestions are minor and aim to improve clarity and adherence to the style guide. I am unable to approve pull requests, but I recommend making these changes for a better user experience.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

hongyunyan and others added 9 commits May 16, 2025 15:49
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@lilin90 lilin90 added the translation/from-docs-cn This PR is translated from a PR in pingcap/docs-cn. label May 19, 2025
@ti-chi-bot ti-chi-bot bot removed the missing-translation-status This PR does not have translation status info. label May 19, 2025
@lilin90 lilin90 self-assigned this May 19, 2025
@lilin90 lilin90 added area/ticdc Indicates that the Issue or PR belongs to the area of TiCDC. type/enhancement The issue or PR belongs to an enhancement. labels May 19, 2025
@lilin90 lilin90 changed the title ticdc: add new doc to describe the capacity of ticdc data syncing ticdc: add new doc for TiCDC data replication capabilities May 20, 2025
@lilin90 lilin90 added the ONCALL Relates to documentation oncall. label May 20, 2025
ti-chi-bot pushed a commit to ti-chi-bot/docs that referenced this pull request May 23, 2025
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-8.5: #21019.
But this PR has conflicts, please resolve them!

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.1: #21020.
But this PR has conflicts, please resolve them!

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.5: #21021.
But this PR has conflicts, please resolve them!

ti-chi-bot pushed a commit to ti-chi-bot/docs that referenced this pull request May 23, 2025
ti-chi-bot pushed a commit to ti-chi-bot/docs that referenced this pull request May 23, 2025
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.1: #21022.
But this PR has conflicts, please resolve them!

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.5: #21023.
But this PR has conflicts, please resolve them!

@lilin90 lilin90 removed the needs-cherry-pick-release-6.1 Should cherry pick this PR to release-6.1 branch. label May 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved area/ticdc Indicates that the Issue or PR belongs to the area of TiCDC. lgtm needs-1-more-lgtm Indicates a PR needs 1 more LGTM. needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. ONCALL Relates to documentation oncall. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. translation/from-docs-cn This PR is translated from a PR in pingcap/docs-cn. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants