Skip to content

docs: MySQL/MariaDB connector accuracy pass#2958

Open
jwhartley wants to merge 3 commits into
masterfrom
james/docs-mysql-mariadb-ddl-handling
Open

docs: MySQL/MariaDB connector accuracy pass#2958
jwhartley wants to merge 3 commits into
masterfrom
james/docs-mysql-mariadb-ddl-handling

Conversation

@jwhartley
Copy link
Copy Markdown
Contributor

@jwhartley jwhartley commented May 20, 2026

Summary

Three commits, all on the MySQL/MariaDB CDC capture connector pages and the schema-evolution guide.

1. DDL handling guidance (5eea49f)

The "Unsupported Operations" sections claimed most ALTER TABLE operations would halt the capture with an unsupported operation error and required users to remove/re-add bindings. That hasn't been true for a while — the connector now handles column add/drop/rename/retype automatically, deactivates bindings cleanly on DROP TABLE / RENAME TABLE / DROP DATABASE, and silently ignores TRUNCATE TABLE on active tables (see handleQuery and handleAlterTable).

Replaces those sections with a short "Handling Source Schema Changes" subsection across all five connector pages, and drops the redundant "Unsupported DDL operations (MySQL/MariaDB CDC)" section and stale DROP TABLE troubleshooting-table row from the schema-evolution guide.

Motivated by this Slack thread where Will Donnelly flagged the docs as "woefully out-of-date" after kapa.ai cited them in answer to a prospective-customer question about type changes on large MariaDB tables.

2. MariaDB accuracy pass (a60c7bf)

While verifying the DDL section, audited the rest of MariaDB.md and fixed:

  • Removed the "Azure Database for MariaDB" setup section — Microsoft retired that service on 2025-09-19.
  • Replaced SET PERSIST with SET GLOBALSET PERSIST is MySQL 8.0+ syntax that MariaDB does not implement. Added a note about persisting via my.cnf.
  • Fixed the missing semicolon on the CREATE USER … IDENTIFIED BY 'secret' example.
  • Corrected backfill_chunk_size default to 50000 (was documented as 131072; the connector default is 50000 per main.go:222).
  • Stated minimum MariaDB version (10.3+) in Prerequisites per prerequisites.go:43-44.
  • Added the missing discover_schemas and statement_timeout advanced properties to the configuration table.

Applied the same fixes to amazon-rds-mariadb.md where they overlap.

3. MySQL parity (0b858ae)

backfill_chunk_size and the two missing advanced properties affect the MySQL family pages identically (same underlying connector code). Mirrors the corrections onto MySQL.md, amazon-rds-mysql.md, and google-cloud-sql-mysql.md.

Test plan

  • Render preview for the five connector pages — confirm anchors (#handling-source-schema-changes) resolve from the "Unhandled Queries" cross-reference, and confirm the new "Statement Timeout" / "Discovery Schema Selection" property rows look right.
  • Render preview for schema-evolution.md and confirm no dangling links after the section removal.
  • Render preview for MariaDB.md — confirm the Self Hosted setup section reads cleanly with the Azure subsection removed and the SET GLOBAL / my.cnf note in place.

The "Unsupported Operations" sections on the MySQL/MariaDB capture
connector pages and the matching section in schema-evolution.md were
written when the connector errored on most ALTER TABLE operations.
The connector now handles column add/drop/rename/retype automatically,
silently deactivates bindings on DROP TABLE / RENAME TABLE / DROP DATABASE,
and ignores TRUNCATE TABLE on active tables.

Replace the stale sections with a short "Handling Source Schema Changes"
subsection that describes the current behavior, and drop the redundant
section and troubleshooting-table row in schema-evolution.md.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 20, 2026

jwhartley added 2 commits May 20, 2026 10:58
- Remove the "Azure Database for MariaDB" setup section; Microsoft
  retired that service on 2025-09-19.
- Replace `SET PERSIST` (MySQL 8.0+ syntax) with `SET GLOBAL` plus a
  note explaining that MariaDB requires `my.cnf` for persistence.
- Fix the missing semicolon on the `CREATE USER` example.
- Correct the documented `backfill_chunk_size` default (`50000`, not
  `131072`).
- State the supported MariaDB minimum version (10.3+) in Prerequisites.
- Add the missing `discover_schemas` and `statement_timeout` advanced
  properties to the table.

Applies the same fixes to amazon-rds-mariadb.md where they overlap.
Mirror the MariaDB accuracy pass: correct the documented
`backfill_chunk_size` default (`50000`, not `131072`) and add the
missing `discover_schemas` and `statement_timeout` advanced
properties to the table on MySQL.md, amazon-rds-mysql.md, and
google-cloud-sql-mysql.md.
@jwhartley jwhartley changed the title docs: update MySQL/MariaDB DDL handling guidance docs: MySQL/MariaDB connector accuracy pass May 20, 2026
@jwhartley jwhartley requested a review from willdonnelly May 20, 2026 20:12
Copy link
Copy Markdown
Member

@willdonnelly willdonnelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo three comments (which I made on one of the docs but they apply to all of them)

The connector handles most DDL on actively-captured tables automatically:

In the case of `DROP TABLE` and other destructive operations this is not supported, and can only be resolved by removing the offending table(s) from the capture bindings list, after which you may recreate the capture if desired (causing the latest state of the table to be recaptured in its entirety).
- `ALTER TABLE` to add, drop, rename, or change the type of a column is applied to the collection schema as the change appears in the binlog. No action is required.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove "to the collection schema" here as that's not accurate. Our internal decoding logic applies the changes immediately, but the change is applied to collection schemas as part of our normal autodiscovery process.


In the case of `DROP TABLE` and other destructive operations this is not supported, and can only be resolved by removing the offending table(s) from the capture bindings list, after which you may recreate the capture if desired (causing the latest state of the table to be recaptured in its entirety).
- `ALTER TABLE` to add, drop, rename, or change the type of a column is applied to the collection schema as the change appears in the binlog. No action is required.
- `DROP TABLE`, `RENAME TABLE`, and `DROP DATABASE` deactivate the affected binding. To resume capturing afterwards, re-add the table to the binding list, which will trigger a fresh backfill.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is partially accurate. The affected bindings will be deactivated, but if the table is recreated we will re-backfill it and resume capturing. This is necessary because otherwise we'd have a weird "inactive even though the binding is enabled and the table exists" condition which would confuse users and would be a nightmare to support.

If your capture is failing with an `"unhandled query"` error, some SQL query is present in the binlog which the connector does not (currently) understand.

In general, this error suggests that the connector should be modified to at least recognize this type of query, and most likely categorize it as either an unsupported [DML Query](#data-manipulation-queries), an unsupported [Table Operation](#unsupported-operations), or something that can safely be ignored. Until such a fix is made the capture cannot proceed, and you will need to backfill all collections to allow the capture to jump ahead to a later point in the binlog.
In general, this error suggests that the connector should be modified to at least recognize this type of query, and most likely categorize it as either an unsupported [DML Query](#data-manipulation-queries), a [schema change](#handling-source-schema-changes), or something that can safely be ignored. Until such a fix is made the capture cannot proceed, and you will need to backfill all collections to allow the capture to jump ahead to a later point in the binlog.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this paragraph is very helpful any more. I would recommend removing it and simply saying that they should contact Estuary support so we can help get the capture unstuck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants