Enable change capture on MySQL / MariaDB

MySQL's native CDC is the binary log (binlog) in row mode: Elglide reads the binlog change stream to sync only what changed. Once it's enabled, Elglide manages the resume position, the first consistent snapshot, and self-healing for you — per task, automatically.

Unlike SQL Server (enabled per-table), the binary log is a server-wide setting — turning it on covers every table. Elglide reads only the one table your task points at. Elglide uses GTID for reliable resume, so on MySQL the server needs gtid_mode = ON (a one-time change that needs a restart).
MariaDB users: MariaDB has no gtid_mode / enforce_gtid_consistency settings — GTID is always available once binary logging is on, so you only need log_bin + binlog_format = ROW + binlog_row_image = FULL (and a primary key). See the MariaDB blocks below; everything else on this page (permissions, snapshot, self-healing) applies identically.

Permissions required

  • To turn on the binary log + GTID (one-time, an admin): edit the server config and restart (step 1), or on managed services the parameter-group change + reboot.
  • For Elglide's connection role (ongoing):
    • REPLICATION SLAVE and REPLICATION CLIENT — to read the binlog change stream.
    • SELECT on each synced table — the initial snapshot + reads.

A typical setup (run by an admin):

GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'elglide_reader'@'%';
GRANT SELECT ON yourdb.* TO 'elglide_reader'@'%';
FLUSH PRIVILEGES;
If a table loads Full (or RowHash) instead of using the binary log, the source is usually missing one of the server settings below — most often the row image isn't FULL, the table has no primary key, or (MySQL only) gtid_mode isn't ON.

1. Turn on the binary log at the server level (one-time, needs a restart)

On a self-managed MySQL, add to my.cnf under [mysqld]:

[mysqld]
log_bin                  = mysql-bin     # enable binary logging
binlog_format            = ROW           # row-based (not STATEMENT/MIXED)
binlog_row_image         = FULL          # log every column (needed to rebuild rows)
gtid_mode                = ON            # global transaction ids (reliable resume)
enforce_gtid_consistency = ON            # required companion to gtid_mode
server_id                = 1             # any unique id on your network
binlog_expire_logs_seconds = 259200      # keep 3 days of binlog (tune to taste)

Then restart and confirm:

sudo systemctl restart mysql           # or your platform's restart

-- after restart, each should report the expected value:
SHOW VARIABLES LIKE 'log_bin';           -- ON
SHOW VARIABLES LIKE 'binlog_format';     -- ROW
SHOW VARIABLES LIKE 'binlog_row_image';  -- FULL
SHOW VARIABLES LIKE 'gtid_mode';         -- ON

MariaDB — same idea, but gtid_mode / enforce_gtid_consistency don't exist (GTID is always on). Add to my.cnf under [mariadb] (or [mysqld]):

[mariadb]
log_bin          = mariadb-bin   # enable binary logging
binlog_format    = ROW           # row-based (not STATEMENT/MIXED)
binlog_row_image = FULL          # log every column
server_id        = 1             # any unique id on your network
gtid_strict_mode = ON            # recommended for clean GTID resume
expire_logs_days = 3             # keep 3 days of binlog (tune to taste)

Then restart and confirm (note: @@gtid_binlog_pos, not gtid_mode):

sudo systemctl restart mariadb

SELECT @@log_bin, @@binlog_format, @@binlog_row_image;   -- 1, ROW, FULL
SELECT @@gtid_binlog_pos;                                -- e.g. 0-1-42 (non-empty after any write)
Managed services (no shell): set the equivalent parameters in the DB parameter/flags group and reboot.
  • Amazon RDS / Aurora MySQL: set binlog_format = ROW, binlog_row_image = FULL, gtid_mode = ON, and enforce_gtid_consistency = ON in the parameter group; enable automated backups (RDS needs them for binlog); set retention with CALL mysql.rds_set_configuration('binlog retention hours', 72);
  • Google Cloud SQL: enable point-in-time recovery (turns on binlog), then set the binlog_row_image / GTID flags.
  • Azure Database for MySQL: set binlog_row_image = FULL and the GTID parameters in server parameters, then restart.
  • Amazon RDS MariaDB: set binlog_format = ROW and binlog_row_image = FULL in the parameter group, enable automated backups, and set retention with CALL mysql.rds_set_configuration('binlog retention hours', 72); — no GTID parameters needed (MariaDB GTID is always on).

2. Give the Elglide connection role replication access

The role Elglide connects with needs to read the binlog stream plus the tables:

GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'elglide_reader'@'%';
GRANT SELECT ON yourdb.* TO 'elglide_reader'@'%';
FLUSH PRIVILEGES;

The same grants work on MariaDB. On MariaDB 10.5+ REPLICATION CLIENT is an alias for BINLOG MONITOR — either name is accepted.

3. Make sure each table has a PRIMARY KEY

Row-based binlog identifies a changed row by its key, so each synced table needs a PRIMARY KEY (Elglide uses it as the MERGE key for updates and deletes). A table with no primary key stays on a Full / RowHash sync.

ALTER TABLE yourdb.your_table ADD PRIMARY KEY (id);   -- if it has none

4. That's it — Elglide handles the rest

When you create an Incremental task on a ready table, Elglide automatically:

  • takes a consistent snapshot for the first Full load and records the binlog GTID at exactly that point (no gaps, no duplicates);
  • streams inserts / updates / deletes from the binary log on each run and MERGEs them into Snowflake;
  • tracks its GTID position and advances it only after data lands in Snowflake (at-least-once);
  • self-heals — if the source purges binlog the task still needs (e.g. it was paused past retention), the next run re-baselines with a full sync and resumes.

Data types in change capture

Change-captured rows are cast to Snowflake the same way as the first Full load — same target types, same representation — so a row updated via the binary log matches a row from the initial snapshot. High-precision DECIMAL keeps full precision; DATETIME/TIMESTAMP, JSON, BIT, unsigned integers and BLOB/binary columns all land identically. See the MySQL connector data-type reference (sign-in required) for the per-type rules.

MariaDB note: a MariaDB JSON column is internally LONGTEXT, so it flows as plain JSON text (no separate cast) — query it downstream with PARSE_JSON() just like a MySQL JSON column.

A few edge cases over change capture (the first Full load handles all of them; these only affect change-captured rows):
  • Spatial (POINT/GEOMETRY/…): Full stores WKT text (POINT(1 2)); CDC arrives as the hex of the binary geometry — values preserved, only the text shape differs.
  • DATETIME(6) sub-second precision: microseconds are captured to millisecond precision over CDC (e.g. .999999.999000).
  • Raw-binary VARBINARY/BINARY holding non-text bytes: hex is exact for BLOB and text-shaped binary; arbitrary binary may not round-trip byte-for-byte over CDC.
Need any of these exact over change capture? Let us know.

Keep an eye on binlog retention

The source keeps binlog files for binlog_expire_logs_seconds (RDS: the binlog retention hours setting). As long as a CDC task keeps running it stays well inside that window. If you pause a task longer than retention, the binlog it needs is purged — Elglide detects this and automatically re-baselines with a full sync, so you don't lose data, but the next run is a full reload.

Verify in Elglide

Open the connection's Pipelines tab. A binlog-ready table will create an Incremental task using the Binary Log mechanism — a first Full load to seed Snowflake, then change-only syncs. If a table still loads Full, re-check the boxes above (binary logging, ROW + FULL, a primary key, and on MySQL gtid_mode = ON — MariaDB needs none).