【MySQL】Replication - GTID

Posted by 西维蜀黍 on 2021-10-18, Last Modified on 2022-02-19

Background

In MySQL 5.5, resuming a broken replication setup required you to determine the last binary log file and position, which are distinct on nodes if binary logging is enabled. If the MySQL master fails, replication breaks and the slave will need to switch to another master. You will need to promote the most updated slave node to be a master, and manually determine a new binary log file and position of the last transaction executed by the slave. Another option is to dump the data from the new master node, restore it on slave and start replication with the new master node. These options are of course doable, but not very practical in production.

How GTID Solves the Problem

GTID (Global Transaction Identifier) provides a better transactions mapping across nodes. In MySQL 5.5. or before, Replication works in such a way that all nodes will generate different binlog files. Binlog events are the same and in the same order, but binlog file offsets may vary. With GTID, slaves can see a unique transaction coming in from several masters and this can easily be mapped into the slave execution list if it needs to restart or resume replication.

Every transaction has a unique identifier which identifies it in the same way on every server. It’s not important anymore in which binary log position a transaction was recorded, all you need to know is the GTID: ‘966073f3-b6a4-11e4-af2c-080027880ca6:4’. GTID is built from two parts - the unique identifier of a server where a transaction was first executed, and a sequence number. In the above example, we can see that the transaction was executed by the server with server_uuid of ‘966073f3-b6a4-11e4-af2c-080027880ca6’ and it’s 4th transaction executed there. This information is enough to perform complex topology changes - MySQL knows which transactions have been executed and therefore it knows which transactions need to be executed next. Forget about binary logs, it’s now all in the GTID.

All necessary information for synchronizing with the master can be obtained directly from the replication stream. When you are using GTIDs for replication, you do not need to include MASTER_LOG_FILE or MASTER_LOG_POS options in the CHANGE MASTER TO statement; instead, it is necessary only to enable the MASTER_AUTO_POSITION option.

GTIDs (global transaction identifiers) - Transaction-based Replication

GTID (global transaction identifier) 即全局事务 ID,一个事务对应一个 GTID,保证了在每个在主库上提交的事务在集群中有一个唯一的 ID。

When using GTIDs, each transaction can be identified and tracked as it is committed on the originating server and applied by any replicas; this means that it is not necessary when using GTIDs to refer to log files or positions within those files when starting a new replica or failing over to a new source, which greatly simplifies these tasks. Because GTID-based replication is completely transaction-based, it is simple to determine whether sources and replicas are consistent; as long as all transactions committed on a source are also committed on a replica, consistency between the two is guaranteed.

GTIDs 组成

GTID = source_id:transaction_id

source_id 正常即是 server_uuid,在第一次启动时生成(函数 generate_server_uuid),并持久化到 DATADIR/auto.cnf 文件里。

transaction_id顺序化的序列号(sequence number),在每台 MySQL 服务器上都是从 1 开始自增长的序列,是事务的唯一标识。

Refer to https://dev.mysql.com/doc/refman/8.0/en/replication-gtids-concepts.html

GTID 生成

GTID 的生成受 gtid_next 控制。

在 Master 上,gtid_next 是默认的 AUTOMATIC,即 GTID 在每次事务提交时自动生成。它从当前已执行的 GTID 集合(即 gtid_executed)中,找一个大于 0 的未使用的最小值作为下个事务 GTID。在实际的更新事务记录之前将 GTID 写入到 Binlog。

在 Slave 上,从 Binlog 先读取到主库的 GTID(即 set gtid_next 记录),而后执行的事务采用该 GTID。

GTID复制原理

在原来基于日志的复制中,从库需要告知主库要从哪个偏移量进行增量同步, 如果指定错误会造成数据的遗漏,从而造成数据的不一致。

而基于 GTID 的复制中,从库会告知主库已经执行的事务的 GTID 的值,然后主库会将所有未执行的事务的 GTID 的列表返回给从库,并且可以保证同一个事务只在指定的从库执行一次,通过全局的事务 ID 确定从库要执行的事务的方式代替了以前需要用 Binlog 和 位点确定从库要执行的事务的方式

基于 GTID 的复制过程如下:

  1. master 更新数据时,会在事务前产生 GTID,一同记录到 Binlog 日志中。
  2. slave 端的 I/O 线程将变更的 Binlog,写入到本地的 relay log 中,读取值是根据gitd_next变量,告诉我们 slave 下一个执行哪个 GTID。
  3. SQL 线程从 relay log 中获取 GTID,然后对比 slave 端的 Binlog 是否有记录。如果有记录,说明该 GTID 的事务已经执行,slave 会忽略。
  4. 如果没有记录,slave 就会从 relay log 中执行该 GTID 的事务,并记录到 Binlog。
  5. 在解析过程中会判断是否有主键,如果没有就用二级索引,如果没有二级索引就用全部扫描。

GTID 的好处

  1. GTID 使用 master_auto_position=1 代替了 Binlog 的主从复制方案,相比 Binlog 方式更容易搭建主从复制。
  2. GTID 方便实现主从之间的 failover(主从切换),不用一步一步的去定位 Binlog日志文件和查找 Binlog 的位点信息。

GTID 模式复制局限性

  1. 在一个事务里面混合使用引擎,如 Innodb(支持事务)、MyISAM(不支持事务), 造成多个 GTIDs 和同一个事务相关联出错。
  2. CREATE TABLE…..SELECT 不能使用,该语句产生的两个 Event。 在某一情况会使用同一个 GTID(同一个 GTID 在 slave 只能被使用一次):
    • event one:创建表语句 create table
    • event two :插入数据语句 insert
  3. CREATE TEMPORARY TABLE and DROP TEMPORARY TABLE 不能在事务内使用 (启用了 –enforce-gtid-consistency 参数)。
  4. 使用 GTID 复制从库跳过错误时,不支持 sql_slave_skip_counter 参数的语法。

Reference