Chapter 17 Replication

Table of Contents

17.1 Configuring Replication
17.1.1 Binary Log File Position Based Replication Configuration Overview
17.1.2 Setting Up Binary Log File Position Based Replication
17.1.3 Replication with Global Transaction Identifiers
17.1.4 Changing GTID Mode on Online Servers
17.1.5 MySQL Multi-Source Replication
17.1.6 Replication and Binary Logging Options and Variables
17.1.7 Common Replication Administration Tasks
17.2 Replication Implementation
17.2.1 Replication Formats
17.2.2 Replication Channels
17.2.3 Replication Threads
17.2.4 Relay Log and Replication Metadata Repositories
17.2.5 How Servers Evaluate Replication Filtering Rules
17.3 Replication Security
17.3.1 Setting Up Replication to Use Encrypted Connections
17.3.2 Encrypting Binary Log Files and Relay Log Files
17.3.3 Replication Privilege Checks
17.4 Replication Solutions
17.4.1 Using Replication for Backups
17.4.2 Handling an Unexpected Halt of a Replica
17.4.3 Monitoring Row-based Replication
17.4.4 Using Replication with Different Source and Replica Storage Engines
17.4.5 Using Replication for Scale-Out
17.4.6 Replicating Different Databases to Different Replicas
17.4.7 Improving Replication Performance
17.4.8 Switching Sources During Failover
17.4.9 Switching Sources with Asynchronous Connection Failover
17.4.10 Semisynchronous Replication
17.4.11 Delayed Replication
17.5 Replication Notes and Tips
17.5.1 Replication Features and Issues
17.5.2 Replication Compatibility Between MySQL Versions
17.5.3 Upgrading a Replication Setup
17.5.4 Troubleshooting Replication
17.5.5 How to Report Replication Bugs or Problems

Replication enables data from one MySQL database server (known as a source) to be copied to one or more MySQL database servers (known as replicas). Replication is asynchronous by default; replicas do not need to be connected permanently to receive updates from a source. Depending on the configuration, you can replicate all databases, selected databases, or even selected tables within a database.

Advantages of replication in MySQL include:

For information on how to use replication in such scenarios, see Section 17.4, “Replication Solutions”.

MySQL 8.0 supports different methods of replication. The traditional method is based on replicating events from the source's binary log, and requires the log files and positions in them to be synchronized between source and replica. The newer method based on global transaction identifiers (GTIDs) is transactional and therefore does not require working with log files or positions within these files, which greatly simplifies many common replication tasks. Replication using GTIDs guarantees consistency between source and replica as long as all transactions committed on the source have also been applied on the replica. For more information about GTIDs and GTID-based replication in MySQL, see Section 17.1.3, “Replication with Global Transaction Identifiers”. For information on using binary log file position based replication, see Section 17.1, “Configuring Replication”.

Replication in MySQL supports different types of synchronization. The original type of synchronization is one-way, asynchronous replication, in which one server acts as the source, while one or more other servers act as replicas. This is in contrast to the synchronous replication which is a characteristic of NDB Cluster (see Chapter 23, MySQL NDB Cluster 8.0). In MySQL 8.0, semisynchronous replication is supported in addition to the built-in asynchronous replication. With semisynchronous replication, a commit performed on the source blocks before returning to the session that performed the transaction until at least one replica acknowledges that it has received and logged the events for the transaction; see Section 17.4.10, “Semisynchronous Replication”. MySQL 8.0 also supports delayed replication such that a replica deliberately lags behind the source by at least a specified amount of time; see Section 17.4.11, “Delayed Replication”. For scenarios where synchronous replication is required, use NDB Cluster (see Chapter 23, MySQL NDB Cluster 8.0).

There are a number of solutions available for setting up replication between servers, and the best method to use depends on the presence of data and the engine types you are using. For more information on the available options, see Section 17.1.2, “Setting Up Binary Log File Position Based Replication”.

There are two core types of replication format, Statement Based Replication (SBR), which replicates entire SQL statements, and Row Based Replication (RBR), which replicates only the changed rows. You can also use a third variety, Mixed Based Replication (MBR). For more information on the different replication formats, see Section 17.2.1, “Replication Formats”.

Replication is controlled through a number of different options and variables. For more information, see Section 17.1.6, “Replication and Binary Logging Options and Variables”. Additional security measures can be applied to a replication topology, as described in Section 17.3, “Replication Security”.

You can use replication to solve a number of different problems, including performance, supporting the backup of different databases, and as part of a larger solution to alleviate system failures. For information on how to address these issues, see Section 17.4, “Replication Solutions”.

For notes and tips on how different data types and statements are treated during replication, including details of replication features, version compatibility, upgrades, and potential problems and their resolution, see Section 17.5, “Replication Notes and Tips”. For answers to some questions often asked by those who are new to MySQL Replication, see Section A.14, “MySQL 8.0 FAQ: Replication”.

For detailed information on the implementation of replication, how replication works, the process and contents of the binary log, background threads and the rules used to decide how statements are recorded and replicated, see Section 17.2, “Replication Implementation”.

17.1 Configuring Replication

This section describes how to configure the different types of replication available in MySQL and includes the setup and configuration required for a replication environment, including step-by-step instructions for creating a new replication environment. The major components of this section are:

17.1.1 Binary Log File Position Based Replication Configuration Overview

This section describes replication between MySQL servers based on the binary log file position method, where the MySQL instance operating as the source (where the database changes take place) writes updates and changes as events to the binary log. The information in the binary log is stored in different logging formats according to the database changes being recorded. Replicas are configured to read the binary log from the source and to execute the events in the binary log on the replica's local database.

Each replica receives a copy of the entire contents of the binary log. It is the responsibility of the replica to decide which statements in the binary log should be executed. Unless you specify otherwise, all events in the source's binary log are executed on the replica. If required, you can configure the replica to process only events that apply to particular databases or tables.

Important

You cannot configure the source to log only certain events.

Each replica keeps a record of the binary log coordinates: the file name and position within the file that it has read and processed from the source. This means that multiple replicas can be connected to the source and executing different parts of the same binary log. Because the replicas control this process, individual replicas can be connected and disconnected from the server without affecting the source's operation. Also, because each replica records the current position within the binary log, it is possible for replicas to be disconnected, reconnect and then resume processing.

The source and each replica must be configured with a unique ID (using the server_id system variable). In addition, each replica must be configured with information about the source's host name, log file name, and position within that file. These details can be controlled from within a MySQL session using the CHANGE MASTER TO statement on the replica. The details are stored within the replica's connection metadata repository (see Section 17.2.4, “Relay Log and Replication Metadata Repositories”).

17.1.2 Setting Up Binary Log File Position Based Replication

This section describes how to set up a MySQL server to use binary log file position based replication. There are a number of different methods for setting up replication, and the exact method to use depends on how you are setting up replication, and whether you already have data in the database on the source that you want to replicate.

Tip

To deploy multiple instances of MySQL, you can use InnoDB Cluster which enables you to easily administer a group of MySQL server instances in MySQL Shell. InnoDB Cluster wraps MySQL Group Replication in a programmatic environment that enables you easily deploy a cluster of MySQL instances to achieve high availability. In addition, InnoDB Cluster interfaces seamlessly with MySQL Router, which enables your applications to connect to the cluster without writing your own failover process. For similar use cases that do not require high availability, however, you can use InnoDB ReplicaSet. Installation instructions for MySQL Shell can be found here.

There are some generic tasks that are common to all setups:

Note

Certain steps within the setup process require the SUPER privilege. If you do not have this privilege, it might not be possible to enable replication.

After configuring the basic options, select your scenario:

Before administering MySQL replication servers, read this entire chapter and try all statements mentioned in Section 13.4.1, “SQL Statements for Controlling Source Servers”, and Section 13.4.2, “SQL Statements for Controlling Replica Servers”. Also familiarize yourself with the replication startup options described in Section 17.1.6, “Replication and Binary Logging Options and Variables”.

17.1.2.1 Setting the Replication Source Configuration

To configure a source to use binary log file position based replication, you must ensure that binary logging is enabled, and establish a unique server ID.

Each server within a replication topology must be configured with a unique server ID, which you can specify using the server_id system variable. This server ID is used to identify individual servers within the replication topology, and must be a positive integer between 1 and (232)−1. The default server_id value from MySQL 8.0 is 1. You can change the server_id value dynamically by issuing a statement like this:

SET GLOBAL server_id = 2;

How you organize and select the server IDs is your choice, so long as each server ID is different from every other server ID in use by any other server in the replication topology. Note that if a value of 0 (which was the default in earlier releases) was set previously for the server ID, you must restart the server to initialize the source with your new nonzero server ID. Otherwise, a server restart is not needed when you change the server ID, unless you make other configuration changes that require it.

Binary logging is required on the source because the binary log is the basis for replicating changes from the source to its replicas. Binary logging is enabled by default (the log_bin system variable is set to ON). The --log-bin option tells the server what base name to use for binary log files. It is recommended that you specify this option to give the binary log files a non-default base name, so that if the host name changes, you can easily continue to use the same binary log file names (see Section B.3.7, “Known Issues in MySQL”). If binary logging was previously disabled on the source using the --skip-log-bin option, you must restart the server without this option to enable it.

Note

The following options also have an impact on the source:

  • For the greatest possible durability and consistency in a replication setup using InnoDB with transactions, you should use innodb_flush_log_at_trx_commit=1 and sync_binlog=1 in the source's my.cnf file.

  • Ensure that the skip_networking system variable is not enabled on the source. If networking has been disabled, the replica cannot communicate with the source and replication fails.

17.1.2.2 Setting the Replica Configuration

Each replica must have a unique server ID, as specified by the server_id system variable. If you are setting up multiple replicas, each one must have a unique server_id value that differs from that of the source and from any of the other replicas. If the replica's server ID is not already set, or the current value conflicts with the value that you have chosen for the source or another replica, you must change it.

The default server_id value is 1. You can change the server_id value dynamically by issuing a statement like this:

SET GLOBAL server_id = 21;

Note that a value of 0 for the server ID prevents a replica from connecting to a source. If that server ID value (which was the default in earlier releases) was set previously, you must restart the server to initialize the replica with your new nonzero server ID. Otherwise, a server restart is not needed when you change the server ID, unless you make other configuration changes that require it. For example, if binary logging was disabled on the server and you want it enabled for your replica, a server restart is required to enable this.

If you are shutting down the replica server, you can edit the [mysqld] section of the configuration file to specify a unique server ID. For example:

[mysqld]
server-id=21

Binary logging is enabled by default on all servers. A replica is not required to have binary logging enabled for replication to take place. However, binary logging on a replica means that the replica's binary log can be used for data backups and crash recovery. Replicas that have binary logging enabled can also be used as part of a more complex replication topology. For example, you might want to set up replication servers using this chained arrangement:

A -> B -> C

Here, A serves as the source for the replica B, and B serves as the source for the replica C. For this to work, B must be both a source and a replica. Updates received from A must be logged by B to its binary log, in order to be passed on to C. In addition to binary logging, this replication topology requires the log_slave_updates system variable to be enabled. With replica updates enabled, the replica writes updates that are received from a source and performed by the replica's SQL thread to the replica's own binary log. The log_slave_updates system variable is enabled by default.

If you need to disable binary logging or replica update logging on a replica, you can do this by specifying the --skip-log-bin and --log-slave-updates=OFF options for the replica. If you decide to re-enable these features on the replica, remove the relevant options and restart the server.

17.1.2.3 Creating a User for Replication

Each replica connects to the source using a MySQL user name and password, so there must be a user account on the source that the replica can use to connect. The user name is specified by the MASTER_USER option on the CHANGE MASTER TO command when you set up a replica. Any account can be used for this operation, providing it has been granted the REPLICATION SLAVE privilege. You can choose to create a different account for each replica, or connect to the source using the same account for each replica.

Although you do not have to create an account specifically for replication, you should be aware that the replication user name and password are stored in plain text in the replica's connection metadata repository mysql.slave_master_info (see Section 17.2.4.2, “Replication Metadata Repositories”). Therefore, you may want to create a separate account that has privileges only for the replication process, to minimize the possibility of compromise to other accounts.

To create a new account, use CREATE USER. To grant this account the privileges required for replication, use the GRANT statement. If you create an account solely for the purposes of replication, that account needs only the REPLICATION SLAVE privilege. For example, to set up a new user, repl, that can connect for replication from any host within the example.com domain, issue these statements on the source:

mysql> CREATE USER 'repl'@'%.example.com' IDENTIFIED BY 'password';
mysql> GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%.example.com';

See Section 13.7.1, “Account Management Statements”, for more information on statements for manipulation of user accounts.

Important

To connect to the source using a user account that authenticates with the caching_sha2_password plugin, you must either set up a secure connection as described in Section 17.3.1, “Setting Up Replication to Use Encrypted Connections”, or enable the unencrypted connection to support password exchange using an RSA key pair. The caching_sha2_password authentication plugin is the default for new users created from MySQL 8.0 (for details, see Section 6.4.1.2, “Caching SHA-2 Pluggable Authentication”). If the user account that you create or use for replication (as specified by the MASTER_USER option) uses this authentication plugin, and you are not using a secure connection, you must enable RSA key pair-based password exchange for a successful connection.

17.1.2.4 Obtaining the Replication Source Binary Log Coordinates

To configure the replica to start the replication process at the correct point, you need to note the source's current coordinates within its binary log.

Warning

This procedure uses FLUSH TABLES WITH READ LOCK, which blocks COMMIT operations for InnoDB tables.

If you are planning to shut down the source to create a data snapshot, you can optionally skip this procedure and instead store a copy of the binary log index file along with the data snapshot. In that situation, the source creates a new binary log file on restart. The source binary log coordinates where the replica must start the replication process are therefore the start of that new file, which is the next binary log file on the source following after the files that are listed in the copied binary log index file.

To obtain the source binary log coordinates, follow these steps:

  1. Start a session on the source by connecting to it with the command-line client, and flush all tables and block write statements by executing the FLUSH TABLES WITH READ LOCK statement:

    mysql> FLUSH TABLES WITH READ LOCK;
    
    Warning

    Leave the client from which you issued the FLUSH TABLES statement running so that the read lock remains in effect. If you exit the client, the lock is released.

  2. In a different session on the source, use the SHOW MASTER STATUS statement to determine the current binary log file name and position:

    mysql > SHOW MASTER STATUS;
    +------------------+----------+--------------+------------------+
    | File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
    +------------------+----------+--------------+------------------+
    | mysql-bin.000003 | 73       | test         | manual,mysql     |
    +------------------+----------+--------------+------------------+
    

    The File column shows the name of the log file and the Position column shows the position within the file. In this example, the binary log file is mysql-bin.000003 and the position is 73. Record these values. You need them later when you are setting up the replica. They represent the replication coordinates at which the replica should begin processing new updates from the source.

    If the source has been running previously with binary logging disabled, the log file name and position values displayed by SHOW MASTER STATUS or mysqldump --master-data are empty. In that case, the values that you need to use later when specifying the source's binary log file and position are the empty string ('') and 4.

You now have the information you need to enable the replica to start reading from the source's binary log in the correct place to start replication.

The next step depends on whether you have existing data on the source. Choose one of the following options:

17.1.2.5 Choosing a Method for Data Snapshots

If the source database contains existing data it is necessary to copy this data to each replica. There are different ways to dump the data from the source database. The following sections describe possible options.

To select the appropriate method of dumping the database, choose between these options:

  • Use the mysqldump tool to create a dump of all the databases you want to replicate. This is the recommended method, especially when using InnoDB.

  • If your database is stored in binary portable files, you can copy the raw data files to a replica. This can be more efficient than using mysqldump and importing the file on each replica, because it skips the overhead of updating indexes as the INSERT statements are replayed. With storage engines such as InnoDB this is not recommended.

Tip

To deploy multiple instances of MySQL, you can use InnoDB Cluster which enables you to easily administer a group of MySQL server instances in MySQL Shell. InnoDB Cluster wraps MySQL Group Replication in a programmatic environment that enables you easily deploy a cluster of MySQL instances to achieve high availability. In addition, InnoDB Cluster interfaces seamlessly with MySQL Router, which enables your applications to connect to the cluster without writing your own failover process. For similar use cases that do not require high availability, however, you can use InnoDB ReplicaSet. Installation instructions for MySQL Shell can be found here.

17.1.2.5.1 Creating a Data Snapshot Using mysqldump

To create a snapshot of the data in an existing source database, use the mysqldump tool. Once the data dump has been completed, import this data into the replica before starting the replication process.

The following example dumps all databases to a file named dbdump.db, and includes the --master-data option which automatically appends the CHANGE MASTER TO statement required on the replica to start the replication process:

shell> mysqldump --all-databases --master-data > dbdump.db
Note

If you do not use --master-data, then it is necessary to lock all tables in a separate session manually. See Section 17.1.2.4, “Obtaining the Replication Source Binary Log Coordinates”.

It is possible to exclude certain databases from the dump using the mysqldump tool. If you want to choose which databases to include in the dump, do not use --all-databases. Choose one of these options:

  • Exclude all the tables in the database using --ignore-table option.

  • Name only those databases which you want dumped using the --databases option.

Note

By default, if GTIDs are in use on the source (gtid_mode=ON), mysqldump includes the GTIDs from the gtid_executed set on the source in the dump output to add them to the gtid_purged set on the replica. If you are dumping only specific databases or tables, it is important to note that the value that is included by mysqldump includes the GTIDs of all transactions in the gtid_executed set on the source, even those that changed suppressed parts of the database, or other databases on the server that were not included in the partial dump. Check the description for mysqldump's --set-gtid-purged option to find the outcome of the default behavior for the MySQL Server versions you are using, and how to change the behavior if this outcome is not suitable for your situation.

For more information, see Section 4.5.4, “mysqldump — A Database Backup Program”.

To import the data, either copy the dump file to the replica, or access the file from the source when connecting remotely to the replica.

17.1.2.5.2 Creating a Data Snapshot Using Raw Data Files

This section describes how to create a data snapshot using the raw files which make up the database. Employing this method with a table using a storage engine that has complex caching or logging algorithms requires extra steps to produce a perfect point in time snapshot: the initial copy command could leave out cache information and logging updates, even if you have acquired a global read lock. How the storage engine responds to this depends on its crash recovery abilities.

If you use InnoDB tables, you can use the mysqlbackup command from the MySQL Enterprise Backup component to produce a consistent snapshot. This command records the log name and offset corresponding to the snapshot to be used on the replica. MySQL Enterprise Backup is a commercial product that is included as part of a MySQL Enterprise subscription. See Section 30.2, “MySQL Enterprise Backup Overview” for detailed information.

This method also does not work reliably if the source and replica have different values for ft_stopword_file, ft_min_word_len, or ft_max_word_len and you are copying tables having full-text indexes.

Assuming the above exceptions do not apply to your database, use the cold backup technique to obtain a reliable binary snapshot of InnoDB tables: do a slow shutdown of the MySQL Server, then copy the data files manually.

To create a raw data snapshot of MyISAM tables when your MySQL data files exist on a single file system, you can use standard file copy tools such as cp or copy, a remote copy tool such as scp or rsync, an archiving tool such as zip or tar, or a file system snapshot tool such as dump. If you are replicating only certain databases, copy only those files that relate to those tables. For InnoDB, all tables in all databases are stored in the system tablespace files, unless you have the innodb_file_per_table option enabled.

The following files are not required for replication:

  • Files relating to the mysql database.

  • The replica's connection metadata repository file master.info, if used; the use of this file is now deprecated (see Section 17.2.4, “Relay Log and Replication Metadata Repositories”).

  • The source's binary log files, with the exception of the binary log index file if you are going to use this to locate the source binary log coordinates for the replica.

  • Any relay log files.

Depending on whether you are using InnoDB tables or not, choose one of the following:

If you are using InnoDB tables, and also to get the most consistent results with a raw data snapshot, shut down the source server during the process, as follows:

  1. Acquire a read lock and get the source's status. See Section 17.1.2.4, “Obtaining the Replication Source Binary Log Coordinates”.

  2. In a separate session, shut down the source server:

    shell> mysqladmin shutdown
    
  3. Make a copy of the MySQL data files. The following examples show common ways to do this. You need to choose only one of them:

    shell> tar cf /tmp/db.tar ./data
    shell> zip -r /tmp/db.zip ./data
    shell> rsync --recursive ./data /tmp/dbdata
    
  4. Restart the source server.

If you are not using InnoDB tables, you can get a snapshot of the system from a source without shutting down the server as described in the following steps:

  1. Acquire a read lock and get the source's status. See Section 17.1.2.4, “Obtaining the Replication Source Binary Log Coordinates”.

  2. Make a copy of the MySQL data files. The following examples show common ways to do this. You need to choose only one of them:

    shell> tar cf /tmp/db.tar ./data
    shell> zip -r /tmp/db.zip ./data
    shell> rsync --recursive ./data /tmp/dbdata
    
  3. In the client where you acquired the read lock, release the lock:

    mysql> UNLOCK TABLES;
    

Once you have created the archive or copy of the database, copy the files to each replica before starting the replication process.

17.1.2.6 Setting Up Replicas

The following sections describe how to set up replicas. Before you proceed, ensure that you have:

The next steps depend on whether you have existing data to import to the replica or not. See Section 17.1.2.5, “Choosing a Method for Data Snapshots” for more information. Choose one of the following:

17.1.2.6.1 Setting Up Replication with New Source and Replicas

When there is no snapshot of a previous database to import, configure the replica to start replication from the new source.

To set up replication between a source and a new replica:

  1. Start up the replica.

  2. Execute a CHANGE MASTER TO statement on the replica to set the source configuration. See Section 17.1.2.7, “Setting the Source Configuration on the Replica”.

Perform these replica setup steps on each replica.

This method can also be used if you are setting up new servers but have an existing dump of the databases from a different server that you want to load into your replication configuration. By loading the data into a new source, the data is automatically replicated to the replicas.

If you are setting up a new replication environment using the data from a different existing database server to create a new source, run the dump file generated from that server on the new source. The database updates are automatically propagated to the replicas:

shell> mysql -h source < fulldb.dump
17.1.2.6.2 Setting Up Replication with Existing Data

When setting up replication with existing data, transfer the snapshot from the source to the replica before starting replication. The process for importing data to the replica depends on how you created the snapshot of data on the source.

Tip

To deploy multiple instances of MySQL, you can use InnoDB Cluster which enables you to easily administer a group of MySQL server instances in MySQL Shell. InnoDB Cluster wraps MySQL Group Replication in a programmatic environment that enables you easily deploy a cluster of MySQL instances to achieve high availability. In addition, InnoDB Cluster interfaces seamlessly with MySQL Router, which enables your applications to connect to the cluster without writing your own failover process. For similar use cases that do not require high availability, however, you can use InnoDB ReplicaSet. Installation instructions for MySQL Shell can be found here.

Note

If the replication source server or existing replica that you are copying to create the new replica has any scheduled events, ensure that these are disabled on the new replica before you start it. If an event runs on the new replica that has already run on the source, the duplicated operation causes an error. The Event Scheduler is controlled by the event_scheduler system variable, which defaults to ON from MySQL 8.0, so events that are active on the original server run by default when the new replica starts up. To stop all events from running on the new replica, set the event_scheduler system variable to OFF or DISABLED on the new replica. Alternatively, you can use the ALTER EVENT statement to set individual events to DISABLE or DISABLE ON SLAVE to prevent them from running on the new replica. You can list the events on a server using the SHOW statement or the Information Schema EVENTS table. For more information, see Section 17.5.1.16, “Replication of Invoked Features”.

Choose one of the following procedures to import the data to the replica.

If you used mysqldump:

  1. Start the replica, using the --skip-slave-start option so that replication does not start.

  2. Import the dump file:

    shell> mysql < fulldb.dump
    

If you created a snapshot using the raw data files:

  1. Extract the data files into your replica's data directory. For example:

    shell> tar xvf dbdump.tar
    

    You may need to set permissions and ownership on the files so that the replica server can access and modify them.

  2. Start the replica, using the --skip-slave-start option so that replication does not start.

  3. Configure the replica with the replication coordinates from the source. This tells the replica the binary log file and position within the file where replication needs to start. Also, configure the replica with the login credentials and host name of the source. For more information on the CHANGE MASTER TO statement required, see Section 17.1.2.7, “Setting the Source Configuration on the Replica”.

  4. Start the replication threads by issuing a START REPLICA | SLAVE statement.

After you have performed this procedure, the replica connects to the source and replicates any updates that have occurred on the source since the snapshot was taken. Error messages are issued to the replica's error log if it is not able to replicate for any reason.

The replica uses information logged in its connection metadata repository and applier metadata repository to keep track of how much of the source's binary log it has processed. From MySQL 8.0, by default, these repositories are tables named slave_master_info and slave_relay_log_info in the mysql database. Do not remove or edit these tables unless you know exactly what you are doing and fully understand the implications. Even in that case, it is preferred that you use the CHANGE MASTER TO statement to change replication parameters. The replica uses the values specified in the statement to update the replication metadata repositories automatically. See Section 17.2.4, “Relay Log and Replication Metadata Repositories”, for more information.

Note

The contents of the replica's connection metadata repository override some of the server options specified on the command line or in my.cnf. See Section 17.1.6, “Replication and Binary Logging Options and Variables”, for more details.

A single snapshot of the source suffices for multiple replicas. To set up additional replicas, use the same source snapshot and follow the replica portion of the procedure just described.

17.1.2.7 Setting the Source Configuration on the Replica

To set up the replica to communicate with the source for replication, configure the replica with the necessary connection information. To do this, execute the following statement on the replica, replacing the option values with the actual values relevant to your system:

mysql> CHANGE MASTER TO
    ->     MASTER_HOST='source_host_name',
    ->     MASTER_USER='replication_user_name',
    ->     MASTER_PASSWORD='replication_password',
    ->     MASTER_LOG_FILE='recorded_log_file_name',
    ->     MASTER_LOG_POS=recorded_log_position;
Note

Replication cannot use Unix socket files. You must be able to connect to the source MySQL server using TCP/IP.

The CHANGE MASTER TO statement has other options as well. For example, it is possible to set up secure replication using SSL. For a full list of options, and information about the maximum permissible length for the string-valued options, see Section 13.4.2.1, “CHANGE MASTER TO Statement”.

Important

As noted in Section 17.1.2.3, “Creating a User for Replication”, if you are not using a secure connection and the user account named in the MASTER_USER option authenticates with the caching_sha2_password plugin (the default from MySQL 8.0), you must specify the MASTER_PUBLIC_KEY_PATH or GET_MASTER_PUBLIC_KEY option in the CHANGE MASTER TO statement to enable RSA key pair-based password exchange.

17.1.2.8 Adding Replicas to a Replication Environment

You can add another replica to an existing replication configuration without stopping the source server. To do this, you can set up the new replica by copying the data directory of an existing replica, and giving the new replica a different server ID (which is user-specified) and server UUID (which is generated at startup).

Note

If the replication source server or existing replica that you are copying to create the new replica has any scheduled events, ensure that these are disabled on the new replica before you start it. If an event runs on the new replica that has already run on the source, the duplicated operation causes an error. The Event Scheduler is controlled by the event_scheduler system variable, which defaults to ON from MySQL 8.0, so events that are active on the original server run by default when the new replica starts up. To stop all events from running on the new replica, set the event_scheduler system variable to OFF or DISABLED on the new replica. Alternatively, you can use the ALTER EVENT statement to set individual events to DISABLE or DISABLE ON SLAVE to prevent them from running on the new replica. You can list the events on a server using the SHOW statement or the Information Schema EVENTS table. For more information, see Section 17.5.1.16, “Replication of Invoked Features”.

To duplicate an existing replica:

  1. Stop the existing replica and record the replica status information, particularly the source binary log file and relay log file positions. You can view the replica status either in the Performance Schema replication tables (see Section 27.12.11, “Performance Schema Replication Tables”), or by issuing SHOW REPLICA | SLAVE STATUS as follows:

    mysql> STOP SLAVE;
    mysql> SHOW SLAVE STATUS\G
    Or from MySQL 8.0.22:
    mysql> STOP REPLICA;
    mysql> SHOW REPLICA STATUS\G
    
  2. Shut down the existing replica:

    shell> mysqladmin shutdown
    
  3. Copy the data directory from the existing replica to the new replica, including the log files and relay log files. You can do this by creating an archive using tar or WinZip, or by performing a direct copy using a tool such as cp or rsync.

    Important
    • Before copying, verify that all the files relating to the existing replica actually are stored in the data directory. For example, the InnoDB system tablespace, undo tablespace, and redo log might be stored in an alternative location. InnoDB tablespace files and file-per-table tablespaces might have been created in other directories. The binary logs and relay logs for the replica might be in their own directories outside the data directory. Check through the system variables that are set for the existing replica and look for any alternative paths that have been specified. If you find any, copy these directories over as well.

    • During copying, if files have been used for the replication metadata repositories (see Section 17.2.4, “Relay Log and Replication Metadata Repositories”), ensure that you also copy these files from the existing replica to the new replica. If tables have been used for the repositories, which is the default from MySQL 8.0, the tables are in the data directory.

    • After copying, delete the auto.cnf file from the copy of the data directory on the new replica, so that the new replica is started with a different generated server UUID. The server UUID must be unique.

    A common problem that is encountered when adding new replicas is that the new replica fails with a series of warning and error messages like these:

    071118 16:44:10 [Warning] Neither --relay-log nor --relay-log-index were used; so
    replication may break when this MySQL server acts as a replica and has his hostname
    changed!! Please use '--relay-log=new_replica_hostname-relay-bin' to avoid this problem.
    071118 16:44:10 [ERROR] Failed to open the relay log './old_replica_hostname-relay-bin.003525'
    (relay_log_pos 22940879)
    071118 16:44:10 [ERROR] Could not find target log during relay log initialization
    071118 16:44:10 [ERROR] Failed to initialize the master info structure
    

    This situation can occur if the relay_log system variable is not specified, as the relay log files contain the host name as part of their file names. This is also true of the relay log index file if the relay_log_index system variable is not used. For more information about these variables, see Section 17.1.6, “Replication and Binary Logging Options and Variables”.

    To avoid this problem, use the same value for relay_log on the new replica that was used on the existing replica. If this option was not set explicitly on the existing replica, use existing_replica_hostname-relay-bin. If this is not possible, copy the existing replica's relay log index file to the new replica and set the relay_log_index system variable on the new replica to match what was used on the existing replica. If this option was not set explicitly on the existing replica, use existing_replica_hostname-relay-bin.index. Alternatively, if you have already tried to start the new replica after following the remaining steps in this section and have encountered errors like those described previously, then perform the following steps:

    1. If you have not already done so, issue STOP REPLICA | SLAVE on the new replica.

      If you have already started the existing replica again, issue STOP REPLICA | SLAVE on the existing replica as well.

    2. Copy the contents of the existing replica's relay log index file into the new replica's relay log index file, making sure to overwrite any content already in the file.

    3. Proceed with the remaining steps in this section.

  4. When copying is complete, restart the existing replica.

  5. On the new replica, edit the configuration and give the new replica a unique server ID (using the server_id system variable) that is not used by the source or any of the existing replicas.

  6. Start the new replica server, specifying the --skip-slave-start option so that replication does not start yet. Use the Performance Schema replication tables or issue SHOW REPLICA | SLAVE STATUS to confirm that the new replica has the correct settings when compared with the existing replica. Also display the server ID and server UUID and verify that these are correct and unique for the new replica.

  7. Start the replica threads by issuing a START REPLICA | SLAVE statement. The new replica now uses the information in its connection metadata repository to start the replication process.

17.1.3 Replication with Global Transaction Identifiers

This section explains transaction-based replication using global transaction identifiers (GTIDs). When using GTIDs, each transaction can be identified and tracked as it is committed on the originating server and applied by any replicas; this means that it is not necessary when using GTIDs to refer to log files or positions within those files when starting a new replica or failing over to a new source, which greatly simplifies these tasks. Because GTID-based replication is completely transaction-based, it is simple to determine whether sources and replicas are consistent; as long as all transactions committed on a source are also committed on a replica, consistency between the two is guaranteed. You can use either statement-based or row-based replication with GTIDs (see Section 17.2.1, “Replication Formats”); however, for best results, we recommend that you use the row-based format.

GTIDs are always preserved between source and replica. This means that you can always determine the source for any transaction applied on any replica by examining its binary log. In addition, once a transaction with a given GTID is committed on a given server, any subsequent transaction having the same GTID is ignored by that server. Thus, a transaction committed on the source can be applied no more than once on the replica, which helps to guarantee consistency.

This section discusses the following topics:

For information about MySQL Server options and variables relating to GTID-based replication, see Section 17.1.6.5, “Global Transaction ID System Variables”. See also Section 12.19, “Functions Used with Global Transaction Identifiers (GTIDs)”, which describes SQL functions supported by MySQL 8.0 for use with GTIDs.

17.1.3.1 GTID Format and Storage

A global transaction identifier (GTID) is a unique identifier created and associated with each transaction committed on the server of origin (the source). This identifier is unique not only to the server on which it originated, but is unique across all servers in a given replication topology.

GTID assignment distinguishes between client transactions, which are committed on the source, and replicated transactions, which are reproduced on a replica. When a client transaction is committed on the source, it is assigned a new GTID, provided that the transaction was written to the binary log. Client transactions are guaranteed to have monotonically increasing GTIDs without gaps between the generated numbers. If a client transaction is not written to the binary log (for example, because the transaction was filtered out, or the transaction was read-only), it is not assigned a GTID on the server of origin.

Replicated transactions retain the same GTID that was assigned to the transaction on the server of origin. The GTID is present before the replicated transaction begins to execute, and is persisted even if the replicated transaction is not written to the binary log on the replica, or is filtered out on the replica. The MySQL system table mysql.gtid_executed is used to preserve the assigned GTIDs of all the transactions applied on a MySQL server, except those that are stored in a currently active binary log file.

The auto-skip function for GTIDs means that a transaction committed on the source can be applied no more than once on the replica, which helps to guarantee consistency. Once a transaction with a given GTID has been committed on a given server, any attempt to execute a subsequent transaction with the same GTID is ignored by that server. No error is raised, and no statement in the transaction is executed.

If a transaction with a given GTID has started to execute on a server, but has not yet committed or rolled back, any attempt to start a concurrent transaction on the server with the same GTID blocks. The server neither begins to execute the concurrent transaction nor returns control to the client. Once the first attempt at the transaction commits or rolls back, concurrent sessions that were blocking on the same GTID may proceed. If the first attempt rolled back, one concurrent session proceeds to attempt the transaction, and any other concurrent sessions that were blocking on the same GTID remain blocked. If the first attempt committed, all the concurrent sessions stop being blocked, and auto-skip all the statements of the transaction.

A GTID is represented as a pair of coordinates, separated by a colon character (:), as shown here:

GTID = source_id:transaction_id

The source_id identifies the originating server. Normally, the source's server_uuid is used for this purpose. The transaction_id is a sequence number determined by the order in which the transaction was committed on the source. For example, the first transaction to be committed has 1 as its transaction_id, and the tenth transaction to be committed on the same originating server is assigned a transaction_id of 10. It is not possible for a transaction to have 0 as a sequence number in a GTID. For example, the twenty-third transaction to be committed originally on the server with the UUID 3E11FA47-71CA-11E1-9E33-C80AA9429562 has this GTID:

3E11FA47-71CA-11E1-9E33-C80AA9429562:23

The GTID for a transaction is shown in the output from mysqlbinlog, and it is used to identify an individual transaction in the Performance Schema replication status tables, for example, replication_applier_status_by_worker. The value stored by the gtid_next system variable (@@GLOBAL.gtid_next) is a single GTID.

GTID Sets

A GTID set is a set comprising one or more single GTIDs or ranges of GTIDs. GTID sets are used in a MySQL server in several ways. For example, the values stored by the gtid_executed and gtid_purged system variables are GTID sets. The START REPLICA | SLAVE clauses UNTIL SQL_BEFORE_GTIDS and UNTIL SQL_AFTER_GTIDS can be used to make a replica process transactions only up to the first GTID in a GTID set, or stop after the last GTID in a GTID set. The built-in functions GTID_SUBSET() and GTID_SUBTRACT() require GTID sets as input.

A range of GTIDs originating from the same server can be collapsed into a single expression, as shown here:

3E11FA47-71CA-11E1-9E33-C80AA9429562:1-5

The above example represents the first through fifth transactions originating on the MySQL server whose server_uuid is 3E11FA47-71CA-11E1-9E33-C80AA9429562. Multiple single GTIDs or ranges of GTIDs originating from the same server can also be included in a single expression, with the GTIDs or ranges separated by colons, as in the following example:

3E11FA47-71CA-11E1-9E33-C80AA9429562:1-3:11:47-49

A GTID set can include any combination of single GTIDs and ranges of GTIDs, and it can include GTIDs originating from different servers. This example shows the GTID set stored in the gtid_executed system variable (@@GLOBAL.gtid_executed) of a replica that has applied transactions from more than one source:

2174B383-5441-11E8-B90A-C80AA9429562:1-3, 24DA167-0C0C-11E8-8442-00059A3C7B00:1-19

When GTID sets are returned from server variables, UUIDs are in alphabetical order, and numeric intervals are merged and in ascending order.

The syntax for a GTID set is as follows:

gtid_set:
    uuid_set [, uuid_set] ...
    | ''

uuid_set:
    uuid:interval[:interval]...

uuid:
    hhhhhhhh-hhhh-hhhh-hhhh-hhhhhhhhhhhh

h:
    [0-9|A-F]

interval:
    n[-n]

    (n >= 1)
mysql.gtid_executed Table

GTIDs are stored in a table named gtid_executed, in the mysql database. A row in this table contains, for each GTID or set of GTIDs that it represents, the UUID of the originating server, and the starting and ending transaction IDs of the set; for a row referencing only a single GTID, these last two values are the same.

The mysql.gtid_executed table is created (if it does not already exist) when MySQL Server is installed or upgraded, using a CREATE TABLE statement similar to that shown here:

CREATE TABLE gtid_executed (
    source_uuid CHAR(36) NOT NULL,
    interval_start BIGINT(20) NOT NULL,
    interval_end BIGINT(20) NOT NULL,
    PRIMARY KEY (source_uuid, interval_start)
)
Warning

As with other MySQL system tables, do not attempt to create or modify this table yourself.

The mysql.gtid_executed table is provided for internal use by the MySQL server. It enables a replica to use GTIDs when binary logging is disabled on the replica, and it enables retention of the GTID state when the binary logs have been lost. Note that the mysql.gtid_executed table is cleared if you issue RESET MASTER.

GTIDs are stored in the mysql.gtid_executed table only when gtid_mode is ON or ON_PERMISSIVE. If binary logging is disabled (log_bin is OFF), or if log_slave_updates is disabled, the server stores the GTID belonging to each transaction together with the transaction in the mysql.gtid_executed table at transaction commit time. In addition, the table is compressed periodically at a user-configurable rate, as described in mysql.gtid_executed Table Compression.

If binary logging is enabled (log_bin is ON), from MySQL 8.0.17 for the InnoDB storage engine only, the server updates the mysql.gtid_executed table in the same way as when binary logging or replica update logging is disabled, storing the GTID for each transaction at transaction commit time. However, in releases before MySQL 8.0.17, and for other storage engines, the server only updates the mysql.gtid_executed table when the binary log is rotated or the server is shut down. At these times, the server writes GTIDs for all transactions that were written into the previous binary log into the mysql.gtid_executed table. This situation applies on a source prior to MySQL 8.0.17, or on a replica prior to MySQL 8.0.17 where binary logging is enabled, or with storage engines other than InnoDB, it has the following consequences:

  • In the event of the server stopping unexpectedly, the set of GTIDs from the current binary log file is not saved in the mysql.gtid_executed table. These GTIDs are added to the table from the binary log file during recovery so that replication can continue. The exception to this is if you disable binary logging when the server is restarted (using --skip-log-bin or --disable-log-bin). In that case, the server cannot access the binary log file to recover the GTIDs, so replication cannot be started.

  • The mysql.gtid_executed table does not hold a complete record of the GTIDs for all executed transactions. That information is provided by the global value of the gtid_executed system variable. In releases before MySQL 8.0.17 and with storage engines other than InnoDB, always use @@GLOBAL.gtid_executed, which is updated after every commit, to represent the GTID state for the MySQL server, instead of querying the mysql.gtid_executed table.

The MySQL server can write to the mysql.gtid_executed table even when the server is in read only or super read only mode. In releases before MySQL 8.0.17, this ensures that the binary log file can still be rotated in these modes. If the mysql.gtid_executed table cannot be accessed for writes, and the binary log file is rotated for any reason other than reaching the maximum file size (max_binlog_size), the current binary log file continues to be used. An error message is returned to the client that requested the rotation, and a warning is logged on the server. If the mysql.gtid_executed table cannot be accessed for writes and max_binlog_size is reached, the server responds according to its binlog_error_action setting. If IGNORE_ERROR is set, an error is logged on the server and binary logging is halted, or if ABORT_SERVER is set, the server shuts down.

mysql.gtid_executed Table Compression

Over the course of time, the mysql.gtid_executed table can become filled with many rows referring to individual GTIDs that originate on the same server, and whose transaction IDs make up a range, similar to what is shown here:

+--------------------------------------+----------------+--------------+
| source_uuid                          | interval_start | interval_end |
|--------------------------------------+----------------+--------------|
| 3E11FA47-71CA-11E1-9E33-C80AA9429562 | 37             | 37           |
| 3E11FA47-71CA-11E1-9E33-C80AA9429562 | 38             | 38           |
| 3E11FA47-71CA-11E1-9E33-C80AA9429562 | 39             | 39           |
| 3E11FA47-71CA-11E1-9E33-C80AA9429562 | 40             | 40           |
| 3E11FA47-71CA-11E1-9E33-C80AA9429562 | 41             | 41           |
| 3E11FA47-71CA-11E1-9E33-C80AA9429562 | 42             | 42           |
| 3E11FA47-71CA-11E1-9E33-C80AA9429562 | 43             | 43           |
...

To save space, the MySQL server can compress the mysql.gtid_executed table periodically by replacing each such set of rows with a single row that spans the entire interval of transaction identifiers, like this:

+--------------------------------------+----------------+--------------+
| source_uuid                          | interval_start | interval_end |
|--------------------------------------+----------------+--------------|
| 3E11FA47-71CA-11E1-9E33-C80AA9429562 | 37             | 43           |
...

The server can carry out compression using a dedicated foreground thread named thread/sql/compress_gtid_table. This thread is not listed in the output of SHOW PROCESSLIST, but it can be viewed as a row in the threads table, as shown here:

mysql> SELECT * FROM performance_schema.threads WHERE NAME LIKE '%gtid%'\G
*************************** 1. row ***************************
          THREAD_ID: 26
               NAME: thread/sql/compress_gtid_table
               TYPE: FOREGROUND
     PROCESSLIST_ID: 1
   PROCESSLIST_USER: NULL
   PROCESSLIST_HOST: NULL
     PROCESSLIST_DB: NULL
PROCESSLIST_COMMAND: Daemon
   PROCESSLIST_TIME: 1509
  PROCESSLIST_STATE: Suspending
   PROCESSLIST_INFO: NULL
   PARENT_THREAD_ID: 1
               ROLE: NULL
       INSTRUMENTED: YES
            HISTORY: YES
    CONNECTION_TYPE: NULL
       THREAD_OS_ID: 18677

When binary logging is enabled on the server, this compression method is not used, and instead the mysql.gtid_executed table is compressed on each binary log rotation. However, when binary logging is disabled on the server, the thread/sql/compress_gtid_table thread sleeps until a specified number of transactions have been executed, then wakes up to perform compression of the mysql.gtid_executed table. It then sleeps until the same number of transactions have taken place, then wakes up to perform the compression again, repeating this loop indefinitely. The number of transactions that elapse before the table is compressed, and thus the compression rate, is controlled by the value of the gtid_executed_compression_period system variable. Setting that value to 0 means that the thread never wakes up, meaning that this explicit compression method is not used. Instead, compression occurs implicitly as required.

From MySQL 8.0.17, InnoDB transactions are written to the mysql.gtid_executed table by a separate process to non-InnoDB transactions. This process is controlled by a different thread, innodb/clone_gtid_thread. This GTID persister thread collects GTIDs in groups, flushes them to the mysql.gtid_executed table, then compresses the table. If the server has a mix of InnoDB transactions and non-InnoDB transactions, which are written to the mysql.gtid_executed table individually, the compression carried out by the compress_gtid_table thread interferes with the work of the GTID persister thread and can slow it significantly. For this reason, from that release it is recommended that you set gtid_executed_compression_period to 0, so that the compress_gtid_table thread is never activated.

From MySQL 8.0.23, the gtid_executed_compression_period default value is 0, and both InnoDB and non-InnoDB transactions are written to the mysql.gtid_executed table by the GTID persister thread.

For releases before MySQL 8.0.17, the default value of 1000 for gtid_executed_compression_period can be used, meaning that compression of the table is performed after each 1000 transactions, or you can choose an alternative value. In those releases, if you set a value of 0 and binary logging is disabled, explicit compression is not performed on the mysql.gtid_executed table, and you should be prepared for a potentially large increase in the amount of disk space that may be required by the table if you do this.

When a server instance is started, if gtid_executed_compression_period is set to a nonzero value and the thread/sql/compress_gtid_table thread is launched, in most server configurations, explicit compression is performed for the mysql.gtid_executed table. In releases before MySQL 8.0.17 when binary logging is enabled, compression is triggered by the fact of the binary log being rotated at startup. In releases from MySQL 8.0.20, compression is triggered by the thread launch. In the intervening releases, compression does not take place at startup.

17.1.3.2 GTID Life Cycle

The life cycle of a GTID consists of the following steps:

  1. A transaction is executed and committed on the source. This client transaction is assigned a GTID composed of the source's UUID and the smallest nonzero transaction sequence number not yet used on this server. The GTID is written to the source's binary log (immediately preceding the transaction itself in the log). If a client transaction is not written to the binary log (for example, because the transaction was filtered out, or the transaction was read-only), it is not assigned a GTID.

  2. If a GTID was assigned for the transaction, the GTID is persisted atomically at commit time by writing it to the binary log at the beginning of the transaction (as a Gtid_log_event). Whenever the binary log is rotated or the server is shut down, the server writes GTIDs for all transactions that were written into the previous binary log file into the mysql.gtid_executed table.

  3. If a GTID was assigned for the transaction, the GTID is externalized non-atomically (very shortly after the transaction is committed) by adding it to the set of GTIDs in the gtid_executed system variable (@@GLOBAL.gtid_executed). This GTID set contains a representation of the set of all committed GTID transactions, and it is used in replication as a token that represents the server state. With binary logging enabled (as required for the source), the set of GTIDs in the gtid_executed system variable is a complete record of the transactions applied, but the mysql.gtid_executed table is not, because the most recent history is still in the current binary log file.

  4. After the binary log data is transmitted to the replica and stored in the replica's relay log (using established mechanisms for this process, see Section 17.2, “Replication Implementation”, for details), the replica reads the GTID and sets the value of its gtid_next system variable as this GTID. This tells the replica that the next transaction must be logged using this GTID. It is important to note that the replica sets gtid_next in a session context.

  5. The replica verifies that no thread has yet taken ownership of the GTID in gtid_next in order to process the transaction. By reading and checking the replicated transaction's GTID first, before processing the transaction itself, the replica guarantees not only that no previous transaction having this GTID has been applied on the replica, but also that no other session has already read this GTID but has not yet committed the associated transaction. So if multiple clients attempt to apply the same transaction concurrently, the server resolves this by letting only one of them execute. The gtid_owned system variable (@@GLOBAL.gtid_owned) for the replica shows each GTID that is currently in use and the ID of the thread that owns it. If the GTID has already been used, no error is raised, and the auto-skip function is used to ignore the transaction.

  6. If the GTID has not been used, the replica applies the replicated transaction. Because gtid_next is set to the GTID already assigned by the source, the replica does not attempt to generate a new GTID for this transaction, but instead uses the GTID stored in gtid_next.

  7. If binary logging is enabled on the replica, the GTID is persisted atomically at commit time by writing it to the binary log at the beginning of the transaction (as a Gtid_log_event). Whenever the binary log is rotated or the server is shut down, the server writes GTIDs for all transactions that were written into the previous binary log file into the mysql.gtid_executed table.

  8. If binary logging is disabled on the replica, the GTID is persisted atomically by writing it directly into the mysql.gtid_executed table. MySQL appends a statement to the transaction to insert the GTID into the table. From MySQL 8.0, this operation is atomic for DDL statements as well as for DML statements. In this situation, the mysql.gtid_executed table is a complete record of the transactions applied on the replica.

  9. Very shortly after the replicated transaction is committed on the replica, the GTID is externalized non-atomically by adding it to the set of GTIDs in the gtid_executed system variable (@@GLOBAL.gtid_executed) for the replica. As for the source, this GTID set contains a representation of the set of all committed GTID transactions. If binary logging is disabled on the replica, the mysql.gtid_executed table is also a complete record of the transactions applied on the replica. If binary logging is enabled on the replica, meaning that some GTIDs are only recorded in the binary log, the set of GTIDs in the gtid_executed system variable is the only complete record.

Client transactions that are completely filtered out on the source are not assigned a GTID, therefore they are not added to the set of transactions in the gtid_executed system variable, or added to the mysql.gtid_executed table. However, the GTIDs of replicated transactions that are completely filtered out on the replica are persisted. If binary logging is enabled on the replica, the filtered-out transaction is written to the binary log as a Gtid_log_event followed by an empty transaction containing only BEGIN and COMMIT statements. If binary logging is disabled, the GTID of the filtered-out transaction is written to the mysql.gtid_executed table. Preserving the GTIDs for filtered-out transactions ensures that the mysql.gtid_executed table and the set of GTIDs in the gtid_executed system variable can be compressed. It also ensures that the filtered-out transactions are not retrieved again if the replica reconnects to the source, as explained in Section 17.1.3.3, “GTID Auto-Positioning”.

On a multithreaded replica (with slave_parallel_workers > 0 ), transactions can be applied in parallel, so replicated transactions can commit out of order (unless slave_preserve_commit_order=1 is set). When that happens, the set of GTIDs in the gtid_executed system variable contains multiple GTID ranges with gaps between them. (On a source or a single-threaded replica, there are monotonically increasing GTIDs without gaps between the numbers.) Gaps on multithreaded replicas only occur among the most recently applied transactions, and are filled in as replication progresses. When replication threads are stopped cleanly using the STOP REPLICA | SLAVE statement, ongoing transactions are applied so that the gaps are filled in. In the event of a shutdown such as a server failure or the use of the KILL statement to stop replication threads, the gaps might remain.

What changes are assigned a GTID?

The typical scenario is that the server generates a new GTID for a committed transaction. However, GTIDs can also be assigned to other changes besides transactions, and in some cases a single transaction can be assigned multiple GTIDs.

Every database change (DDL or DML) that is written to the binary log is assigned a GTID. This includes changes that are autocommitted, and changes that are committed using BEGIN and COMMIT or START TRANSACTION statements. A GTID is also assigned to the creation, alteration, or deletion of a database, and of a non-table database object such as a procedure, function, trigger, event, view, user, role, or grant.

Non-transactional updates as well as transactional updates are assigned GTIDs. In addition, for a non-transactional update, if a disk write failure occurs while attempting to write to the binary log cache and a gap is therefore created in the binary log, the resulting incident log event is assigned a GTID.

When a table is automatically dropped by a generated statement in the binary log, a GTID is assigned to the statement. Temporary tables are dropped automatically when a replica begins to apply events from a source that has just been started, and when statement-based replication is in use (binlog_format=STATEMENT) and a user session that has open temporary tables disconnects. Tables that use the MEMORY storage engine are deleted automatically the first time they are accessed after the server is started, because rows might have been lost during the shutdown.

When a transaction is not written to the binary log on the server of origin, the server does not assign a GTID to it. This includes transactions that are rolled back and transactions that are executed while binary logging is disabled on the server of origin, either globally (with --skip-log-bin specified in the server's configuration) or for the session (SET @@SESSION.sql_log_bin = 0). This also includes no-op transactions when row-based replication is in use (binlog_format=ROW).

XA transactions are assigned separate GTIDs for the XA PREPARE phase of the transaction and the XA COMMIT or XA ROLLBACK phase of the transaction. XA transactions are persistently prepared so that users can commit them or roll them back in the case of a failure (which in a replication topology might include a failover to another server). The two parts of the transaction are therefore replicated separately, so they must have their own GTIDs, even though a non-XA transaction that is rolled back would not have a GTID.

In the following special cases, a single statement can generate multiple transactions, and therefore be assigned multiple GTIDs:

  • A stored procedure is invoked that commits multiple transactions. One GTID is generated for each transaction that the procedure commits.

  • A multi-table DROP TABLE statement drops tables of different types. Multiple GTIDs can be generated if any of the tables use storage engines that do not support atomic DDL, or if any of the tables are temporary tables.

  • A CREATE TABLE ... SELECT statement is issued when row-based replication is in use (binlog_format=ROW). One GTID is generated for the CREATE TABLE action and one GTID is generated for the row-insert actions.

The gtid_next System Variable

By default, for new transactions committed in user sessions, the server automatically generates and assigns a new GTID. When the transaction is applied on a replica, the GTID from the server of origin is preserved. You can change this behavior by setting the session value of the gtid_next system variable:

  • When gtid_next is set to AUTOMATIC, which is the default, and a transaction is committed and written to the binary log, the server automatically generates and assigns a new GTID. If a transaction is rolled back or not written to the binary log for another reason, the server does not generate and assign a GTID.

  • If you set gtid_next to a valid GTID (consisting of a UUID and a transaction sequence number, separated by a colon), the server assigns that GTID to your transaction. This GTID is assigned and added to gtid_executed even when the transaction is not written to the binary log, or when the transaction is empty.

Note that after you set gtid_next to a specific GTID, and the transaction has been committed or rolled back, an explicit SET @@SESSION.gtid_next statement must be issued before any other statement. You can use this to set the GTID value back to AUTOMATIC if you do not want to assign any more GTIDs explicitly.

When replication applier threads apply replicated transactions, they use this technique, setting @@SESSION.gtid_next explicitly to the GTID of the replicated transaction as assigned on the server of origin. This means the GTID from the server of origin is retained, rather than a new GTID being generated and assigned by the replica. It also means the GTID is added to gtid_executed on the replica even when binary logging or replica update logging is disabled on the replica, or when the transaction is a no-op or is filtered out on the replica.

It is possible for a client to simulate a replicated transaction by setting @@SESSION.gtid_next to a specific GTID before executing the transaction. This technique is used by mysqlbinlog to generate a dump of the binary log that the client can replay to preserve GTIDs. A simulated replicated transaction committed through a client is completely equivalent to a replicated transaction committed through a replication applier thread, and they cannot be distinguished after the fact.

The gtid_purged System Variable

The set of GTIDs in the gtid_purged system variable (@@GLOBAL.gtid_purged) contains the GTIDs of all the transactions that have been committed on the server, but do not exist in any binary log file on the server. gtid_purged is a subset of gtid_executed. The following categories of GTIDs are in gtid_purged:

  • GTIDs of replicated transactions that were committed with binary logging disabled on the replica.

  • GTIDs of transactions that were written to a binary log file that has now been purged.

  • GTIDs that were added explicitly to the set by the statement SET @@GLOBAL.gtid_purged.

You can change the value of gtid_purged in order to record on the server that the transactions in a certain GTID set have been applied, although they do not exist in any binary log on the server. When you add GTIDs to gtid_purged, they are also added to gtid_executed. An example use case for this action is when you are restoring a backup of one or more databases on a server, but you do not have the relevant binary logs containing the transactions on the server. Before MySQL 8.0, you could only change the value of gtid_purged when gtid_executed (and therefore gtid_purged) was empty. From MySQL 8.0, this restriction does not apply, and you can also choose whether to replace the whole GTID set in gtid_purged with a specified GTID set, or to add a specified GTID set to the GTIDs already in gtid_purged. For details of how to do this, see the description for gtid_purged.

The sets of GTIDs in the gtid_executed and gtid_purged system variables are initialized when the server starts. Every binary log file begins with the event Previous_gtids_log_event, which contains the set of GTIDs in all previous binary log files (composed from the GTIDs in the preceding file's Previous_gtids_log_event, and the GTIDs of every Gtid_log_event in the preceding file itself). The contents of Previous_gtids_log_event in the oldest and most recent binary log files are used to compute the gtid_executed and gtid_purged sets at server startup:

  • gtid_executed is computed as the union of the GTIDs in Previous_gtids_log_event in the most recent binary log file, the GTIDs of transactions in that binary log file, and the GTIDs stored in the mysql.gtid_executed table. This GTID set contains all the GTIDs that have been used (or added explicitly to gtid_purged) on the server, whether or not they are currently in a binary log file on the server. It does not include the GTIDs for transactions that are currently being processed on the server (@@GLOBAL.gtid_owned).

  • gtid_purged is computed by first adding the GTIDs in Previous_gtids_log_event in the most recent binary log file and the GTIDs of transactions in that binary log file. This step gives the set of GTIDs that are currently, or were once, recorded in a binary log on the server (gtids_in_binlog). Next, the GTIDs in Previous_gtids_log_event in the oldest binary log file are subtracted from gtids_in_binlog. This step gives the set of GTIDs that are currently recorded in a binary log on the server (gtids_in_binlog_not_purged). Finally, gtids_in_binlog_not_purged is subtracted from gtid_executed. The result is the set of GTIDs that have been used on the server, but are not currently recorded in a binary log file on the server, and this result is used to initialize gtid_purged.

If binary logs from MySQL 5.7.7 or older are involved in these computations, it is possible for incorrect GTID sets to be computed for gtid_executed and gtid_purged, and they remain incorrect even if the server is later restarted. For details, see the description for the binlog_gtid_simple_recovery system variable, which controls how the binary logs are iterated to compute the GTID sets. If one of the situations described there applies on a server, set binlog_gtid_simple_recovery=FALSE in the server's configuration file before starting it. That setting makes the server iterate all the binary log files (not just the newest and oldest) to find where GTID events start to appear. This process could take a long time if the server has a large number of binary log files without GTID events.

Resetting the GTID Execution History

If you need to reset the GTID execution history on a server, use the RESET MASTER statement. For example, you might need to do this after carrying out test queries to verify a replication setup on new GTID-enabled servers, or when you want to join a new server to a replication group but it contains some unwanted local transactions that are not accepted by Group Replication.

Warning

Use RESET MASTER with caution to avoid losing any wanted GTID execution history and binary log files.

Before issuing RESET MASTER, ensure that you have backups of the server's binary log files and binary log index file, if any, and obtain and save the GTID set held in the global value of the gtid_executed system variable (for example, by issuing a SELECT @@GLOBAL.gtid_executed statement and saving the results). If you are removing unwanted transactions from that GTID set, use mysqlbinlog to examine the contents of the transactions to ensure that they have no value, contain no data that must be saved or replicated, and did not result in data changes on the server.

When you issue RESET MASTER, the following reset operations are carried out:

  • The value of the gtid_purged system variable is set to an empty string ('').

  • The global value (but not the session value) of the gtid_executed system variable is set to an empty string.

  • The mysql.gtid_executed table is cleared (see mysql.gtid_executed Table).

  • If the server has binary logging enabled, the existing binary log files are deleted and the binary log index file is cleared.

Note that RESET MASTER is the method to reset the GTID execution history even if the server is a replica where binary logging is disabled. RESET REPLICA | SLAVE has no effect on the GTID execution history.

17.1.3.3 GTID Auto-Positioning

GTIDs replace the file-offset pairs previously required to determine points for starting, stopping, or resuming the flow of data between source and replica. When GTIDs are in use, all the information that the replica needs for synchronizing with the source is obtained directly from the replication data stream.

To start a replica using GTID-based replication, you do not include MASTER_LOG_FILE or MASTER_LOG_POS options in the CHANGE MASTER TO statement used to direct the replica to replicate from a given source. These options specify the name of the log file and the starting position within the file, but with GTIDs the replica does not need this nonlocal data. Instead, you need to enable the MASTER_AUTO_POSITION option. For full instructions to configure and start sources and replicas using GTID-based replication, see Section 17.1.3.4, “Setting Up Replication Using GTIDs”.

The MASTER_AUTO_POSITION option is disabled by default. If multi-source replication is enabled on the replica, you need to set the option for each applicable replication channel. Disabling the MASTER_AUTO_POSITION option again makes the replica revert to file-based replication, in which case you must also specify one or both of the MASTER_LOG_FILE or MASTER_LOG_POS options.

When a replica has GTIDs enabled (GTID_MODE=ON, ON_PERMISSIVE, or OFF_PERMISSIVE ) and the MASTER_AUTO_POSITION option enabled, auto-positioning is activated for connection to the source. The source must have GTID_MODE=ON set in order for the connection to succeed. In the initial handshake, the replica sends a GTID set containing the transactions that it has already received, committed, or both. This GTID set is equal to the union of the set of GTIDs in the gtid_executed system variable (@@GLOBAL.gtid_executed), and the set of GTIDs recorded in the Performance Schema replication_connection_status table as received transactions (the result of the statement SELECT RECEIVED_TRANSACTION_SET FROM PERFORMANCE_SCHEMA.replication_connection_status).

The source responds by sending all transactions recorded in its binary log whose GTID is not included in the GTID set sent by the replica. To do this, the source first identifies the appropriate binary log file to begin working with, by checking the Previous_gtids_log_event in the header of each of its binary log files, starting with the most recent. When the source finds the first Previous_gtids_log_event which contains no transactions that the replica is missing, it begins with that binary log file. This method is efficient and only takes a significant amount of time if the replica is behind the source by a large number of binary log files. The source then reads the transactions in that binary log file and subsequent files up to the current one, sending the transactions with GTIDs that the replica is missing, and skipping the transactions that were in the GTID set sent by the replica. The elapsed time until the replica receives the first missing transaction depends on its offset in the binary log file. This exchange ensures that the source only sends the transactions with a GTID that the replica has not already received or committed. If the replica receives transactions from more than one source, as in the case of a diamond topology, the auto-skip function ensures that the transactions are not applied twice.

If any of the transactions that should be sent by the source have been purged from the source's binary log, or added to the set of GTIDs in the gtid_purged system variable by another method, the source sends the error ER_MASTER_HAS_PURGED_REQUIRED_GTIDS to the replica, and replication does not start. The GTIDs of the missing purged transactions are identified and listed in the source's error log in the warning message ER_FOUND_MISSING_GTIDS. The replica cannot recover automatically from this error because parts of the transaction history that are needed to catch up with the source have been purged. Attempting to reconnect without the MASTER_AUTO_POSITION option enabled only results in the loss of the purged transactions on the replica. The correct approach to recover from this situation is for the replica to replicate the missing transactions listed in the ER_FOUND_MISSING_GTIDS message from another source, or for the replica to be replaced by a new replica created from a more recent backup. Consider revising the binary log expiration period (binlog_expire_logs_seconds) on the source to ensure that the situation does not occur again.

If during the exchange of transactions it is found that the replica has received or committed transactions with the source's UUID in the GTID, but the source itself does not have a record of them, the source sends the error ER_SLAVE_HAS_MORE_GTIDS_THAN_MASTER to the replica and replication does not start. This situation can occur if a source that does not have sync_binlog=1 set experiences a power failure or operating system crash, and loses committed transactions that have not yet been synchronized to the binary log file, but have been received by the replica. The source and replica can diverge if any clients commit transactions on the source after it is restarted, which can lead to the situation where the source and replica are using the same GTID for different transactions. The correct approach to recover from this situation is to check manually whether the source and replica have diverged. If the same GTID is now in use for different transactions, you either need to perform manual conflict resolution for individual transactions as required, or remove either the source or the replica from the replication topology. If the issue is only missing transactions on the source, you can make the source into a replica instead, allow it to catch up with the other servers in the replication topology, and then make it a source again if needed.

For a multi-source replica in a diamond topology (where the replica replicates from two or more sources, which in turn replicate from a common source), when GTID-based replication is in use, ensure that any replication filters or other channel configuration are identical on all channels on the multi-source replica. With GTID-based replication, filters are applied only to the transaction data, and GTIDs are not filtered out. This happens so that a replica’s GTID set stays consistent with the source’s, meaning GTID auto-positioning can be used without re-acquiring filtered out transactions each time. In the case where the downstream replica is multi-source and receives the same transaction from multiple sources in a diamond topology, the downstream replica now has multiple versions of the transaction, and the result depends on which channel applies the transaction first. The second channel to attempt it skips the transaction using GTID auto-skip, because the transaction’s GTID was added to the gtid_executed set by the first channel. With identical filtering on the channels, there is no problem because all versions of the transaction contain the same data, so the results are the same. However, with different filtering on the channels, the database can become inconsistent and replication can hang.

17.1.3.4 Setting Up Replication Using GTIDs

This section describes a process for configuring and starting GTID-based replication in MySQL 8.0. This is a cold start procedure that assumes either that you are starting the source server for the first time, or that it is possible to stop it; for information about provisioning replicas using GTIDs from a running source server, see Section 17.1.3.5, “Using GTIDs for Failover and Scaleout”. For information about changing GTID mode on servers online, see Section 17.1.4, “Changing GTID Mode on Online Servers”.

The key steps in this startup process for the simplest possible GTID replication topology, consisting of one source and one replica, are as follows:

  1. If replication is already running, synchronize both servers by making them read-only.

  2. Stop both servers.

  3. Restart both servers with GTIDs enabled and the correct options configured.

    The mysqld options necessary to start the servers as described are discussed in the example that follows later in this section.

  4. Instruct the replica to use the source as the replication data source and to use auto-positioning. The SQL statements needed to accomplish this step are described in the example that follows later in this section.

  5. Take a new backup. Binary logs containing transactions without GTIDs cannot be used on servers where GTIDs are enabled, so backups taken before this point cannot be used with your new configuration.

  6. Start the replica, then disable read-only mode on both servers, so that they can accept updates.

In the following example, two servers are already running as source and replica, using MySQL's binary log position-based replication protocol. If you are starting with new servers, see Section 17.1.2.3, “Creating a User for Replication” for information about adding a specific user for replication connections and Section 17.1.2.1, “Setting the Replication Source Configuration” for information about setting the server_id variable. The following examples show how to store mysqld startup options in server's option file, see Section 4.2.2.2, “Using Option Files” for more information. Alternatively you can use startup options when running mysqld.

Most of the steps that follow require the use of the MySQL root account or another MySQL user account that has the SUPER privilege. mysqladmin shutdown requires either the SUPER privilege or the SHUTDOWN privilege.

Step 1: Synchronize the servers.  This step is only required when working with servers which are already replicating without using GTIDs. For new servers proceed to Step 3. Make the servers read-only by setting the read_only system variable to ON on each server by issuing the following:

mysql> SET @@GLOBAL.read_only = ON;

Wait for all ongoing transactions to commit or roll back. Then, allow the replica to catch up with the source. It is extremely important that you make sure the replica has processed all updates before continuing.

If you use binary logs for anything other than replication, for example to do point in time backup and restore, wait until you do not need the old binary logs containing transactions without GTIDs. Ideally, wait for the server to purge all binary logs, and wait for any existing backup to expire.

Important

It is important to understand that logs containing transactions without GTIDs cannot be used on servers where GTIDs are enabled. Before proceeding, you must be sure that transactions without GTIDs do not exist anywhere in the topology.

Step 2: Stop both servers.  Stop each server using mysqladmin as shown here, where username is the user name for a MySQL user having sufficient privileges to shut down the server:

shell> mysqladmin -uusername -p shutdown

Then supply this user's password at the prompt.

Step 3: Start both servers with GTIDs enabled.  To enable GTID-based replication, each server must be started with GTID mode enabled by setting the gtid_mode variable to ON, and with the enforce_gtid_consistency variable enabled to ensure that only statements which are safe for GTID-based replication are logged. For example:

gtid_mode=ON
enforce-gtid-consistency=ON

In addition, you should start replicas with the --skip-slave-start option before configuring the replica settings. For more information on GTID related options and variables, see Section 17.1.6.5, “Global Transaction ID System Variables”.

It is not mandatory to have binary logging enabled in order to use GTIDs when using the mysql.gtid_executed Table. Source servers must always have binary logging enabled in order to be able to replicate. However, replica servers can use GTIDs but without binary logging. If you need to disable binary logging on a replica server, you can do this by specifying the --skip-log-bin and --log-slave-updates=OFF options for the replica.

Step 4: Configure the replica to use GTID-based auto-positioning.  Tell the replica to use the source with GTID based transactions as the replication data source, and to use GTID-based auto-positioning rather than file-based positioning. Issue a CHANGE MASTER TO statement on the replica, including the MASTER_AUTO_POSITION option in the statement to tell the replica that the source's transactions are identified by GTIDs.

You may also need to supply appropriate values for the source's host name and port number as well as the user name and password for a replication user account which can be used by the replica to connect to the source; if these have already been set prior to Step 1 and no further changes need to be made, the corresponding options can safely be omitted from the statement shown here.

mysql> CHANGE MASTER TO
     >     MASTER_HOST = host,
     >     MASTER_PORT = port,
     >     MASTER_USER = user,
     >     MASTER_PASSWORD = password,
     >     MASTER_AUTO_POSITION = 1;

Neither the MASTER_LOG_FILE option nor the MASTER_LOG_POS option may be used with MASTER_AUTO_POSITION set equal to 1. Attempting to do so causes the CHANGE MASTER TO statement to fail with an error.

Step 5: Take a new backup.  Existing backups that were made before you enabled GTIDs can no longer be used on these servers now that you have enabled GTIDs. Take a new backup at this point, so that you are not left without a usable backup.

For instance, you can execute FLUSH LOGS on the server where you are taking backups. Then either explicitly take a backup or wait for the next iteration of any periodic backup routine you may have set up.

Step 6: Start the replica and disable read-only mode.  Start the replica like this:

mysql> START SLAVE;
Or from MySQL 8.0.22:
mysql> START REPLICA;

The following step is only necessary if you configured a server to be read-only in Step 1. To allow the server to begin accepting updates again, issue the following statement:

mysql> SET @@GLOBAL.read_only = OFF;

GTID-based replication should now be running, and you can begin (or resume) activity on the source as before. Section 17.1.3.5, “Using GTIDs for Failover and Scaleout”, discusses creation of new replicas when using GTIDs.

17.1.3.5 Using GTIDs for Failover and Scaleout

There are a number of techniques when using MySQL Replication with Global Transaction Identifiers (GTIDs) for provisioning a new replica which can then be used for scaleout, being promoted to source as necessary for failover. This section describes the following techniques:

Global transaction identifiers were added to MySQL Replication for the purpose of simplifying in general management of the replication data flow and of failover activities in particular. Each identifier uniquely identifies a set of binary log events that together make up a transaction. GTIDs play a key role in applying changes to the database: the server automatically skips any transaction having an identifier which the server recognizes as one that it has processed before. This behavior is critical for automatic replication positioning and correct failover.

The mapping between identifiers and sets of events comprising a given transaction is captured in the binary log. This poses some challenges when provisioning a new server with data from another existing server. To reproduce the identifier set on the new server, it is necessary to copy the identifiers from the old server to the new one, and to preserve the relationship between the identifiers and the actual events. This is neccessary for restoring a replica that is immediately available as a candidate to become a new source on failover or switchover.

Simple replication.  The easiest way to reproduce all identifiers and transactions on a new server is to make the new server into the replica of a source that has the entire execution history, and enable global transaction identifiers on both servers. See Section 17.1.3.4, “Setting Up Replication Using GTIDs”, for more information.

Once replication is started, the new server copies the entire binary log from the source and thus obtains all information about all GTIDs.

This method is simple and effective, but requires the replica to read the binary log from the source; it can sometimes take a comparatively long time for the new replica to catch up with the source, so this method is not suitable for fast failover or restoring from backup. This section explains how to avoid fetching all of the execution history from the source by copying binary log files to the new server.

Copying data and transactions to the replica.  Executing the entire transaction history can be time-consuming when the source server has processed a large number of transactions previously, and this can represent a major bottleneck when setting up a new replica. To eliminate this requirement, a snapshot of the data set, the binary logs and the global transaction information the source server contains can be imported to the new replica. The server where the snapshot is taken can be either the source or one of its replicas, but you must ensure that the server has processed all required transactions before copying the data.

There are several variants of this method, the difference being in the manner in which data dumps and transactions from binary logs are transfered to the replica, as outlined here:

Data Set
  1. Create a dump file using mysqldump on the source server. Set the mysqldump option --master-data (with the default value of 1) to include a CHANGE MASTER TO statement with binary logging information. Set the --set-gtid-purged option to AUTO (the default) or ON, to include information about executed transactions in the dump. Then use the mysql client to import the dump file on the target server.

  2. Alternatively, create a data snapshot of the source server using raw data files, then copy these files to the target server, following the instructions in Section 17.1.2.5, “Choosing a Method for Data Snapshots”. If you use InnoDB tables, you can use the mysqlbackup command from the MySQL Enterprise Backup component to produce a consistent snapshot. This command records the log name and offset corresponding to the snapshot to be used on the replica. MySQL Enterprise Backup is a commercial product that is included as part of a MySQL Enterprise subscription. See Section 30.2, “MySQL Enterprise Backup Overview” for detailed information.

  3. Alternatively, stop both the source and target servers, copy the contents of the source's data directory to the new replica's data directory, then restart the replica. If you use this method, the replica must be configured for GTID-based replication, in other words with gtid_mode=ON. For instructions and important information for this method, see Section 17.1.2.8, “Adding Replicas to a Replication Environment”.

Transaction History

If the source server has a complete transaction history in its binary logs (that is, the GTID set @@GLOBAL.gtid_purged is empty), you can use these methods.

  1. Import the binary logs from the source server to the new replica using mysqlbinlog, with the --read-from-remote-server and --read-from-remote-master options.

  2. Alternatively, copy the source server's binary log files to the replica. You can make copies from the replica using mysqlbinlog with the --read-from-remote-server and --raw options. These can be read into the replica by using mysqlbinlog > file (without the --raw option) to export the binary log files to SQL files, then passing these files to the mysql client for processing. Ensure that all of the binary log files are processed using a single mysql process, rather than multiple connections. For example:

    shell> mysqlbinlog copied-binlog.000001 copied-binlog.000002 | mysql -u root -p
    

    For more information, see Section 4.6.8.3, “Using mysqlbinlog to Back Up Binary Log Files”.

This method has the advantage that a new server is available almost immediately; only those transactions that were committed while the snapshot or dump file was being replayed still need to be obtained from the existing source. This means that the replica's availability is not instantanteous, but only a relatively short amount of time should be required for the replica to catch up with these few remaining transactions.

Copying over binary logs to the target server in advance is usually faster than reading the entire transaction execution history from the source in real time. However, it may not always be feasible to move these files to the target when required, due to size or other considerations. The two remaining methods for provisioning a new replica discussed in this section use other means to transfer information about transactions to the new replica.

Injecting empty transactions.  The source's global gtid_executed variable contains the set of all transactions executed on the source. Rather than copy the binary logs when taking a snapshot to provision a new server, you can instead note the content of gtid_executed on the server from which the snapshot was taken. Before adding the new server to the replication chain, simply commit an empty transaction on the new server for each transaction identifier contained in the source's gtid_executed, like this:

SET GTID_NEXT='aaa-bbb-ccc-ddd:N';

BEGIN;
COMMIT;

SET GTID_NEXT='AUTOMATIC';

Once all transaction identifiers have been reinstated in this way using empty transactions, you must flush and purge the replica's binary logs, as shown here, where N is the nonzero suffix of the current binary log file name:

FLUSH LOGS;
PURGE BINARY LOGS TO 'source-bin.00000N';

You should do this to prevent this server from flooding the replication stream with false transactions in the event that it is later promoted to the source. (The FLUSH LOGS statement forces the creation of a new binary log file; PURGE BINARY LOGS purges the empty transactions, but retains their identifiers.)

This method creates a server that is essentially a snapshot, but in time is able to become a source as its binary log history converges with that of the replication stream (that is, as it catches up with the source or sources). This outcome is similar in effect to that obtained using the remaining provisioning method, which we discuss in the next few paragraphs.

Excluding transactions with gtid_purged.  The source's global gtid_purged variable contains the set of all transactions that have been purged from the source's binary log. As with the method discussed previously (see Injecting empty transactions), you can record the value of gtid_executed on the server from which the snapshot was taken (in place of copying the binary logs to the new server). Unlike the previous method, there is no need to commit empty transactions (or to issue PURGE BINARY LOGS); instead, you can set gtid_purged on the replica directly, based on the value of gtid_executed on the server from which the backup or snapshot was taken.

As with the method using empty transactions, this method creates a server that is functionally a snapshot, but in time is able to become a source as its binary log history converges with that of the source and other replicas.

Restoring GTID mode replicas.  When restoring a replica in a GTID based replication setup that has encountered an error, injecting an empty transaction may not solve the problem because an event does not have a GTID.

Use mysqlbinlog to find the next transaction, which is probably the first transaction in the next log file after the event. Copy everything up to the COMMIT for that transaction, being sure to include the SET @@SESSION.gtid_next. Even if you are not using row-based replication, you can still run binary log row events in the command line client.

Stop the replica and run the transaction you copied. The mysqlbinlog output sets the delimiter to /*!*/;, so set it back:

mysql> DELIMITER ;

Restart replication from the correct position automatically:

mysql> SET GTID_NEXT=automatic;
mysql> RESET SLAVE;
mysql> START SLAVE;
Or from MySQL 8.0.22:
mysql> SET GTID_NEXT=automatic;
mysql> RESET REPLICA;
mysql> START REPLICA;

17.1.3.6 Replication From a Source Without GTIDs to a Replica With GTIDs

From MySQL 8.0.23, you can set up replication channels to assign a GTID to replicated transactions that do not already have one. This feature enables replication from a source server that does not have GTIDs enabled and does not use GTID-based replication, to a replica that has GTIDs enabled. If it is possible to enable GTIDs on the replication source server, as described in Section 17.1.4, “Changing GTID Mode on Online Servers”, use that approach instead. This feature is designed for replication source servers where you cannot enable GTIDs.

You can enable GTID assignment on a replication channel using the ASSIGN_GTIDS_TO_ANONYMOUS_TRANSACTIONS option of the CHANGE REPLICATION SOURCE TO statement. LOCAL assigns a GTID including the replica's own UUID (the server_uuid setting). uuid assigns a GTID including the specified UUID, such as the server_uuid setting for the replication source server. Using a nonlocal UUID lets you differentiate between transactions that originated on the replica and transactions that originated on the source, and for a multi-source replica, between transactions that originated on different sources. If any of the transactions sent by the source do have a GTID already, that GTID is retained.

Important

A replica set up with ASSIGN_GTIDS_TO_ANONYMOUS_TRANSACTIONS on any channel cannot be promoted to replace the replication source server in the event that a failover is required, and a backup taken from the replica cannot be used to restore the replication source server. The same restriction applies to replacing or restoring other replicas that use ASSIGN_GTIDS_TO_ANONYMOUS_TRANSACTIONS on any channel.

The replica must have gtid_mode=ON set, and this cannot be changed afterwards. If the replica server is started without GTIDs enabled and with ASSIGN_GTIDS_TO_ANONYMOUS_TRANSACTIONS set for any replication channels, the settings are not changed, but a warning message is written to the error log explaining how to change the situation.

For a multi-source replica, you can have a mix of channels that use ASSIGN_GTIDS_TO_ANONYMOUS_TRANSACTIONS, and channels that do not. Channels specific to Group Replication cannot use ASSIGN_GTIDS_TO_ANONYMOUS_TRANSACTIONS, but an asynchronous replication channel for another source on a server instance that is a Group Replication group member can do so. For a channel on a Group Replication group member, do not specify the Group Replication group name as the UUID for creating the GTIDs.

Using ASSIGN_GTIDS_TO_ANONYMOUS_TRANSACTIONS on a replication channel is not the same as introducing GTID-based replication for the channel. The GTID set (gtid_executed) from a replica set up with ASSIGN_GTIDS_TO_ANONYMOUS_TRANSACTIONS should not be transferred to another server or compared with another server's gtid_executed set. The GTIDs that are assigned to the anonymous transactions, and the UUID you choose for them, only have significance for that replica's own use.

A replication channel using ASSIGN_GTIDS_TO_ANONYMOUS_TRANSACTIONS has the following behavior differences to GTID-based replication:

  • GTIDs are assigned to the replicated transactions when they are applied (unless they already had a GTID). A GTID would normally be assigned on the replication source server when the transaction is committed, and sent to the replica along with the transaction. On a multi-threaded replica, this means the order of the GTIDs does not necessarily match the order of the transactions, even if slave-preserve-commit-order=1 is set.

  • The SOURCE_LOG_FILE and SOURCE_LOG_POS options of the CHANGE REPLICATION SOURCE TO statement are used to position the replication I/O thread, rather than the MASTER_AUTO_POSITION option.

  • The SET GLOBAL sql_slave_skip_counter statement is used to skip transactions on a replication channel set up with ASSIGN_GTIDS_TO_ANONYMOUS_TRANSACTIONS, rather than the CHANGE REPLICATION SOURCE TO statement. For instructions, see Section 17.1.7.3, “Skipping Transactions”.

  • The UNTIL SQL_BEFORE_GTIDS and UNTIL_SQL_AFTER_GTIDS options of the START REPLICA statement cannot be used for the channel.

  • The function WAIT_FOR_EXECUTED_GTID_SET(), and also WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS(), which is deprecated from MySQL 8.0.18, cannot be used with the channel.

The Performance Schema table replication_applier_configuration shows whether GTIDs are assigned to anonymous transactions on a replication channel, what the UUID is, and whether it is the UUID of the replica server (LOCAL) or a user-specified UUID (MANUAL). The information is also recorded in the applier metadata repository. A RESET SLAVE ALL statement resets the ASSIGN_GTIDS_TO_ANONYMOUS_TRANSACTIONS setting, but a RESET SLAVE statement does not.

17.1.3.7 Restrictions on Replication with GTIDs

Because GTID-based replication is dependent on transactions, some features otherwise available in MySQL are not supported when using it. This section provides information about restrictions on and limitations of replication with GTIDs.

Updates involving nontransactional storage engines.  When using GTIDs, updates to tables using nontransactional storage engines such as MyISAM cannot be made in the same statement or transaction as updates to tables using transactional storage engines such as InnoDB.

This restriction is due to the fact that updates to tables that use a nontransactional storage engine mixed with updates to tables that use a transactional storage engine within the same transaction can result in multiple GTIDs being assigned to the same transaction.

Such problems can also occur when the source and the replica use different storage engines for their respective versions of the same table, where one storage engine is transactional and the other is not. Also be aware that triggers that are defined to operate on nontransactional tables can be the cause of these problems.

In any of the cases just mentioned, the one-to-one correspondence between transactions and GTIDs is broken, with the result that GTID-based replication cannot function correctly.

CREATE TABLE ... SELECT statements.  Prior to MySQL 8.0.21, CREATE TABLE ... SELECT statements are not allowed when using GTID-based replication. When binlog_format is set to STATEMENT, a CREATE TABLE ... SELECT statement is recorded in the binary log as one transaction with one GTID, but if ROW format is used, the statement is recorded as two transactions with two GTIDs. If a source used STATEMENT format and a replica used ROW format, the replica would be unable to handle the transaction correctly, therefore the CREATE TABLE ... SELECT statement is disallowed with GTIDs to prevent this scenario. This restriction is lifted in MySQL 8.0.21 on storage engines that support atomic DDL. In this case, CREATE TABLE ... SELECT is recorded in the binary log as one transaction. For more information, see Section 13.1.1, “Atomic Data Definition Statement Support”.

Temporary tables.  When binlog_format is set to STATEMENT, CREATE TEMPORARY TABLE and DROP TEMPORARY TABLE statements cannot be used inside transactions, procedures, functions, and triggers when GTIDs are in use on the server (that is, when the enforce_gtid_consistency system variable is set to ON). They can be used outside these contexts when GTIDs are in use, provided that autocommit=1 is set. From MySQL 8.0.13, when binlog_format is set to ROW or MIXED, CREATE TEMPORARY TABLE and DROP TEMPORARY TABLE statements are allowed inside a transaction, procedure, function, or trigger when GTIDs are in use. The statements are not written to the binary log and are therefore not replicated to replicas. The use of row-based replication means that the replicas remain in sync without the need to replicate temporary tables. If the removal of these statements from a transaction results in an empty transaction, the transaction is not written to the binary log.

Preventing execution of unsupported statements.  To prevent execution of statements that would cause GTID-based replication to fail, all servers must be started with the --enforce-gtid-consistency option when enabling GTIDs. This causes statements of any of the types discussed previously in this section to fail with an error.

Note that --enforce-gtid-consistency only takes effect if binary logging takes place for a statement. If binary logging is disabled on the server, or if statements are not written to the binary log because they are removed by a filter, GTID consistency is not checked or enforced for the statements that are not logged.

For information about other required startup options when enabling GTIDs, see Section 17.1.3.4, “Setting Up Replication Using GTIDs”.

Skipping transactions.  sql_slave_skip_counter is not available when using GTID-based replication. If you need to skip transactions, use the value of the source's gtid_executed variable instead. If you have enabled GTID assignment on a replication channel using the ASSIGN_GTIDS_TO_ANONYMOUS_TRANSACTIONS option of the CHANGE REPLICATION SOURCE TO statement, sql_slave_skip_counter is available. For more information, see Section 17.1.7.3, “Skipping Transactions”.

Ignoring servers.  The IGNORE_SERVER_IDS option of the CHANGE MASTER TO statement is deprecated when using GTIDs, because transactions that have already been applied are automatically ignored. Before starting GTID-based replication, check for and clear all ignored server ID lists that have previously been set on the servers involved. The SHOW REPLICA | SLAVE STATUS statement, which can be issued for individual channels, displays the list of ignored server IDs if there is one. If there is no list, the Replicate_Ignore_Server_Ids field is blank.

GTID mode and mysql_upgrade.  Prior to MySQL 8.0.16, when the server is running with global transaction identifiers (GTIDs) enabled (gtid_mode=ON), do not enable binary logging by mysql_upgrade (the --write-binlog option). As of MySQL 8.0.16, the server performs the entire MySQL upgrade procedure, but disables binary logging during the upgrade, so there is no issue.

17.1.3.8 Stored Function Examples to Manipulate GTIDs

MySQL includes some built-in (native) functions for use with GTID-based replication. These functions are as follows:

GTID_SUBSET(set1,set2)

Given two sets of global transaction identifiers set1 and set2, returns true if all GTIDs in set1 are also in set2. Returns false otherwise.

GTID_SUBTRACT(set1,set2)

Given two sets of global transaction identifiers set1 and set2, returns only those GTIDs from set1 that are not in set2.

WAIT_FOR_EXECUTED_GTID_SET(gtid_set[, timeout])

Wait until the server has applied all of the transactions whose global transaction identifiers are contained in gtid_set. The optional timeout stops the function from waiting after the specified number of seconds have elapsed.

For details of these functions, see Section 12.19, “Functions Used with Global Transaction Identifiers (GTIDs)”.

You can define your own stored functions to work with GTIDs. For information on defining stored functions, see Chapter 25, Stored Objects. The following examples show some useful stored functions that can be created based on the built-in GTID_SUBSET() and GTID_SUBTRACT() functions.

Note that in these stored functions, the delimiter command has been used to change the MySQL statement delimiter to a vertical bar, as follows:

mysql> delimiter |

All of these functions take string representations of GTID sets as arguments, so GTID sets must always be quoted when used with them.

This function returns nonzero (true) if two GTID sets are the same set, even if they are not formatted in the same way.

CREATE FUNCTION GTID_IS_EQUAL(gtid_set_1 LONGTEXT, gtid_set_2 LONGTEXT)
RETURNS INT
  RETURN GTID_SUBSET(gtid_set_1, gtid_set_2) AND GTID_SUBSET(gtid_set_2, gtid_set_1)|

This function returns nonzero (true) if two GTID sets are disjoint.

CREATE FUNCTION GTID_IS_DISJOINT(gtid_set_1 LONGTEXT, gtid_set_2 LONGTEXT)
RETURNS INT
  RETURN GTID_SUBSET(gtid_set_1, GTID_SUBTRACT(gtid_set_1, gtid_set_2))|

This function returns nonzero (true) if two GTID sets are disjoint, and sum is the union of the two sets.

CREATE FUNCTION GTID_IS_DISJOINT_UNION(gtid_set_1 LONGTEXT, gtid_set_2 LONGTEXT, sum LONGTEXT)
RETURNS INT
  RETURN GTID_IS_EQUAL(GTID_SUBTRACT(sum, gtid_set_1), gtid_set_2) AND
         GTID_IS_EQUAL(GTID_SUBTRACT(sum, gtid_set_2), gtid_set_1)|

This function returns a normalized form of the GTID set, in all uppercase, with no whitespace and no duplicates. The UUIDs are arranged in alphabetic order and intervals are arranged in numeric order.

CREATE FUNCTION GTID_NORMALIZE(g LONGTEXT)
RETURNS LONGTEXT
RETURN GTID_SUBTRACT(g, '')|

This function returns the union of two GTID sets.

CREATE FUNCTION GTID_UNION(gtid_set_1 LONGTEXT, gtid_set_2 LONGTEXT)
RETURNS LONGTEXT
  RETURN GTID_NORMALIZE(CONCAT(gtid_set_1, ',', gtid_set_2))|

This function returns the intersection of two GTID sets.

CREATE FUNCTION GTID_INTERSECTION(gtid_set_1 LONGTEXT, gtid_set_2 LONGTEXT)
RETURNS LONGTEXT
  RETURN GTID_SUBTRACT(gtid_set_1, GTID_SUBTRACT(gtid_set_1, gtid_set_2))|

This function returns the symmetric difference between two GTID sets, that is, the GTIDs that exist in gtid_set_1 but not in gtid_set_2, and also the GTIDs that exist in gtid_set_2 but not in gtid_set_1.

CREATE FUNCTION GTID_SYMMETRIC_DIFFERENCE(gtid_set_1 LONGTEXT, gtid_set_2 LONGTEXT)
RETURNS LONGTEXT
  RETURN GTID_SUBTRACT(CONCAT(gtid_set_1, ',', gtid_set_2), GTID_INTERSECTION(gtid_set_1, gtid_set_2))|

This function removes from a GTID set all the GTIDs from a specified origin, and returns the remaining GTIDs, if any. The UUID is the identifier used by the server where the transaction originated, which is normally the server_uuid value.

CREATE FUNCTION GTID_SUBTRACT_UUID(gtid_set LONGTEXT, uuid TEXT)
RETURNS LONGTEXT
  RETURN GTID_SUBTRACT(gtid_set, CONCAT(UUID, ':1-', (1 << 63) - 2))|

This function reverses the previously listed function to return only those GTIDs from the GTID set that originate from the server with the specified identifier (UUID).

CREATE FUNCTION GTID_INTERSECTION_WITH_UUID(gtid_set LONGTEXT, uuid TEXT)
RETURNS LONGTEXT
  RETURN GTID_SUBTRACT(gtid_set, GTID_SUBTRACT_UUID(gtid_set, uuid))|

Example 17.1 Verifying that a replica is up to date

The built-in functions GTID_SUBSET and GTID_SUBTRACT can be used to check that a replica has applied at least every transaction that a source has applied.

To perform this check with GTID_SUBSET, execute the following statement on the replica:

SELECT GTID_SUBSET(source_gtid_executed, replica_gtid_executed)

If this returns 0 (false), some GTIDs in source_gtid_executed are not present in replica_gtid_executed, so the source has applied some transactions that the replica has not applied, and the replica is therefore not up to date.

To perform the check with GTID_SUBTRACT, execute the following statement on the replica:

SELECT GTID_SUBTRACT(source_gtid_executed, replica_gtid_executed)

This statement returns any GTIDs that are in source_gtid_executed but not in replica_gtid_executed. If any GTIDs are returned, the source has applied some transactions that the replica has not applied, and the replica is therefore not up to date.


Example 17.2 Backup and restore scenario

The stored functions GTID_IS_EQUAL, GTID_IS_DISJOINT, and GTID_IS_DISJOINT_UNION could be used to verify backup and restore operations involving multiple databases and servers. In this example scenario, server1 contains database db1, and server2 contains database db2. The goal is to copy database db2 to server1, and the result on server1 should be the union of the two databases. The procedure used is to back up server2 using mysqlpump or mysqldump, then restore this backup on server1.

Provided the backup program's option --set-gtid-purged was set to ON or the default of AUTO, the program's output contains a SET @@GLOBAL.gtid_purged statement which adds the gtid_executed set from server2 to the gtid_purged set on server1. The gtid_purged set contains the GTIDs of all the transactions that have been committed on a server but do not exist in any binary log file on the server. When database db2 is copied to server1, the GTIDs of the transactions committed on server2, which are not in the binary log files on server1, must be added to server1's gtid_purged set to make the set complete.

The stored functions can be used to assist with the following steps in this scenario:

  • Use GTID_IS_EQUAL to verify that the backup operation computed the correct GTID set for the SET @@GLOBAL.gtid_purged statement. On server2, extract that statement from the mysqlpump or mysqldump output, and store the GTID set into a local variable, such as $gtid_purged_set. Then execute the following statement:

    server2> SELECT GTID_IS_EQUAL($gtid_purged_set, @@GLOBAL.gtid_executed); 

    If the result is 1, the two GTID sets are equal, and the set has been computed correctly.

  • Use GTID_IS_DISJOINT to verify that the GTID set in the mysqlpump or mysqldump output does not overlap with the gtid_executed set on server1. Having identical GTIDs present on both servers causes errors when copying database db2 to server1. To check, on server1, extract and store the gtid_purged set from the output into a local variable as above, then execute the following statement:

    server1> SELECT GTID_IS_DISJOINT($gtid_purged_set, @@GLOBAL.gtid_executed); 

    If the result is 1, there is no overlap between the two GTID sets, so no duplicate GTIDs are present.

  • Use GTID_IS_DISJOINT_UNION to verify that the restore operation resulted in the correct GTID state on server1. Before restoring the backup, on server1, obtain the existing gtid_executed set by executing the following statement:

    server1> SELECT @@GLOBAL.gtid_executed;

    Store the result in a local variable $original_gtid_executed. Also store the gtid_purged set in a local variable as described above. When the backup from server2 has been restored onto server1, execute the following statement to verify the GTID state:

    server1> SELECT GTID_IS_DISJOINT_UNION($original_gtid_executed,
                                           $gtid_purged_set,
                                           @@GLOBAL.gtid_executed); 

    If the result is 1, the stored function has verified that the original gtid_executed set from server1 ($original_gtid_executed) and the gtid_purged set that was added from server2 ($gtid_purged_set) have no overlap, and also that the updated gtid_executed set on server1 now consists of the previous gtid_executed set from server1 plus the gtid_purged set from server2, which is the desired result. Ensure that this check is carried out before any further transactions take place on server1, otherwise the new transactions in the gtid_executed set cause it to fail.


Example 17.3 Selecting the most up-to-date replica for manual failover

The stored function GTID_UNION could be used to identify the most up-to-date replica from a set of replicas, in order to perform a manual failover operation after a source server has stopped unexpectedly. If some of the replicas are experiencing replication lag, this stored function can be used to compute the most up-to-date replica without waiting for all the replicas to apply their existing relay logs, and therefore to minimize the failover time. The function can return the union of the gtid_executed set on each replica with the set of transactions received by the replica, which is recorded in the Performance Schema table replication_connection_status. You can compare these results to find which replica's record of transactions is the most up-to-date, even if not all of the transactions have been committed yet.

On each replica, compute the complete record of transactions by issuing the following statement:

SELECT GTID_UNION(RECEIVED_TRANSACTION_SET, @@GLOBAL.gtid_executed)
    FROM performance_schema.replication_connection_status
    WHERE channel_name = 'name';

You can then compare the results from each replica to see which one has the most up-to-date record of transactions, and use this replica as the new source.


Example 17.4 Checking for extraneous transactions on a replica

The stored function GTID_SUBTRACT_UUID could be used to check whether a replica has received transactions that did not originate from its designated source or sources. If it has, there might be an issue with your replication setup, or with a proxy, router, or load balancer. This function works by removing from a GTID set all the GTIDs from a specified originating server, and returning the remaining GTIDs, if any.

For a replica that replicates from a single source, issue the following statement, giving the identifier of the originating source, which is normally the server_uuid value:

SELECT GTID_SUBTRACT_UUID(@@GLOBAL.gtid_executed, server_uuid_of_source);

  If the result is not empty, the transactions returned are extra transactions that did not originate from the designated source.

For a replica in a multisource replication topology, repeat the function, for example:

SELECT GTID_SUBTRACT_UUID(GTID_SUBTRACT_UUID(@@GLOBAL.gtid_executed,
                                             server_uuid_of_source_1),
                                             server_uuid_of_source_2);

If the result is not empty, the transactions returned are extra transactions that did not originate from any of the designated sources.


Example 17.5 Verifying that a server in a replication topology is read-only

The stored function GTID_INTERSECTION_WITH_UUID could be used to verify that a server has not originated any GTIDs and is in a read-only state. The function returns only those GTIDs from the GTID set that originate from the server with the specified identifier. If any of the transactions in the server's gtid_executed set have the server's own identifier, the server itself originated those transactions. You can issue the following statement on the server to check:

SELECT GTID_INTERSECTION_WITH_UUID(@@GLOBAL.gtid_executed, my_server_uuid);


Example 17.6 Validating an additional replica in a multisource replication setup

The stored function GTID_INTERSECTION_WITH_UUID could be used to find out if a replica attached to a multisource replication setup has applied all the transactions originating from one particular source. In this scenario, source1 and source2 are both sources and replicas and replicate to each other. source2 also has its own replica. The replica also receives and applies transactions from source source1 if source2 is configured with log_slave_updates=ON, but it does not do so if source2 uses log_slave_updates=OFF. Whatever the case, we currently only want to find out if the replica is up to date with source2. In this situation, the stored function GTID_INTERSECTION_WITH_UUID can be used to identify the transactions that source2 originated, discarding the transactions that source2 has replicated from source1. The built-in function GTID_SUBSET can then be used to compare the result to the gtid_executed set on the replica. If the replica is up to date with source2, the gtid_executed set on the replica contains all the transactions in the intersection set (the transactions that originated from source2).

To carry out this check, store source2's gtid_executed set, source2's server UUID, and the replica's gtid_executed set, into client-side variables as follows:

    $source2_gtid_executed :=
      source2> SELECT @@GLOBAL.gtid_executed;
    $source2_server_uuid :=
      source2> SELECT @@GLOBAL.server_uuid;
    $replica_gtid_executed :=
      replica> SELECT @@GLOBAL.gtid_executed;

Then use GTID_INTERSECTION_WITH_UUID and GTID_SUBSET with these variables as input, as follows:

SELECT GTID_SUBSET(GTID_INTERSECTION_WITH_UUID($source2_gtid_executed,
                                               $source2_server_uuid),
                                               $replica_gtid_executed);

The server identifier from source2 ($source2_server_uuid) is used with GTID_INTERSECTION_WITH_UUID to identify and return only those GTIDs from source2's gtid_executed set that originated on source2, omitting those that originated on source1. The resulting GTID set is then compared with the set of all executed GTIDs on the replica, using GTID_SUBSET. If this statement returns nonzero (true), all the identified GTIDs from source2 (the first set input) are also in the replica's gtid_executed set (the second set input), meaning that the replica has replicated all the transactions that originated from source2.


17.1.4 Changing GTID Mode on Online Servers

This section describes how to change the mode of replication from and to GTID mode without having to take the server offline.

17.1.4.1 Replication Mode Concepts

To be able to safely configure the replication mode of an online server it is important to understand some key concepts of replication. This section explains these concepts and is essential reading before attempting to modify the replication mode of an online server.

The modes of replication available in MySQL rely on different techniques for identifying transactions which are logged. The types of transactions used by replication are as follows:

  • GTID transactions are identified by a global transaction identifier (GTID) in the form UUID:NUMBER. Every GTID transaction in a log is always preceded by a Gtid_log_event. GTID transactions can be addressed using either the GTID or using the file name and position.

  • Anonymous transactions do not have a GTID assigned, and MySQL ensures that every anonymous transaction in a log is preceded by an Anonymous_gtid_log_event. In previous versions, anonymous transactions were not preceded by any particular event. Anonymous transactions can only be addressed using file name and position.

When using GTIDs you can take advantage of GTID auto-positioning and automatic fail-over, as well as use WAIT_FOR_EXECUTED_GTID_SET(), session_track_gtids, and monitor replicated transactions using Performance Schema tables.

Transactions in a relay log that was received from a source running a previous version of MySQL may not be preceded by any particular event at all, but after being replayed and logged in the replica's binary log, they are preceded with an Anonymous_gtid_log_event.

The ability to configure the replication mode online means that the gtid_mode and enforce_gtid_consistency variables are now both dynamic and can be set from a top-level statement by an account that has privileges sufficient to set global system variables. See Section 5.1.9.1, “System Variable Privileges”. In MySQL 5.6 and earlier, both of these variables could only be configured using the appropriate option at server start, meaning that changes to the replication mode required a server restart. In all versions gtid_mode could be set to ON or OFF, which corresponded to whether GTIDs were used to identify transactions or not. When gtid_mode=ON it is not possible to replicate anonymous transactions, and when gtid_mode=OFF only anonymous transactions can be replicated. When gtid_mode=OFF_PERMISSIVE then new transactions are anonymous while permitting replicated transactions to be either GTID or anonymous transactions. When gtid_mode=ON_PERMISSIVE then new transactions use GTIDs while permitting replicated transactions to be either GTID or anonymous transactions. This means it is possible to have a replication topology that has servers using both anonymous and GTID transactions. For example a source with gtid_mode=ON could be replicating to a replica with gtid_mode=ON_PERMISSIVE. The valid values for gtid_mode are as follows and in this order:

  • OFF

  • OFF_PERMISSIVE

  • ON_PERMISSIVE

  • ON

It is important to note that the state of gtid_mode can only be changed by one step at a time based on the above order. For example, if gtid_mode is currently set to OFF_PERMISSIVE, it is possible to change to OFF or ON_PERMISSIVE but not to ON. This is to ensure that the process of changing from anonymous transactions to GTID transactions online is correctly handled by the server. When you switch between gtid_mode=ON and gtid_mode=OFF, the GTID state (in other words the value of gtid_executed) is persistent. This ensures that the GTID set that has been applied by the server is always retained, regardless of changes between types of gtid_mode.

The fields related to GTIDs display the correct information regardless of the currently selected gtid_mode. This means that fields which display GTID sets, such as gtid_executed, gtid_purged, RECEIVED_TRANSACTION_SET in the replication_connection_status Performance Schema table, and the GTID related results of SHOW REPLICA | SLAVE STATUS, now return the empty string when there are no GTIDs present. Fields that display a single GTID, such as CURRENT_TRANSACTION in the Performance Schema replication_applier_status_by_worker table, now display ANONYMOUS when GTID transactions are not being used.

Replication from a source using gtid_mode=ON provides the ability to use GTID auto-positioning, configured using the CHANGE MASTER TO MASTER_AUTO_POSITION = 1; statement. The replication topology being used impacts on whether it is possible to enable auto-positioning or not, as this feature relies on GTIDs and is not compatible with anonymous transactions. It is strongly recommended to ensure there are no anonymous transactions remaining in the topology before enabling auto-positioning, see Section 17.1.4.2, “Enabling GTID Transactions Online”.

The valid combinations of gtid_mode and auto-positioning on source and replica are shown in the following table, where the source's gtid_mode is shown on the horizontal and the replica's gtid_mode is on the vertical. The meaning of each entry is as follows:

  • Y: the gtid_mode of source and replica is compatible

  • N: the gtid_mode of source and replica is not compatible

  • *: auto-positioning can be used with this combination

Table 17.1 Valid Combinations of Source and Replica gtid_mode

gtid_mode

Source OFF

Source OFF_PERMISSIVE

Source ON_PERMISSIVE

Source ON

Replica OFF

Y

Y

N

N

Replica OFF_PERMISSIVE

Y

Y

Y

Y*

Replica ON_PERMISSIVE

Y

Y

Y

Y*

Replica ON

N

N

Y

Y*


The currently selected gtid_mode also impacts on the gtid_next variable. The following table shows the behavior of the server for the different values of gtid_mode and gtid_next. The meaning of each entry is as follows:

  • ANONYMOUS: generate an anonymous transaction.

  • Error: generate an error and fail to execute SET GTID_NEXT.

  • UUID:NUMBER: generate a GTID with the specified UUID:NUMBER.

  • New GTID: generate a GTID with an automatically generated number.

Table 17.2 Valid Combinations of gtid_mode and gtid_next

gtid_next AUTOMATIC

binary log on

gtid_next AUTOMATIC

binary log off

gtid_next ANONYMOUS

gtid_next UUID:NUMBER

gtid_mode OFF

ANONYMOUS

ANONYMOUS

ANONYMOUS

Error

gtid_mode OFF_PERMISSIVE

ANONYMOUS

ANONYMOUS

ANONYMOUS

UUID:NUMBER

gtid_mode ON_PERMISSIVE

New GTID

ANONYMOUS

ANONYMOUS

UUID:NUMBER

gtid_mode ON

New GTID

ANONYMOUS

Error

UUID:NUMBER


When the binary log is off and gtid_next is set to AUTOMATIC, then no GTID is generated. This is consistent with the behavior of previous versions.

17.1.4.2 Enabling GTID Transactions Online

This section describes how to enable GTID transactions, and optionally auto-positioning, on servers that are already online and using anonymous transactions. This procedure does not require taking the server offline and is suited to use in production. However, if you have the possibility to take the servers offline when enabling GTID transactions that process is easier.

From MySQL 8.0.23, you can set up replication channels to assign a GTID to replicated transactions that do not already have one. This feature enables replication from a source server that does not use GTID-based replication, to a replica that does. If it is possible to enable GTIDs on the replication source server, as described in this procedure, use this approach instead. Assigning GTIDs is designed for replication source servers where you cannot enable GTIDs. For more information on this option, see Section 17.1.3.6, “Replication From a Source Without GTIDs to a Replica With GTIDs”.

Before you start, ensure that the servers meet the following pre-conditions:

  • All servers in your topology must use MySQL 5.7.6 or later. You cannot enable GTID transactions online on any single server unless all servers which are in the topology are using this version.

  • All servers have gtid_mode set to the default value OFF.

The following procedure can be paused at any time and later resumed where it was, or reversed by jumping to the corresponding step of Section 17.1.4.3, “Disabling GTID Transactions Online”, the online procedure to disable GTIDs. This makes the procedure fault-tolerant because any unrelated issues that may appear in the middle of the procedure can be handled as usual, and then the procedure continued where it was left off.

Note

It is crucial that you complete every step before continuing to the next step.

To enable GTID transactions:

  1. On each server, execute:

    SET @@GLOBAL.ENFORCE_GTID_CONSISTENCY = WARN;

    Let the server run for a while with your normal workload and monitor the logs. If this step causes any warnings in the log, adjust your application so that it only uses GTID-compatible features and does not generate any warnings.

    Important

    This is the first important step. You must ensure that no warnings are being generated in the error logs before going to the next step.

  2. On each server, execute:

    SET @@GLOBAL.ENFORCE_GTID_CONSISTENCY = ON;
  3. On each server, execute:

    SET @@GLOBAL.GTID_MODE = OFF_PERMISSIVE;

    It does not matter which server executes this statement first, but it is important that all servers complete this step before any server begins the next step.

  4. On each server, execute:

    SET @@GLOBAL.GTID_MODE = ON_PERMISSIVE;

    It does not matter which server executes this statement first.

  5. On each server, wait until the status variable ONGOING_ANONYMOUS_TRANSACTION_COUNT is zero. This can be checked using:

    SHOW STATUS LIKE 'ONGOING_ANONYMOUS_TRANSACTION_COUNT';
    Note

    On a replica, it is theoretically possible that this shows zero and then nonzero again. This is not a problem, it suffices that it shows zero once.

  6. Wait for all transactions generated up to step 5 to replicate to all servers. You can do this without stopping updates: the only important thing is that all anonymous transactions get replicated.

    See Section 17.1.4.4, “Verifying Replication of Anonymous Transactions” for one method of checking that all anonymous transactions have replicated to all servers.

  7. If you use binary logs for anything other than replication, for example point in time backup and restore, wait until you do not need the old binary logs having transactions without GTIDs.

    For instance, after step 6 has completed, you can execute FLUSH LOGS on the server where you are taking backups. Then either explicitly take a backup or wait for the next iteration of any periodic backup routine you may have set up.

    Ideally, wait for the server to purge all binary logs that existed when step 6 was completed. Also wait for any backup taken before step 6 to expire.

    Important

    This is the second important point. It is vital to understand that binary logs containing anonymous transactions, without GTIDs cannot be used after the next step. After this step, you must be sure that transactions without GTIDs do not exist anywhere in the topology.

  8. On each server, execute:

    SET @@GLOBAL.GTID_MODE = ON;
  9. On each server, add gtid_mode=ON and enforce_gtid_consistency=ON to my.cnf.

    You are now guaranteed that all transactions have a GTID (except transactions generated in step 5 or earlier, which have already been processed). To start using the GTID protocol so that you can later perform automatic fail-over, execute the following on each replica. Optionally, if you use multi-source replication, do this for each channel and include the FOR CHANNEL channel clause:

    STOP SLAVE [FOR CHANNEL 'channel'];
    CHANGE MASTER TO MASTER_AUTO_POSITION = 1 [FOR CHANNEL 'channel'];
    START SLAVE [FOR CHANNEL 'channel'];
    Or from MySQL 8.0.22:
    STOP REPLICA [FOR CHANNEL 'channel'];
    CHANGE MASTER TO MASTER_AUTO_POSITION = 1 [FOR CHANNEL 'channel'];
    START REPLICA [FOR CHANNEL 'channel'];
    

17.1.4.3 Disabling GTID Transactions Online

This section describes how to disable GTID transactions on servers that are already online. This procedure does not require taking the server offline and is suited to use in production. However, if you have the possibility to take the servers offline when disabling GTIDs mode that process is easier.

The process is similar to enabling GTID transactions while the server is online, but reversing the steps. The only thing that differs is the point at which you wait for logged transactions to replicate.

Before you start, ensure that the servers meet the following pre-conditions:

  • All servers in your topology must use MySQL 5.7.6 or later. You cannot disable GTID transactions online on any single server unless all servers which are in the topology are using this version.

  • All servers have gtid_mode set to ON.

  • The --replicate-same-server-id option is not set on any server. You cannot disable GTID transactions if this option is set together with the --log-slave-updates option (which is the default) and binary logging is enabled (which is also the default). Without GTIDs, this combination of options causes infinite loops in circular replication.

  1. Execute the following on each replica, and if you are using multi-source replication, do it for each channel and include the FOR CHANNEL channel clause:

    STOP SLAVE [FOR CHANNEL 'channel'];
    CHANGE MASTER TO MASTER_AUTO_POSITION = 0, MASTER_LOG_FILE = file, \
    MASTER_LOG_POS = position [FOR CHANNEL 'channel'];
    START SLAVE [FOR CHANNEL 'channel'];
    Or from MySQL 8.0.22:
    STOP REPLICA [FOR CHANNEL 'channel'];
    CHANGE MASTER TO MASTER_AUTO_POSITION = 0, MASTER_LOG_FILE = file, \
    MASTER_LOG_POS = position [FOR CHANNEL 'channel'];
    START REPLICA [FOR CHANNEL 'channel'];
     
  2. On each server, execute:

    SET @@GLOBAL.GTID_MODE = ON_PERMISSIVE;
  3. On each server, execute:

    SET @@GLOBAL.GTID_MODE = OFF_PERMISSIVE;
  4. On each server, wait until the variable @@GLOBAL.GTID_OWNED is equal to the empty string. This can be checked using:

    SELECT @@GLOBAL.GTID_OWNED;

    On a replica, it is theoretically possible that this is empty and then nonempty again. This is not a problem, it suffices that it is empty once.

  5. Wait for all transactions that currently exist in any binary log to replicate to all replicas. See Section 17.1.4.4, “Verifying Replication of Anonymous Transactions” for one method of checking that all anonymous transactions have replicated to all servers.

  6. If you use binary logs for anything else than replication, for example to do point in time backup or restore: wait until you do not need the old binary logs having GTID transactions.

    For instance, after step 5 has completed, you can execute FLUSH LOGS on the server where you are taking the backup. Then either explicitly take a backup or wait for the next iteration of any periodic backup routine you may have set up.

    Ideally, wait for the server to purge all binary logs that existed when step 5 was completed. Also wait for any backup taken before step 5 to expire.

    Important

    This is the one important point during this procedure. It is important to understand that logs containing GTID transactions cannot be used after the next step. Before proceeding you must be sure that GTID transactions do not exist anywhere in the topology.

  7. On each server, execute:

    SET @@GLOBAL.GTID_MODE = OFF;
  8. On each server, set gtid_mode=OFF in my.cnf.

    If you want to set enforce_gtid_consistency=OFF, you can do so now. After setting it, you should add enforce_gtid_consistency=OFF to your configuration file.

If you want to downgrade to an earlier version of MySQL, you can do so now, using the normal downgrade procedure.

17.1.4.4 Verifying Replication of Anonymous Transactions

This section explains how to monitor a replication topology and verify that all anonymous transactions have been replicated. This is helpful when changing the replication mode online as you can verify that it is safe to change to GTID transactions.

There are several possible ways to wait for transactions to replicate:

The simplest method, which works regardless of your topology but relies on timing is as follows: if you are sure that the replica never lags more than N seconds, just wait for a bit more than N seconds. Or wait for a day, or whatever time period you consider safe for your deployment.

A safer method in the sense that it does not depend on timing: if you only have a source with one or more replicas, do the following:

  1. On the source, execute:

    SHOW MASTER STATUS;

    Note down the values in the File and Position column.

  2. On every replica, use the file and position information from the source to execute:

    SELECT MASTER_POS_WAIT(file, position);

If you have a source and multiple levels of replicas, or in other words you have replicas of replicas, repeat step 2 on each level, starting from the source, then all the direct replicas, then all the replicas of replicas, and so on.

If you use a circular replication topology where multiple servers may have write clients, perform step 2 for each source-replica connection, until you have completed the full circle. Repeat the whole process so that you do the full circle twice.

For example, suppose you have three servers A, B, and C, replicating in a circle so that A -> B -> C -> A. The procedure is then:

  • Do step 1 on A and step 2 on B.

  • Do step 1 on B and step 2 on C.

  • Do step 1 on C and step 2 on A.

  • Do step 1 on A and step 2 on B.

  • Do step 1 on B and step 2 on C.

  • Do step 1 on C and step 2 on A.

17.1.5 MySQL Multi-Source Replication

MySQL multi-source replication enables a replica to receive transactions from multiple immediate sources in parallel. In a multi-source replication topology, a replica creates a replication channel for each source that it should receive transactions from. For more information on how replication channels function, see Section 17.2.2, “Replication Channels”.

You might choose to implement multi-source replication to achieve goals like these:

  • Backing up multiple servers to a single server.

  • Merging table shards.

  • Consolidating data from multiple servers to a single server.

Multi-source replication does not implement any conflict detection or resolution when applying transactions, and those tasks are left to the application if required.

Note

Each channel on a multi-source replica must replicate from a different source. You cannot set up multiple replication channels from a single replica to a single source. This is because the server IDs of replicas must be unique in a replication topology. The source distinguishes replicas only by their server IDs, not by the names of the replication channels, so it cannot recognize different replication channels from the same replica.

A rmulti-source replica can also be set up as a multi-threaded replica, by setting the slave_parallel_workers system variable to a value greater than 0. When you do this on a multi-source replica, each channel on the replica has the specified number of applier threads, plus a coordinator thread to manage them. You cannot configure the number of applier threads for individual channels.

From MySQL 8.0, multi-source replicas can be configured with replication filters on specific replication channels. Channel specific replication filters can be used when the same database or table is present on multiple sources, and you only need the replica to replicate it from one source. For GTID-based replication, if the same transaction might arrive from multiple sources (such as in a diamond topology), you must ensure the filtering setup is the same on all channels. For more information, see Section 17.2.5.4, “Replication Channel Based Filters”.

This section provides tutorials on how to configure sources and replicas for multi-source replication, how to start, stop and reset multi-source replicas, and how to monitor multi-source replication.

17.1.5.1 Configuring Multi-Source Replication

A multi-source replication topology requires at least two sources and one replica configured. In these tutorials, we assume that you have two sources source1 and source2, and a replica replicahost. The replica replicates one database from each of the sources, db1 from source1 and db2 from source2.

Sources in a multi-source replication topology can be configured to use either GTID-based replication, or binary log position-based replication. See Section 17.1.3.4, “Setting Up Replication Using GTIDs” for how to configure a source using GTID-based replication. See Section 17.1.2.1, “Setting the Replication Source Configuration” for how to configure a source using file position based replication.

Replicas in a multi-source replication topology require TABLE repositories for the replica's connection metadata repository and applier metadata repository, which are the default in MySQL 8.0. Multi-source replication is not compatible with the deprecated alternative file repositories.

Create a suitable user account on all the sources that the replica can use to connect. You can use the same account on all the sources, or a different account on each. If you create an account solely for the purposes of replication, that account needs only the REPLICATION SLAVE privilege. For example, to set up a new user, ted, that can connect from the replica replicahost, use the mysql client to issue these statements on each of the sources:

mysql> CREATE USER 'ted'@'replicahost' IDENTIFIED BY 'password';
mysql> GRANT REPLICATION SLAVE ON *.* TO 'ted'@'replicahost';

For more details, and important information on the default authentication plugin for new users from MySQL 8.0, see Section 17.1.2.3, “Creating a User for Replication”.

17.1.5.2 Provisioning a Multi-Source Replica for GTID-Based Replication

If the sources in the multi-source replication topology have existing data, it can save time to provision the replica with the relevant data before starting replication. In a multi-source replication topology, cloning or copying of the data directory cannot be used to provision the replica with data from all of the sources, and you might also want to replicate only specific databases from each source. The best strategy for provisioning such a replica is therefore to use mysqldump to create an appropriate dump file on each source, then use the mysql client to import the dump file on the replica.

If you are using GTID-based replication, you need to pay attention to the SET @@GLOBAL.gtid_purged statement that mysqldump places in the dump output. This statement transfers the GTIDs for the transactions executed on the source to the replica, and the replica requires this information. However, for any case more complex than provisioning one new, empty replica from one source, you need to check what effect the statement has in the version of MySQL used by the replica, and handle the statement accordingly. The following guidance summarizes suitable actions, but for more details, see the mysqldump documentation.

The behavior of the SET @@GLOBAL.gtid_purged statement written by mysqldump is different in releases from MySQL 8.0 compared to MySQL 5.6 and 5.7. In MySQL 5.6 and 5.7, the statement replaces the value of gtid_purged on the replica, and also in those releases that value can only be changed when the replica's record of transactions with GTIDs (the gtid_executed set) is empty. In a multi-source replication topology, you must therefore remove the SET @@GLOBAL.gtid_purged statement from the dump output before replaying the dump files, because you cannot apply a second or subsequent dump file including this statement. Also note that for MySQL 5.6 and 5.7, this limitation means all the dump files from the sources must be applied in a single operation on a replica with an empty gtid_executed set. You can clear a replica's GTID execution history by issuing RESET MASTER on the replica, but if you have other, wanted transactions with GTIDs on the replica, choose an alternative method of provisioning from those described in Section 17.1.3.5, “Using GTIDs for Failover and Scaleout”.

From MySQL 8.0, the SET @@GLOBAL.gtid_purged statement adds the GTID set from the dump file to the existing gtid_purged set on the replica. The statement can therefore potentially be left in the dump output when you replay the dump files on the replica, and the dump files can be replayed at different times. However, it is important to note that the value that is included by mysqldump for the SET @@GLOBAL.gtid_purged statement includes the GTIDs of all transactions in the gtid_executed set on the source, even those that changed suppressed parts of the database, or other databases on the server that were not included in a partial dump. If you replay a second or subsequent dump file on the replica that contains any of the same GTIDs (for example, another partial dump from the same source, or a dump from another source that has overlapping transactions), any SET @@GLOBAL.gtid_purged statement in the second dump file fails, and must therefore be removed from the dump output.

For sources from MySQL 8.0.17, as an alternative to removing the SET @@GLOBAL.gtid_purged statement, you may set mysqldump's --set-gtid-purged option to COMMENTED to include the statement but commented out, so that it is not actioned when you load the dump file. If you are provisioning the replica with two partial dumps from the same source, and the GTID set in the second dump is the same as the first (so no new transactions have been executed on the source in between the dumps), you can set mysqldump's --set-gtid-purged option to OFF when you output the second dump file, to omit the statement.

In the following provisioning example, we assume that the SET @@GLOBAL.gtid_purged statement cannot be left in the dump output, and must be removed from the files and handled manually. We also assume that there are no wanted transactions with GTIDs on the replica before provisioning starts.

  1. To create dump files for a database named db1 on source1 and a database named db2 on source2, run mysqldump for source1 as follows:

    mysqldump -u<user> -p<password> --single-transaction --triggers --routines --set-gtid-purged=ON --databases db1 > dumpM1.sql 
    

    Then run mysqldump for source2 as follows:

    mysqldump -u<user> -p<password> --single-transaction --triggers --routines --set-gtid-purged=ON --databases db2 > dumpM2.sql 
    
  2. Record the gtid_purged value that mysqldump added to each of the dump files. For example, for dump files created on MySQL 5.6 or 5.7, you can extract the value like this:

    cat dumpM1.sql | grep GTID_PURGED | cut -f2 -d'=' | cut -f2 -d$'\''
    cat dumpM2.sql | grep GTID_PURGED | cut -f2 -d'=' | cut -f2 -d$'\'' 
    

    From MySQL 8.0, where the format has changed, you can extract the value like this:

    cat dumpM1.sql | grep GTID_PURGED | perl -p0 -e 's#/\*.*?\*/##sg' | cut -f2 -d'=' | cut -f2 -d$'\''
    cat dumpM2.sql | grep GTID_PURGED | perl -p0 -e 's#/\*.*?\*/##sg' | cut -f2 -d'=' | cut -f2 -d$'\''
    

    The result in each case should be a GTID set, for example:

    source1:   2174B383-5441-11E8-B90A-C80AA9429562:1-1029
    source2:   224DA167-0C0C-11E8-8442-00059A3C7B00:1-2695
    
  3. Remove the line from each dump file that contains the SET @@GLOBAL.gtid_purged statement. For example:

    sed '/GTID_PURGED/d' dumpM1.sql > dumpM1_nopurge.sql
    sed '/GTID_PURGED/d' dumpM2.sql > dumpM2_nopurge.sql 
    
  4. Use the mysql client to import each edited dump file into the replica. For example:

    mysql -u<user> -p<password> < dumpM1_nopurge.sql
    mysql -u<user> -p<password> < dumpM2_nopurge.sql 
    
  5. On the replica, issue RESET MASTER to clear the GTID execution history (assuming, as explained above, that all the dump files have been imported and that there are no wanted transactions with GTIDs on the replica). Then issue a SET @@GLOBAL.gtid_purged statement to set the gtid_purged value to the union of all the GTID sets from all the dump files, as you recorded in Step 2. For example:

    mysql> RESET MASTER;
    mysql> SET @@GLOBAL.gtid_purged = "2174B383-5441-11E8-B90A-C80AA9429562:1-1029, 224DA167-0C0C-11E8-8442-00059A3C7B00:1-2695";
    

    If there are, or might be, overlapping transactions between the GTID sets in the dump files, you can use the stored functions described in Section 17.1.3.8, “Stored Function Examples to Manipulate GTIDs” to check this beforehand and to calculate the union of all the GTID sets.

17.1.5.3 Adding GTID-Based Sources to a Multi-Source Replica

These steps assume you have enabled GTIDs for transactions on the sources using gtid_mode=ON, created a replication user, ensured that the replica is using TABLE based replication applier metadata repositories, and provisioned the replica with data from the sources if appropriate.

Use the CHANGE MASTER TO statement to configure a replication channel for each source on the replica (see Section 17.2.2, “Replication Channels”). The FOR CHANNEL clause is used to specify the channel. For GTID-based replication, GTID auto-positioning is used to synchronize with the source (see Section 17.1.3.3, “GTID Auto-Positioning”). The MASTER_AUTO_POSITION option is set to specify the use of auto-positioning.

For example, to add source1 and source2 as sources to the replica, use the mysql client to issue the CHANGE MASTER TO statement twice on the replica, like this:

mysql> CHANGE MASTER TO MASTER_HOST="source1", MASTER_USER="ted", \
MASTER_PASSWORD="password", MASTER_AUTO_POSITION=1 FOR CHANNEL "source_1";
mysql> CHANGE MASTER TO MASTER_HOST="source2", MASTER_USER="ted", \
MASTER_PASSWORD="password", MASTER_AUTO_POSITION=1 FOR CHANNEL "source_2";

For the full syntax of the CHANGE MASTER TO statement and other available options, see Section 13.4.2.1, “CHANGE MASTER TO Statement”.

To make the replica replicate only database db1 from source1, and only database db2 from source2, use the mysql client to issue the CHANGE REPLICATION FILTER statement for each channel, like this:

mysql> CHANGE REPLICATION FILTER REPLICATE_WILD_DO_TABLE = ('db1.%') FOR CHANNEL "source_1";
mysql> CHANGE REPLICATION FILTER REPLICATE_WILD_DO_TABLE = ('db2.%') FOR CHANNEL "source_2";

For the full syntax of the CHANGE REPLICATION FILTER statement and other available options, see Section 13.4.2.2, “CHANGE REPLICATION FILTER Statement”.

17.1.5.4 Adding Binary Log Based Replication Sources to a Multi-Source Replica

These steps assume that binary logging is enabled on the source (which is the default), the replica is using TABLE based replication applier metadata repositories (which is the default in MySQL 8.0), and that you have enabled a replication user and noted the current binary log position. You need to know the current MASTER_LOG_FILE and MASTER_LOG_POSITION.

Use the CHANGE MASTER TO statement to configure a replication channel for each source on the replica (see Section 17.2.2, “Replication Channels”). The FOR CHANNEL clause is used to specify the channel. For example, to add source1 and source2 as sources to the replica, use the mysql client to issue the CHANGE MASTER TO statement twice on the replica, like this:

mysql> CHANGE MASTER TO MASTER_HOST="source1", MASTER_USER="ted", MASTER_PASSWORD="password", \
MASTER_LOG_FILE='source1-bin.000006', MASTER_LOG_POS=628 FOR CHANNEL "source_1";
mysql> CHANGE MASTER TO MASTER_HOST="source2", MASTER_USER="ted", MASTER_PASSWORD="password", \
MASTER_LOG_FILE='source2-bin.000018', MASTER_LOG_POS=104 FOR CHANNEL "source_2";

For the full syntax of the CHANGE MASTER TO statement and other available options, see Section 13.4.2.1, “CHANGE MASTER TO Statement”.

To make the replica replicate only database db1 from source1, and only database db2 from source2, use the mysql client to issue the CHANGE REPLICATION FILTER statement for each channel, like this:

mysql> CHANGE REPLICATION FILTER REPLICATE_WILD_DO_TABLE = ('db1.%') FOR CHANNEL "source_1";
mysql> CHANGE REPLICATION FILTER REPLICATE_WILD_DO_TABLE = ('db2.%') FOR CHANNEL "source_2";

For the full syntax of the CHANGE REPLICATION FILTER statement and other available options, see Section 13.4.2.2, “CHANGE REPLICATION FILTER Statement”.

17.1.5.5 Starting Multi-Source Replicas

Once you have added channels for all of the replication sources, issue a START REPLICA | SLAVE statement to start replication. When you have enabled multiple channels on a replica, you can choose to either start all channels, or select a specific channel to start. For example, to start the two channels separately, use the mysql client to issue the following statements:

mysql> START SLAVE FOR CHANNEL "source_1";
mysql> START SLAVE FOR CHANNEL "source_2";
Or from MySQL 8.0.22:
mysql> START REPLICA FOR CHANNEL "source_1";
mysql> START REPLICA FOR CHANNEL "source_2";

For the full syntax of the START REPLICA | SLAVE command and other available options, see Section 13.4.2.7, “START REPLICA | SLAVE Statement”.

To verify that both channels have started and are operating correctly, you can issue SHOW REPLICA | SLAVE STATUS statements on the replica, for example:

mysql> SHOW SLAVE STATUS FOR CHANNEL "source_1"\G
mysql> SHOW SLAVE STATUS FOR CHANNEL "source_2"\G
Or from MySQL 8.0.22:
mysql> SHOW REPLICA STATUS FOR CHANNEL "source_1"\G
mysql> SHOW REPLICA STATUS FOR CHANNEL "source_2"\G

17.1.5.6 Stopping Multi-Source Replicas

The STOP REPLICA | SLAVE statement can be used to stop a multi-source replica. By default, if you use the STOP REPLICA | SLAVE statement on a multi-source replica all channels are stopped. Optionally, use the FOR CHANNEL channel clause to stop only a specific channel.

  • To stop all currently configured replication channels:

    mysql> STOP SLAVE;
    Or from MySQL 8.0.22:
    mysql> STOP REPLICA;
  • To stop only a named channel, use a FOR CHANNEL channel clause:

    mysql> STOP SLAVE FOR CHANNEL "source_1";
    Or from MySQL 8.0.22:
    mysql> STOP REPLICA FOR CHANNEL "source_1";

For the full syntax of the STOP REPLICA | SLAVE command and other available options, see Section 13.4.2.9, “STOP REPLICA | SLAVE Statement”.

17.1.5.7 Resetting Multi-Source Replicas

The RESET REPLICA | SLAVE statement can be used to reset a multi-source replica. By default, if you use the RESET REPLICA | SLAVE statement on a multi-source replica all channels are reset. Optionally, use the FOR CHANNEL channel clause to reset only a specific channel.

  • To reset all currently configured replication channels:

    mysql> RESET SLAVE;
    Or from MySQL 8.0.22:
    mysql> RESET REPLICA;
  • To reset only a named channel, use a FOR CHANNEL channel clause:

    mysql> RESET SLAVE FOR CHANNEL "source_1";
    Or from MySQL 8.0.22:
    mysql> RESET REPLICA FOR CHANNEL "source_1";
    

For GTID-based replication, note that RESET REPLICA | SLAVE has no effect on the replica's GTID execution history. If you want to clear this, issue RESET MASTER on the replica.

RESET REPLICA | SLAVE makes the replica forget its replication position, and clears the relay log, but it does not change any replication connection parameters (such as the source host name) or replication filters. If you want to remove these for a channel, issue RESET REPLICA | SLAVE ALL.

For the full syntax of the RESET REPLICA | SLAVE command and other available options, see Section 13.4.2.5, “RESET REPLICA | SLAVE Statement”.

17.1.5.8 Monitoring Multi-Source Replication

To monitor the status of replication channels the following options exist:

  • Using the replication Performance Schema tables. The first column of these tables is Channel_Name. This enables you to write complex queries based on Channel_Name as a key. See Section 27.12.11, “Performance Schema Replication Tables”.

  • Using SHOW REPLICA | SLAVE STATUS FOR CHANNEL channel. By default, if the FOR CHANNEL channel clause is not used, this statement shows the replica status for all channels with one row per channel. The identifier Channel_name is added as a column in the result set. If a FOR CHANNEL channel clause is provided, the results show the status of only the named replication channel.

Note

The SHOW VARIABLES statement does not work with multiple replication channels. The information that was available through these variables has been migrated to the replication performance tables. Using a SHOW VARIABLES statement in a topology with multiple channels shows the status of only the default channel.

The error codes and messages that are issued when multi-source replication is enabled specify the channel that generated the error.

17.1.5.8.1 Monitoring Channels Using Performance Schema Tables

This section explains how to use the replication Performance Schema tables to monitor channels. You can choose to monitor all channels, or a subset of the existing channels.

To monitor the connection status of all channels:

mysql> SELECT * FROM replication_connection_status\G;
*************************** 1. row ***************************
CHANNEL_NAME: source_1
GROUP_NAME:
SOURCE_UUID: 046e41f8-a223-11e4-a975-0811960cc264
THREAD_ID: 24
SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 0
LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00
RECEIVED_TRANSACTION_SET: 046e41f8-a223-11e4-a975-0811960cc264:4-37
LAST_ERROR_NUMBER: 0
LAST_ERROR_MESSAGE:
LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 2. row ***************************
CHANNEL_NAME: source_2
GROUP_NAME:
SOURCE_UUID: 7475e474-a223-11e4-a978-0811960cc264
THREAD_ID: 26
SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 0
LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00
RECEIVED_TRANSACTION_SET: 7475e474-a223-11e4-a978-0811960cc264:4-6
LAST_ERROR_NUMBER: 0
LAST_ERROR_MESSAGE:
LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
2 rows in set (0.00 sec)
	    

In the above output there are two channels enabled, and as shown by the CHANNEL_NAME field they are called source_1 and source_2.

The addition of the CHANNEL_NAME field enables you to query the Performance Schema tables for a specific channel. To monitor the connection status of a named channel, use a WHERE CHANNEL_NAME=channel clause:

mysql> SELECT * FROM replication_connection_status WHERE CHANNEL_NAME='source_1'\G
*************************** 1. row ***************************
CHANNEL_NAME: source_1
GROUP_NAME:
SOURCE_UUID: 046e41f8-a223-11e4-a975-0811960cc264
THREAD_ID: 24
SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 0
LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00
RECEIVED_TRANSACTION_SET: 046e41f8-a223-11e4-a975-0811960cc264:4-37
LAST_ERROR_NUMBER: 0
LAST_ERROR_MESSAGE:
LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
1 row in set (0.00 sec)

Similarly, the WHERE CHANNEL_NAME=channel clause can be used to monitor the other replication Performance Schema tables for a specific channel. For more information, see Section 27.12.11, “Performance Schema Replication Tables”.

17.1.6 Replication and Binary Logging Options and Variables

The following sections contain information about mysqld options and server variables that are used in replication and for controlling the binary log. Options and variables for use on sources and replicas are covered separately, as are options and variables relating to binary logging and global transaction identifiers (GTIDs). A set of quick-reference tables providing basic information about these options and variables is also included.

Of particular importance is the server_id system variable.

Command-Line Format --server-id=#
System Variable server_id
Scope Global
Dynamic Yes
SET_VAR Hint Applies No
Type Integer
Default Value (≥ 8.0.3) 1
Default Value (≤ 8.0.2) 0
Minimum Value 0
Maximum Value 4294967295

This variable specifies the server ID. server_id is set to 1 by default. The server can be started with this default ID, but when binary logging is enabled, an informational message is issued if you did not set server_id explicitly to specify a server ID.

For servers that are used in a replication topology, you must specify a unique server ID for each replication server, in the range from 1 to 232 − 1. Unique means that each ID must be different from every other ID in use by any other source or replica in the replication topology. For additional information, see Section 17.1.6.2, “Replication Source Options and Variables”, and Section 17.1.6.3, “Replica Server Options and Variables”.

If the server ID is set to 0, binary logging takes place, but a source with a server ID of 0 refuses any connections from replicas, and a replica with a server ID of 0 refuses to connect to a source. Note that although you can change the server ID dynamically to a nonzero value, doing so does not enable replication to start immediately. You must change the server ID and then restart the server to initialize the replica.

For more information, see Section 17.1.2.2, “Setting the Replica Configuration”.

server_uuid

The MySQL server generates a true UUID in addition to the default or user-supplied server ID set in the server_id system variable. This is available as the global, read-only variable server_uuid.

Note

The presence of the server_uuid system variable does not change the requirement for setting a unique server_id value for each MySQL server as part of preparing and running MySQL replication, as described earlier in this section.

System Variable server_uuid
Scope Global
Dynamic No
SET_VAR Hint Applies No
Type String

When starting, the MySQL server automatically obtains a UUID as follows:

  1. Attempt to read and use the UUID written in the file data_dir/auto.cnf (where data_dir is the server's data directory).

  2. If data_dir/auto.cnf is not found, generate a new UUID and save it to this file, creating the file if necessary.

The auto.cnf file has a format similar to that used for my.cnf or my.ini files. auto.cnf has only a single [auto] section containing a single server_uuid setting and value; the file's contents appear similar to what is shown here:

[auto]
server_uuid=8a94f357-aab4-11df-86ab-c80aa9429562
Important

The auto.cnf file is automatically generated; do not attempt to write or modify this file.

When using MySQL replication, sources and replicas know each other's UUIDs. The value of a replica's UUID can be seen in the output of SHOW REPLICAS | SHOW SLAVE HOSTS. Once START REPLICA | SLAVE has been executed, the value of the source's UUID is available on the replica in the output of SHOW REPLICA | SLAVE STATUS.

Note

Issuing a STOP REPLICA | SLAVE or RESET REPLICA | SLAVE statement does not reset the source's UUID as used on the replica.

A server's server_uuid is also used in GTIDs for transactions originating on that server. For more information, see Section 17.1.3, “Replication with Global Transaction Identifiers”.

When starting, the replication I/O thread generates an error and aborts if its source's UUID is equal to its own unless the --replicate-same-server-id option has been set. In addition, the replication I/O thread generates a warning if either of the following is true:

17.1.6.1 Replication and Binary Logging Option and Variable Reference

The following two sections provide basic information about the MySQL command-line options and system variables applicable to replication and the binary log.

Replication Options and Variables

The command-line options and system variables in the following list relate to replication source servers and replicas. Section 17.1.6.2, “Replication Source Options and Variables” provides more detailed information about options and variables relating to replication source servers. For more information about options and variables relating to replicas, see Section 17.1.6.3, “Replica Server Options and Variables”.

For a listing of all command-line options, system variables, and status variables used with mysqld, see Section 5.1.4, “Server Option, System Variable, and Status Variable Reference”.

Binary Logging Options and Variables

The command-line options and system variables in the following list relate to the binary log. Section 17.1.6.4, “Binary Logging Options and Variables”, provides more detailed information about options and variables relating to binary logging. For additional general information about the binary log, see Section 5.4.4, “The Binary Log”.

For a listing of all command-line options, system and status variables used with mysqld, see Section 5.1.4, “Server Option, System Variable, and Status Variable Reference”.

17.1.6.2 Replication Source Options and Variables

This section describes the server options and system variables that you can use on replication source servers. You can specify the options either on the command line or in an option file. You can specify system variable values using SET.

On the source and each replica, you must set the server_id system variable to establish a unique replication ID. For each server, you should pick a unique positive integer in the range from 1 to 232 − 1, and each ID must be different from every other ID in use by any other source or replica in the replication topology. Example: server-id=3.

For options used on the source for controlling binary logging, see Section 17.1.6.4, “Binary Logging Options and Variables”.

Startup Options for Replication Source Servers

The following list describes startup options for controlling replication source servers. Replication-related system variables are discussed later in this section.

System Variables Used on Replication Source Servers

The following system variables are used for or by replication source servers:

  • auto_increment_increment

    Command-Line Format --auto-increment-increment=#
    System Variable auto_increment_increment
    Scope Global, Session
    Dynamic Yes
    SET_VAR Hint Applies Yes
    Type Integer
    Default Value 1
    Minimum Value 1
    Maximum Value 65535

    auto_increment_increment and auto_increment_offset are intended for use with circular (source-to-source) replication, and can be used to control the operation of AUTO_INCREMENT columns. Both variables have global and session values, and each can assume an integer value between 1 and 65,535 inclusive. Setting the value of either of these two variables to 0 causes its value to be set to 1 instead. Attempting to set the value of either of these two variables to an integer greater than 65,535 or less than 0 causes its value to be set to 65,535 instead. Attempting to set the value of auto_increment_increment or auto_increment_offset to a noninteger value produces an error, and the actual value of the variable remains unchanged.

    Note

    auto_increment_increment is also supported for use with NDB tables.

    As of MySQL 8.0.18, setting the session value of this system variable is no longer a restricted operation.

    When Group Replication is started on a server, the value of auto_increment_increment is changed to the value of group_replication_auto_increment_increment, which defaults to 7, and the value of auto_increment_offset is changed to the server ID. The changes are reverted when Group Replication is stopped. These changes are only made and reverted if auto_increment_increment and auto_increment_offset each have their default value of 1. If their values have already been modified from the default, Group Replication does not alter them. From MySQL 8.0, the system variables are also not modified when Group Replication is in single-primary mode, where only one server writes.

    auto_increment_increment and auto_increment_offset affect AUTO_INCREMENT column behavior as follows:

    • auto_increment_increment controls the interval between successive column values. For example:

      mysql> SHOW VARIABLES LIKE 'auto_inc%';
      +--------------------------+-------+
      | Variable_name            | Value |
      +--------------------------+-------+
      | auto_increment_increment | 1     |
      | auto_increment_offset    | 1     |
      +--------------------------+-------+
      2 rows in set (0.00 sec)
      
      mysql> CREATE TABLE autoinc1
          -> (col INT NOT NULL AUTO_INCREMENT PRIMARY KEY);
        Query OK, 0 rows affected (0.04 sec)
      
      mysql> SET @@auto_increment_increment=10;
      Query OK, 0 rows affected (0.00 sec)
      
      mysql> SHOW VARIABLES LIKE 'auto_inc%';
      +--------------------------+-------+
      | Variable_name            | Value |
      +--------------------------+-------+
      | auto_increment_increment | 10    |
      | auto_increment_offset    | 1     |
      +--------------------------+-------+
      2 rows in set (0.01 sec)
      
      mysql> INSERT INTO autoinc1 VALUES (NULL), (NULL), (NULL), (NULL);
      Query OK, 4 rows affected (0.00 sec)
      Records: 4  Duplicates: 0  Warnings: 0
      
      mysql> SELECT col FROM autoinc1;
      +-----+
      | col |
      +-----+
      |   1 |
      |  11 |
      |  21 |
      |  31 |
      +-----+
      4 rows in set (0.00 sec)
      
    • auto_increment_offset determines the starting point for the AUTO_INCREMENT column value. Consider the following, assuming that these statements are executed during the same session as the example given in the description for auto_increment_increment:

      mysql> SET @@auto_increment_offset=5;
      Query OK, 0 rows affected (0.00 sec)
      
      mysql> SHOW VARIABLES LIKE 'auto_inc%';
      +--------------------------+-------+
      | Variable_name            | Value |
      +--------------------------+-------+
      | auto_increment_increment | 10    |
      | auto_increment_offset    | 5     |
      +--------------------------+-------+
      2 rows in set (0.00 sec)
      
      mysql> CREATE TABLE autoinc2
          -> (col INT NOT NULL AUTO_INCREMENT PRIMARY KEY);
      Query OK, 0 rows affected (0.06 sec)
      
      mysql> INSERT INTO autoinc2 VALUES (NULL), (NULL), (NULL), (NULL);
      Query OK, 4 rows affected (0.00 sec)
      Records: 4  Duplicates: 0  Warnings: 0
      
      mysql> SELECT col FROM autoinc2;
      +-----+
      | col |
      +-----+
      |   5 |
      |  15 |
      |  25 |
      |  35 |
      +-----+
      4 rows in set (0.02 sec)
      

      When the value of auto_increment_offset is greater than that of auto_increment_increment, the value of auto_increment_offset is ignored.

    If either of these variables is changed, and then new rows inserted into a table containing an AUTO_INCREMENT column, the results may seem counterintuitive because the series of AUTO_INCREMENT values is calculated without regard to any values already present in the column, and the next value inserted is the least value in the series that is greater than the maximum existing value in the AUTO_INCREMENT column. The series is calculated like this:

    auto_increment_offset + N × auto_increment_increment

    where N is a positive integer value in the series [1, 2, 3, ...]. For example:

    mysql> SHOW VARIABLES LIKE 'auto_inc%';
    +--------------------------+-------+
    | Variable_name            | Value |
    +--------------------------+-------+
    | auto_increment_increment | 10    |
    | auto_increment_offset    | 5     |
    +--------------------------+-------+
    2 rows in set (0.00 sec)
    
    mysql> SELECT col FROM autoinc1;
    +-----+
    | col |
    +-----+
    |   1 |
    |  11 |
    |  21 |
    |  31 |
    +-----+
    4 rows in set (0.00 sec)
    
    mysql> INSERT INTO autoinc1 VALUES (NULL), (NULL), (NULL), (NULL);
    Query OK, 4 rows affected (0.00 sec)
    Records: 4  Duplicates: 0  Warnings: 0
    
    mysql> SELECT col FROM autoinc1;
    +-----+
    | col |
    +-----+
    |   1 |
    |  11 |
    |  21 |
    |  31 |
    |  35 |
    |  45 |
    |  55 |
    |  65 |
    +-----+
    8 rows in set (0.00 sec)
    

    The values shown for auto_increment_increment and auto_increment_offset generate the series 5 + N × 10, that is, [5, 15, 25, 35, 45, ...]. The highest value present in the col column prior to the INSERT is 31, and the next available value in the AUTO_INCREMENT series is 35, so the inserted values for col begin at that point and the results are as shown for the SELECT query.

    It is not possible to restrict the effects of these two variables to a single table; these variables control the behavior of all AUTO_INCREMENT columns in all tables on the MySQL server. If the global value of either variable is set, its effects persist until the global value is changed or overridden by setting the session value, or until mysqld is restarted. If the local value is set, the new value affects AUTO_INCREMENT columns for all tables into which new rows are inserted by the current user for the duration of the session, unless the values are changed during that session.

    The default value of auto_increment_increment is 1. See Section 17.5.1.1, “Replication and AUTO_INCREMENT”.

  • auto_increment_offset

    Command-Line Format --auto-increment-offset=#
    System Variable auto_increment_offset
    Scope Global, Session
    Dynamic Yes
    SET_VAR Hint Applies Yes
    Type Integer
    Default Value 1
    Minimum Value 1
    Maximum Value 65535

    This variable has a default value of 1. If it is left with its default value, and Group Replication is started on the server in multi-primary mode, it is changed to the server ID. For more information, see the description for auto_increment_increment.

    Note

    auto_increment_offset is also supported for use with NDB tables.

    As of MySQL 8.0.18, setting the session value of this system variable is no longer a restricted operation.

  • immediate_server_version

    Introduced 8.0.14
    System Variable immediate_server_version
    Scope Session
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer

    For internal use by replication. This session system variable holds the MySQL Server release number of the server that is the immediate source in a replication topology (for example, 80014 for a MySQL 8.0.14 server instance). If this immediate server is at a release that does not support the session system variable, the value of the variable is set to 0 (UNKNOWN_SERVER_VERSION).

    The value of the variable is replicated from a source to a replica. With this information the replica can correctly process data originating from a source at an older release, by recognizing where syntax changes or semantic changes have occurred between the releases involved and handling these appropriately. The information can also be used in a Group Replication environment where one or more members of the replication group is at a newer release than the others. The value of the variable can be viewed in the binary log for each transaction (as part of the Gtid_log_event, or Anonymous_gtid_log_event if GTIDs are not in use on the server), and could be helpful in debugging cross-version replication issues.

    Setting the session value of this system variable is a restricted operation. The session user must have either the REPLICATION_APPLIER privilege (see Section 17.3.3, “Replication Privilege Checks”), or privileges sufficient to set restricted session variables (see Section 5.1.9.1, “System Variable Privileges”). However, note that the variable is not intended for users to set; it is set automatically by the replication infrastructure.

  • original_server_version

    Introduced 8.0.14
    System Variable original_server_version
    Scope Session
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer

    For internal use by replication. This session system variable holds the MySQL Server release number of the server where a transaction was originally committed (for example, 80014 for a MySQL 8.0.14 server instance). If this original server is at a release that does not support the session system variable, the value of the variable is set to 0 (UNKNOWN_SERVER_VERSION). Note that when a release number is set by the original server, the value of the variable is reset to 0 if the immediate server or any other intervening server in the replication topology does not support the session system variable, and so does not replicate its value.

    The value of the variable is set and used in the same ways as for the immediate_server_version system variable. If the value of the variable is the same as that for the immediate_server_version system variable, only the latter is recorded in the binary log, with an indicator that the original server version is the same.

    In a Group Replication environment, view change log events, which are special transactions queued by each group member when a new member joins the group, are tagged with the server version of the group member queuing the transaction. This ensures that the server version of the original donor is known to the joining member. Because the view change log events queued for a particular view change have the same GTID on all members, for this case only, instances of the same GTID might have a different original server version.

    Setting the session value of this system variable is a restricted operation. The session user must have either the REPLICATION_APPLIER privilege (see Section 17.3.3, “Replication Privilege Checks”), or privileges sufficient to set restricted session variables (see Section 5.1.9.1, “System Variable Privileges”). However, note that the variable is not intended for users to set; it is set automatically by the replication infrastructure.

  • rpl_semi_sync_master_enabled

    Command-Line Format --rpl-semi-sync-master-enabled[={OFF|ON}]
    System Variable rpl_semi_sync_master_enabled
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value OFF

    Controls whether semisynchronous replication is enabled on the source server. To enable or disable the plugin, set this variable to ON or OFF (or 1 or 0), respectively. The default is OFF.

    This variable is available only if the source-side semisynchronous replication plugin is installed.

  • rpl_semi_sync_master_timeout

    Command-Line Format --rpl-semi-sync-master-timeout=#
    System Variable rpl_semi_sync_master_timeout
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 10000

    A value in milliseconds that controls how long the source waits on a commit for acknowledgment from a replica before timing out and reverting to asynchronous replication. The default value is 10000 (10 seconds).

    This variable is available only if the source-side semisynchronous replication plugin is installed.

  • rpl_semi_sync_master_trace_level

    Command-Line Format --rpl-semi-sync-master-trace-level=#
    System Variable rpl_semi_sync_master_trace_level
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 32

    The semisynchronous replication debug trace level on the source server. Four levels are defined:

    • 1 = general level (for example, time function failures)

    • 16 = detail level (more verbose information)

    • 32 = net wait level (more information about network waits)

    • 64 = function level (information about function entry and exit)

    This variable is available only if the source-side semisynchronous replication plugin is installed.

  • rpl_semi_sync_master_wait_for_slave_count

    Command-Line Format --rpl-semi-sync-master-wait-for-slave-count=#
    System Variable rpl_semi_sync_master_wait_for_slave_count
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 1
    Minimum Value 1
    Maximum Value 65535

    The number of replica acknowledgments the source must receive per transaction before proceeding. By default rpl_semi_sync_master_wait_for_slave_count is 1, meaning that semisynchronous replication proceeds after receiving a single replica acknowledgment. Performance is best for small values of this variable.

    For example, if rpl_semi_sync_master_wait_for_slave_count is 2, then 2 replicas must acknowledge receipt of the transaction before the timeout period configured by rpl_semi_sync_master_timeout for semisynchronous replication to proceed. If fewer replicas acknowledge receipt of the transaction during the timeout period, the source reverts to normal replication.

    Note

    This behavior also depends on rpl_semi_sync_master_wait_no_slave

    This variable is available only if the source-side semisynchronous replication plugin is installed.

  • rpl_semi_sync_master_wait_no_slave

    Command-Line Format --rpl-semi-sync-master-wait-no-slave[={OFF|ON}]
    System Variable rpl_semi_sync_master_wait_no_slave
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value ON

    Controls whether the source waits for the timeout period configured by rpl_semi_sync_master_timeout to expire, even if the replica count drops to less than the number of replicas configured by rpl_semi_sync_master_wait_for_slave_count during the timeout period.

    When the value of rpl_semi_sync_master_wait_no_slave is ON (the default), it is permissible for the replica count to drop to less than rpl_semi_sync_master_wait_for_slave_count during the timeout period. As long as enough replicas acknowledge the transaction before the timeout period expires, semisynchronous replication continues.

    When the value of rpl_semi_sync_master_wait_no_slave is OFF, if the replica count drops to less than the number configured in rpl_semi_sync_master_wait_for_slave_count at any time during the timeout period configured by rpl_semi_sync_master_timeout, the source reverts to normal replication.

    This variable is available only if the source-side semisynchronous replication plugin is installed.

  • rpl_semi_sync_master_wait_point

    Command-Line Format --rpl-semi-sync-master-wait-point=value
    System Variable rpl_semi_sync_master_wait_point
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Enumeration
    Default Value AFTER_SYNC
    Valid Values

    AFTER_SYNC

    AFTER_COMMIT

    This variable controls the point at which a semisynchronous replication source server waits for replica acknowledgment of transaction receipt before returning a status to the client that committed the transaction. These values are permitted:

    • AFTER_SYNC (the default): The source writes each transaction to its binary log and the replica, and syncs the binary log to disk. The source waits for replica acknowledgment of transaction receipt after the sync. Upon receiving acknowledgment, the source commits the transaction to the storage engine and returns a result to the client, which then can proceed.

    • AFTER_COMMIT: The source writes each transaction to its binary log and the replica, syncs the binary log, and commits the transaction to the storage engine. The source waits for replica acknowledgment of transaction receipt after the commit. Upon receiving acknowledgment, the source returns a result to the client, which then can proceed.

    The replication characteristics of these settings differ as follows:

    • With AFTER_SYNC, all clients see the committed transaction at the same time: After it has been acknowledged by the replica and committed to the storage engine on the source. Thus, all clients see the same data on the source.

      In the event of source failure, all transactions committed on the source have been replicated to the replica (saved to its relay log). An unexpected exit of the source server and failover to the replica is lossless because the replica is up to date. Note, however, that the source cannot be restarted in this scenario and must be discarded, because its binary log might contain uncommitted transactions that would cause a conflict with the replica when externalized after binary log recovery.

    • With AFTER_COMMIT, the client issuing the transaction gets a return status only after the server commits to the storage engine and receives replica acknowledgment. After the commit and before replica acknowledgment, other clients can see the committed transaction before the committing client.

      If something goes wrong such that the replica does not process the transaction, then in the event of an unexpected source server exit and failover to the replica, it is possible for such clients to see a loss of data relative to what they saw on the source.

    This variable is available only if the source-side semisynchronous replication plugin is installed.

    With the addition of rpl_semi_sync_master_wait_point in MySQL 5.7, a version compatibility constraint was created because it increments the semisynchronous interface version: Servers for MySQL 5.7 and higher do not work with semisynchronous replication plugins from older versions, nor do servers from older versions work with semisynchronous replication plugins for MySQL 5.7 and higher.

17.1.6.3 Replica Server Options and Variables

This section explains the server options and system variables that apply to replica servers and contains the following:

Specify the options either on the command line or in an option file. Many of the options can be set while the server is running by using the CHANGE MASTER TO statement. Specify system variable values using SET.

Server ID.  On the source and each replica, you must set the server_id system variable to establish a unique replication ID in the range from 1 to 232 − 1. Unique means that each ID must be different from every other ID in use by any other source or replica in the replication topology. Example my.cnf file:

[mysqld]
server-id=3
Startup Options for Replica Servers

This section explains startup options for controlling replica servers. Many of these options can be set while the server is running by using the CHANGE MASTER TO statement. Others, such as the --replicate-* options, can be set only when the replica server starts. Replication-related system variables are discussed later in this section.

  • --master-info-file=file_name

    Command-Line Format --master-info-file=file_name
    Deprecated 8.0.18
    Type File name
    Default Value master.info

    The use of this option is now deprecated. It was used to set the file name for the replica's connection metadata repository if master_info_repository=FILE was set. --master-info-file and the use of the master_info_repository system variable are deprecated because the use of a file for the connection metadata repository has been superseded by crash-safe tables. For information about the connection metadata repository, see Section 17.2.4.2, “Replication Metadata Repositories”.

  • --master-retry-count=count

    Command-Line Format --master-retry-count=#
    Deprecated Yes
    Type Integer
    Default Value 86400
    Minimum Value 0
    Maximum Value (64-bit platforms) 18446744073709551615
    Maximum Value (32-bit platforms) 4294967295

    The number of times that the replica tries to reconnect to the source before giving up. The default value is 86400 times. A value of 0 means infinite, and the replica attempts to connect forever. Reconnection attempts are triggered when the replica reaches its connection timeout (specified by the slave_net_timeout system variable) without receiving data or a heartbeat signal from the source. Reconnection is attempted at intervals set by the MASTER_CONNECT_RETRY option of the CHANGE MASTER TO statement (which defaults to every 60 seconds).

    This option is deprecated; expect it to be removed in a future MySQL release. Use the MASTER_RETRY_COUNT option of the CHANGE MASTER TO statement instead.

  • --max-relay-log-size=size

    Command-Line Format --max-relay-log-size=#
    System Variable max_relay_log_size
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 0
    Minimum Value 0
    Maximum Value 1073741824

    The size at which the server rotates relay log files automatically. If this value is nonzero, the relay log is rotated automatically when its size exceeds this value. If this value is zero (the default), the size at which relay log rotation occurs is determined by the value of max_binlog_size. For more information, see Section 17.2.4.1, “The Relay Log”.

  • --relay-log-purge={0|1}

    Command-Line Format --relay-log-purge[={OFF|ON}]
    System Variable relay_log_purge
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value ON

    Disable or enable automatic purging of relay logs as soon as they are no longer needed. The default value is 1 (enabled). This is a global variable that can be changed dynamically with SET GLOBAL relay_log_purge = N. Disabling purging of relay logs when enabling the --relay-log-recovery option risks data consistency and is therefore not crash-safe.

  • --relay-log-space-limit=size

    Command-Line Format --relay-log-space-limit=#
    System Variable relay_log_space_limit
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type Integer
    Default Value 0
    Minimum Value 0
    Maximum Value (64-bit platforms) 18446744073709551615
    Maximum Value (32-bit platforms) 4294967295

    This option places an upper limit on the total size in bytes of all relay logs on the replica. A value of 0 means no limit. This is useful for a replica server host that has limited disk space. When the limit is reached, the I/O thread stops reading binary log events from the source server until the SQL thread has caught up and deleted some unused relay logs. Note that this limit is not absolute: There are cases where the SQL thread needs more events before it can delete relay logs. In that case, the I/O thread exceeds the limit until it becomes possible for the SQL thread to delete some relay logs because not doing so would cause a deadlock. You should not set --relay-log-space-limit to less than twice the value of --max-relay-log-size (or --max-binlog-size if --max-relay-log-size is 0). In that case, there is a chance that the I/O thread waits for free space because --relay-log-space-limit is exceeded, but the SQL thread has no relay log to purge and is unable to satisfy the I/O thread. This forces the I/O thread to ignore --relay-log-space-limit temporarily.

  • --replicate-do-db=db_name

    Command-Line Format --replicate-do-db=name
    Type String

    Creates a replication filter using the name of a database. Such filters can also be created using CHANGE REPLICATION FILTER REPLICATE_DO_DB.

    This option supports channel specific replication filters, enabling multi-source replicas to use specific filters for different sources. To configure a channel specific replication filter on a channel named channel_1 use --replicate-do-db:channel_1:db_name. In this case, the first colon is interpreted as a separator and subsequent colons are literal colons. See Section 17.2.5.4, “Replication Channel Based Filters” for more information.

    Note

    Global replication filters cannot be used on a MySQL server instance that is configured for Group Replication, because filtering transactions on some servers would make the group unable to reach agreement on a consistent state. Channel specific replication filters can be used on replication channels that are not directly involved with Group Replication, such as where a group member also acts as a replica to a source that is outside the group. They cannot be used on the group_replication_applier or group_replication_recovery channels.

    The precise effect of this replication filter depends on whether statement-based or row-based replication is in use.

    Statement-based replication.  Tell the replication SQL thread to restrict replication to statements where the default database (that is, the one selected by USE) is db_name. To specify more than one database, use this option multiple times, once for each database; however, doing so does not replicate cross-database statements such as UPDATE some_db.some_table SET foo='bar' while a different database (or no database) is selected.

    Warning

    To specify multiple databases you must use multiple instances of this option. Because database names can contain commas, if you supply a comma separated list then the list is treated as the name of a single database.

    An example of what does not work as you might expect when using statement-based replication: If the replica is started with --replicate-do-db=sales and you issue the following statements on the source, the UPDATE statement is not replicated:

    USE prices;
    UPDATE sales.january SET amount=amount+1000;
    

    The main reason for this check just the default database behavior is that it is difficult from the statement alone to know whether it should be replicated (for example, if you are using multiple-table DELETE statements or multiple-table UPDATE statements that act across multiple databases). It is also faster to check only the default database rather than all databases if there is no need.

    Row-based replication.  Tells the replication SQL thread to restrict replication to database db_name. Only tables belonging to db_name are changed; the current database has no effect on this. Suppose that the replica is started with --replicate-do-db=sales and row-based replication is in effect, and then the following statements are run on the source:

    USE prices;
    UPDATE sales.february SET amount=amount+100;
    

    The february table in the sales database on the replica is changed in accordance with the UPDATE statement; this occurs whether or not the USE statement was issued. However, issuing the following statements on the source has no effect on the replica when using row-based replication and --replicate-do-db=sales:

    USE prices;
    UPDATE prices.march SET amount=amount-25;
    

    Even if the statement USE prices were changed to USE sales, the UPDATE statement's effects would still not be replicated.

    Another important difference in how --replicate-do-db is handled in statement-based replication as opposed to row-based replication occurs with regard to statements that refer to multiple databases. Suppose that the replica is started with --replicate-do-db=db1, and the following statements are executed on the source:

    USE db1;
    UPDATE db1.table1, db2.table2 SET db1.table1.col1 = 10, db2.table2.col2 = 20;
    

    If you are using statement-based replication, then both tables are updated on the replica. However, when using row-based replication, only table1 is affected on the replica; since table2 is in a different database, table2 on the replica is not changed by the UPDATE. Now suppose that, instead of the USE db1 statement, a USE db4 statement had been used:

    USE db4;
    UPDATE db1.table1, db2.table2 SET db1.table1.col1 = 10, db2.table2.col2 = 20;
    

    In this case, the UPDATE statement would have no effect on the replica when using statement-based replication. However, if you are using row-based replication, the UPDATE would change table1 on the replica, but not table2—in other words, only tables in the database named by --replicate-do-db are changed, and the choice of default database has no effect on this behavior.

    If you need cross-database updates to work, use --replicate-wild-do-table=db_name.% instead. See Section 17.2.5, “How Servers Evaluate Replication Filtering Rules”.

    Note

    This option affects replication in the same manner that --binlog-do-db affects binary logging, and the effects of the replication format on how --replicate-do-db affects replication behavior are the same as those of the logging format on the behavior of --binlog-do-db.

    This option has no effect on BEGIN, COMMIT, or ROLLBACK statements.

  • --replicate-ignore-db=db_name

    Command-Line Format --replicate-ignore-db=name
    Type String

    Creates a replication filter using the name of a database. Such filters can also be created using CHANGE REPLICATION FILTER REPLICATE_IGNORE_DB.

    This option supports channel specific replication filters, enabling multi-source replicas to use specific filters for different sources. To configure a channel specific replication filter on a channel named channel_1 use --replicate-ignore-db:channel_1:db_name. In this case, the first colon is interpreted as a separator and subsequent colons are literal colons. See Section 17.2.5.4, “Replication Channel Based Filters” for more information.

    Note

    Global replication filters cannot be used on a MySQL server instance that is configured for Group Replication, because filtering transactions on some servers would make the group unable to reach agreement on a consistent state. Channel specific replication filters can be used on replication channels that are not directly involved with Group Replication, such as where a group member also acts as a replica to a source that is outside the group. They cannot be used on the group_replication_applier or group_replication_recovery channels.

    To specify more than one database to ignore, use this option multiple times, once for each database. Because database names can contain commas, if you supply a comma-separated list, it is treated as the name of a single database.

    As with --replicate-do-db, the precise effect of this filtering depends on whether statement-based or row-based replication is in use, and are described in the next several paragraphs.

    Statement-based replication.  Tells the replication SQL thread not to replicate any statement where the default database (that is, the one selected by USE) is db_name.

    Row-based replication.  Tells the replication SQL thread not to update any tables in the database db_name. The default database has no effect.

    When using statement-based replication, the following example does not work as you might expect. Suppose that the replica is started with --replicate-ignore-db=sales and you issue the following statements on the source:

    USE prices;
    UPDATE sales.january SET amount=amount+1000;
    

    The UPDATE statement is replicated in such a case because --replicate-ignore-db applies only to the default database (determined by the USE statement). Because the sales database was specified explicitly in the statement, the statement has not been filtered. However, when using row-based replication, the UPDATE statement's effects are not propagated to the replica, and the replica's copy of the sales.january table is unchanged; in this instance, --replicate-ignore-db=sales causes all changes made to tables in the source's copy of the sales database to be ignored by the replica.

    You should not use this option if you are using cross-database updates and you do not want these updates to be replicated. See Section 17.2.5, “How Servers Evaluate Replication Filtering Rules”.

    If you need cross-database updates to work, use --replicate-wild-ignore-table=db_name.% instead. See Section 17.2.5, “How Servers Evaluate Replication Filtering Rules”.

    Note

    This option affects replication in the same manner that --binlog-ignore-db affects binary logging, and the effects of the replication format on how --replicate-ignore-db affects replication behavior are the same as those of the logging format on the behavior of --binlog-ignore-db.

    This option has no effect on BEGIN, COMMIT, or ROLLBACK statements.

  • --replicate-do-table=db_name.tbl_name

    Command-Line Format --replicate-do-table=name
    Type String

    Creates a replication filter by telling the replication SQL thread to restrict replication to a given table. To specify more than one table, use this option multiple times, once for each table. This works for both cross-database updates and default database updates, in contrast to --replicate-do-db. See Section 17.2.5, “How Servers Evaluate Replication Filtering Rules”. You can also create such a filter by issuing a CHANGE REPLICATION FILTER REPLICATE_DO_TABLE statement.

    This option supports channel specific replication filters, enabling multi-source replicas to use specific filters for different sources. To configure a channel specific replication filter on a channel named channel_1 use --replicate-do-table:channel_1:db_name.tbl_name. In this case, the first colon is interpreted as a separator and subsequent colons are literal colons. See Section 17.2.5.4, “Replication Channel Based Filters” for more information.

    Note

    Global replication filters cannot be used on a MySQL server instance that is configured for Group Replication, because filtering transactions on some servers would make the group unable to reach agreement on a consistent state. Channel specific replication filters can be used on replication channels that are not directly involved with Group Replication, such as where a group member also acts as a replica to a source that is outside the group. They cannot be used on the group_replication_applier or group_replication_recovery channels.

    This option affects only statements that apply to tables. It does not affect statements that apply only to other database objects, such as stored routines. To filter statements operating on stored routines, use one or more of the --replicate-*-db options.

  • --replicate-ignore-table=db_name.tbl_name

    Command-Line Format --replicate-ignore-table=name
    Type String

    Creates a replication filter by telling the replication SQL thread not to replicate any statement that updates the specified table, even if any other tables might be updated by the same statement. To specify more than one table to ignore, use this option multiple times, once for each table. This works for cross-database updates, in contrast to --replicate-ignore-db. See Section 17.2.5, “How Servers Evaluate Replication Filtering Rules”. You can also create such a filter by issuing a CHANGE REPLICATION FILTER REPLICATE_IGNORE_TABLE statement.

    This option supports channel specific replication filters, enabling multi-source replicas to use specific filters for different sources. To configure a channel specific replication filter on a channel named channel_1 use --replicate-ignore-table:channel_1:db_name.tbl_name. In this case, the first colon is interpreted as a separator and subsequent colons are literal colons. See Section 17.2.5.4, “Replication Channel Based Filters” for more information.

    Note

    Global replication filters cannot be used on a MySQL server instance that is configured for Group Replication, because filtering transactions on some servers would make the group unable to reach agreement on a consistent state. Channel specific replication filters can be used on replication channels that are not directly involved with Group Replication, such as where a group member also acts as a replica to a source that is outside the group. They cannot be used on the group_replication_applier or group_replication_recovery channels.

    This option affects only statements that apply to tables. It does not affect statements that apply only to other database objects, such as stored routines. To filter statements operating on stored routines, use one or more of the --replicate-*-db options.

  • --replicate-rewrite-db=from_name->to_name

    Command-Line Format --replicate-rewrite-db=old_name->new_name
    Type String

    Tells the replica to create a replication filter that translates the specified database to to_name if it was from_name on the source. Only statements involving tables are affected, not statements such as CREATE DATABASE, DROP DATABASE, and ALTER DATABASE.

    To specify multiple rewrites, use this option multiple times. The server uses the first one with a from_name value that matches. The database name translation is done before the --replicate-* rules are tested. You can also create such a filter by issuing a CHANGE REPLICATION FILTER REPLICATE_REWRITE_DB statement.

    If you use the --replicate-rewrite-db option on the command line and the > character is special to your command interpreter, quote the option value. For example:

    shell> mysqld --replicate-rewrite-db="olddb->newdb"
    

    The effect of the --replicate-rewrite-db option differs depending on whether statement-based or row-based binary logging format is used for the query. With statement-based format, DML statements are translated based on the current database, as specified by the USE statement. With row-based format, DML statements are translated based on the database where the modified table exists. DDL statements are always filtered based on the current database, as specified by the USE statement, regardless of the binary logging format.

    To ensure that rewriting produces the expected results, particularly in combination with other replication filtering options, follow these recommendations when you use the --replicate-rewrite-db option:

    • Create the from_name and to_name databases manually on the source and the replica with different names.

    • If you use statement-based or mixed binary logging format, do not use cross-database queries, and do not specify database names in queries. For both DDL and DML statements, rely on the USE statement to specify the current database, and use only the table name in queries.

    • If you use row-based binary logging format exclusively, for DDL statements, rely on the USE statement to specify the current database, and use only the table name in queries. For DML statements, you can use a fully qualified table name (db.table) if you want.

    If these recommendations are followed, it is safe to use the --replicate-rewrite-db option in combination with table-level replication filtering options such as --replicate-do-table.

    This option supports channel specific replication filters, enabling multi-source replicas to use specific filters for different sources. Specify the channel name followed by a colon, followed by the filter specification. The first colon is interpreted as a separator, and any subsequent colons are interpreted as literal colons. For example, to configure a channel specific replication filter on a channel named channel_1, use:

    shell> mysqld --replicate-rewrite-db=channel_1:db_name1->db_name2
    

    If you use a colon but do not specify a channel name, the option configures the replication filter for the default replication channel. See Section 17.2.5.4, “Replication Channel Based Filters” for more information.

    Note

    Global replication filters cannot be used on a MySQL server instance that is configured for Group Replication, because filtering transactions on some servers would make the group unable to reach agreement on a consistent state. Channel specific replication filters can be used on replication channels that are not directly involved with Group Replication, such as where a group member also acts as a replica to a source that is outside the group. They cannot be used on the group_replication_applier or group_replication_recovery channels.

  • --replicate-same-server-id

    Command-Line Format --replicate-same-server-id[={OFF|ON}]
    Type Boolean
    Default Value OFF

    This option is for use on replicas. The default is 0 (FALSE). With this option set to 1 (TRUE), the replica does not skip events that have its own server ID. This setting is normally useful only in rare configurations.

    When binary logging is enabled on a replica, the combination of the --replicate-same-server-id and --log-slave-updates options on the replica can cause infinite loops in replication if the server is part of a circular replication topology. (In MySQL 8.0, binary logging is enabled by default, and replica update logging is the default when binary logging is enabled.) However, the use of global transaction identifiers (GTIDs) prevents this situation by skipping the execution of transactions that have already been applied. If gtid_mode=ON is set on the replica, you can start the server with this combination of options, but you cannot change to any other GTID mode while the server is running. If any other GTID mode is set, the server does not start with this combination of options.

    By default, the replication I/O thread does not write binary log events to the relay log if they have the replica's server ID (this optimization helps save disk usage). If you want to use --replicate-same-server-id, be sure to start the replica with this option before you make the replica read its own events that you want the replication SQL thread to execute.

  • --replicate-wild-do-table=db_name.tbl_name

    Command-Line Format --replicate-wild-do-table=name
    Type String

    Creates a replication filter by telling the replication SQL thread to restrict replication to statements where any of the updated tables match the specified database and table name patterns. Patterns can contain the % and _ wildcard characters, which have the same meaning as for the LIKE pattern-matching operator. To specify more than one table, use this option multiple times, once for each table. This works for cross-database updates. See Section 17.2.5, “How Servers Evaluate Replication Filtering Rules”. You can also create such a filter by issuing a CHANGE REPLICATION FILTER REPLICATE_WILD_DO_TABLE statement.

    This option supports channel specific replication filters, enabling multi-source replicas to use specific filters for different sources. To configure a channel specific replication filter on a channel named channel_1 use --replicate-wild-do-table:channel_1:db_name.tbl_name. In this case, the first colon is interpreted as a separator and subsequent colons are literal colons. See Section 17.2.5.4, “Replication Channel Based Filters” for more information.

    Note

    Global replication filters cannot be used on a MySQL server instance that is configured for Group Replication, because filtering transactions on some servers would make the group unable to reach agreement on a consistent state. Channel specific replication filters can be used on replication channels that are not directly involved with Group Replication, such as where a group member also acts as a replica to a source that is outside the group. They cannot be used on the group_replication_applier or group_replication_recovery channels.

    This option applies to tables, views, and triggers. It does not apply to stored procedures and functions, or events. To filter statements operating on the latter objects, use one or more of the --replicate-*-db options.

    As an example, --replicate-wild-do-table=foo%.bar% replicates only updates that use a table where the database name starts with foo and the table name starts with bar.

    If the table name pattern is %, it matches any table name and the option also applies to database-level statements (CREATE DATABASE, DROP DATABASE, and ALTER DATABASE). For example, if you use --replicate-wild-do-table=foo%.%, database-level statements are replicated if the database name matches the pattern foo%.

    To include literal wildcard characters in the database or table name patterns, escape them with a backslash. For example, to replicate all tables of a database that is named my_own%db, but not replicate tables from the my1ownAABCdb database, you should escape the _ and % characters like this: --replicate-wild-do-table=my\_own\%db. If you use the option on the command line, you might need to double the backslashes or quote the option value, depending on your command interpreter. For example, with the bash shell, you would need to type --replicate-wild-do-table=my\\_own\\%db.

  • --replicate-wild-ignore-table=db_name.tbl_name

    Command-Line Format --replicate-wild-ignore-table=name
    Type String

    Creates a replication filter which keeps the replication SQL thread from replicating a statement in which any table matches the given wildcard pattern. To specify more than one table to ignore, use this option multiple times, once for each table. This works for cross-database updates. See Section 17.2.5, “How Servers Evaluate Replication Filtering Rules”. You can also create such a filter by issuing a CHANGE REPLICATION FILTER REPLICATE_WILD_IGNORE_TABLE statement.

    This option supports channel specific replication filters, enabling multi-source replicas to use specific filters for different sources. To configure a channel specific replication filter on a channel named channel_1 use --replicate-wild-ignore:channel_1:db_name.tbl_name. In this case, the first colon is interpreted as a separator and subsequent colons are literal colons. See Section 17.2.5.4, “Replication Channel Based Filters” for more information.

    Note

    Global replication filters cannot be used on a MySQL server instance that is configured for Group Replication, because filtering transactions on some servers would make the group unable to reach agreement on a consistent state. Channel specific replication filters can be used on replication channels that are not directly involved with Group Replication, such as where a group member also acts as a replica to a source that is outside the group. They cannot be used on the group_replication_applier or group_replication_recovery channels.

    As an example, --replicate-wild-ignore-table=foo%.bar% does not replicate updates that use a table where the database name starts with foo and the table name starts with bar. For information about how matching works, see the description of the --replicate-wild-do-table option. The rules for including literal wildcard characters in the option value are the same as for --replicate-wild-ignore-table as well.

  • --skip-slave-start

    Command-Line Format --skip-slave-start[={OFF|ON}]
    Type Boolean
    Default Value OFF

    Tells the replica server not to start the replication I/O and SQL threads when the server starts. To start the threads later, use a START REPLICA | SLAVE statement.

  • --slave-skip-errors=[err_code1,err_code2,...|all|ddl_exist_errors]

    Command-Line Format --slave-skip-errors=name
    System Variable slave_skip_errors
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type String
    Default Value OFF
    Valid Values

    OFF

    [list of error codes]

    all

    ddl_exist_errors

    Normally, replication stops when an error occurs on the replica, which gives you the opportunity to resolve the inconsistency in the data manually. This option causes the replication SQL thread to continue replication when a statement returns any of the errors listed in the option value.

    Do not use this option unless you fully understand why you are getting errors. If there are no bugs in your replication setup and client programs, and no bugs in MySQL itself, an error that stops replication should never occur. Indiscriminate use of this option results in replicas becoming hopelessly out of synchrony with the source, with you having no idea why this has occurred.

    For error codes, you should use the numbers provided by the error message in your replica's error log and in the output of SHOW REPLICA | SLAVE STATUS. Appendix B, Error Messages and Common Problems, lists server error codes.

    The shorthand value ddl_exist_errors is equivalent to the error code list 1007,1008,1050,1051,1054,1060,1061,1068,1094,1146.

    You can also (but should not) use the very nonrecommended value of all to cause the replica to ignore all error messages and keeps going regardless of what happens. Needless to say, if you use all, there are no guarantees regarding the integrity of your data. Please do not complain (or file bug reports) in this case if the replica's data is not anywhere close to what it is on the source. You have been warned.

    Examples:

    --slave-skip-errors=1062,1053
    --slave-skip-errors=all
    --slave-skip-errors=ddl_exist_errors
    
  • --slave-sql-verify-checksum={0|1}

    Command-Line Format --slave-sql-verify-checksum[={OFF|ON}]
    Type Boolean
    Default Value ON

    When this option is enabled, the replica examines checksums read from the relay log. In the event of a mismatch, the replica stops with an error.

The following options are used internally by the MySQL test suite for replication testing and debugging. They are not intended for use in a production setting.

  • --abort-slave-event-count

    Command-Line Format --abort-slave-event-count=#
    Type Integer
    Default Value 0
    Minimum Value 0

    When this option is set to some positive integer value other than 0 (the default) it affects replication behavior as follows: After the replication SQL thread has started, value log events are permitted to be executed; after that, the replication SQL thread does not receive any more events, just as if the network connection from the source were cut. The replication SQL thread continues to run, and the output from SHOW REPLICA | SLAVE STATUS displays Yes in both the Replica_IO_Running and the Replica_SQL_Running columns, but no further events are read from the relay log.

  • --disconnect-slave-event-count

    Command-Line Format --disconnect-slave-event-count=#
    Type Integer
    Default Value 0
System Variables Used on Replica Servers

The following list describes system variables for controlling replica servers. They can be set at server startup and some of them can be changed at runtime using SET. Server options used with replicas are listed earlier in this section.

  • init_slave

    Command-Line Format --init-slave=name
    System Variable init_slave
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type String

    This variable is similar to init_connect, but is a string to be executed by a replica server each time the replication SQL thread starts. The format of the string is the same as for the init_connect variable. The setting of this variable takes effect for subsequent START REPLICA | SLAVE statements.

    Note

    The replication SQL thread sends an acknowledgment to the client before it executes init_slave. Therefore, it is not guaranteed that init_slave has been executed when START REPLICA | SLAVE returns. See Section 13.4.2.7, “START REPLICA | SLAVE Statement” for more information.

  • log_slow_slave_statements

    Command-Line Format --log-slow-slave-statements[={OFF|ON}]
    System Variable log_slow_slave_statements
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value OFF

    When the slow query log is enabled, this variable enables logging for queries that have taken more than long_query_time seconds to execute on the replica. Note that if row-based replication is in use (binlog_format=ROW), log_slow_slave_statements has no effect. Queries are only added to the replica's slow query log when they are logged in statement format in the binary log, that is, when binlog_format=STATEMENT is set, or when binlog_format=MIXED is set and the statement is logged in statement format. Slow queries that are logged in row format when binlog_format=MIXED is set, or that are logged when binlog_format=ROW is set, are not added to the replica's slow query log, even if log_slow_slave_statements is enabled.

    Setting log_slow_slave_statements has no immediate effect. The state of the variable applies on all subsequent START REPLICA | SLAVE statements. Also note that the global setting for long_query_time applies for the lifetime of the SQL thread. If you change that setting, you must stop and restart the replication SQL thread to implement the change there (for example, by issuing STOP REPLICA | SLAVE and START REPLICA | SLAVE statements with the SQL_THREAD option).

  • master_info_repository

    Command-Line Format --master-info-repository={FILE|TABLE}
    Deprecated 8.0.23
    System Variable master_info_repository
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type String
    Default Value (≥ 8.0.2) TABLE
    Default Value (≤ 8.0.1) FILE
    Valid Values

    FILE

    TABLE

    The use of this system variable is now deprecated. The setting TABLE is the default, and is required when multiple replication channels are configured. The alternative setting FILE was previously deprecated.

    With the default setting, the replica records metadata about the source, consisting of status and connection information, to an InnoDB table in the mysql system database named mysql.slave_master_info. For more information on the connection metadata repository, see Section 17.2.4, “Relay Log and Replication Metadata Repositories”.

    The FILE setting wrote the replica's connection metadata repository to a file, which was named master.info by default. The name could be changed using the --master-info-file option.

  • max_relay_log_size

    Command-Line Format --max-relay-log-size=#
    System Variable max_relay_log_size
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 0
    Minimum Value 0
    Maximum Value 1073741824

    If a write by a replica to its relay log causes the current log file size to exceed the value of this variable, the replica rotates the relay logs (closes the current file and opens the next one). If max_relay_log_size is 0, the server uses max_binlog_size for both the binary log and the relay log. If max_relay_log_size is greater than 0, it constrains the size of the relay log, which enables you to have different sizes for the two logs. You must set max_relay_log_size to between 4096 bytes and 1GB (inclusive), or to 0. The default value is 0. See Section 17.2.3, “Replication Threads”.

  • relay_log

    Command-Line Format --relay-log=file_name
    System Variable relay_log
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type File name

    The base name for relay log files. For the default replication channel, the default base name for relay logs is host_name-relay-bin. For non-default replication channels, the default base name for relay logs is host_name-relay-bin-channel, where channel is the name of the replication channel recorded in this relay log.

    The server writes the file in the data directory unless the base name is given with a leading absolute path name to specify a different directory. The server creates relay log files in sequence by adding a numeric suffix to the base name.

    The relay log and relay log index on a replication server cannot be given the same names as the binary log and binary log index, whose names are specified by the --log-bin and --log-bin-index options. The server issues an error message and does not start if the binary log and relay log file base names would be the same.

    Due to the manner in which MySQL parses server options, if you specify this variable at server startup, you must supply a value; the default base name is used only if the option is not actually specified. If you specify the relay_log system variable at server startup without specifying a value, unexpected behavior is likely to result; this behavior depends on the other options used, the order in which they are specified, and whether they are specified on the command line or in an option file. For more information about how MySQL handles server options, see Section 4.2.2, “Specifying Program Options”.

    If you specify this variable, the value specified is also used as the base name for the relay log index file. You can override this behavior by specifying a different relay log index file base name using the relay_log_index system variable.

    When the server reads an entry from the index file, it checks whether the entry contains a relative path. If it does, the relative part of the path is replaced with the absolute path set using the relay_log system variable. An absolute path remains unchanged; in such a case, the index must be edited manually to enable the new path or paths to be used.

    You may find the relay_log system variable useful in performing the following tasks:

    • Creating relay logs whose names are independent of host names.

    • If you need to put the relay logs in some area other than the data directory because your relay logs tend to be very large and you do not want to decrease max_relay_log_size.

    • To increase speed by using load-balancing between disks.

    You can obtain the relay log file name (and path) from the relay_log_basename system variable.

  • relay_log_basename

    System Variable relay_log_basename
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type File name
    Default Value datadir + '/' + hostname + '-relay-bin'

    Holds the base name and complete path to the relay log file. The maximum variable length is 256. This variable is set by the server and is read only.

  • relay_log_index

    Command-Line Format --relay-log-index=file_name
    System Variable relay_log_index
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type File name
    Default Value *host_name*-relay-bin.index

    The name for the relay log index file. The maximum variable length is 256. If you do not specify this variable, but the relay_log system variable is specified, its value is used as the default base name for the relay log index file. If relay_log is also not specified, then for the default replication channel, the default name is host_name-relay-bin.index, using the name of the host machine. For non-default replication channels, the default name is host_name-relay-bin-channel.index, where channel is the name of the replication channel recorded in this relay log index.

    The default location for relay log files is the data directory, or any other location that was specified using the relay_log system variable. You can use the relay_log_index system variable to specify an alternative location, by adding a leading absolute path name to the base name to specify a different directory.

    The relay log and relay log index on a replication server cannot be given the same names as the binary log and binary log index, whose names are specified by the --log-bin and --log-bin-index options. The server issues an error message and does not start if the binary log and relay log file base names would be the same.

    Due to the manner in which MySQL parses server options, if you specify this variable at server startup, you must supply a value; the default base name is used only if the option is not actually specified. If you specify the relay_log_index system variable at server startup without specifying a value, unexpected behavior is likely to result; this behavior depends on the other options used, the order in which they are specified, and whether they are specified on the command line or in an option file. For more information about how MySQL handles server options, see Section 4.2.2, “Specifying Program Options”.

  • relay_log_info_file

    Command-Line Format --relay-log-info-file=file_name
    Deprecated 8.0.18
    System Variable relay_log_info_file
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type File name
    Default Value relay-log.info

    The use of this system variable is now deprecated. It was used to set the file name for the replica's applier metadata repository if relay_log_info_repository=FILE was set. relay_log_info_file and the use of the relay_log_info_repository system variable are deprecated because the use of a file for the applier metadata repository has been superseded by crash-safe tables. For information about the applier metadata repository, see Section 17.2.4.2, “Replication Metadata Repositories”.

  • relay_log_info_repository

    Command-Line Format --relay-log-info-repository=value
    Deprecated 8.0.23
    System Variable relay_log_info_repository
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type String
    Default Value (≥ 8.0.2) TABLE
    Default Value (≤ 8.0.1) FILE
    Valid Values

    FILE

    TABLE

    The use of this system variable is now deprecated. The setting TABLE is the default, and is required when multiple replication channels are configured. The TABLE setting for the replica's applier metadata repository is also required to make replication resilient to unexpected halts. See Section 17.4.2, “Handling an Unexpected Halt of a Replica” for more information. The alternative setting FILE was previously deprecated.

    With the default setting, the replica stores its applier metadata repository as an InnoDB table in the mysql system database named mysql.slave_relay_log_info. For more information on the applier metadata repository, see Section 17.2.4, “Relay Log and Replication Metadata Repositories”.

    The FILE setting wrote the replica's applier metadata repository to a file, which was named relay-log.info by default. The name could be changed using the relay_log_info_file system variable.

  • relay_log_purge

    Command-Line Format --relay-log-purge[={OFF|ON}]
    System Variable relay_log_purge
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value ON

    Disables or enables automatic purging of relay log files as soon as they are not needed any more. The default value is 1 (ON).

  • relay_log_recovery

    Command-Line Format --relay-log-recovery[={OFF|ON}]
    System Variable relay_log_recovery
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type Boolean
    Default Value OFF

    If enabled, this variable enables automatic relay log recovery immediately following server startup. The recovery process creates a new relay log file, initializes the SQL thread position to this new relay log, and initializes the I/O thread to the SQL thread position. Reading of the relay log from the source then continues.

    This global variable is read-only at runtime. Its value can be set with the --relay-log-recovery option at replica server startup, which should be used following an unexpected halt of a replica to ensure that no possibly corrupted relay logs are processed, and must be used in order to guarantee a crash-safe replica. The default value is 0 (disabled). For information on the combination of settings on a replica that is most resilient to unexpected halts, see Section 17.4.2, “Handling an Unexpected Halt of a Replica”.

    For a multithreaded replica (where slave_parallel_workers is greater than 0), setting --relay-log-recovery at startup automatically handles any inconsistencies and gaps in the sequence of transactions that have been executed from the relay log. These gaps can occur when file position based replication is in use. (For more details, see Section 17.5.1.34, “Replication and Transaction Inconsistencies”.) The relay log recovery process deals with gaps using the same method as the START REPLICA | SLAVE UNTIL SQL_AFTER_MTS_GAPS statement would. When the replica reaches a consistent gap-free state, the relay log recovery process goes on to fetch further transactions from the source beginning at the SQL (applier) thread position. When GTID-based replication is in use, this process is unnecessary, and from MySQL 8.0.18 a multithreaded replica automatically skips relay log recovery when MASTER_AUTO_POSITION is set to ON, so the setting for relay_log_recovery makes no difference in that case.

    Note

    This variable does not affect the following Group Replication channels:

    • group_replication_applier

    • group_replication_recovery

    Any other channels running on a group are affected, such as a channel which is replicating from an outside source or another group.

  • relay_log_space_limit

    Command-Line Format --relay-log-space-limit=#
    System Variable relay_log_space_limit
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type Integer
    Default Value 0
    Minimum Value 0
    Maximum Value (64-bit platforms) 18446744073709551615
    Maximum Value (32-bit platforms) 4294967295

    The maximum amount of space to use for all relay logs.

  • report_host

    Command-Line Format --report-host=host_name
    System Variable report_host
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type String

    The host name or IP address of the replica to be reported to the source during replica registration. This value appears in the output of SHOW REPLICAS | SHOW SLAVE HOSTS on the source server. Leave the value unset if you do not want the replica to register itself with the source.

    Note

    It is not sufficient for the source to simply read the IP address of the replica server from the TCP/IP socket after the replica connects. Due to NAT and other routing issues, that IP may not be valid for connecting to the replica from the source or other hosts.

  • report_password

    Command-Line Format --report-password=name
    System Variable report_password
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type String

    The account password of the replica to be reported to the source during replica registration. This value appears in the output of SHOW REPLICAS | SHOW SLAVE HOSTS on the source server if the source was started with --show-slave-auth-info.

    Although the name of this variable might imply otherwise, report_password is not connected to the MySQL user privilege system and so is not necessarily (or even likely to be) the same as the password for the MySQL replication user account.

  • report_port

    Command-Line Format --report-port=port_num
    System Variable report_port
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type Integer
    Default Value [slave_port]
    Minimum Value 0
    Maximum Value 65535

    The TCP/IP port number for connecting to the replica, to be reported to the source during replica registration. Set this only if the replica is listening on a nondefault port or if you have a special tunnel from the source or other clients to the replica. If you are not sure, do not use this option.

    The default value for this option is the port number actually used by the replica. This is also the default value displayed by SHOW REPLICAS | SHOW SLAVE HOSTS.

  • report_user

    Command-Line Format --report-user=name
    System Variable report_user
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type String

    The account user name of the replica to be reported to the source during replica registration. This value appears in the output of SHOW REPLICAS | SHOW SLAVE HOSTS on the source server if the source was started with --show-slave-auth-info.

    Although the name of this variable might imply otherwise, report_user is not connected to the MySQL user privilege system and so is not necessarily (or even likely to be) the same as the name of the MySQL replication user account.

  • rpl_read_size

    Command-Line Format --rpl-read-size=#
    Introduced 8.0.11
    System Variable rpl_read_size
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 8192
    Minimum Value 8192
    Maximum Value 4294967295

    The rpl_read_size system variable controls the minimum amount of data in bytes that is read from the binary log files and relay log files. If heavy disk I/O activity for these files is impeding performance for the database, increasing the read size might reduce file reads and I/O stalls when the file data is not currently cached by the operating system.

    The minimum and default value for rpl_read_size is 8192 bytes. The value must be a multiple of 4KB. Note that a buffer the size of this value is allocated for each thread that reads from the binary log and relay log files, including dump threads on sources and coordinator threads on replicas. Setting a large value might therefore have an impact on memory consumption for servers.

  • rpl_semi_sync_slave_enabled

    Command-Line Format --rpl-semi-sync-slave-enabled[={OFF|ON}]
    System Variable rpl_semi_sync_slave_enabled
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value OFF

    Controls whether semisynchronous replication is enabled on the replica server. To enable or disable the plugin, set this variable to ON or OFF (or 1 or 0), respectively. The default is OFF.

    This variable is available only if the replica-side semisynchronous replication plugin is installed.

  • rpl_semi_sync_slave_trace_level

    Command-Line Format --rpl-semi-sync-slave-trace-level=#
    System Variable rpl_semi_sync_slave_trace_level
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 32

    The semisynchronous replication debug trace level on the replica server. See rpl_semi_sync_master_trace_level for the permissible values.

    This variable is available only if the replica-side semisynchronous replication plugin is installed.

  • rpl_stop_slave_timeout

    Command-Line Format --rpl-stop-slave-timeout=seconds
    System Variable rpl_stop_slave_timeout
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 31536000
    Minimum Value 2
    Maximum Value 31536000

    You can control the length of time (in seconds) that STOP REPLICA | SLAVE waits before timing out by setting this variable. This can be used to avoid deadlocks between STOP REPLICA | SLAVE and other SQL statements using different client connections to the replica.

    The maximum and default value of rpl_stop_slave_timeout is 31536000 seconds (1 year). The minimum is 2 seconds. Changes to this variable take effect for subsequent STOP REPLICA | SLAVE statements.

    This variable affects only the client that issues a STOP REPLICA | SLAVE statement. When the timeout is reached, the issuing client returns an error message stating that the command execution is incomplete. The client then stops waiting for the replication I/O and SQL threads to stop, but the replication threads continue to try to stop, and the STOP REPLICA | SLAVE instruction remains in effect. Once the replication threads are no longer busy, the STOP REPLICA | SLAVE statement is executed and the replica stops.

  • slave_checkpoint_group

    Command-Line Format --slave-checkpoint-group=#
    System Variable slave_checkpoint_group
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 512
    Minimum Value 32
    Maximum Value 524280
    Block Size 8

    Sets the maximum number of transactions that can be processed by a multithreaded replica before a checkpoint operation is called to update its status as shown by SHOW REPLICA | SLAVE STATUS. Setting this variable has no effect on replicas for which multithreading is not enabled. Setting this variable has no immediate effect. The state of the variable applies on all subsequent START REPLICA | SLAVE commands.

    Note

    Multithreaded replicas are not currently supported by NDB Cluster, which silently ignores the setting for this variable. See Section 23.6.3, “Known Issues in NDB Cluster Replication”, for more information.

    This variable works in combination with the slave_checkpoint_period system variable in such a way that, when either limit is exceeded, the checkpoint is executed and the counters tracking both the number of transactions and the time elapsed since the last checkpoint are reset.

    The minimum allowed value for this variable is 32, unless the server was built using -DWITH_DEBUG, in which case the minimum value is 1. The effective value is always a multiple of 8; you can set it to a value that is not such a multiple, but the server rounds it down to the next lower multiple of 8 before storing the value. (Exception: No such rounding is performed by the debug server.) Regardless of how the server was built, the default value is 512, and the maximum allowed value is 524280.

  • slave_checkpoint_period

    Command-Line Format --slave-checkpoint-period=#
    System Variable slave_checkpoint_period
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 300
    Minimum Value 1
    Maximum Value (64-bit platforms) 18446744073709551615
    Maximum Value (32-bit platforms) 4294967295
    Unit milliseconds

    Sets the maximum time (in milliseconds) that is allowed to pass before a checkpoint operation is called to update the status of a multithreaded replica as shown by SHOW REPLICA | SLAVE STATUS. Setting this variable has no effect on replicas for which multithreading is not enabled. Setting this variable takes effect for all replication channels immediately, including running channels.

    Note

    Multithreaded replicas are not currently supported by NDB Cluster, which silently ignores the setting for this variable. See Section 23.6.3, “Known Issues in NDB Cluster Replication”, for more information.

    This variable works in combination with the slave_checkpoint_group system variable in such a way that, when either limit is exceeded, the checkpoint is executed and the counters tracking both the number of transactions and the time elapsed since the last checkpoint are reset.

    The minimum allowed value for this variable is 1, unless the server was built using -DWITH_DEBUG, in which case the minimum value is 0. Regardless of how the server was built, the default value is 300, and the maximum possible value is 4294967296 (4GB).

  • slave_compressed_protocol

    Command-Line Format --slave-compressed-protocol[={OFF|ON}]
    Deprecated 8.0.18
    System Variable slave_compressed_protocol
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value OFF

    Whether to use compression of the source/replica connection protocol if both source and replica support it. If this variable is disabled (the default), connections are uncompressed. Changes to this variable take effect on subsequent connection attempts; this includes after issuing a START REPLICA | SLAVE statement, as well as reconnections made by a running replication I/O thread (for example, after setting the MASTER_RETRY_COUNT option for the CHANGE MASTER TO statement).

    Binary log transaction compression (available as of MySQL 8.0.20), which is activated by the binlog_transaction_compression system variable, can also be used to save bandwidth. If you use binary log transaction compression in combination with protocol compression, protocol compression has less opportunity to act on the data, but can still compress headers and those events and transaction payloads that are uncompressed. For more information on binary log transaction compression, see Section 5.4.4.5, “Binary Log Transaction Compression”.

    As of MySQL 8.0.18, if slave_compressed_protocol is enabled, it takes precedence over any MASTER_COMPRESSION_ALGORITHMS option specified for the CHANGE MASTER TO statement. In this case, connections to the source use zlib compression if both the source and replica support that algorithm. If slave_compressed_protocol is disabled, the value of MASTER_COMPRESSION_ALGORITHMS applies. For more information, see Section 4.2.8, “Connection Compression Control”.

    As of MySQL 8.0.18, this system variable is deprecated. You should expect it to be removed in a future version of MySQL. See Configuring Legacy Connection Compression.

  • slave_exec_mode

    Command-Line Format --slave-exec-mode=mode
    System Variable slave_exec_mode
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Enumeration
    Default Value

    IDEMPOTENT (NDB)

    STRICT (Other)

    Valid Values

    IDEMPOTENT

    STRICT

    Controls how a replication thread resolves conflicts and errors during replication. IDEMPOTENT mode causes suppression of duplicate-key and no-key-found errors; STRICT means no such suppression takes place.

    IDEMPOTENT mode is intended for use in multi-source replication, circular replication, and some other special replication scenarios for NDB Cluster Replication. (See Section 23.6.10, “NDB Cluster Replication: Bidrectional and Circular Replication”, and Section 23.6.11, “NDB Cluster Replication Conflict Resolution”, for more information.) NDB Cluster ignores any value explicitly set for slave_exec_mode, and always treats it as IDEMPOTENT.

    In MySQL Server 8.0, STRICT mode is the default value.

    Setting this variable takes immediate effect for all replication channels, including running channels.

    For storage engines other than NDB, IDEMPOTENT mode should be used only when you are absolutely sure that duplicate-key errors and key-not-found errors can safely be ignored. It is meant to be used in fail-over scenarios for NDB Cluster where multi-source replication or circular replication is employed, and is not recommended for use in other cases.

  • slave_load_tmpdir

    Command-Line Format --slave-load-tmpdir=dir_name
    System Variable slave_load_tmpdir
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type Directory name
    Default Value Value of --tmpdir

    The name of the directory where the replica creates temporary files. Setting this variable takes effect for all replication channels immediately, including running channels. The variable value is by default equal to the value of the tmpdir system variable, or the default that applies when that system variable is not specified.

    When the replication SQL thread replicates a LOAD DATA statement, it extracts the file to be loaded from the relay log into temporary files, and then loads these into the table. If the file loaded on the source is huge, the temporary files on the replica are huge, too. Therefore, it might be advisable to use this option to tell the replica to put temporary files in a directory located in some file system that has a lot of available space. In that case, the relay logs are huge as well, so you might also want to set the relay_log system variable to place the relay logs in that file system.

    The directory specified by this option should be located in a disk-based file system (not a memory-based file system) so that the temporary files used to replicate LOAD DATA statements can survive machine restarts. The directory also should not be one that is cleared by the operating system during the system startup process. However, replication can now continue after a restart if the temporary files have been removed.

  • slave_max_allowed_packet

    Command-Line Format --slave-max-allowed-packet=#
    System Variable slave_max_allowed_packet
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 1073741824
    Minimum Value 1024
    Maximum Value 1073741824

    This option sets the maximum packet size in bytes that the replication SQL and I/O threads can handle. Setting this variable takes effect for all replication channels immediately, including running channels. It is possible for a source to write binary log events longer than its max_allowed_packet setting once the event header is added. The setting for slave_max_allowed_packet must be larger than the max_allowed_packet setting on the source, so that large updates using row-based replication do not cause replication to fail.

    This global variable always has a value that is a positive integer multiple of 1024; if you set it to some value that is not, the value is rounded down to the next highest multiple of 1024 for it is stored or used; setting slave_max_allowed_packet to 0 causes 1024 to be used. (A truncation warning is issued in all such cases.) The default and maximum value is 1073741824 (1 GB); the minimum is 1024.

  • slave_net_timeout

    Command-Line Format --slave-net-timeout=#
    System Variable slave_net_timeout
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 60
    Minimum Value 1
    Maximum Value 4294967295

    The number of seconds to wait for more data or a heartbeat signal from the source before the replica considers the connection broken, aborts the read, and tries to reconnect. Setting this variable has no immediate effect. The state of the variable applies on all subsequent START REPLICA | SLAVE commands.

    The default value is 60 seconds (one minute). The first retry occurs immediately after the timeout. The interval between retries is controlled by the MASTER_CONNECT_RETRY option for the CHANGE MASTER TO statement, and the number of reconnection attempts is limited by the MASTER_RETRY_COUNT option for the CHANGE MASTER TO statement.

    The heartbeat interval, which stops the connection timeout occurring in the absence of data if the connection is still good, is controlled by the MASTER_HEARTBEAT_PERIOD option for the CHANGE MASTER TO statement. The heartbeat interval defaults to half the value of slave_net_timeout, and it is recorded in the replica's connection metadata repository and shown in the replication_connection_configuration Performance Schema table. Note that a change to the value or default setting of slave_net_timeout does not automatically change the heartbeat interval, whether that has been set explicitly or is using a previously calculated default. If the connection timeout is changed, you must also issue CHANGE MASTER TO to adjust the heartbeat interval to an appropriate value so that it occurs before the connection timeout.

  • slave_parallel_type

    Command-Line Format --slave-parallel-type=value
    System Variable slave_parallel_type
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Enumeration
    Default Value DATABASE
    Valid Values

    DATABASE

    LOGICAL_CLOCK

    For multithreaded replicas (replicas on which slave_parallel_workers is set to a value greater than 0), slave_parallel_type specifies the policy used to decide which transactions are allowed to execute in parallel on the replica. The variable has no effect on replicas for which multithreading is not enabled. The possible values are:

    • LOGICAL_CLOCK: Transactions that are part of the same binary log group commit on a source are applied in parallel on a replica. The dependencies between transactions are tracked based on their timestamps to provide additional parallelization where possible. When this value is set, the binlog_transaction_dependency_tracking system variable can be used on the source to specify that write sets are used for parallelization in place of timestamps, if a write set is available for the transaction and gives improved results compared to timestamps.

    • DATABASE: Transactions that update different databases are applied in parallel. This value is only appropriate if data is partitioned into multiple databases which are being updated independently and concurrently on the source. There must be no cross-database constraints, as such constraints may be violated on the replica.

    When slave_preserve_commit_order=1 is set, you can only use LOGICAL_CLOCK.

    When your replication topology uses multiple levels of replicas, LOGICAL_CLOCK may achieve less parallelization for each level the replica is away from the source. You can reduce this effect by using binlog_transaction_dependency_tracking on the source to specify that write sets are used instead of timestamps for parallelization where possible.

    When binary log transaction compression is enabled using the binlog_transaction_compression system variable, if slave_parallel_type is set to DATABASE, all the databases affected by the transaction are mapped before the transaction is scheduled. The use of binary log transaction compression with the DATABASE policy can reduce parallelism compared to uncompressed transactions, which are mapped and scheduled for each event.

  • slave_parallel_workers

    Command-Line Format --slave-parallel-workers=#
    System Variable slave_parallel_workers
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 0
    Minimum Value 0
    Maximum Value 1024

    Enables multithreading on the replica and sets the number of applier threads for executing replication transactions in parallel. When the value is a number greater than 0, the replica is a multithreaded replica with the specified number of applier threads, plus a coordinator thread to manage them. If you are using multiple replication channels, each channel has this number of threads.

    Note

    Multithreaded replicas are not currently supported by NDB Cluster, which silently ignores the setting for this variable. See Section 23.6.3, “Known Issues in NDB Cluster Replication”, for more information.

    Retrying of transactions is supported when multithreading is enabled on a replica. When slave_preserve_commit_order=1, transactions on a replica are externalized on the replica in the same order as they appear in the replica's relay log. The way in which transactions are distributed among applier threads is configured by slave_parallel_type.

    To disable parallel execution, set this option to 0, which gives the replica a single applier thread and no coordinator thread. With this setting, the slave_parallel_type and slave_preserve_commit_order system variables have no effect and are ignored.

    Setting slave_parallel_workers has no immediate effect. The state of the variable applies on all subsequent START REPLICA | SLAVE statements.

  • slave_pending_jobs_size_max

    Command-Line Format --slave-pending-jobs-size-max=#
    System Variable slave_pending_jobs_size_max
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value (≥ 8.0.12) 128M
    Default Value (≤ 8.0.11) 16M
    Minimum Value 1024
    Maximum Value 16EiB
    Unit bytes
    Block Size 1024

    For multithreaded replicas, this variable sets the maximum amount of memory (in bytes) available to applier queues holding events not yet applied. Setting this variable has no effect on replicas for which multithreading is not enabled. Setting this variable has no immediate effect. The state of the variable applies on all subsequent START REPLICA | SLAVE commands.

    The minimum possible value for this variable is 1024 bytes; the default is 128MB. The maximum possible value is 18446744073709551615 (16 exbibytes). Values that are not exact multiples of 1024 bytes are rounded down to the next lower multiple of 1024 bytes prior to being stored.

    The value of this variable is a soft limit and can be set to match the normal workload. If an unusually large event exceeds this size, the transaction is held until all the worker threads have empty queues, and then processed. All subsequent transactions are held until the large transaction has been completed.

  • slave_preserve_commit_order

    Command-Line Format --slave-preserve-commit-order[={OFF|ON}]
    System Variable slave_preserve_commit_order
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value OFF

    For multithreaded replicas (replicas on which slave_parallel_workers is set to a value greater than 0), setting slave_preserve_commit_order=1 ensures that transactions are executed and committed on the replica in the same order as they appear in the replica's relay log. This prevents gaps in the sequence of transactions that have been executed from the replica's relay log, and preserves the same transaction history on the replica as on the source (with the limitations listed below). This variable has no effect on replicas for which multithreading is not enabled.

    Up to and including MySQL 8.0.18, setting slave_preserve_commit_order=1 requires that binary logging (log_bin) and replica update logging (log_slave_updates) are enabled on the replica, which are the default settings from MySQL 8.0. From MySQL 8.0.19, binary logging and replica update logging are not required on the replica to set slave_preserve_commit_order=1, and can be disabled if wanted. In all releases, setting slave_preserve_commit_order=1 requires that slave_parallel_type is set to LOGICAL_CLOCK, which is not the default setting. Before changing the value of slave_preserve_commit_order and slave_parallel_type, the replication SQL thread (for all replication channels if you are using multiple replication channels) must be stopped.

    When slave_preserve_commit_order=0 is set, which is the default, the transactions that a multithreaded replica applies in parallel may commit out of order. Therefore, checking for the most recently executed transaction does not guarantee that all previous transactions from the source have been executed on the replica. There is a chance of gaps in the sequence of transactions that have been executed from the replica's relay log. This has implications for logging and recovery when using a multithreaded replica. See Section 17.5.1.34, “Replication and Transaction Inconsistencies” for more information.

    When slave_preserve_commit_order=1 is set, the executing worker thread waits until all previous transactions are committed before committing. While a given thread is waiting for other worker threads to commit their transactions, it reports its status as Waiting for preceding transaction to commit. With this mode, a multithreaded replica never enters a state that the source was not in. This supports the use of replication for read scale-out. See Section 17.4.5, “Using Replication for Scale-Out”.

    Note
    • slave_preserve_commit_order=1 does not prevent source binary log position lag, where Exec_master_log_pos is behind the position up to which transactions have been executed. See Section 17.5.1.34, “Replication and Transaction Inconsistencies”.

    • slave_preserve_commit_order=1 does not preserve the commit order and transaction history if the replica uses filters on its binary log, such as --binlog-do-db.

    • slave_preserve_commit_order=1 does not preserve the order of non-transactional DML updates. These might commit before transactions that precede them in the relay log, which might result in gaps in the sequence of transactions that have been executed from the replica's relay log.

    • In releases before MySQL 8.0.19, slave_preserve_commit_order=1 does not preserve the order of statements with an IF EXISTS clause when the object concerned does not exist. These might commit before transactions that precede them in the relay log, which might result in gaps in the sequence of transactions that have been executed from the replica's relay log.

    • A limitation to preserving the commit order on the replica can occur if statement-based replication is in use, and both transactional and non-transactional storage engines participate in a non-XA transaction that is rolled back on the source. Normally, non-XA transactions that are rolled back on the source are not replicated to the replica, but in this particular situation, the transaction might be replicated to the replica. If this does happen, a multithreaded replica without binary logging does not handle the transaction rollback, so the commit order on the replica diverges from the relay log order of the transactions in that case.

  • slave_rows_search_algorithms

    Command-Line Format --slave-rows-search-algorithms=value
    Deprecated 8.0.18
    System Variable slave_rows_search_algorithms
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Set
    Default Value (≥ 8.0.2) INDEX_SCAN,HASH_SCAN
    Default Value (≤ 8.0.1) TABLE_SCAN,INDEX_SCAN
    Valid Values

    TABLE_SCAN,INDEX_SCAN

    INDEX_SCAN,HASH_SCAN

    TABLE_SCAN,HASH_SCAN

    TABLE_SCAN,INDEX_SCAN,HASH_SCAN (equivalent to INDEX_SCAN,HASH_SCAN)

    When preparing batches of rows for row-based logging and replication, this system variable controls how the rows are searched for matches, in particular whether hash scans are used. The use of this system variable is now deprecated. The default setting INDEX_SCAN,HASH_SCAN is optimal for performance and works correctly in all scenarios. See Section 17.5.1.27, “Replication and Row Searches”.

  • slave_skip_errors

    Command-Line Format --slave-skip-errors=name
    System Variable slave_skip_errors
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type String
    Default Value OFF
    Valid Values

    OFF

    [list of error codes]

    all

    ddl_exist_errors

    Normally, replication stops when an error occurs on the replica, which gives you the opportunity to resolve the inconsistency in the data manually. This variable causes the replication SQL thread to continue replication when a statement returns any of the errors listed in the variable value.

  • slave_sql_verify_checksum

    Command-Line Format --slave-sql-verify-checksum[={OFF|ON}]
    System Variable slave_sql_verify_checksum
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value ON

    Cause the replication SQL thread to verify data using the checksums read from the relay log. In the event of a mismatch, the replica stops with an error. Setting this variable takes effect for all replication channels immediately, including running channels.

    Note

    The replication I/O thread always reads checksums if possible when accepting events from over the network.

  • slave_transaction_retries

    Command-Line Format --slave-transaction-retries=#
    System Variable slave_transaction_retries
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 10
    Minimum Value 0
    Maximum Value (64-bit platforms) 18446744073709551615
    Maximum Value (32-bit platforms) 4294967295

    Sets the maximum number of times for replication SQL threads on a single-threaded or multithreaded replica to automatically retry failed transactions before stopping. Setting this variable takes effect for all replication channels immediately, including running channels. The default value is 10. Setting the variable to 0 disables automatic retrying of transactions.

    If a replication SQL thread fails to execute a transaction because of an InnoDB deadlock or because the transaction's execution time exceeded InnoDB's innodb_lock_wait_timeout or NDB's TransactionDeadlockDetectionTimeout or TransactionInactiveTimeout, it automatically retries slave_transaction_retries times before stopping with an error. Transactions with a non-temporary error are not retried.

    The Performance Schema table replication_applier_status shows the number of retries that took place on each replication channel, in the COUNT_TRANSACTIONS_RETRIES column. The Performance Schema table replication_applier_status_by_worker shows detailed information on transaction retries by individual applier threads on a single-threaded or multithreaded replica, and identifies the errors that caused the last transaction and the transaction currently in progress to be reattempted.

  • slave_type_conversions

    Command-Line Format --slave-type-conversions=set
    System Variable slave_type_conversions
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type Set
    Default Value
    Valid Values

    ALL_LOSSY

    ALL_NON_LOSSY

    ALL_SIGNED

    ALL_UNSIGNED

    Controls the type conversion mode in effect on the replica when using row-based replication. Its value is a comma-delimited set of zero or more elements from the list: ALL_LOSSY, ALL_NON_LOSSY, ALL_SIGNED, ALL_UNSIGNED. Set this variable to an empty string to disallow type conversions between the source and the replica. Setting this variable takes effect for all replication channels immediately, including running channels.

    For additional information on type conversion modes applicable to attribute promotion and demotion in row-based replication, see Row-based replication: attribute promotion and demotion.

  • sql_slave_skip_counter

    System Variable sql_slave_skip_counter
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 0
    Minimum Value 0
    Maximum Value (64-bit platforms) 18446744073709551615
    Maximum Value (32-bit platforms) 4294967295

    The number of events from the source that a replica should skip. Setting the option has no immediate effect. The variable applies to the next START REPLICA | SLAVE statement; the next START REPLICA | SLAVE statement also changes the value back to 0. When this variable is set to a nonzero value and there are multiple replication channels configured, the START REPLICA | SLAVE statement can only be used with the FOR CHANNEL channel clause.

    This option is incompatible with GTID-based replication, and must not be set to a nonzero value when gtid_mode=ON is set. If you need to skip transactions when employing GTIDs, use gtid_executed from the source instead. If you have enabled GTID assignment on a replication channel using the ASSIGN_GTIDS_TO_ANONYMOUS_TRANSACTIONS option of the CHANGE REPLICATION SOURCE TO statement, sql_slave_skip_counter is available. See Section 17.1.7.3, “Skipping Transactions”.

    Important

    If skipping the number of events specified by setting this variable would cause the replica to begin in the middle of an event group, the replica continues to skip until it finds the beginning of the next event group and begins from that point. For more information, see Section 17.1.7.3, “Skipping Transactions”.

  • sync_master_info

    Command-Line Format --sync-master-info=#
    System Variable sync_master_info
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 10000
    Minimum Value 0
    Maximum Value (64-bit platforms) 18446744073709551615
    Maximum Value (32-bit platforms) 4294967295

    The number of events after which the replica updates the connection metadata repository. When the connection metadata repository is stored as an InnoDB table, which is the default from MySQL 8.0, it is updated after this number of events. If the connection metadata repository is stored as a file, which is deprecated from MySQL 8.0, the replica synchronizes its master.info file to disk (using fdatasync()) after this number of events. The default value is 10000, and a zero value means that the repository is never updated. Setting this variable takes effect for all replication channels immediately, including running channels.

  • sync_relay_log

    Command-Line Format --sync-relay-log=#
    System Variable sync_relay_log
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 10000
    Minimum Value 0
    Maximum Value (64-bit platforms) 18446744073709551615
    Maximum Value (32-bit platforms) 4294967295

    If the value of this variable is greater than 0, the MySQL server synchronizes its relay log to disk (using fdatasync()) after every sync_relay_log events are written to the relay log. Setting this variable takes effect for all replication channels immediately, including running channels.

    Setting sync_relay_log to 0 causes no synchronization to be done to disk; in this case, the server relies on the operating system to flush the relay log's contents from time to time as for any other file.

    A value of 1 is the safest choice because in the event of an unexpected halt you lose at most one event from the relay log. However, it is also the slowest choice (unless the disk has a battery-backed cache, which makes synchronization very fast). For information on the combination of settings on a replica that is most resilient to unexpected halts, see Section 17.4.2, “Handling an Unexpected Halt of a Replica”.

  • sync_relay_log_info

    Command-Line Format --sync-relay-log-info=#
    System Variable sync_relay_log_info
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 10000
    Minimum Value 0
    Maximum Value (64-bit platforms) 18446744073709551615
    Maximum Value (32-bit platforms) 4294967295

    The number of transactions after which the replica updates the applier metadata repository. When the applier metadata repository is stored as an InnoDB table, which is the default from MySQL 8.0, it is updated after every transaction and this system variable is ignored. If the applier metadata repository is stored as a file, which is deprecated from MySQL 8.0, the replica synchronizes its relay-log.info file to disk (using fdatasync()) after this number of transactions. The default value for sync_relay_log_info is 10000, and a zero value means that the file contents are only flushed by the operating system. Setting this variable takes effect for all replication channels immediately, including running channels.

17.1.6.4 Binary Logging Options and Variables

You can use the mysqld options and system variables that are described in this section to affect the operation of the binary log as well as to control which statements are written to the binary log. For additional information about the binary log, see Section 5.4.4, “The Binary Log”. For additional information about using MySQL server options and system variables, see Section 5.1.7, “Server Command Options”, and Section 5.1.8, “Server System Variables”.

Startup Options Used with Binary Logging

The following list describes startup options for enabling and configuring the binary log. System variables used with binary logging are discussed later in this section.

  • --binlog-row-event-max-size=N

    Command-Line Format --binlog-row-event-max-size=#
    System Variable (≥ 8.0.14) binlog_row_event_max_size
    Scope (≥ 8.0.14) Global
    Dynamic (≥ 8.0.14) No
    SET_VAR Hint Applies (≥ 8.0.14) No
    Type Integer
    Default Value 8192
    Minimum Value 256
    Maximum Value (64-bit platforms) 18446744073709551615
    Maximum Value (32-bit platforms) 4294967295

    When row-based binary logging is used, this setting is a soft limit on the maximum size of a row-based binary log event, in bytes. Where possible, rows stored in the binary log are grouped into events with a size not exceeding the value of this setting. If an event cannot be split, the maximum size can be exceeded. The value must be (or else gets rounded down to) a multiple of 256. The default is 8192 bytes.

  • --log-bin[=base_name]

    Command-Line Format --log-bin=file_name
    Type File name

    Specifies the base name to use for binary log files. With binary logging enabled, the server logs all statements that change data to the binary log, which is used for backup and replication. The binary log is a sequence of files with a base name and numeric extension. The --log-bin option value is the base name for the log sequence. The server creates binary log files in sequence by adding a numeric suffix to the base name.

    If you do not supply the --log-bin option, MySQL uses binlog as the default base name for the binary log files. For compatibility with earlier releases, if you supply the --log-bin option with no string or with an empty string, the base name defaults to host_name-bin, using the name of the host machine.

    The default location for binary log files is the data directory. You can use the --log-bin option to specify an alternative location, by adding a leading absolute path name to the base name to specify a different directory. When the server reads an entry from the binary log index file, which tracks the binary log files that have been used, it checks whether the entry contains a relative path. If it does, the relative part of the path is replaced with the absolute path set using the --log-bin option. An absolute path recorded in the binary log index file remains unchanged; in such a case, the index file must be edited manually to enable a new path or paths to be used. The binary log file base name and any specified path are available as the log_bin_basename system variable.

    In earlier MySQL versions, binary logging was disabled by default, and was enabled if you specified the --log-bin option. From MySQL 8.0, binary logging is enabled by default, whether or not you specify the --log-bin option. The exception is if you use mysqld to initialize the data directory manually by invoking it with the --initialize or --initialize-insecure option, when binary logging is disabled by default. It is possible to enable binary logging in this case by specifying the --log-bin option. When binary logging is enabled, the log_bin system variable, which shows the status of binary logging on the server, is set to ON.

    To disable binary logging, you can specify the --skip-log-bin or --disable-log-bin option at startup. If either of these options is specified and --log-bin is also specified, the option specified later takes precedence. When binary logging is disabled, the log_bin system variable is set to OFF.

    When GTIDs are in use on the server, if you disable binary logging when restarting the server after an abnormal shutdown, some GTIDs are likely to be lost, causing replication to fail. In a normal shutdown, the set of GTIDs from the current binary log file is saved in the mysql.gtid_executed table. Following an abnormal shutdown where this did not happen, during recovery the GTIDs are added to the table from the binary log file, provided that binary logging is still enabled. If binary logging is disabled for the server restart, the server cannot access the binary log file to recover the GTIDs, so replication cannot be started. Binary logging can be disabled safely after a normal shutdown.

    The --log-slave-updates and --slave-preserve-commit-order options require binary logging. If you disable binary logging, either omit these options, or specify --log-slave-updates=OFF and --skip-slave-preserve-commit-order. MySQL disables these options by default when --skip-log-bin or --disable-log-bin is specified. If you specify --log-slave-updates or --slave-preserve-commit-order together with --skip-log-bin or --disable-log-bin, a warning or error message is issued.

    In MySQL 5.7, a server ID had to be specified when binary logging was enabled, or the server would not start. In MySQL 8.0, the server_id system variable is set to 1 by default. The server can now be started with this default server ID when binary logging is enabled, but an informational message is issued if you do not specify a server ID explicitly by setting the server_id system variable. For servers that are used in a replication topology, you must specify a unique nonzero server ID for each server.

    For information on the format and management of the binary log, see Section 5.4.4, “The Binary Log”.

  • --log-bin-index[=file_name]

    Command-Line Format --log-bin-index=file_name
    System Variable log_bin_index
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type File name

    The name for the binary log index file, which contains the names of the binary log files. By default, it has the same location and base name as the value specified for the binary log files using the --log-bin option, plus the extension .index. If you do not specify --log-bin, the default binary log index file name is binlog.index. If you specify --log-bin option with no string or an empty string, the default binary log index file name is host_name-bin.index, using the name of the host machine.

    For information on the format and management of the binary log, see Section 5.4.4, “The Binary Log”.

Statement selection options.  The options in the following list affect which statements are written to the binary log, and thus sent by a replication source server to its replicas. There are also options for replicas that control which statements received from the source should be executed or ignored. For details, see Section 17.1.6.3, “Replica Server Options and Variables”.

  • --binlog-do-db=db_name

    Command-Line Format --binlog-do-db=name
    Type String

    This option affects binary logging in a manner similar to the way that --replicate-do-db affects replication.

    The effects of this option depend on whether the statement-based or row-based logging format is in use, in the same way that the effects of --replicate-do-db depend on whether statement-based or row-based replication is in use. You should keep in mind that the format used to log a given statement may not necessarily be the same as that indicated by the value of binlog_format. For example, DDL statements such as CREATE TABLE and ALTER TABLE are always logged as statements, without regard to the logging format in effect, so the following statement-based rules for --binlog-do-db always apply in determining whether or not the statement is logged.

    Statement-based logging.  Only those statements are written to the binary log where the default database (that is, the one selected by USE) is db_name. To specify more than one database, use this option multiple times, once for each database; however, doing so does not cause cross-database statements such as UPDATE some_db.some_table SET foo='bar' to be logged while a different database (or no database) is selected.

    Warning

    To specify multiple databases you must use multiple instances of this option. Because database names can contain commas, the list is treated as the name of a single database if you supply a comma-separated list.

    An example of what does not work as you might expect when using statement-based logging: If the server is started with --binlog-do-db=sales and you issue the following statements, the UPDATE statement is not logged:

    USE prices;
    UPDATE sales.january SET amount=amount+1000;
    

    The main reason for this just check the default database behavior is that it is difficult from the statement alone to know whether it should be replicated (for example, if you are using multiple-table DELETE statements or multiple-table UPDATE statements that act across multiple databases). It is also faster to check only the default database rather than all databases if there is no need.

    Another case which may not be self-evident occurs when a given database is replicated even though it was not specified when setting the option. If the server is started with --binlog-do-db=sales, the following UPDATE statement is logged even though prices was not included when setting --binlog-do-db:

    USE sales;
    UPDATE prices.discounts SET percentage = percentage + 10;
    

    Because sales is the default database when the UPDATE statement is issued, the UPDATE is logged.

    Row-based logging.  Logging is restricted to database db_name. Only changes to tables belonging to db_name are logged; the default database has no effect on this. Suppose that the server is started with --binlog-do-db=sales and row-based logging is in effect, and then the following statements are executed:

    USE prices;
    UPDATE sales.february SET amount=amount+100;
    

    The changes to the february table in the sales database are logged in accordance with the UPDATE statement; this occurs whether or not the USE statement was issued. However, when using the row-based logging format and --binlog-do-db=sales, changes made by the following UPDATE are not logged:

    USE prices;
    UPDATE prices.march SET amount=amount-25;
    

    Even if the USE prices statement were changed to USE sales, the UPDATE statement's effects would still not be written to the binary log.

    Another important difference in --binlog-do-db handling for statement-based logging as opposed to the row-based logging occurs with regard to statements that refer to multiple databases. Suppose that the server is started with --binlog-do-db=db1, and the following statements are executed:

    USE db1;
    UPDATE db1.table1, db2.table2 SET db1.table1.col1 = 10, db2.table2.col2 = 20;
    

    If you are using statement-based logging, the updates to both tables are written to the binary log. However, when using the row-based format, only the changes to table1 are logged; table2 is in a different database, so it is not changed by the UPDATE. Now suppose that, instead of the USE db1 statement, a USE db4 statement had been used:

    USE db4;
    UPDATE db1.table1, db2.table2 SET db1.table1.col1 = 10, db2.table2.col2 = 20;
    

    In this case, the UPDATE statement is not written to the binary log when using statement-based logging. However, when using row-based logging, the change to table1 is logged, but not that to table2—in other words, only changes to tables in the database named by --binlog-do-db are logged, and the choice of default database has no effect on this behavior.

  • --binlog-ignore-db=db_name

    Command-Line Format --binlog-ignore-db=name
    Type String

    This option affects binary logging in a manner similar to the way that --replicate-ignore-db affects replication.

    The effects of this option depend on whether the statement-based or row-based logging format is in use, in the same way that the effects of --replicate-ignore-db depend on whether statement-based or row-based replication is in use. You should keep in mind that the format used to log a given statement may not necessarily be the same as that indicated by the value of binlog_format. For example, DDL statements such as CREATE TABLE and ALTER TABLE are always logged as statements, without regard to the logging format in effect, so the following statement-based rules for --binlog-ignore-db always apply in determining whether or not the statement is logged.

    Statement-based logging.  Tells the server to not log any statement where the default database (that is, the one selected by USE) is db_name.

    When there is no default database, no --binlog-ignore-db options are applied, and such statements are always logged. (Bug #11829838, Bug #60188)

    Row-based format.  Tells the server not to log updates to any tables in the database db_name. The current database has no effect.

    When using statement-based logging, the following example does not work as you might expect. Suppose that the server is started with --binlog-ignore-db=sales and you issue the following statements:

    USE prices;
    UPDATE sales.january SET amount=amount+1000;
    

    The UPDATE statement is logged in such a case because --binlog-ignore-db applies only to the default database (determined by the USE statement). Because the sales database was specified explicitly in the statement, the statement has not been filtered. However, when using row-based logging, the UPDATE statement's effects are not written to the binary log, which means that no changes to the sales.january table are logged; in this instance, --binlog-ignore-db=sales causes all changes made to tables in the source's copy of the sales database to be ignored for purposes of binary logging.

    To specify more than one database to ignore, use this option multiple times, once for each database. Because database names can contain commas, the list is treated as the name of a single database if you supply a comma-separated list.

    You should not use this option if you are using cross-database updates and you do not want these updates to be logged.

Checksum options.  MySQL supports reading and writing of binary log checksums. These are enabled using the two options listed here:

  • --binlog-checksum={NONE|CRC32}

    Command-Line Format --binlog-checksum=type
    Type String
    Default Value CRC32
    Valid Values

    NONE

    CRC32

    Enabling this option causes the source to write checksums for events written to the binary log. Set to NONE to disable, or the name of the algorithm to be used for generating checksums; currently, only CRC32 checksums are supported, and CRC32 is the default. You cannot change the setting for this option within a transaction.

To control reading of checksums by the replica (from the relay log), use the --slave-sql-verify-checksum option.

Testing and debugging options.  The following binary log options are used in replication testing and debugging. They are not intended for use in normal operations.

  • --max-binlog-dump-events=N

    Command-Line Format --max-binlog-dump-events=#
    Type Integer
    Default Value 0

    This option is used internally by the MySQL test suite for replication testing and debugging.

  • --sporadic-binlog-dump-fail

    Command-Line Format --sporadic-binlog-dump-fail[={OFF|ON}]
    Type Boolean
    Default Value OFF

    This option is used internally by the MySQL test suite for replication testing and debugging.

System Variables Used with Binary Logging

The following list describes system variables for controlling binary logging. They can be set at server startup and some of them can be changed at runtime using SET. Server options used to control binary logging are listed earlier in this section.

  • binlog_cache_size

    Command-Line Format --binlog-cache-size=#
    System Variable binlog_cache_size
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 32768
    Minimum Value 4096
    Maximum Value (64-bit platforms) 18446744073709551615
    Maximum Value (32-bit platforms) 4294967295

    The size of the memory buffer to hold changes to the binary log during a transaction. When binary logging is enabled on the server (with the log_bin system variable set to ON), a binary log cache is allocated for each client if the server supports any transactional storage engines. If the data for the transaction exceeds the space in the memory buffer, the excess data is stored in a temporary file. When binary log encryption is active on the server, the memory buffer is not encrypted, but (from MySQL 8.0.17) any temporary file used to hold the binary log cache is encrypted. After each transaction is committed, the binary log cache is reset by clearing the memory buffer and truncating the temporary file if used.

    If you often use large transactions, you can increase this cache size to get better performance by reducing or eliminating the need to write to temporary files. The Binlog_cache_use and Binlog_cache_disk_use status variables can be useful for tuning the size of this variable. See Section 5.4.4, “The Binary Log”.

    binlog_cache_size sets the size for the transaction cache only; the size of the statement cache is governed by the binlog_stmt_cache_size system variable.

  • binlog_checksum

    Command-Line Format --binlog-checksum=name
    System Variable binlog_checksum
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type String
    Default Value CRC32
    Valid Values

    NONE

    CRC32

    When enabled, this variable causes the source to write a checksum for each event in the binary log. binlog_checksum supports the values NONE (which disables checksums) and CRC32. The default is CRC32. When binlog_checksum is disabled (value NONE), the server verifies that it is writing only complete events to the binary log by writing and checking the event length (rather than a checksum) for each event.

    Setting this variable on the source to a value unrecognized by the replica causes the replica to set its own binlog_checksum value to NONE, and to stop replication with an error. If backward compatibility with older replicas is a concern, you may want to set the value explicitly to NONE.

    Up to and including MySQL 8.0.20, Group Replication cannot make use of checksums and does not support their presence in the binary log, so you must set binlog_checksum=NONE when configuring a server instance to become a group member. From MySQL 8.0.21, Group Replication supports checksums, so group members may use the default setting.

    Changing the value of binlog_checksum causes the binary log to be rotated, because checksums must be written for an entire binary log file, and never for only part of one. You cannot change the value of binlog_checksum within a transaction.

    When binary log transaction compression is enabled using the binlog_transaction_compression system variable, checksums are not written for individual events in a compressed transaction payload. Instead a checksum is written for the GTID event, and a checksum for the compressed Transaction_payload_event.

  • binlog_direct_non_transactional_updates

    Command-Line Format --binlog-direct-non-transactional-updates[={OFF|ON}]
    System Variable binlog_direct_non_transactional_updates
    Scope Global, Session
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value OFF

    Due to concurrency issues, a replica can become inconsistent when a transaction contains updates to both transactional and nontransactional tables. MySQL tries to preserve causality among these statements by writing nontransactional statements to the transaction cache, which is flushed upon commit. However, problems arise when modifications done to nontransactional tables on behalf of a transaction become immediately visible to other connections because these changes may not be written immediately into the binary log.

    The binlog_direct_non_transactional_updates variable offers one possible workaround to this issue. By default, this variable is disabled. Enabling binlog_direct_non_transactional_updates causes updates to nontransactional tables to be written directly to the binary log, rather than to the transaction cache.

    As of MySQL 8.0.14, setting the session value of this system variable is a restricted operation. The session user must have privileges sufficient to set restricted session variables. See Section 5.1.9.1, “System Variable Privileges”.

    binlog_direct_non_transactional_updates works only for statements that are replicated using the statement-based binary logging format; that is, it works only when the value of binlog_format is STATEMENT, or when binlog_format is MIXED and a given statement is being replicated using the statement-based format. This variable has no effect when the binary log format is ROW, or when binlog_format is set to MIXED and a given statement is replicated using the row-based format.

    Important

    Before enabling this variable, you must make certain that there are no dependencies between transactional and nontransactional tables; an example of such a dependency would be the statement INSERT INTO myisam_table SELECT * FROM innodb_table. Otherwise, such statements are likely to cause the replica to diverge from the source.

    This variable has no effect when the binary log format is ROW or MIXED.

  • binlog_encryption

    Command-Line Format --binlog-encryption[={OFF|ON}]
    Introduced 8.0.14
    System Variable binlog_encryption
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value OFF

    Enables encryption for binary log files and relay log files on this server. OFF is the default. ON sets encryption on for binary log files and relay log files. Binary logging does not need to be enabled on the server to enable encryption, so you can encrypt the relay log files on a replica that has no binary log. To use encryption, a keyring plugin must be installed and configured to supply MySQL Server's keyring service. For instructions to do this, see Section 6.4.4, “The MySQL Keyring”. Any supported keyring plugin can be used to store binary log encryption keys.

    When you first start the server with binary log encryption enabled, a new binary log encryption key is generated before the binary log and relay logs are initialized. This key is used to encrypt a file password for each binary log file (if the server has binary logging enabled) and relay log file (if the server has replication channels), and further keys generated from the file passwords are used to encrypt the data in the files. Relay log files are encrypted for all channels, including Group Replication applier channels and new channels that are created after encryption is activated. The binary log index file and relay log index file are never encrypted.

    If you activate encryption while the server is running, a new binary log encryption key is generated at that time. The exception is if encryption was active previously on the server and was then disabled, in which case the binary log encryption key that was in use before is used again. The binary log file and relay log files are rotated immediately, and file passwords for the new files and all subsequent binary log files and relay log files are encrypted using this binary log encryption key. Existing binary log files and relay log files still present on the server are not automatically encrypted, but you can purge them if they are no longer needed.

    If you deactivate encryption by changing the binlog_encryption system variable to OFF, the binary log file and relay log files are rotated immediately and all subsequent logging is unencrypted. Previously encrypted files are not automatically decrypted, but the server is still able to read them. The BINLOG_ENCRYPTION_ADMIN privilege (or the deprecated SUPER privilege) is required to activate or deactivate encryption while the server is running. Group Replication applier channels are not included in the relay log rotation request, so unencrypted logging for these channels does not start until their logs are rotated in normal use.

    For more information on binary log file and relay log file encryption, see Section 17.3.2, “Encrypting Binary Log Files and Relay Log Files”.

  • binlog_error_action

    Command-Line Format --binlog-error-action[=value]
    System Variable binlog_error_action
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Enumeration
    Default Value ABORT_SERVER
    Valid Values

    IGNORE_ERROR

    ABORT_SERVER

    Controls what happens when the server encounters an error such as not being able to write to, flush or synchronize the binary log, which can cause the source's binary log to become inconsistent and replicas to lose synchronization.

    This variable defaults to ABORT_SERVER, which makes the server halt logging and shut down whenever it encounters such an error with the binary log. On restart, recovery proceeds as in the case of an unexpected server halt (see Section 17.4.2, “Handling an Unexpected Halt of a Replica”).

    When binlog_error_action is set to IGNORE_ERROR, if the server encounters such an error it continues the ongoing transaction, logs the error then halts logging, and continues performing updates. To resume binary logging log_bin must be enabled again, which requires a server restart. This setting provides backward compatibility with older versions of MySQL.

  • binlog_expire_logs_seconds

    Command-Line Format --binlog-expire-logs-seconds=#
    Introduced 8.0.1
    System Variable binlog_expire_logs_seconds
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value (≥ 8.0.11) 2592000
    Default Value (≤ 8.0.4) 0
    Minimum Value 0
    Maximum Value 4294967295

    Sets the binary log expiration period in seconds. After their expiration period ends, binary log files can be automatically removed. Possible removals happen at startup and when the binary log is flushed. Log flushing occurs as indicated in Section 5.4, “MySQL Server Logs”.

    The default binary log expiration period is 2592000 seconds, which equals 30 days (30*24*60*60 seconds). The default applies if neither binlog_expire_logs_seconds nor the deprecated system variable expire_logs_days has a value set at startup. If a non-zero value for one of the variables binlog_expire_logs_seconds or expire_logs_days is set at startup, this value is used as the binary log expiration period. If a non-zero value for both of those variables is set at startup, the value for binlog_expire_logs_seconds is used as the binary log expiration period, and the value for expire_logs_days is ignored with a warning message.

    To disable automatic purging of the binary log, specify a value of 0 explicitly for binlog_expire_logs_seconds, and do not specify a value for expire_logs_days. For compatibility with earlier releases, automatic purging is also disabled if you specify a value of 0 explicitly for expire_logs_days and do not specify a value for binlog_expire_logs_seconds. In that case, the default for binlog_expire_logs_seconds is not applied.

    To remove binary log files manually, use the PURGE BINARY LOGS statement. See Section 13.4.1.1, “PURGE BINARY LOGS Statement”.

  • binlog_format

    Command-Line Format --binlog-format=format
    System Variable binlog_format
    Scope Global, Session
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Enumeration
    Default Value ROW
    Valid Values

    ROW

    STATEMENT

    MIXED

    This variable sets the binary logging format, and can be any one of STATEMENT, ROW, or MIXED. See Section 17.2.1, “Replication Formats”.

    binlog_format can be set at startup or at runtime, except that under some conditions, changing this variable at runtime is not possible or causes replication to fail, as described later.

    The default is ROW. Exception: In NDB Cluster, the default is MIXED; statement-based replication is not supported for NDB Cluster.

    Setting the session value of this system variable is a restricted operation. The session user must have privileges sufficient to set restricted session variables. See Section 5.1.9.1, “System Variable Privileges”.

    The rules governing when changes to this variable take effect and how long the effect lasts are the same as for other MySQL server system variables. For more information, see Section 13.7.6.1, “SET Syntax for Variable Assignment”.

    When MIXED is specified, statement-based replication is used, except for cases where only row-based replication is guaranteed to lead to proper results. For example, this happens when statements contain user-defined functions (UDF) or the UUID() function.

    For details of how stored programs (stored procedures and functions, triggers, and events) are handled when each binary logging format is set, see Section 25.7, “Stored Program Binary Logging”.

    There are exceptions when you cannot switch the replication format at runtime:

    • The replication format cannot be changed from within a stored function or a trigger.

    • If a session has open temporary tables, the replication format cannot be changed for the session (SET @@SESSION.binlog_format).

    • If any replication channel has open temporary tables, the replication format cannot be changed globally (SET @@GLOBAL.binlog_format or SET @@PERSIST.binlog_format).

    • If any replication channel applier thread is currently running, the replication format cannot be changed globally (SET @@GLOBAL.binlog_format or SET @@PERSIST.binlog_format).

    Trying to switch the replication format in any of these cases (or attempting to set the current replication format) results in an error. You can, however, use PERSIST_ONLY (SET @@PERSIST_ONLY.binlog_format) to change the replication format at any time, because this action does not modify the runtime global system variable value, and takes effect only after a server restart.

    Switching the replication format at runtime is not recommended when any temporary tables exist, because temporary tables are logged only when using statement-based replication, whereas with row-based replication and mixed replication, they are not logged.

    Changing the logging format on a replication source server does not cause a replica to change its logging format to match. Switching the replication format while replication is ongoing can cause issues if a replica has binary logging enabled, and the change results in the replica using STATEMENT format logging while the source is using ROW or MIXED format logging. A replica is not able to convert binary log entries received in ROW logging format to STATEMENT format for use in its own binary log, so this situation can cause replication to fail. For more information, see Section 5.4.4.2, “Setting The Binary Log Format”.

    The binary log format affects the behavior of the following server options:

    These effects are discussed in detail in the descriptions of the individual options.

  • binlog_group_commit_sync_delay

    Command-Line Format --binlog-group-commit-sync-delay=#
    System Variable binlog_group_commit_sync_delay
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 0
    Minimum Value 0
    Maximum Value 1000000

    Controls how many microseconds the binary log commit waits before synchronizing the binary log file to disk. By default binlog_group_commit_sync_delay is set to 0, meaning that there is no delay. Setting binlog_group_commit_sync_delay to a microsecond delay enables more transactions to be synchronized together to disk at once, reducing the overall time to commit a group of transactions because the larger groups require fewer time units per group.

    When sync_binlog=0 or sync_binlog=1 is set, the delay specified by binlog_group_commit_sync_delay is applied for every binary log commit group before synchronization (or in the case of sync_binlog=0, before proceeding). When sync_binlog is set to a value n greater than 1, the delay is applied after every n binary log commit groups.

    Setting binlog_group_commit_sync_delay can increase the number of parallel committing transactions on any server that has (or might have after a failover) a replica, and therefore can increase parallel execution on the replicas. To benefit from this effect, the replica servers must have slave_parallel_type=LOGICAL_CLOCK set, and the effect is more significant when binlog_transaction_dependency_tracking=COMMIT_ORDER is also set. It is important to take into account both the source's throughput and the replicas' throughput when you are tuning the setting for binlog_group_commit_sync_delay.

    Setting binlog_group_commit_sync_delay can also reduce the number of fsync() calls to the binary log on any server (source or replica) that has a binary log.

    Note that setting binlog_group_commit_sync_delay increases the latency of transactions on the server, which might affect client applications. Also, on highly concurrent workloads, it is possible for the delay to increase contention and therefore reduce throughput. Typically, the benefits of setting a delay outweigh the drawbacks, but tuning should always be carried out to determine the optimal setting.

  • binlog_group_commit_sync_no_delay_count

    Command-Line Format --binlog-group-commit-sync-no-delay-count=#
    System Variable binlog_group_commit_sync_no_delay_count
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 0
    Minimum Value 0
    Maximum Value 1000000

    The maximum number of transactions to wait for before aborting the current delay as specified by binlog_group_commit_sync_delay. If binlog_group_commit_sync_delay is set to 0, then this option has no effect.

  • binlog_max_flush_queue_time

    Command-Line Format --binlog-max-flush-queue-time=#
    Deprecated Yes
    System Variable binlog_max_flush_queue_time
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 0
    Minimum Value 0
    Maximum Value 100000

    binlog_max_flush_queue_time is deprecated, and is marked for eventual removal in a future MySQL release. Formerly, this system variable controlled the time in microseconds to continue reading transactions from the flush queue before proceeding with group commit. It no longer has any effect.

  • binlog_order_commits

    Command-Line Format --binlog-order-commits[={OFF|ON}]
    System Variable binlog_order_commits
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value ON

    When this variable is enabled on a replication source server (which is the default), transaction commit instructions issued to storage engines are serialized on a single thread, so that transactions are always committed in the same order as they are written to the binary log. Disabling this variable permits transaction commit instructions to be issued using multiple threads. Used in combination with binary log group commit, this prevents the commit rate of a single transaction being a bottleneck to throughput, and might therefore produce a performance improvement.

    Transactions are written to the binary log at the point when all the storage engines involved have confirmed that the transaction is prepared to commit. The binary log group commit logic then commits a group of transactions after their binary log write has taken place. When binlog_order_commits is disabled, because multiple threads are used for this process, transactions in a commit group might be committed in a different order from their order in the binary log. (Transactions from a single client always commit in chronological order.) In many cases this does not matter, as operations carried out in separate transactions should produce consistent results, and if that is not the case, a single transaction ought to be used instead.

    If you want to ensure that the transaction history on the source and on a multithreaded replica remains identical, set slave_preserve_commit_order=1 on the replica.

  • binlog_rotate_encryption_master_key_at_startup

    Command-Line Format --binlog-rotate-encryption-master-key-at-startup[={OFF|ON}]
    Introduced 8.0.14
    System Variable binlog_rotate_encryption_master_key_at_startup
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type Boolean
    Default Value OFF

    Specifies whether or not the binary log master key is rotated at server startup. The binary log master key is the binary log encryption key that is used to encrypt file passwords for the binary log files and relay log files on the server. When a server is started for the first time with binary log encryption enabled (binlog_encryption=ON), a new binary log encryption key is generated and used as the binary log master key. If the binlog_rotate_encryption_master_key_at_startup system variable is also set to ON, whenever the server is restarted, a further binary log encryption key is generated and used as the binary log master key for all subsequent binary log files and relay log files. If the binlog_rotate_encryption_master_key_at_startup system variable is set to OFF, which is the default, the existing binary log master key is used again after the server restarts. For more information on binary log encryption keys and the binary log master key, see Section 17.3.2, “Encrypting Binary Log Files and Relay Log Files”.

  • binlog_row_event_max_size

    Command-Line Format --binlog-row-event-max-size=#
    System Variable (≥ 8.0.14) binlog_row_event_max_size
    Scope (≥ 8.0.14) Global
    Dynamic (≥ 8.0.14) No
    SET_VAR Hint Applies (≥ 8.0.14) No
    Type Integer
    Default Value 8192
    Minimum Value 256
    Maximum Value (64-bit platforms) 18446744073709551615
    Maximum Value (32-bit platforms) 4294967295

    When row-based binary logging is used, this setting is a soft limit on the maximum size of a row-based binary log event, in bytes. Where possible, rows stored in the binary log are grouped into events with a size not exceeding the value of this setting. If an event cannot be split, the maximum size can be exceeded. The value must be (or else gets rounded down to) a multiple of 256. The default is 8192 bytes.

    This global system variable is read-only and can be set only at server startup. Its value can therefore only be modified by using the PERSIST_ONLY keyword or the @@persist_only qualifier with the SET statement.

  • binlog_row_image

    Command-Line Format --binlog-row-image=image_type
    System Variable binlog_row_image
    Scope Global, Session
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Enumeration
    Default Value full
    Valid Values

    full (Log all columns)

    minimal (Log only changed columns, and columns needed to identify rows)

    noblob (Log all columns, except for unneeded BLOB and TEXT columns)

    For MySQL row-based replication, this variable determines how row images are written to the binary log.

    Setting the session value of this system variable is a restricted operation. The session user must have privileges sufficient to set restricted session variables. See Section 5.1.9.1, “System Variable Privileges”.

    In MySQL row-based replication, each row change event contains two images, a before image whose columns are matched against when searching for the row to be updated, and an after image containing the changes. Normally, MySQL logs full rows (that is, all columns) for both the before and after images. However, it is not strictly necessary to include every column in both images, and we can often save disk, memory, and network usage by logging only those columns which are actually required.

    Note

    When deleting a row, only the before image is logged, since there are no changed values to propagate following the deletion. When inserting a row, only the after image is logged, since there is no existing row to be matched. Only when updating a row are both the before and after images required, and both written to the binary log.

    For the before image, it is necessary only that the minimum set of columns required to uniquely identify rows is logged. If the table containing the row has a primary key, then only the primary key column or columns are written to the binary log. Otherwise, if the table has a unique key all of whose columns are NOT NULL, then only the columns in the unique key need be logged. (If the table has neither a primary key nor a unique key without any NULL columns, then all columns must be used in the before image, and logged.) In the after image, it is necessary to log only the columns which have actually changed.

    You can cause the server to log full or minimal rows using the binlog_row_image system variable. This variable actually takes one of three possible values, as shown in the following list:

    • full: Log all columns in both the before image and the after image.

    • minimal: Log only those columns in the before image that are required to identify the row to be changed; log only those columns in the after image where a value was specified by the SQL statement, or generated by auto-increment.

    • noblob: Log all columns (same as full), except for BLOB and TEXT columns that are not required to identify rows, or that have not changed.

    Note

    This variable is not supported by NDB Cluster; setting it has no effect on the logging of NDB tables.

    The default value is full.

    When using minimal or noblob, deletes and updates are guaranteed to work correctly for a given table if and only if the following conditions are true for both the source and destination tables:

    • All columns must be present and in the same order; each column must use the same data type as its counterpart in the other table.

    • The tables must have identical primary key definitions.

    (In other words, the tables must be identical with the possible exception of indexes that are not part of the tables' primary keys.)

    If these conditions are not met, it is possible that the primary key column values in the destination table may prove insufficient to provide a unique match for a delete or update. In this event, no warning or error is issued; the source and replica silently diverge, thus breaking consistency.

    Setting this variable has no effect when the binary logging format is STATEMENT. When binlog_format is MIXED, the setting for binlog_row_image is applied to changes that are logged using row-based format, but this setting has no effect on changes logged as statements.

    Setting binlog_row_image on either the global or session level does not cause an implicit commit; this means that this variable can be changed while a transaction is in progress without affecting the transaction.

  • binlog_row_metadata

    Command-Line Format --binlog-row-metadata=metadata_type
    Introduced 8.0.1
    System Variable binlog_row_metadata
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Enumeration
    Default Value MINIMAL
    Valid Values

    FULL (All metadata is included)

    MINIMAL (Limit included metadata)

    Configures the amount of table metadata added to the binary log when using row-based logging. When set to MINIMAL, the default, only metadata related to SIGNED flags, column character set and geometry types are logged. When set to FULL complete metadata for tables is logged, such as column name, ENUM or SET string values, PRIMARY KEY information, and so on.

    The extended metadata serves the following purposes:

    • Replicas use the metadata to transfer data when its table structure is different from the source's.

    • External software can use the metadata to decode row events and store the data into external databases, such as a data warehouse.

  • binlog_row_value_options

    Command-Line Format --binlog-row-value-options=#
    Introduced 8.0.3
    System Variable binlog_row_value_options
    Scope Global, Session
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Set
    Default Value ''
    Valid Values PARTIAL_JSON

    When set to PARTIAL_JSON, this enables use of a space-efficient binary log format for updates that modify only a small portion of a JSON document, which causes row-based replication to write only the modified parts of the JSON document to the after-image for the update in the binary log (rather than writing the full document). This works for an UPDATE statement which modifies a JSON column using any sequence of JSON_SET(), JSON_REPLACE(), and JSON_REMOVE(). If the modification requires more space than the full document, or if the server is unable to generate a partial update, the full document is used instead.

    Setting the session value of this system variable is a restricted operation. The session user must have privileges sufficient to set restricted session variables. See Section 5.1.9.1, “System Variable Privileges”.

    PARTIAL_JSON is the only supported value; to unset binlog_row_value_options, set its value to the empty string.

    binlog_row_value_options=PARTIAL_JSON takes effect only when binary logging is enabled and binlog_format is set to ROW or MIXED. Statement-based replication always logs only the modified parts of the JSON document, regardless of any value set for binlog_row_value_options. To maximize the amount of space saved, use binlog_row_image=NOBLOB or binlog_row_image=MINIMAL together with this option. binlog_row_image=FULL saves less space than either of these, since the full JSON document is stored in the before-image, and the partial update is stored only in the after-image.

    mysqlbinlog output includes partial JSON updates in the form of events encoded as base-64 strings using BINLOG statements. If the --verbose option is specified, mysqlbinlog displays the partial JSON updates as readable JSON using pseudo-SQL statements.

    MySQL Replication generates an error if a modification cannot be applied to the JSON document on the replica. This includes a failure to find the path. Be aware that, even with this and other safety checks, if a JSON document on a replica has diverged from that on the source and a partial update is applied, it remains theoretically possible to produce a valid but unexpected JSON document on the replica.

  • binlog_rows_query_log_events

    Command-Line Format --binlog-rows-query-log-events[={OFF|ON}]
    System Variable binlog_rows_query_log_events
    Scope Global, Session
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value OFF

    This system variable affects row-based logging only. When enabled, it causes the server to write informational log events such as row query log events into its binary log. This information can be used for debugging and related purposes, such as obtaining the original query issued on the source when it cannot be reconstructed from the row updates.

    Setting the session value of this system variable is a restricted operation. The session user must have privileges sufficient to set restricted session variables. See Section 5.1.9.1, “System Variable Privileges”.

    These informational events are normally ignored by MySQL programs reading the binary log and so cause no issues when replicating or restoring from backup. To view them, increase the verbosity level by using mysqlbinlog's --verbose option twice, either as -vv or --verbose --verbose.

  • binlog_stmt_cache_size

    Command-Line Format --binlog-stmt-cache-size=#
    System Variable binlog_stmt_cache_size
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 32768
    Minimum Value 4096
    Maximum Value (64-bit platforms) 18446744073709551615
    Maximum Value (32-bit platforms) 4294967295

    The size of the memory buffer for the binary log to hold nontransactional statements issued during a transaction. When binary logging is enabled on the server (with the log_bin system variable set to ON), separate binary log transaction and statement caches are allocated for each client if the server supports any transactional storage engines. If the data for the nontransactional statements used in the transaction exceeds the space in the memory buffer, the excess data is stored in a temporary file. When binary log encryption is active on the server, the memory buffer is not encrypted, but (from MySQL 8.0.17) any temporary file used to hold the binary log cache is encrypted. After each transaction is committed, the binary log statement cache is reset by clearing the memory buffer and truncating the temporary file if used.

    If you often use large nontransactional statements during transactions, you can increase this cache size to get better performance by reducing or eliminating the need to write to temporary files. The Binlog_stmt_cache_use and Binlog_stmt_cache_disk_use status variables can be useful for tuning the size of this variable. See Section 5.4.4, “The Binary Log”.

    The binlog_cache_size system variable sets the size for the transaction cache.

  • binlog_transaction_compression

    Command-Line Format --binlog-transaction-compression[={OFF|ON}]
    Introduced 8.0.20
    System Variable binlog_transaction_compression
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value OFF

    Enables compression for transactions that are written to binary log files on this server. OFF is the default. Use the binlog_transaction_compression_level_zstd system variable to set the level for the zstd algorithm that is used for compression.

    When binary log transaction compression is enabled, transaction payloads are compressed and then written to the binary log file as a single event (Transaction_payload_event). Compressed transaction payloads remain in a compressed state while they are sent in the replication stream to replicas, other Group Replication group members, or clients such as mysqlbinlog, and are written to the relay log still in their compressed state. Binary log transaction compression therefore saves storage space both on the originator of the transaction and on the recipient (and for their backups), and saves network bandwidth when the transactions are sent between server instances.

    For binlog_transaction_compression=ON to have a direct effect, binary logging must be enabled on the server. When a MySQL server instance has no binary log, if it is at a release from MySQL 8.0.20, it can receive, handle, and display compressed transaction payloads regardless of its value for binlog_transaction_compression. Compressed transaction payloads received by such server instances are written in their compressed state to the relay log, so they benefit indirectly from compression carried out by other servers in the replication topology.

    This system variable cannot be changed within the context of a transaction. Setting the session value of this system variable is a restricted operation. The session user must have privileges sufficient to set restricted session variables. See Section 5.1.9.1, “System Variable Privileges”.

    For more information on binary log transaction compression, including details of what events are and are not compressed, and changes in behavior when transaction compression is in use, see Section 5.4.4.5, “Binary Log Transaction Compression”.

  • binlog_transaction_compression_level_zstd

    Command-Line Format --binlog-transaction-compression-level-zstd=#
    Introduced 8.0.20
    System Variable binlog_transaction_compression_level_zstd
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 3
    Minimum Value 1
    Maximum Value 22

    Sets the compression level for binary log transaction compression on this server, which is enabled by the binlog_transaction_compression system variable. The value is an integer that determines the compression effort, from 1 (the lowest effort) to 22 (the highest effort). If you do not specify this system variable, the compression level is set to 3.

    As the compression level increases, the data compression ratio increases, which reduces the storage space and network bandwidth required for the transaction payload. However, the effort required for data compression also increases, taking time and CPU and memory resources on the originating server. Increases in the compression effort do not have a linear relationship to increases in the data compression ratio.

    This system variable cannot be changed within the context of a transaction. Setting the session value of this system variable is a restricted operation. The session user must have privileges sufficient to set restricted session variables. See Section 5.1.9.1, “System Variable Privileges”.

  • binlog_transaction_dependency_tracking

    Command-Line Format --binlog-transaction-dependency-tracking=value
    Introduced 8.0.1
    System Variable binlog_transaction_dependency_tracking
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Enumeration
    Default Value COMMIT_ORDER
    Valid Values

    COMMIT_ORDER

    WRITESET

    WRITESET_SESSION

    For a replication source server that has multithreaded replicas (replicas on which slave_parallel_workers is set to a value greater than 0), binlog_transaction_dependency_tracking specifies the source of dependency information that the source records in the binary log to help replicas determine which transactions can be executed in parallel. The possible values are:

    • COMMIT_ORDER: Dependency information is generated from the source's commit timestamps. This is the default.

    • WRITESET: Dependency information is generated from the source's write set, and any transactions that write different tuples can be parallelized.

    • WRITESET_SESSION: Dependency information is generated from the source's write set, and any transactions that write different tuples can be parallelized, with the exception that no two updates from the same session can be reordered.

    WRITESET and WRITESET_SESSION modes do not deliver any transaction dependencies that are less optimized than those that would have been returned in COMMIT_ORDER mode. When you set WRITESET or WRITESET_SESSION as the value, the source uses COMMIT_ORDER mode for any transactions that have empty or partial write sets, for any transactions that update tables without primary or unique keys, and for any transactions that update parent tables in a foreign key relationship.

    To set WRITESET or WRITESET_SESSION as the value for binlog_transaction_dependency_tracking, transaction_write_set_extraction must be set to specify an algorithm (not set to OFF). The default in MySQL 8.0 is that transaction_write_set_extraction is set to XXHASH64. The value that you select for transaction_write_set_extraction cannot be changed again while the value of binlog_transaction_dependency_tracking remains as WRITESET or WRITESET_SESSION.

    The number of row hashes to be kept and checked for the latest transaction to have changed a given row is determined by the value of binlog_transaction_dependency_history_size.

    For Group Replication, setting binlog_transaction_dependency_tracking=WRITESET_SESSION can improve performance for a group member, depending on the group's workload. Group Replication carries out its own parallelization after certification when applying transactions from the relay log, independently of the value set for binlog_transaction_dependency_tracking. However, the value of binlog_transaction_dependency_tracking does affect how transactions are written to the binary logs on Group Replication members. The dependency information in those logs is used to assist the process of state transfer from a donor's binary log for distributed recovery, which takes place whenever a member joins or rejoins the group.

  • binlog_transaction_dependency_history_size

    Command-Line Format --binlog-transaction-dependency-history-size=#
    Introduced 8.0.1
    System Variable binlog_transaction_dependency_history_size
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 25000
    Minimum Value 1
    Maximum Value 1000000

    Sets an upper limit on the number of row hashes which are kept in memory and used for looking up the transaction that last modified a given row. Once this number of hashes has been reached, the history is purged.

  • expire_logs_days

    Command-Line Format --expire-logs-days=#
    Deprecated 8.0.3
    System Variable expire_logs_days
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value (≥ 8.0.11) 0
    Default Value (≥ 8.0.2, ≤ 8.0.4) 30
    Default Value (≤ 8.0.1) 0
    Minimum Value 0
    Maximum Value 99

    Specifies the number of days before automatic removal of binary log files. expire_logs_days is deprecated, and you should expect it to be removed in a future release. Instead, use binlog_expire_logs_seconds, which sets the binary log expiration period in seconds. If you do not set a value for either system variable, the default expiration period is 30 days. Possible removals happen at startup and when the binary log is flushed. Log flushing occurs as indicated in Section 5.4, “MySQL Server Logs”.

    Any non-zero value that you specify for expire_logs_days is ignored if binlog_expire_logs_seconds is also specified, and the value of binlog_expire_logs_seconds is used instead as the binary log expiration period. A warning message is issued in this situation. A non-zero value for expire_logs_days is only applied as the binary log expiration period if binlog_expire_logs_seconds is not specified or is specified as 0.

    To disable automatic purging of the binary log, specify a value of 0 explicitly for binlog_expire_logs_seconds, and do not specify a value for expire_logs_days. For compatibility with earlier releases, automatic purging is also disabled if you specify a value of 0 explicitly for expire_logs_days and do not specify a value for binlog_expire_logs_seconds. In that case, the default for binlog_expire_logs_seconds is not applied.

    To remove binary log files manually, use the PURGE BINARY LOGS statement. See Section 13.4.1.1, “PURGE BINARY LOGS Statement”.

  • log_bin

    System Variable log_bin
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type Boolean

    Shows the status of binary logging on the server, either enabled (ON) or disabled (OFF). With binary logging enabled, the server logs all statements that change data to the binary log, which is used for backup and replication. ON means that the binary log is available, OFF means that it is not in use. The --log-bin option can be used to specify a base name and location for the binary log.

    In earlier MySQL versions, binary logging was disabled by default, and was enabled if you specified the --log-bin option. From MySQL 8.0, binary logging is enabled by default, with the log_bin system variable set to ON, whether or not you specify the --log-bin option. The exception is if you use mysqld to initialize the data directory manually by invoking it with the --initialize or --initialize-insecure option, when binary logging is disabled by default. It is possible to enable binary logging in this case by specifying the --log-bin option.

    If the --skip-log-bin or --disable-log-bin option is specified at startup, binary logging is disabled, with the log_bin system variable set to OFF. If either of these options is specified and --log-bin is also specified, the option specified later takes precedence.

    For information on the format and management of the binary log, see Section 5.4.4, “The Binary Log”.

  • log_bin_basename

    System Variable log_bin_basename
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type File name

    Holds the base name and path for the binary log files, which can be set with the --log-bin server option. The maximum variable length is 256. In MySQL 8.0, if the --log-bin option is not supplied, the default base name is binlog. For compatibility with MySQL 5.7, if the --log-bin option is supplied with no string or with an empty string, the default base name is host_name-bin, using the name of the host machine. The default location is the data directory.

  • log_bin_index

    Command-Line Format --log-bin-index=file_name
    System Variable log_bin_index
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type File name

    Holds the base name and path for the binary log index file, which can be set with the --log-bin-index server option. The maximum variable length is 256.

  • log_bin_trust_function_creators

    Command-Line Format --log-bin-trust-function-creators[={OFF|ON}]
    System Variable log_bin_trust_function_creators
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value OFF

    This variable applies when binary logging is enabled. It controls whether stored function creators can be trusted not to create stored functions that may cause unsafe events to be written to the binary log. If set to 0 (the default), users are not permitted to create or alter stored functions unless they have the SUPER privilege in addition to the CREATE ROUTINE or ALTER ROUTINE privilege. A setting of 0 also enforces the restriction that a function must be declared with the DETERMINISTIC characteristic, or with the READS SQL DATA or NO SQL characteristic. If the variable is set to 1, MySQL does not enforce these restrictions on stored function creation. This variable also applies to trigger creation. See Section 25.7, “Stored Program Binary Logging”.

  • log_bin_use_v1_row_events

    Command-Line Format --log-bin-use-v1-row-events[={OFF|ON}]
    Deprecated 8.0.18
    System Variable log_bin_use_v1_row_events
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type Boolean
    Default Value OFF

    This read-only system variable is deprecated. Setting the system variable to ON at server startup enabled row-based replication with replicas running MySQL Server 5.5 and earlier by writing the binary log using Version 1 binary log row events, instead of Version 2 binary log row events which are the default as of MySQL 5.6.

  • log_slave_updates

    Command-Line Format --log-slave-updates[={OFF|ON}]
    System Variable log_slave_updates
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type Boolean
    Default Value (≥ 8.0.3) ON
    Default Value (≤ 8.0.2) OFF

    Whether updates received by a replica server from a replication source server should be logged to the replica's own binary log.

    Enabling this variable causes the replica to write the updates that are received from a source and performed by the replication SQL thread to the replica's own binary log. Binary logging, which is controlled by the --log-bin option and is enabled by default, must also be enabled on the replica for updates to be logged. See Section 17.1.6, “Replication and Binary Logging Options and Variables”. log_slave_updates is enabled by default, unless you specify --skip-log-bin to disable binary logging, in which case MySQL also disables replica update logging by default. If you need to disable replica update logging when binary logging is enabled, specify --log-slave-updates=OFF at replica server startup.

    Enabling log_slave_updates enables replication servers to be chained. For example, you might want to set up replication servers using this arrangement:

    A -> B -> C
    

    Here, A serves as the source for the replica B, and B serves as the source for the replica C. For this to work, B must be both a source and a replica. With binary logging enabled and log_slave_updates enabled, which are the default settings, updates received from A are logged by B to its binary log, and can therefore be passed on to C.

  • log_statements_unsafe_for_binlog

    Command-Line Format --log-statements-unsafe-for-binlog[={OFF|ON}]
    System Variable log_statements_unsafe_for_binlog
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value ON

    If error 1592 is encountered, controls whether the generated warnings are added to the error log or not.

  • master_verify_checksum

    Command-Line Format --master-verify-checksum[={OFF|ON}]
    System Variable master_verify_checksum
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value OFF

    Enabling this variable causes the source to verify events read from the binary log by examining checksums, and to stop with an error in the event of a mismatch. master_verify_checksum is disabled by default; in this case, the source uses the event length from the binary log to verify events, so that only complete events are read from the binary log.

  • max_binlog_cache_size

    Command-Line Format --max-binlog-cache-size=#
    System Variable max_binlog_cache_size
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 18446744073709551615
    Minimum Value 4096
    Maximum Value 18446744073709551615

    If a transaction requires more than this many bytes of memory, the server generates a Multi-statement transaction required more than 'max_binlog_cache_size' bytes of storage error. The minimum value is 4096. The maximum possible value is 16EiB (exbibytes). The maximum recommended value is 4GB; this is due to the fact that MySQL currently cannot work with binary log positions greater than 4GB.

    max_binlog_cache_size sets the size for the transaction cache only; the upper limit for the statement cache is governed by the max_binlog_stmt_cache_size system variable.

    The visibility to sessions of max_binlog_cache_size matches that of the binlog_cache_size system variable; in other words, changing its value affects only new sessions that are started after the value is changed.

  • max_binlog_size

    Command-Line Format --max-binlog-size=#
    System Variable max_binlog_size
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 1073741824
    Minimum Value 4096
    Maximum Value 1073741824

    If a write to the binary log causes the current log file size to exceed the value of this variable, the server rotates the binary logs (closes the current file and opens the next one). The minimum value is 4096 bytes. The maximum and default value is 1GB. Encrypted binary log files have an additional 512-byte header, which is included in max_binlog_size.

    A transaction is written in one chunk to the binary log, so it is never split between several binary logs. Therefore, if you have big transactions, you might see binary log files larger than max_binlog_size.

    If max_relay_log_size is 0, the value of max_binlog_size applies to relay logs as well.

    With GTIDs in use on the server, when max_binlog_size is reached, if the system table mysql.gtid_executed cannot be accessed to write the GTIDs from the current binary log file, the binary log cannot be rotated. In this situation, the server responds according to its binlog_error_action setting. If IGNORE_ERROR is set, an error is logged on the server and binary logging is halted, or if ABORT_SERVER is set, the server shuts down.

  • max_binlog_stmt_cache_size

    Command-Line Format --max-binlog-stmt-cache-size=#
    System Variable max_binlog_stmt_cache_size
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 18446744073709547520
    Minimum Value 4096
    Maximum Value 18446744073709547520

    If nontransactional statements within a transaction require more than this many bytes of memory, the server generates an error. The minimum value is 4096. The maximum and default values are 4GB on 32-bit platforms and 16EB (exabytes) on 64-bit platforms.

    max_binlog_stmt_cache_size sets the size for the statement cache only; the upper limit for the transaction cache is governed exclusively by the max_binlog_cache_size system variable.

  • original_commit_timestamp

    Introduced 8.0.1
    System Variable original_commit_timestamp
    Scope Session
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Numeric

    For internal use by replication. When re-executing a transaction on a replica, this is set to the time when the transaction was committed on the original source, measured in microseconds since the epoch. This allows the original commit timestamp to be propagated throughout a replication topology.

    Setting the session value of this system variable is a restricted operation. The session user must have either the REPLICATION_APPLIER privilege (see Section 17.3.3, “Replication Privilege Checks”), or privileges sufficient to set restricted session variables (see Section 5.1.9.1, “System Variable Privileges”). However, note that the variable is not intended for users to set; it is set automatically by the replication infrastructure.

  • sql_log_bin

    System Variable sql_log_bin
    Scope Session
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Boolean
    Default Value ON

    This variable controls whether logging to the binary log is enabled for the current session (assuming that the binary log itself is enabled). The default value is ON. To disable or enable binary logging for the current session, set the session sql_log_bin variable to OFF or ON.

    Set this variable to OFF for a session to temporarily disable binary logging while making changes to the source you do not want replicated to the replica.

    Setting the session value of this system variable is a restricted operation. The session user must have privileges sufficient to set restricted session variables. See Section 5.1.9.1, “System Variable Privileges”.

    It is not possible to set the session value of sql_log_bin within a transaction or subquery.

    Setting this variable to OFF prevents GTIDs from being assigned to transactions in the binary log. If you are using GTIDs for replication, this means that even when binary logging is later enabled again, the GTIDs written into the log from this point do not account for any transactions that occurred in the meantime, so in effect those transactions are lost.

  • sync_binlog

    Command-Line Format --sync-binlog=#
    System Variable sync_binlog
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value 1
    Minimum Value 0
    Maximum Value 4294967295

    Controls how often the MySQL server synchronizes the binary log to disk.

    • sync_binlog=0: Disables synchronization of the binary log to disk by the MySQL server. Instead, the MySQL server relies on the operating system to flush the binary log to disk from time to time as it does for any other file. This setting provides the best performance, but in the event of a power failure or operating system crash, it is possible that the server has committed transactions that have not been synchronized to the binary log.

    • sync_binlog=1: Enables synchronization of the binary log to disk before transactions are committed. This is the safest setting but can have a negative impact on performance due to the increased number of disk writes. In the event of a power failure or operating system crash, transactions that are missing from the binary log are only in a prepared state. This permits the automatic recovery routine to roll back the transactions, which guarantees that no transaction is lost from the binary log.

    • sync_binlog=N, where N is a value other than 0 or 1: The binary log is synchronized to disk after N binary log commit groups have been collected. In the event of a power failure or operating system crash, it is possible that the server has committed transactions that have not been flushed to the binary log. This setting can have a negative impact on performance due to the increased number of disk writes. A higher value improves performance, but with an increased risk of data loss.

    For the greatest possible durability and consistency in a replication setup that uses InnoDB with transactions, use these settings:

    Caution

    Many operating systems and some disk hardware fool the flush-to-disk operation. They may tell mysqld that the flush has taken place, even though it has not. In this case, the durability of transactions is not guaranteed even with the recommended settings, and in the worst case, a power outage can corrupt InnoDB data. Using a battery-backed disk cache in the SCSI disk controller or in the disk itself speeds up file flushes, and makes the operation safer. You can also try to disable the caching of disk writes in hardware caches.

  • transaction_write_set_extraction

    Command-Line Format --transaction-write-set-extraction[=value]
    System Variable transaction_write_set_extraction
    Scope Global, Session
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Enumeration
    Default Value (≥ 8.0.2) XXHASH64
    Default Value (≤ 8.0.1) OFF
    Valid Values

    OFF

    MURMUR32

    XXHASH64

    For a replication source server that has multithreaded replicas (replicas on which slave_parallel_workers is set to a value greater than 0), where binlog_transaction_dependency_tracking is set to WRITESET or WRITESET_SESSION to generate dependency information from the source's write set, transaction_write_set_extraction specifies the algorithm used to hash the writes extracted during a transaction. binlog_format must be set to ROW to change the value of this system variable.

    When WRITESET or WRITESET_SESSION is set as the value for binlog_transaction_dependency_tracking, transaction_write_set_extraction must be set to specify an algorithm (not set to OFF). The default in MySQL 8.0 is that transaction_write_set_extraction is set to XXHASH64. While the current value of binlog_transaction_dependency_tracking is WRITESET or WRITESET_SESSION, you cannot change the value of transaction_write_set_extraction.

    For Group Replication, transaction_write_set_extraction must be set to XXHASH64. The process of extracting the writes from a transaction is used in Group Replication for conflict detection and certification on all group members. See Section 18.9.1, “Group Replication Requirements”.

    As of MySQL 8.0.14, setting the session value of this system variable is a restricted operation. The session user must have privileges sufficient to set restricted session variables. See Section 5.1.9.1, “System Variable Privileges”.

17.1.6.5 Global Transaction ID System Variables

The MySQL Server system variables described in this section are used to monitor and control Global Transaction Identifiers (GTIDs). For additional information, see Section 17.1.3, “Replication with Global Transaction Identifiers”.

  • binlog_gtid_simple_recovery

    Command-Line Format --binlog-gtid-simple-recovery[={OFF|ON}]
    System Variable binlog_gtid_simple_recovery
    Scope Global
    Dynamic No
    SET_VAR Hint Applies No
    Type Boolean
    Default Value ON

    This variable controls how binary log files are iterated during the search for GTIDs when MySQL starts or restarts.

    When binlog_gtid_simple_recovery=TRUE, which is the default in MySQL 8.0, the values of gtid_executed and gtid_purged are computed at startup based on the values of Previous_gtids_log_event in the most recent and oldest binary log files. For a description of the computation, see The gtid_purged System Variable. This setting accesses only two binary log files during server restart. If all binary logs on the server were generated using MySQL 5.7.8 or later, binlog_gtid_simple_recovery=TRUE can always safely be used.

    If any binary logs from MySQL 5.7.7 or older are present on the server (for example, following an upgrade of an older server to MySQL 8.0), with binlog_gtid_simple_recovery=TRUE, gtid_executed and gtid_purged might be initialized incorrectly in the following two situations:

    • The newest binary log was generated by MySQL 5.7.5 or earlier, and gtid_mode was ON for some binary logs but OFF for the newest binary log.

    • A SET @@GLOBAL.gtid_purged statement was issued on MySQL 5.7.7 or earlier, and the binary log that was active at the time of the SET @@GLOBAL.gtid_purged statement has not yet been purged.

    If an incorrect GTID set is computed in either situation, it remains incorrect even if the server is later restarted with binlog_gtid_simple_recovery=FALSE. If either of these situations apply or might apply on the server, set binlog_gtid_simple_recovery=FALSE before starting or restarting the server.

    When binlog_gtid_simple_recovery=FALSE is set, the method of computing gtid_executed and gtid_purged as described in The gtid_purged System Variable is changed to iterate the binary log files as follows:

    • Instead of using the value of Previous_gtids_log_event and GTID log events from the newest binary log file, the computation for gtid_executed iterates from the newest binary log file, and uses the value of Previous_gtids_log_event and any GTID log events from the first binary log file where it finds a Previous_gtids_log_event value. If the server's most recent binary log files do not have GTID log events, for example if gtid_mode=ON was used but the server was later changed to gtid_mode=OFF, this process can take a long time.

    • Instead of using the value of Previous_gtids_log_event from the oldest binary log file, the computation for gtid_purged iterates from the oldest binary log file, and uses the value of Previous_gtids_log_event from the first binary log file where it finds either a nonempty Previous_gtids_log_event value, or at least one GTID log event (indicating that the use of GTIDs starts at that point). If the server's older binary log files do not have GTID log events, for example if gtid_mode=ON was only set recently on the server, this process can take a long time.

  • enforce_gtid_consistency

    Command-Line Format --enforce-gtid-consistency[=value]
    System Variable enforce_gtid_consistency
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Enumeration
    Default Value OFF
    Valid Values

    OFF

    ON

    WARN

    Depending on the value of this variable, the server enforces GTID consistency by allowing execution of only statements that can be safely logged using a GTID. You must set this variable to ON before enabling GTID based replication.

    The values that enforce_gtid_consistency can be configured to are:

    • OFF: all transactions are allowed to violate GTID consistency.

    • ON: no transaction is allowed to violate GTID consistency.

    • WARN: all transactions are allowed to violate GTID consistency, but a warning is generated in this case.

    --enforce-gtid-consistency only takes effect if binary logging takes place for a statement. If binary logging is disabled on the server, or if statements are not written to the binary log because they are removed by a filter, GTID consistency is not checked or enforced for the statements that are not logged.

    Only statements that can be logged using GTID safe statements can be logged when enforce_gtid_consistency is set to ON, so the operations listed here cannot be used with this option:

    • CREATE TEMPORARY TABLE or DROP TEMPORARY TABLE statements inside transactions.

    • Transactions or statements that update both transactional and nontransactional tables. There is an exception that nontransactional DML is allowed in the same transaction or in the same statement as transactional DML, if all nontransactional tables are temporary.

    • CREATE TABLE ... SELECT statements, prior to MySQL 8.0.21. From MySQL 8.0.21, CREATE TABLE ... SELECT statements are allowed for storage engines that support atomic DDL.

    For more information, see Section 17.1.3.7, “Restrictions on Replication with GTIDs”.

    Prior to MySQL 5.7 and in early releases in that release series, the boolean enforce_gtid_consistency defaulted to OFF. To maintain compatibility with these earlier releases, the enumeration defaults to OFF, and setting --enforce-gtid-consistency without a value is interpreted as setting the value to ON. The variable also has multiple textual aliases for the values: 0=OFF=FALSE, 1=ON=TRUE,2=WARN. This differs from other enumeration types but maintains compatibility with the boolean type used in previous releases. These changes impact on what is returned by the variable. Using SELECT @@ENFORCE_GTID_CONSISTENCY, SHOW VARIABLES LIKE 'ENFORCE_GTID_CONSISTENCY', and SELECT * FROM INFORMATION_SCHEMA.VARIABLES WHERE 'VARIABLE_NAME' = 'ENFORCE_GTID_CONSISTENCY', all return the textual form, not the numeric form. This is an incompatible change, since @@ENFORCE_GTID_CONSISTENCY returns the numeric form for booleans but returns the textual form for SHOW and the Information Schema.

  • gtid_executed

    System Variable gtid_executed
    System Variable gtid_executed
    Scope Global
    Scope Global, Session
    Dynamic No
    Dynamic No
    SET_VAR Hint Applies No
    SET_VAR Hint Applies No
    Type String
    Unit set of GTIDs

    When used with global scope, this variable contains a representation of the set of all transactions executed on the server and GTIDs that have been set by a SET gtid_purged statement. This is the same as the value of the Executed_Gtid_Set column in the output of SHOW MASTER STATUS and SHOW REPLICA | SLAVE STATUS. The value of this variable is a GTID set, see GTID Sets for more information.

    When the server starts, @@GLOBAL.gtid_executed is initialized. See binlog_gtid_simple_recovery for more information on how binary logs are iterated to populate gtid_executed. GTIDs are then added to the set as transactions are executed, or if any SET gtid_purged statement is executed.

    The set of transactions that can be found in the binary logs at any given time is equal to GTID_SUBTRACT(@@GLOBAL.gtid_executed, @@GLOBAL.gtid_purged); that is, to all transactions in the binary log that have not yet been purged.

    Issuing RESET MASTER causes the global value (but not the session value) of this variable to be reset to an empty string. GTIDs are not otherwise removed from this set other than when the set is cleared due to RESET MASTER.

    In some older releases, this variable could also be used with session scope, where it contained a representation of the set of transactions that are written to the cache in the current session. The session scope is now deprecated.

  • gtid_executed_compression_period

    Command-Line Format --gtid-executed-compression-period=#
    System Variable gtid_executed_compression_period
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Integer
    Default Value (≥ 8.0.23) 0
    Default Value (≤ 8.0.22) 1000
    Minimum Value 0
    Maximum Value 4294967295

    Compress the mysql.gtid_executed table each time this many transactions have been processed. When binary logging is enabled on the server, this compression method is not used, and instead the mysql.gtid_executed table is compressed on each binary log rotation. When binary logging is disabled on the server, the compression thread sleeps until the specified number of transactions have been executed, then wakes up to perform compression of the mysql.gtid_executed table. Setting the value of this system variable to 0 means that the thread never wakes up, so this explicit compression method is not used. Instead, compression occurs implicitly as required.

    From MySQL 8.0.17, InnoDB transactions are written to the mysql.gtid_executed table by a separate process to non-InnoDB transactions. If the server has a mix of InnoDB transactions and non-InnoDB transactions, the compression controlled by this system variable interferes with the work of this process and can slow it significantly. For this reason, from that release it is recommended that you set gtid_executed_compression_period to 0.

    From MySQL 8.0.23, InnoDB and non-InnoDB transactions are written to the mysql.gtid_executed table by the same process, and the gtid_executed_compression_period default value is 0.

    See mysql.gtid_executed Table Compression for more information.

  • gtid_mode

    Command-Line Format --gtid-mode=MODE
    System Variable gtid_mode
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Enumeration
    Default Value OFF
    Valid Values

    OFF

    OFF_PERMISSIVE

    ON_PERMISSIVE

    ON

    Controls whether GTID based logging is enabled and what type of transactions the logs can contain. You must have privileges sufficient to set global system variables. See Section 5.1.9.1, “System Variable Privileges”. enforce_gtid_consistency must be true before you can set gtid_mode=ON. Before modifying this variable, see Section 17.1.4, “Changing GTID Mode on Online Servers”.

    Logged transactions can be either anonymous or use GTIDs. Anonymous transactions rely on binary log file and position to identify specific transactions. GTID transactions have a unique identifier that is used to refer to transactions. The different modes are:

    • OFF: Both new and replicated transactions must be anonymous.

    • OFF_PERMISSIVE: New transactions are anonymous. Replicated transactions can be either anonymous or GTID transactions.

    • ON_PERMISSIVE: New transactions are GTID transactions. Replicated transactions can be either anonymous or GTID transactions.

    • ON: Both new and replicated transactions must be GTID transactions.

    Changes from one value to another can only be one step at a time. For example, if gtid_mode is currently set to OFF_PERMISSIVE, it is possible to change to OFF or ON_PERMISSIVE but not to ON.

    The values of gtid_purged and gtid_executed are persistent regardless of the value of gtid_mode. Therefore even after changing the value of gtid_mode, these variables contain the correct values.

  • gtid_next

    System Variable gtid_next
    Scope Session
    Dynamic Yes
    SET_VAR Hint Applies No
    Type Enumeration
    Default Value AUTOMATIC
    Valid Values

    AUTOMATIC

    ANONYMOUS

    UUID:NUMBER

    This variable is used to specify whether and how the next GTID is obtained.

    Setting the session value of this system variable is a restricted operation. The session user must have either the REPLICATION_APPLIER privilege (see Section 17.3.3, “Replication Privilege Checks”), or privileges sufficient to set restricted session variables (see Section 5.1.9.1, “System Variable Privileges”).

    gtid_next can take any of the following values:

    • AUTOMATIC: Use the next automatically-generated global transaction ID.

    • ANONYMOUS: Transactions do not have global identifiers, and are identified by file and position only.

    • A global transaction ID in UUID:NUMBER format.

    Exactly which of the above options are valid depends on the setting of gtid_mode, see Section 17.1.4.1, “Replication Mode Concepts” for more information. Setting this variable has no effect if gtid_mode is OFF.

    After this variable has been set to UUID:NUMBER, and a transaction has been committed or rolled back, an explicit SET GTID_NEXT statement must again be issued before any other statement.

    DROP TABLE or DROP TEMPORARY TABLE fails with an explicit error when used on a combination of nontemporary tables with temporary tables, or of temporary tables using transactional storage engines with temporary tables using nontransactional storage engines.

  • gtid_owned

    System Variable gtid_owned
    Scope Global, Session
    Dynamic No
    SET_VAR Hint Applies No
    Type String
    Unit set of GTIDs

    This read-only variable is primarily for internal use. Its contents depend on its scope.

    • When used with global scope, gtid_owned holds a list of all the GTIDs that are currently in use on the server, with the IDs of the threads that own them. This variable is mainly useful for a multi-threaded replica to check whether a transaction is already being applied on another thread. An applier thread takes ownership of a transaction's GTID all the time it is processing the transaction, so @@global.gtid_owned shows the GTID and owner for the duration of processing. When a transaction has been committed (or rolled back), the applier thread releases ownership of the GTID.

    • When used with session scope, gtid_owned holds a single GTID that is currently in use by and owned by this session. This variable is mainly useful for testing and debugging the use of GTIDs when the client has explicitly assigned a GTID for the transaction by setting gtid_next. In this case, @@session.gtid_owned displays the GTID all the time the client is processing the transaction, until the transaction has been committed (or rolled back). When the client has finished processing the transaction, the variable is cleared. If gtid_next=AUTOMATIC is used for the session, gtid_owned is populated only briefly during the execution of the commit statement for the transaction, so it cannot be observed from the session concerned, although it is listed if @@global.gtid_owned is read at the right point. If you have a requirement to track the GTIDs that are handled by a client in a session, you can enable the session state tracker controlled by the session_track_gtids system variable.

  • gtid_purged

    System Variable gtid_purged
    Scope Global
    Dynamic Yes
    SET_VAR Hint Applies No
    Type String
    Unit set of GTIDs

    The global value of the gtid_purged system variable (@@GLOBAL.gtid_purged) is a GTID set consisting of the GTIDs of all the transactions that have been committed on the server, but do not exist in any binary log file on the server. gtid_purged is a subset of gtid_executed. The following categories of GTIDs are in gtid_purged:

    • GTIDs of replicated transactions that were committed with binary logging disabled on the replica.

    • GTIDs of transactions that were written to a binary log file that has now been purged.

    • GTIDs that were added explicitly to the set by the statement SET @@GLOBAL.gtid_purged.

    When the server starts, the global value of gtid_purged is initialized to a set of GTIDs. For information on how this GTID set is computed, see The gtid_purged System Variable. If binary logs from MySQL 5.7.7 or older are present on the server, you might need to set binlog_gtid_simple_recovery=FALSE in the server's configuration file to produce the correct computation. See the description for binlog_gtid_simple_recovery for details of the situations in which this setting is needed.

    Issuing RESET MASTER causes the value of gtid_purged to be reset to an empty string.

    You can set the value of gtid_purged in order to record on the server that the transactions in a certain GTID set have been applied, although they do not exist in any binary log on the server. An example use case for this action is when you are restoring a backup of one or more databases on a server, but you do not have the relevant binary logs containing the transactions on the server.

    From MySQL 8.0, there are two ways to set the value of gtid_purged. You can either replace the value of gtid_purged with your specified GTID set, or you can append your specified GTID set to the GTID set that is already held by gtid_purged. If the server has no existing GTIDs, for example an empty server that you are provisioning with a backup of an existing database, both methods have the same result. If you are restoring a backup that overlaps the transactions that are already on the server, for example replacing a corrupted table with a partial dump from the source made using mysqldump (which includes the GTIDs of all the transactions on the server, even though the dump is partial), use the first method of replacing the value of gtid_purged. If you are restoring a backup that is disjoint from the transactions that are already on the server, for example provisioning a multi-source replica using dumps from two different servers, use the second method of adding to the value of gtid_purged.

    • To replace the value of gtid_purged with your specified GTID set, use the following statement:

      SET @@GLOBAL.gtid_purged = 'gtid_set'

      gtid_set must be a superset of the current value of gtid_purged, and must not intersect with gtid_subtract(gtid_executed,gtid_purged). In other words, the new GTID set must include any GTIDs that were already in gtid_purged, and must not include any GTIDs in gtid_executed that have not yet been purged. gtid_set also cannot include any GTIDs that are in @@global.gtid_owned, that is, the GTIDs for transactions that are currently being processed on the server.

      The result is that the global value of gtid_purged is set equal to gtid_set, and the value of gtid_executed becomes the union of gtid_set and the previous value of gtid_executed.

    • To append your specified GTID set to gtid_purged, use the following statement with a plus sign (+) before the GTID set:

      SET @@GLOBAL.gtid_purged = '+gtid_set'

      gtid_set must not intersect with the current value of gtid_executed. In other words, the new GTID set must not include any GTIDs in gtid_executed, including transactions that are already also in gtid_purged. gtid_set also cannot include any GTIDs that are in @@global.gtid_owned, that is, the GTIDs for transactions that are currently being processed on the server.

      The result is that gtid_set is added to both gtid_executed and gtid_purged.

Note

If any binary logs from MySQL 5.7.7 or older are present on the server (for example, following an upgrade of an older server to MySQL 8.0), after issuing a SET @@GLOBAL.gtid_purged statement, you might need to set binlog_gtid_simple_recovery=FALSE in the server's configuration file before restarting the server, otherwise gtid_purged can be computed incorrectly. See the description for binlog_gtid_simple_recovery for details of the situations in which this setting is needed.

17.1.7 Common Replication Administration Tasks

Once replication has been started it executes without requiring much regular administration. This section describes how to check the status of replication, how to pause a replica, and how to skip a failed transaction on a replica.

Tip

To deploy multiple instances of MySQL, you can use InnoDB Cluster which enables you to easily administer a group of MySQL server instances in MySQL Shell. InnoDB Cluster wraps MySQL Group Replication in a programmatic environment that enables you easily deploy a cluster of MySQL instances to achieve high availability. In addition, InnoDB Cluster interfaces seamlessly with MySQL Router, which enables your applications to connect to the cluster without writing your own failover process. For similar use cases that do not require high availability, however, you can use InnoDB ReplicaSet. Installation instructions for MySQL Shell can be found here.

17.1.7.1 Checking Replication Status

The most common task when managing a replication process is to ensure that replication is taking place and that there have been no errors between the replica and the source.

The SHOW REPLICA | SLAVE STATUS statement, which you must execute on each replica, provides information about the configuration and status of the connection between the replica server and the source server. From MySQL 8.0.22, SHOW SLAVE STATUS is deprecated, and SHOW REPLICA STATUS is available to use instead. The Performance Schema has replication tables that provide this information in a more accessible form. See Section 27.12.11, “Performance Schema Replication Tables”.

The replication heartbeat information shown in the Performance Schema replication tables lets you check that the replication connection is active even if the source has not sent events to the replica recently. The source sends a heartbeat signal to a replica if there are no updates to, and no unsent events in, the binary log for a longer period than the heartbeat interval. The MASTER_HEARTBEAT_PERIOD setting on the source (set by the CHANGE MASTER TO statement) specifies the frequency of the heartbeat, which defaults to half of the connection timeout interval for the replica (slave_net_timeout). The replication_connection_status Performance Schema table shows when the most recent heartbeat signal was received by a replica, and how many heartbeat signals it has received.

If you are using the SHOW REPLICA | SLAVE STATUS statement to check on the status of an individual replica, the statement provides the following information:

mysql> SHOW REPLICA STATUS\G
*************************** 1. row ***************************
             Replica_IO_State: Waiting for master to send event
                  Source_Host: source1
                  Source_User: root
                  Source_Port: 3306
                Connect_Retry: 60
              Source_Log_File: mysql-bin.000004
          Read_Source_Log_Pos: 931
               Relay_Log_File: replica1-relay-bin.000056
                Relay_Log_Pos: 950
        Relay_Source_Log_File: mysql-bin.000004
           Replica_IO_Running: Yes
          Replica_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Source_Log_Pos: 931
              Relay_Log_Space: 1365
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Source_SSL_Allowed: No
           Source_SSL_CA_File:
           Source_SSL_CA_Path:
              Source_SSL_Cert:
            Source_SSL_Cipher:
               Source_SSL_Key:
        Seconds_Behind_Source: 0
Source_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids: 0

The key fields from the status report to examine are:

  • Replica_IO_State: The current status of the replica. See Section 8.14.5, “Replication I/O Thread States”, and Section 8.14.6, “Replication SQL Thread States”, for more information.

  • Replica_IO_Running: Whether the I/O thread for reading the source's binary log is running. Normally, you want this to be Yes unless you have not yet started replication or have explicitly stopped it with STOP REPLICA | SLAVE.

  • Replica_SQL_Running: Whether the SQL thread for executing events in the relay log is running. As with the I/O thread, this should normally be Yes.

  • Last_IO_Error, Last_SQL_Error: The last errors registered by the I/O and SQL threads when processing the relay log. Ideally these should be blank, indicating no errors.

  • Seconds_Behind_Source: The number of seconds that the replication SQL thread is behind processing the source binary log. A high number (or an increasing one) can indicate that the replica is unable to handle events from the source in a timely fashion.

    A value of 0 for Seconds_Behind_Source can usually be interpreted as meaning that the replica has caught up with the source, but there are some cases where this is not strictly true. For example, this can occur if the network connection between source and replica is broken but the replication I/O thread has not yet noticed this; that is, slave_net_timeout has not yet elapsed.

    It is also possible that transient values for Seconds_Behind_Source may not reflect the situation accurately. When the replication SQL thread has caught up on I/O, Seconds_Behind_Source displays 0; but when the replication I/O thread is still queuing up a new event, Seconds_Behind_Source may show a large value until the replication SQL thread finishes executing the new event. This is especially likely when the events have old timestamps; in such cases, if you execute SHOW REPLICA | SLAVE STATUS several times in a relatively short period, you may see this value change back and forth repeatedly between 0 and a relatively large value.

Several pairs of fields provide information about the progress of the replica in reading events from the source binary log and processing them in the relay log:

  • (Master_Log_file, Read_Master_Log_Pos): Coordinates in the source binary log indicating how far the replication I/O thread has read events from that log.

  • (Relay_Master_Log_File, Exec_Master_Log_Pos): Coordinates in the source binary log indicating how far the replication SQL thread has executed events received from that log.

  • (Relay_Log_File, Relay_Log_Pos): Coordinates in the replica relay log indicating how far the replication SQL thread has executed the relay log. These correspond to the preceding coordinates, but are expressed in replica relay log coordinates rather than source binary log coordinates.

On the source, you can check the status of connected replicas using SHOW PROCESSLIST to examine the list of running processes. Replica connections have Binlog Dump in the Command field:

mysql> SHOW PROCESSLIST \G;
*************************** 4. row ***************************
     Id: 10
   User: root
   Host: replica1:58371
     db: NULL
Command: Binlog Dump
   Time: 777
  State: Has sent all binlog to slave; waiting for binlog to be updated
   Info: NULL

Because it is the replica that drives the replication process, very little information is available in this report.

For replicas that were started with the --report-host option and are connected to the source, the SHOW REPLICAS | SHOW SLAVE HOSTS statement on the source shows basic information about the replicas. The output includes the ID of the replica server, the value of the --report-host option, the connecting port, and source ID:

mysql> SHOW REPLICAS;
+-----------+----------+------+-------------------+-----------+
| Server_id | Host     | Port | Rpl_recovery_rank | Source_id |
+-----------+----------+------+-------------------+-----------+
|        10 | replica1 | 3306 |                 0 |         1 |
+-----------+----------+------+-------------------+-----------+
1 row in set (0.00 sec)

17.1.7.2 Pausing Replication on the Replica

You can stop and start replication on the replica using the STOP REPLICA | SLAVE and START REPLICA | SLAVE statements. From MySQL 8.0.22, STOP SLAVE and START SLAVE are deprecated, and STOP REPLICA and START REPLICA are available to use instead.

To stop processing of the binary log from the source, use STOP REPLICA | SLAVE:

mysql> STOP SLAVE;
Or from MySQL 8.0.22:
mysql> STOP REPLICA;

When replication is stopped, the replication I/O thread stops reading events from the source binary log and writing them to the relay log, and the SQL thread stops reading events from the relay log and executing them. You can pause the I/O or SQL thread individually by specifying the thread type:

mysql> STOP SLAVE IO_THREAD;
mysql> STOP SLAVE SQL_THREAD;
Or from MySQL 8.0.22:
mysql> STOP REPLICA IO_THREAD;
mysql> STOP REPLICA SQL_THREAD;

To start execution again, use the START REPLICA | SLAVE statement:

mysql> START SLAVE;
Or from MySQL 8.0.22:
mysql> START REPLICA;

To start a particular thread, specify the thread type:

mysql> START SLAVE IO_THREAD;
mysql> START SLAVE SQL_THREAD;
Or from MySQL 8.0.22:
mysql> START REPLICA IO_THREAD;
mysql> START REPLICA SQL_THREAD;

For a replica that performs updates only by processing events from the source, stopping only the SQL thread can be useful if you want to perform a backup or other task. The I/O thread continues to read events from the source but they are not executed. This makes it easier for the replica to catch up when you restart the SQL thread.

Stopping only the I/O thread enables the events in the relay log to be executed by the SQL thread up to the point where the relay log ends. This can be useful when you want to pause execution to catch up with events already received from the source, when you want to perform administration on the replica but also ensure that it has processed all updates to a specific point. This method can also be used to pause event receipt on the replica while you conduct administration on the source. Stopping the I/O thread but permitting the SQL thread to run helps ensure that there is not a massive backlog of events to be executed when replication is started again.

17.1.7.3 Skipping Transactions

If replication stops due to an issue with an event in a replicated transaction, you can resume replication by skipping the failed transaction on the replica. Before skipping a transaction, ensure that the replication I/O thread is stopped as well as the SQL thread.

First you need to identify the replicated event that caused the error. Details of the error and the last successfully applied transaction are recorded in the Performance Schema table replication_applier_status_by_worker. You can use mysqlbinlog to retrieve and display the events that were logged around the time of the error. For instructions to do this, see Section 7.5, “Point-in-Time (Incremental) Recovery”. Alternatively, you can issue SHOW RELAYLOG EVENTS on the replica or SHOW BINLOG EVENTS on the source.

Before skipping the transaction and restarting the replica, check these points:

  • Is the transaction that stopped replication from an unknown or untrusted source? If so, investigate the cause in case there are any security considerations that indicate the replica should not be restarted.

  • Does the transaction that stopped replication need to be applied on the replica? If so, either make the appropriate corrections and reapply the transaction, or manually reconcile the data on the replica.

  • Did the transaction that stopped replication need to be applied on the source? If not, undo the transaction manually on the server where it originally took place.

To skip the transaction, choose one of the following methods as appropriate:

To restart replication after skipping the transaction, issue START REPLICA | SLAVE, with the FOR CHANNEL clause if the replica is a multi-source replica.

17.1.7.3.1 Skipping Transactions With GTIDs

When GTIDs are in use (gtid_mode is ON), the GTID for a committed transaction is persisted on the replica even if the content of the transaction is filtered out. This feature prevents a replica from retrieving previously filtered transactions when it reconnects to the source using GTID auto-positioning. It can also be used to skip a transaction on the replica, by committing an empty transaction in place of the failing transaction.

This method of skipping transactions is not suitable when you have enabled GTID assignment on a replication channel using the ASSIGN_GTIDS_TO_ANONYMOUS_TRANSACTIONS option of the CHANGE REPLICATION SOURCE TO statement.

If the failing transaction generated an error in a worker thread, you can obtain its GTID directly from the LAST_SEEN_TRANSACTION field in the Performance Schema table replication_applier_status_by_worker. To see what the transaction is, issue SHOW RELAYLOG EVENTS on the replica or SHOW BINLOG EVENTS on the source, and search the output for a transaction preceded by that GTID.

When you have assessed the failing transaction for any other appropriate actions as described previously (such as security considerations), to skip it, commit an empty transaction on the replica that has the same GTID as the failing transaction. For example:

SET GTID_NEXT='aaa-bbb-ccc-ddd:N';
BEGIN;
COMMIT;
SET GTID_NEXT='AUTOMATIC';

The presence of this empty transaction on the replica means that when you issue a START REPLICA | SLAVE statement to restart replication, the replica uses the auto-skip function to ignore the failing transaction, because it sees a transaction with that GTID has already been applied. If the replica is a multi-source replica, you do not need to specify the channel name when you commit the empty transaction, but you do need to specify the channel name when you issue START REPLICA | SLAVE.

Note that if binary logging is in use on this replica, the empty transaction enters the replication stream if the replica becomes a source or primary in the future. If you need to avoid this possibility, consider flushing and purging the replica's binary logs, as in this example:

FLUSH LOGS;
PURGE BINARY LOGS TO 'binlog.000146';

The GTID of the empty transaction is persisted, but the transaction itself is removed by purging the binary log files.

17.1.7.3.2 Skipping Transactions Without GTIDs

To skip failing transactions when GTIDs are not in use or are being phased in (gtid_mode is OFF, OFF_PERMISSIVE, or ON_PERMISSIVE), you can skip a specified number of events by issuing a SET GLOBAL sql_slave_skip_counter statement. Alternatively, you can skip past an event or events by issuing a CHANGE MASTER TO statement to move the source binary log position forward.

These methods are also suitable when you have enabled GTID assignment on a replication channel using the ASSIGN_GTIDS_TO_ANONYMOUS_TRANSACTIONS option of the CHANGE REPLICATION SOURCE TO statement.

When you use these methods, it is important to understand that you are not necessarily skipping a complete transaction, as is always the case with the GTID-based method described previously. These non-GTID-based methods are not aware of transactions as such, but instead operate on events. The binary log is organized as a sequence of groups known as event groups, and each event group consists of a sequence of events.

  • For transactional tables, an event group corresponds to a transaction.

  • For nontransactional tables, an event group corresponds to a single SQL statement.

A single transaction can contain changes to both transactional and nontransactional tables.

When you use a SET GLOBAL sql_slave_skip_counter statement to skip events and the resulting position is in the middle of an event group, the replica continues to skip events until it reaches the end of the group. Execution then starts with the next event group. The CHANGE MASTER TO statement does not have this function, so you must be careful to identify the correct location to restart replication at the beginning of an event group. However, using CHANGE MASTER TO means you do not have to count the events that need to be skipped, as you do with a SET GLOBAL sql_slave_skip_counter, and instead you can just specify the location to restart.

17.1.7.3.2.1 Skipping Transactions With SET GLOBAL sql_slave_skip_counter

When you have assessed the failing transaction for any other appropriate actions as described previously (such as security considerations), count the number of events that you need to skip. One event normally corresponds to one SQL statement in the binary log, but note that statements that use AUTO_INCREMENT or LAST_INSERT_ID() count as two events in the binary log. When binary log transaction compression is in use, a compressed transaction payload (Transaction_payload_event) is counted as a single counter value, so all the events inside it are skipped as a unit.

If you want to skip the complete transaction, you can count the events to the end of the transaction, or you can just skip the relevant event group. Remember that with SET GLOBAL sql_slave_skip_counter, the replica continues to skip to the end of an event group. Make sure you do not skip too far forward and go into the next event group or transaction so that it is not also skipped.

Issue the SET statement as follows, where N is the number of events from the source to skip:

SET GLOBAL sql_slave_skip_counter = N

This statement cannot be issued if gtid_mode=ON is set, or if the replication I/O and SQL threads are running.

The SET GLOBAL sql_slave_skip_counter statement has no immediate effect. When you issue the START REPLICA | SLAVE statement for the next time following this SET statement, the new value for the system variable sql_slave_skip_counter is applied, and the events are skipped. That START REPLICA | SLAVE statement also automatically sets the value of the system variable back to 0. If the replica is a multi-source replica, when you issue that START REPLICA | SLAVE statement, the FOR CHANNEL clause is required. Make sure that you name the correct channel, otherwise events are skipped on the wrong channel.

17.1.7.3.2.2 Skipping Transactions With CHANGE MASTER TO

When you have assessed the failing transaction for any other appropriate actions as described previously (such as security considerations), identify the coordinates (file and position) in the source's binary log that represent a suitable position to restart replication. This can be the start of the event group following the event that caused the issue, or the start of the next transaction. The replication I/O thread begins reading from the source at these coordinates the next time the thread starts, skipping the failing event. Make sure that you have identified the position accurately, because this statement does not take event groups into account.

Issue the CHANGE MASTER TO statement as follows, where source_log_name is the binary log file that contains the restart position, and source_log_pos is the number representing the restart position as stated in the binary log file:

CHANGE MASTER TO MASTER_LOG_FILE='source_log_name', MASTER_LOG_POS=source_log_pos;

If the replica is a multi-source replica, you must use the FOR CHANNEL clause to name the appropriate channel on the CHANGE MASTER TO statement.

This statement cannot be issued if MASTER_AUTO_POSITION=1 is set, or if the replication I/O and SQL threads are running. If you need to use this method of skipping a transaction when MASTER_AUTO_POSITION=1 is normally set, you can change the setting to MASTER_AUTO_POSITION=1 while issuing the statement, then change it back again afterwards. For example:

CHANGE MASTER TO MASTER_AUTO_POSITION=0, MASTER_LOG_FILE='binlog.000145', MASTER_LOG_POS=235;
CHANGE MASTER TO MASTER_AUTO_POSITION=1;

17.2 Replication Implementation

Replication is based on the source server keeping track of all changes to its databases (updates, deletes, and so on) in its binary log. The binary log serves as a written record of all events that modify database structure or content (data) from the moment the server was started. Typically, SELECT statements are not recorded because they modify neither database structure nor content.

Each replica that connects to the source requests a copy of the binary log. That is, it pulls the data from the source, rather than the source pushing the data to the replica. The replica also executes the events from the binary log that it receives. This has the effect of repeating the original changes just as they were made on the source. Tables are created or their structure modified, and data is inserted, deleted, and updated according to the changes that were originally made on the source.

Because each replica is independent, the replaying of the changes from the source's binary log occurs independently on each replica that is connected to the source. In addition, because each replica receives a copy of the binary log only by requesting it from the source, the replica is able to read and update the copy of the database at its own pace and can start and stop the replication process at will without affecting the ability to update to the latest database status on either the source or replica side.

For more information on the specifics of the replication implementation, see Section 17.2.3, “Replication Threads”.

Source servers and replicas report their status in respect of the replication process regularly so that you can monitor them. See Section 8.14, “Examining Server Thread (Process) Information”, for descriptions of all replicated-related states.

The source's binary log is written to a local relay log on the replica before it is processed. The replica also records information about the current position with the source's binary log and the local relay log. See Section 17.2.4, “Relay Log and Replication Metadata Repositories”.

Database changes are filtered on the replica according to a set of rules that are applied according to the various configuration options and variables that control event evaluation. For details on how these rules are applied, see Section 17.2.5, “How Servers Evaluate Replication Filtering Rules”.

17.2.1 Replication Formats

Replication works because events written to the binary log are read from the source and then processed on the replica. The events are recorded within the binary log in different formats according to the type of event. The different replication formats used correspond to the binary logging format used when the events were recorded in the source's binary log. The correlation between binary logging formats and the terms used during replication are:

  • When using statement-based binary logging, the source writes SQL statements to the binary log. Replication of the source to the replica works by executing the SQL statements on the replica. This is called statement-based replication (which can be abbreviated as SBR), which corresponds to the MySQL statement-based binary logging format.

  • When using row-based logging, the source writes events to the binary log that indicate how individual table rows are changed. Replication of the source to the replica works by copying the events representing the changes to the table rows to the replica. This is called row-based replication (which can be abbreviated as RBR).

    Row-based logging is the default method.

  • You can also configure MySQL to use a mix of both statement-based and row-based logging, depending on which is most appropriate for the change to be logged. This is called mixed-format logging. When using mixed-format logging, a statement-based log is used by default. Depending on certain statements, and also the storage engine being used, the log is automatically switched to row-based in particular cases. Replication using the mixed format is referred to as mixed-based replication or mixed-format replication. For more information, see Section 5.4.4.3, “Mixed Binary Logging Format”.

NDB Cluster.  The default binary logging format in MySQL NDB Cluster 8.0 is MIXED. You should note that NDB Cluster Replication always uses row-based replication, and that the NDB storage engine is incompatible with statement-based replication. See Section 23.6.2, “General Requirements for NDB Cluster Replication”, for more information.

When using MIXED format, the binary logging format is determined in part by the storage engine being used and the statement being executed. For more information on mixed-format logging and the rules governing the support of different logging formats, see Section 5.4.4.3, “Mixed Binary Logging Format”.

The logging format in a running MySQL server is controlled by setting the binlog_format server system variable. This variable can be set with session or global scope. The rules governing when and how the new setting takes effect are the same as for other MySQL server system variables. Setting the variable for the current session lasts only until the end of that session, and the change is not visible to other sessions. Setting the variable globally takes effect for clients that connect after the change, but not for any current client sessions, including the session where the variable setting was changed. To make the global system variable setting permanent so that it applies across server restarts, you must set it in an option file. For more information, see Section 13.7.6.1, “SET Syntax for Variable Assignment”.

There are conditions under which you cannot change the binary logging format at runtime or doing so causes replication to fail. See Section 5.4.4.2, “Setting The Binary Log Format”.

Changing the global binlog_format value requires privileges sufficient to set global system variables. Changing the session binlog_format value requires privileges sufficient to set restricted session system variables. See Section 5.1.9.1, “System Variable Privileges”.

The statement-based and row-based replication formats have different issues and limitations. For a comparison of their relative advantages and disadvantages, see Section 17.2.1.1, “Advantages and Disadvantages of Statement-Based and Row-Based Replication”.

With statement-based replication, you may encounter issues with replicating stored routines or triggers. You can avoid these issues by using row-based replication instead. For more information, see Section 25.7, “Stored Program Binary Logging”.

17.2.1.1 Advantages and Disadvantages of Statement-Based and Row-Based Replication

Each binary logging format has advantages and disadvantages. For most users, the mixed replication format should provide the best combination of data integrity and performance. If, however, you want to take advantage of the features specific to the statement-based or row-based replication format when performing certain tasks, you can use the information in this section, which provides a summary of their relative advantages and disadvantages, to determine which is best for your needs.

Advantages of statement-based replication
  • Proven technology.

  • Less data written to log files. When updates or deletes affect many rows, this results in much less storage space required for log files. This also means that taking and restoring from backups can be accomplished more quickly.

  • Log files contain all statements that made any changes, so they can be used to audit the database.

Disadvantages of statement-based replication
  • Statements that are unsafe for SBR.  Not all statements which modify data (such as INSERT DELETE, UPDATE, and REPLACE statements) can be replicated using statement-based replication. Any nondeterministic behavior is difficult to replicate when using statement-based replication. Examples of such Data Modification Language (DML) statements include the following:

    Statements that cannot be replicated correctly using statement-based replication are logged with a warning like the one shown here:

    [Warning] Statement is not safe to log in statement format.
    

    A similar warning is also issued to the client in such cases. The client can display it using SHOW WARNINGS.

  • INSERT ... SELECT requires a greater number of row-level locks than with row-based replication.

  • UPDATE statements that require a table scan (because no index is used in the WHERE clause) must lock a greater number of rows than with row-based replication.

  • For InnoDB: An INSERT statement that uses AUTO_INCREMENT blocks other nonconflicting INSERT statements.

  • For complex statements, the statement must be evaluated and executed on the replica before the rows are updated or inserted. With row-based replication, the replica only has to modify the affected rows, not execute the full statement.

  • If there is an error in evaluation on the replica, particularly when executing complex statements, statement-based replication may slowly increase the margin of error across the affected rows over time. See Section 17.5.1.29, “Replica Errors During Replication”.

  • Stored functions execute with the same NOW() value as the calling statement. However, this is not true of stored procedures.

  • Deterministic UDFs must be applied on the replicas.

  • Table definitions must be (nearly) identical on source and replica. See Section 17.5.1.9, “Replication with Differing Table Definitions on Source and Replica”, for more information.

  • As of MySQL 8.0.22, DML operations that read data from MySQL grant tables (through a join list or subquery) but do not modify them are performed as non-locking reads on the MySQL grant tables and are therefore not safe for statement-based replication. For more information, see Grant Table Concurrency.

Advantages of row-based replication
  • All changes can be replicated. This is the safest form of replication.

    Note

    Statements that update the information in the mysql system schema, such as GRANT, REVOKE and the manipulation of triggers, stored routines (including stored procedures), and views, are all replicated to replicas using statement-based replication.

    For statements such as CREATE TABLE ... SELECT, a CREATE statement is generated from the table definition and replicated using statement-based format, while the row insertions are replicated using row-based format.

  • Fewer row locks are required on the source, which thus achieves higher concurrency, for the following types of statements:

  • Fewer row locks are required on the replica for any INSERT, UPDATE, or DELETE statement.

Disadvantages of row-based replication
  • RBR can generate more data that must be logged. To replicate a DML statement (such as an UPDATE or DELETE statement), statement-based replication writes only the statement to the binary log. By contrast, row-based replication writes each changed row to the binary log. If the statement changes many rows, row-based replication may write significantly more data to the binary log; this is true even for statements that are rolled back. This also means that making and restoring a backup can require more time. In addition, the binary log is locked for a longer time to write the data, which may cause concurrency problems. Use binlog_row_image=minimal to reduce the disadvantage considerably.

  • Deterministic UDFs that generate large BLOB values take longer to replicate with row-based replication than with statement-based replication. This is because the BLOB column value is logged, rather than the statement generating the data.

  • You cannot see on the replica what statements were received from the source and executed. However, you can see what data was changed using mysqlbinlog with the options --base64-output=DECODE-ROWS and --verbose.

    Alternatively, use the binlog_rows_query_log_events variable, which if enabled adds a Rows_query event with the statement to mysqlbinlog output when the -vv option is used.

  • For tables using the MyISAM storage engine, a stronger lock is required on the replica for INSERT statements when applying them as row-based events to the binary log than when applying them as statements. This means that concurrent inserts on MyISAM tables are not supported when using row-based replication.

17.2.1.2 Usage of Row-Based Logging and Replication

MySQL uses statement-based logging (SBL), row-based logging (RBL) or mixed-format logging. The type of binary log used impacts the size and efficiency of logging. Therefore the choice between row-based replication (RBR) or statement-based replication (SBR) depends on your application and environment. This section describes known issues when using a row-based format log, and describes some best practices using it in replication.

For additional information, see Section 17.2.1, “Replication Formats”, and Section 17.2.1.1, “Advantages and Disadvantages of Statement-Based and Row-Based Replication”.

For information about issues specific to NDB Cluster Replication (which depends on row-based replication), see Section 23.6.3, “Known Issues in NDB Cluster Replication”.

  • Row-based logging of temporary tables.  As noted in Section 17.5.1.31, “Replication and Temporary Tables”, temporary tables are not replicated when using row-based format or (from MySQL 8.0.4) mixed format. For more information, see Section 17.2.1.1, “Advantages and Disadvantages of Statement-Based and Row-Based Replication”.

    Temporary tables are not replicated when using row-based or mixed format because there is no need. In addition, because temporary tables can be read only from the thread which created them, there is seldom if ever any benefit obtained from replicating them, even when using statement-based format.

    You can switch from statement-based to row-based binary logging format at runtime even when temporary tables have been created. However, in MySQL 8.0, you cannot switch from row-based or mixed format for binary logging to statement-based format at runtime, due to any CREATE TEMPORARY TABLE statements having been omitted from the binary log in the previous mode.

    The MySQL server tracks the logging mode that was in effect when each temporary table was created. When a given client session ends, the server logs a DROP TEMPORARY TABLE IF EXISTS statement for each temporary table that still exists and was created when statement-based binary logging was in use. If row-based or mixed format binary logging was in use when the table was created, the DROP TEMPORARY TABLE IF EXISTS statement is not logged. In releases before MySQL 8.0.4 and 5.7.25, the DROP TEMPORARY TABLE IF EXISTS statement was logged regardless of the logging mode that was in effect.

    Nontransactional DML statements involving temporary tables are allowed when using binlog_format=ROW, as long as any nontransactional tables affected by the statements are temporary tables (Bug #14272672).

  • RBL and synchronization of nontransactional tables.  When many rows are affected, the set of changes is split into several events; when the statement commits, all of these events are written to the binary log. When executing on the replica, a table lock is taken on all tables involved, and then the rows are applied in batch mode. Depending on the engine used for the replica's copy of the table, this may or may not be effective.

  • Latency and binary log size.  RBL writes changes for each row to the binary log and so its size can increase quite rapidly. This can significantly increase the time required to make changes on the replica that match those on the source. You should be aware of the potential for this delay in your applications.

  • Reading the binary log.  mysqlbinlog displays row-based events in the binary log using the BINLOG statement (see Section 13.7.8.1, “BINLOG Statement”). This statement displays an event as a base 64-encoded string, the meaning of which is not evident. When invoked with the --base64-output=DECODE-ROWS and --verbose options, mysqlbinlog formats the contents of the binary log to be human readable. When binary log events were written in row-based format and you want to read or recover from a replication or database failure you can use this command to read contents of the binary log. For more information, see Section 4.6.8.2, “mysqlbinlog Row Event Display”.

  • Binary log execution errors and replica execution mode.  Using slave_exec_mode=IDEMPOTENT is generally only useful with MySQL NDB Cluster replication, for which IDEMPOTENT is the default value. (See Section 23.6.10, “NDB Cluster Replication: Bidrectional and Circular Replication”). When slave_exec_mode is IDEMPOTENT, a failure to apply changes from RBL because the original row cannot be found does not trigger an error or cause replication to fail. This means that it is possible that updates are not applied on the replica, so that the source and replica are no longer synchronized. Latency issues and use of nontransactional tables with RBR when slave_exec_mode is IDEMPOTENT can cause the source and replica to diverge even further. For more information about slave_exec_mode, see Section 5.1.8, “Server System Variables”.

    For other scenarios, setting slave_exec_mode to STRICT is normally sufficient; this is the default value for storage engines other than NDB.

  • Filtering based on server ID not supported.  You can filter based on server ID by using the IGNORE_SERVER_IDS option for the CHANGE MASTER TO statement. This option works with statement-based and row-based logging formats, but is deprecated for use when GTID_MODE=ON is set. Another method to filter out changes on some replicas is to use a WHERE clause that includes the relation @@server_id <> id_value clause with UPDATE and DELETE statements. For example, WHERE @@server_id <> 1. However, this does not work correctly with row-based logging. To use the server_id system variable for statement filtering, use statement-based logging.

  • RBL, nontransactional tables, and stopped replicas.  When using row-based logging, if the replica server is stopped while a replica thread is updating a nontransactional table, the replica database can reach an inconsistent state. For this reason, it is recommended that you use a transactional storage engine such as InnoDB for all tables replicated using the row-based format. Use of STOP REPLICA | SLAVE or STOP REPLICA | SLAVE SQL_THREAD prior to shutting down the replica MySQL server helps prevent issues from occurring, and is always recommended regardless of the logging format or storage engine you use.

17.2.1.3 Determination of Safe and Unsafe Statements in Binary Logging

The safeness of a statement in MySQL replication refers to whether the statement and its effects can be replicated correctly using statement-based format. If this is true of the statement, we refer to the statement as safe; otherwise, we refer to it as unsafe.

In general, a statement is safe if it deterministic, and unsafe if it is not. However, certain nondeterministic functions are not considered unsafe (see Nondeterministic functions not considered unsafe, later in this section). In addition, statements using results from floating-point math functions—which are hardware-dependent—are always considered unsafe (see Section 17.5.1.12, “Replication and Floating-Point Values”).

Handling of safe and unsafe statements.  A statement is treated differently depending on whether the statement is considered safe, and with respect to the binary logging format (that is, the current value of binlog_format).

  • When using row-based logging, no distinction is made in the treatment of safe and unsafe statements.

  • When using mixed-format logging, statements flagged as unsafe are logged using the row-based format; statements regarded as safe are logged using the statement-based format.

  • When using statement-based logging, statements flagged as being unsafe generate a warning to this effect. Safe statements are logged normally.

Each statement flagged as unsafe generates a warning. If a large number of such statements were executed on the source, this could lead to excessively large error log files. To prevent this, MySQL has a warning suppression mechanism. Whenever the 50 most recent ER_BINLOG_UNSAFE_STATEMENT warnings have been generated more than 50 times in any 50-second period, warning suppression is enabled. When activated, this causes such warnings not to be written to the error log; instead, for each 50 warnings of this type, a note The last warning was repeated N times in last S seconds is written to the error log. This continues as long as the 50 most recent such warnings were issued in 50 seconds or less; once the rate has decreased below this threshold, the warnings are once again logged normally. Warning suppression has no effect on how the safety of statements for statement-based logging is determined, nor on how warnings are sent to the client. MySQL clients still receive one warning for each such statement.

For more information, see Section 17.2.1, “Replication Formats”.

Statements considered unsafe.  Statements with the following characteristics are considered unsafe:

  • Statements containing system functions that may return a different value on the replica.  These functions include FOUND_ROWS(), GET_LOCK(), IS_FREE_LOCK(), IS_USED_LOCK(), LOAD_FILE(), MASTER_POS_WAIT(), RAND(), RELEASE_LOCK(), ROW_COUNT(), SESSION_USER(), SLEEP(), SYSDATE(), SYSTEM_USER(), USER(), UUID(), and UUID_SHORT().

    Nondeterministic functions not considered unsafe.  Although these functions are not deterministic, they are treated as safe for purposes of logging and replication: CONNECTION_ID(), CURDATE(), CURRENT_DATE(), CURRENT_TIME(), CURRENT_TIMESTAMP(), CURTIME(),, LAST_INSERT_ID(), LOCALTIME(), LOCALTIMESTAMP(), NOW(), UNIX_TIMESTAMP(), UTC_DATE(), UTC_TIME(), and UTC_TIMESTAMP().

    For more information, see Section 17.5.1.14, “Replication and System Functions”.

  • References to system variables.  Most system variables are not replicated correctly using the statement-based format. See Section 17.5.1.39, “Replication and Variables”. For exceptions, see Section 5.4.4.3, “Mixed Binary Logging Format”.

  • UDFs.  Since we have no control over what a UDF does, we must assume that it is executing unsafe statements.

  • Fulltext plugin.  This plugin may behave differently on different MySQL servers; therefore, statements depending on it could have different results. For this reason, all statements relying on the fulltext plugin are treated as unsafe in MySQL.

  • Trigger or stored program updates a table having an AUTO_INCREMENT column.  This is unsafe because the order in which the rows are updated may differ on the source and the replica.

    In addition, an INSERT into a table that has a composite primary key containing an AUTO_INCREMENT column that is not the first column of this composite key is unsafe.

    For more information, see Section 17.5.1.1, “Replication and AUTO_INCREMENT”.

  • INSERT ... ON DUPLICATE KEY UPDATE statements on tables with multiple primary or unique keys.  When executed against a table that contains more than one primary or unique key, this statement is considered unsafe, being sensitive to the order in which the storage engine checks the keys, which is not deterministic, and on which the choice of rows updated by the MySQL Server depends.

    An INSERT ... ON DUPLICATE KEY UPDATE statement against a table having more than one unique or primary key is marked as unsafe for statement-based replication. (Bug #11765650, Bug #58637)

  • Updates using LIMIT.  The order in which rows are retrieved is not specified, and is therefore considered unsafe. See Section 17.5.1.18, “Replication and LIMIT”.

  • Accesses or references log tables.  The contents of the system log table may differ between source and replica.

  • Nontransactional operations after transactional operations.  Within a transaction, allowing any nontransactional reads or writes to execute after any transactional reads or writes is considered unsafe.

    For more information, see Section 17.5.1.35, “Replication and Transactions”.

  • Accesses or references self-logging tables.  All reads and writes to self-logging tables are considered unsafe. Within a transaction, any statement following a read or write to self-logging tables is also considered unsafe.

  • LOAD DATA statements.  LOAD DATA is treated as unsafe and when binlog_format=MIXED the statement is logged in row-based format. When binlog_format=STATEMENT LOAD DATA does not generate a warning, unlike other unsafe statements.

  • XA transactions.  If two XA transactions committed in parallel on the source are being prepared on the replica in the inverse order, locking dependencies can occur with statement-based replication that cannot be safely resolved, and it is possible for replication to fail with deadlock on the replica. When binlog_format=STATEMENT is set, DML statements inside XA transactions are flagged as being unsafe and generate a warning. When binlog_format=MIXED or binlog_format=ROW is set, DML statements inside XA transactions are logged using row-based replication, and the potential issue is not present.

  • DEFAULT clause that refers to a nondeterministic function.  If an expression default value refers to a nondeterministic function, any statement that causes the expression to be evaluated is unsafe for statement-based replication. This includes statements such as INSERT, UPDATE, and ALTER TABLE. Unlike most other unsafe statements, this category of statement cannot be replicated safely in row-based format. When binlog_format is set to STATEMENT, the statement is logged and executed but a warning message is written to the error log. When binlog_format is set to MIXED or ROW, the statement is not executed and an error message is written to the error log. For more information on the handling of explicit defaults, see Explicit Default Handling as of MySQL 8.0.13.

For additional information, see Section 17.5.1, “Replication Features and Issues”.

17.2.2 Replication Channels

In MySQL multi-source replication, a replica opens multiple replication channels, one for each source server. The replication channels represent the path of transactions flowing from a source to the replica. Each replication channel has its own receiver (I/O) thread, one or more applier (SQL) threads, and relay log. When transactions from a source are received by a channel's receiver thread, they are added to the channel's relay log file and passed through to the channel's applier threads. This enables each channel to function independently.

This section describes how channels can be used in a replication topology, and the impact they have on single-source replication. For instructions to configure sources and replicas for multi-source replication, to start, stop and reset multi-source replicas, and to monitor multi-source replication, see Section 17.1.5, “MySQL Multi-Source Replication”.

The maximum number of channels that can be created on one replica server in a multi-source replication topology is 256. Each replication channel must have a unique (nonempty) name, as explained in Section 17.2.2.4, “Replication Channel Naming Conventions”. The error codes and messages that are issued when multi-source replication is enabled specify the channel that generated the error.

Note

Each channel on a multi-source replica must replicate from a different source. You cannot set up multiple replication channels from a single replica to a single source. This is because the server IDs of replicas must be unique in a replication topology. The source distinguishes replicas only by their server IDs, not by the names of the replication channels, so it cannot recognize different replication channels from the same replica.

A multi-source replica can also be set up as a multi-threaded replica, by setting the slave_parallel_workers system variable to a value greater than 0. When you do this on a multi-source replica, each channel on the replica has the specified number of applier threads, plus a coordinator thread to manage them. You cannot configure the number of applier threads for individual channels.

From MySQL 8.0, multi-source replicas can be configured with replication filters on specific replication channels. Channel specific replication filters can be used when the same database or table is present on multiple sources, and you only need the replica to replicate it from one source. For GTID-based replication, if the same transaction might arrive from multiple sources (such as in a diamond topology), you must ensure the filtering setup is the same on all channels. For more information, see Section 17.2.5.4, “Replication Channel Based Filters”.

To provide compatibility with previous versions, the MySQL server automatically creates on startup a default channel whose name is the empty string (""). This channel is always present; it cannot be created or destroyed by the user. If no other channels (having nonempty names) have been created, replication statements act on the default channel only, so that all replication statements from older replicas function as expected (see Section 17.2.2.2, “Compatibility with Previous Replication Statements”. Statements applying to replication channels as described in this section can be used only when there is at least one named channel.

17.2.2.1 Commands for Operations on a Single Channel

To enable MySQL replication operations to act on individual replication channels, use the FOR CHANNEL channel clause with the following replication statements:

An additional channel parameter is introduced for the following function:

The following statements are disallowed for the group_replication_recovery channel:

The following statements are disallowed for the group_replication_applier channel:

FLUSH RELAY LOGS is now permitted for the group_replication_applier channel, but if the request is received while a transaction is being applied, the request is performed after the transaction ends. The requester must wait while the transaction is completed and the rotation takes place. This behavior prevents transactions from being split, which is not permitted for Group Replication.

17.2.2.2 Compatibility with Previous Replication Statements

When a replica has multiple channels and a FOR CHANNEL channel option is not specified, a valid statement generally acts on all available channels, with some specific exceptions.

For example, the following statements behave as expected for all except certain Group Replication channels:

Warning

Use RESET REPLICA | SLAVE with caution as this statement deletes all existing channels, purges their relay log files, and recreates only the default channel.

Some replication statements cannot operate on all channels. In this case, error 1964 Multiple channels exist on the replica. Please provide channel name as an argument. is generated. The following statements and functions generate this error when used in a multi-source replication topology and a FOR CHANNEL channel option is not used to specify which channel to act on:

Note that a default channel always exists in a single source replication topology, where statements and functions behave as in previous versions of MySQL.

17.2.2.3 Startup Options and Replication Channels

This section describes startup options which are impacted by the addition of replication channels.

The master_info_repository and relay_log_info_repository system variables must not be set to FILE when you use replication channels. In MySQL 8.0, the FILE setting is deprecated, and TABLE is the default, so the system variables can be omitted. From MySQL 8.0.23, they must be omitted because their use is deprecated from that release. If these system variables are set to FILE, attempting to add more sources to a replica fails with ER_SLAVE_NEW_CHANNEL_WRONG_REPOSITORY.

The following startup options now affect all channels in a replication topology.

The values set for the following startup options apply on each channel; since these are mysqld startup options, they are applied on every channel.

  • --max-relay-log-size=size

    Maximum size of the individual relay log file for each channel; after reaching this limit, the file is rotated.

  • --relay-log-space-limit=size

    Upper limit for the total size of all relay logs combined, for each individual channel. For N channels, the combined size of these logs is limited to relay_log_space_limit * N.

  • --slave-parallel-workers=value

    Number of replication applier threads per channel.

  • slave_checkpoint_group

    Waiting time by an I/O thread for each source.

  • --relay-log-index=filename

    Base name for each channel's relay log index file. See Section 17.2.2.4, “Replication Channel Naming Conventions”.

  • --relay-log=filename

    Denotes the base name of each channel's relay log file. See Section 17.2.2.4, “Replication Channel Naming Conventions”.

  • --slave_net-timeout=N

    This value is set per channel, so that each channel waits for N seconds to check for a broken connection.

  • --slave-skip-counter=N

    This value is set per channel, so that each channel skips N events from its source.

17.2.2.4 Replication Channel Naming Conventions

This section describes how naming conventions are impacted by replication channels.

Each replication channel has a unique name which is a string with a maximum length of 64 characters and is case-insensitive. Because channel names are used in the replica's applier metadata repository table, the character set used for these is always UTF-8. Although you are generally free to use any name for channels, the following names are reserved:

  • group_replication_applier

  • group_replication_recovery

The name you choose for a replication channel also influences the file names used by a multi-source replica. The relay log files and index files for each channel are named relay_log_basename-channel.xxxxxx, where relay_log_basename is a base name specified using the relay_log system variable, and channel is the name of the channel logged to this file. If you do not specify the relay_log system variable, a default file name is used that also includes the name of the channel.

17.2.3 Replication Threads

MySQL replication capabilities are implemented using three main threads, one on the source server and two on the replica:

  • Binary log dump thread.  The source creates a thread to send the binary log contents to a replica when the replica connects. This thread can be identified in the output of SHOW PROCESSLIST on the source as the Binlog Dump thread.

    The binary log dump thread acquires a lock on the source's binary log for reading each event that is to be sent to the replica. As soon as the event has been read, the lock is released, even before the event is sent to the replica.

  • Replication I/O thread.  When a START REPLICA | SLAVE statement is issued on a replica server, the replica creates an I/O thread, which connects to the source and asks it to send the updates recorded in its binary logs.

    The replication I/O thread reads the updates that the source's Binlog Dump thread sends (see previous item) and copies them to local files that comprise the replica's relay log.

    The state of this thread is shown as Slave_IO_running in the output of SHOW SLAVE STATUS.

  • Replication SQL thread.  The replica creates an SQL thread to read the relay log that is written by the replication I/O thread and execute the transactions contained in it.

There are three main threads for each source/replica connection. A source that has multiple replicas creates one binary log dump thread for each currently connected replica, and each replica has its own replication I/O and SQL threads.

A replica uses two threads to separate reading updates from the source and executing them into independent tasks. Thus, the task of reading transactions is not slowed down if the process of applying them is slow. For example, if the replica server has not been running for a while, its I/O thread can quickly fetch all the binary log contents from the source when the replica starts, even if the SQL thread lags far behind. If the replica stops before the SQL thread has executed all the fetched statements, the I/O thread has at least fetched everything so that a safe copy of the transactions is stored locally in the replica's relay logs, ready for execution the next time that the replica starts.

You can enable further parallelization for tasks on a replica by setting the slave_parallel_workers system variable to a value greater than 0 (the default). When this system variable is set, the replica creates the specified number of worker threads to apply transactions, plus a coordinator thread to manage them. If you are using multiple replication channels, each channel has this number of threads. A replica with slave_parallel_workers set to a value greater than 0 is called a multithreaded replica. With this setup, transactions that fail can be retried.

Note

Multithreaded replicas are not currently supported by NDB Cluster, which silently ignores the setting for this variable. See Section 23.6.3, “Known Issues in NDB Cluster Replication” for more information.

17.2.3.1 Monitoring Replication Main Threads

The SHOW PROCESSLIST statement provides information that tells you what is happening on the source and on the replica regarding replication. For information on source states, see Section 8.14.4, “Replication Source Thread States”. For replica states, see Section 8.14.5, “Replication I/O Thread States”, and Section 8.14.6, “Replication SQL Thread States”.

The following example illustrates how the three main replication threads, the binary log dump thread, replicatin I/O thread, and replication SQL thread, show up in the output from SHOW PROCESSLIST.

On the source server, the output from SHOW PROCESSLIST looks like this:

mysql> SHOW PROCESSLIST\G
*************************** 1. row ***************************
     Id: 2
   User: root
   Host: localhost:32931
     db: NULL
Command: Binlog Dump
   Time: 94
  State: Has sent all binlog to slave; waiting for binlog to
         be updated
   Info: NULL

Here, thread 2 is a Binlog Dump thread that services a connected replica. The State information indicates that all outstanding updates have been sent to the replica and that the source is waiting for more updates to occur. If you see no Binlog Dump threads on a source server, this means that replication is not running; that is, no replicas are currently connected.

On a replica server, the output from SHOW PROCESSLIST looks like this:

mysql> SHOW PROCESSLIST\G
*************************** 1. row ***************************
     Id: 10
   User: system user
   Host:
     db: NULL
Command: Connect
   Time: 11
  State: Waiting for master to send event
   Info: NULL
*************************** 2. row ***************************
     Id: 11
   User: system user
   Host:
     db: NULL
Command: Connect
   Time: 11
  State: Has read all relay log; waiting for the slave I/O
         thread to update it
   Info: NULL

The State information indicates that thread 10 is the replication I/O thread that is communicating with the source server, and thread 11 is the replication SQL thread that is processing the updates stored in the relay logs. At the time that SHOW PROCESSLIST was run, both threads were idle, waiting for further updates.

The value in the Time column can show how late the replica is compared to the source. See Section A.14, “MySQL 8.0 FAQ: Replication”. If sufficient time elapses on the source side without activity on the Binlog Dump thread, the source determines that the replica is no longer connected. As for any other client connection, the timeouts for this depend on the values of net_write_timeout and net_retry_count; for more information about these, see Section 5.1.8, “Server System Variables”.

The SHOW REPLICA | SLAVE STATUS statement provides additional information about replication processing on a replica server. See Section 17.1.7.1, “Checking Replication Status”.

17.2.3.2 Monitoring Replication Applier Worker Threads

On a multithreaded replica, the Performance Schema tables replication_applier_status_by_coordinator and replication_applier_status_by_worker show status information for the replica's coordinator thread and applier worker threads respectively. For a replica with multiple channels, the threads for each channel are identified.

A multithreaded replica's coordinator thread also prints statistics to the replica's error log on a regular basis if the verbosity setting is set to display informational messages. The statistics are printed depending on the volume of events that the coordinator thread has assigned to applier worker threads, with a maximum frequency of once every 120 seconds. The message lists the following statistics for the relevant replication channel, or the default replication channel (which is not named):

Seconds elapsed

The difference in seconds between the current time and the last time this information was printed to the error log.

Events assigned

The total number of events that the coordinator thread has queued to all applier worker threads since the coordinator thread was started.

Worker queues filled over overrun level

The current number of events that are queued to any of the applier worker threads in excess of the overrun level, which is set at 90% of the maximum queue length of 16384 events. If this value is zero, no applier worker threads are operating at the upper limit of their capacity.

Waited due to worker queue full

The number of times that the coordinator thread had to wait to schedule an event because an applier worker thread's queue was full. If this value is zero, no applier worker threads exhausted their capacity.

Waited due to the total size

The number of times that the coordinator thread had to wait to schedule an event because the slave_pending_jobs_size_max limit had been reached. This system variable sets the maximum amount of memory (in bytes) available to applier worker thread queues holding events not yet applied. If an unusually large event exceeds this size, the transaction is held until all the applier worker threads have empty queues, and then processed. All subsequent transactions are held until the large transaction has been completed.

Waited at clock conflicts

The number of nanoseconds that the coordinator thread had to wait to schedule an event because a transaction that the event depended on had not yet been committed. If slave_parallel_type is set to DATABASE (rather than LOGICAL_CLOCK), this value is always zero.

Waited (count) when workers occupied

The number of times that the coordinator thread slept for a short period, which it might do in two situations. The first situation is where the coordinator thread assigns an event and finds the applier worker thread's queue is filled beyond the underrun level of 10% of the maximum queue length, in which case it sleeps for a maximum of 1 millisecond. The second situation is where slave_parallel_type is set to LOGICAL_CLOCK and the coordinator thread needs to assign the first event of a transaction to an applier worker thread's queue, it only does this to a worker with an empty queue, so if no queues are empty, the coordinator thread sleeps until one becomes empty.

Waited when workers occupied

The number of nanoseconds that the coordinator thread slept while waiting for an empty applier worker thread queue (that is, in the second situation described above, where slave_parallel_type is set to LOGICAL_CLOCK and the first event of a transaction needs to be assigned).

17.2.4 Relay Log and Replication Metadata Repositories

A replica server creates several repositories of information to use for the replication process:

  • The replica's relay log, which is written by the replication I/O thread, contains the transactions read from the replication source server's binary log. The transactions in the relay log are applied on the replica by the replication SQL thread. For information about the relay log, see Section 17.2.4.1, “The Relay Log”.

  • The replica's connection metadata repository contains information that the replication I/O thread needs to connect to the replication source server and retrieve transactions from the source's binary log. The connection metadata repository is written to the mysql.slave_master_info table.

  • The replica's applier metadata repository contains information that the replication SQL thread needs to read and apply transactions from the replica's relay log. The applier metadata repository is written to the mysql.slave_relay_log_info table.

The replica's connection metadata repository and applier metadata repository are collectively known as the replication metadata repositories. For information about these, see Section 17.2.4.2, “Replication Metadata Repositories”.

Making replication resilient to unexpected halts.  The mysql.slave_master_info and mysql.slave_relay_log_info tables are created using the transactional storage engine InnoDB. Updates to the replica's applier metadata repository table are committed together with the transactions, meaning that the replica's progress information recorded in that repository is always consistent with what has been applied to the database, even in the event of an unexpected server halt. For information on the combination of settings on the replica that is most resilient to unexpected halts, see Section 17.4.2, “Handling an Unexpected Halt of a Replica”.

17.2.4.1 The Relay Log

The relay log, like the binary log, consists of a set of numbered files containing events that describe database changes, and an index file that contains the names of all used relay log files. The default location for relay log files is the data directory.

The term relay log file generally denotes an individual numbered file containing database events. The term relay log collectively denotes the set of numbered relay log files plus the index file.

Relay log files have the same format as binary log files and can be read using mysqlbinlog (see Section 4.6.8, “mysqlbinlog — Utility for Processing Binary Log Files”). If binary log transaction compression (available as of MySQL 8.0.20) is in use, transaction payloads written to the relay log are compressed in the same way as for the binary log. For more information on binary log transaction compression, see Section 5.4.4.5, “Binary Log Transaction Compression”.

For the default replication channel, relay log file names have the default form host_name-relay-bin.nnnnnn, where host_name is the name of the replica server host and nnnnnn is a sequence number. Successive relay log files are created using successive sequence numbers, beginning with 000001. For non-default replication channels, the default base name is host_name-relay-bin-channel, where channel is the name of the replication channel recorded in the relay log.

The replica uses an index file to track the relay log files currently in use. The default relay log index file name is host_name-relay-bin.index for the default channel, and host_name-relay-bin-channel.index for non-default replication channels.

The default relay log file and relay log index file names and locations can be overridden with, respectively, the relay_log and relay_log_index system variables (see Section 17.1.6, “Replication and Binary Logging Options and Variables”).

If a replica uses the default host-based relay log file names, changing a replica's host name after replication has been set up can cause replication to fail with the errors Failed to open the relay log and Could not find target log during relay log initialization. This is a known issue (see Bug #2122). If you anticipate that a replica's host name might change in the future (for example, if networking is set up on the replica such that its host name can be modified using DHCP), you can avoid this issue entirely by using the relay_log and relay_log_index system variables to specify relay log file names explicitly when you initially set up the replica. This causes the names to be independent of server host name changes.

If you encounter the issue after replication has already begun, one way to work around it is to stop the replica server, prepend the contents of the old relay log index file to the new one, and then restart the replica. On a Unix system, this can be done as shown here:

shell> cat new_relay_log_name.index >> old_relay_log_name.index
shell> mv old_relay_log_name.index new_relay_log_name.index

A replica server creates a new relay log file under the following conditions:

  • Each time the replication I/O thread starts.

  • When the logs are flushed (for example, with FLUSH LOGS or mysqladmin flush-logs).

  • When the size of the current relay log file becomes too large, which is determined as follows:

The replication SQL thread automatically deletes each relay log file after it has executed all events in the file and no longer needs it. There is no explicit mechanism for deleting relay logs because the replication SQL thread takes care of doing so. However, FLUSH LOGS rotates relay logs, which influences when the replication SQL thread deletes them.

17.2.4.2 Replication Metadata Repositories

A replica server creates two replication metadata repositories, the connection metadata repository and the applier metadata repository. The replication metadata repositories survive a replica server's shutdown. If binary log file position based replication is in use, when the replica restarts, it reads the two repositories to determine how far it previously proceeded in reading the binary log from the source and in processing its own relay log. If GTID-based replication is in use, the replica does not use the replication metadata repositories for that purpose, but does need them for the other metadata that they contain.

  • The replica's connection metadata repository contains information that the replication I/O thread needs to connect to the replication source server and retrieve transactions from the source's binary log. The metadata in this repository includes the connection configuration, the replication user account details, the SSL settings for the connection, and the file name and position where the replication I/O thread is currently reading from the source's binary log.

  • The replica's applier metadata repository contains information that the replication SQL thread needs to read and apply transactions from the replica's relay log. The metadata in this repository includes the file name and position up to which the replication SQL thread has executed the transactions in the relay log, and the equivalent position in the source's binary log. It also includes metadata for the process of applying transactions, such as the number of worker threads and the PRIVILEGE_CHECKS_USER account for the channel.

The connection metadata repository is written to the slave_master_info table in the mysql system schema, and the applier metadata repository is written to the slave_relay_log_info table in the mysql system schema. A warning message is issued if mysqld is unable to initialize the tables for the replication metadata repositories, but the replica is allowed to continue starting. This situation is most likely to occur when upgrading from a version of MySQL that does not support the use of tables for the repositories to one in which they are supported.

Important
  1. Do not attempt to update or insert rows in the mysql.slave_master_info or mysql.slave_relay_log_info tables manually. Doing so can cause undefined behavior, and is not supported. Execution of any statement requiring a write lock on either or both of the slave_master_info and slave_relay_log_info tables is disallowed while replication is ongoing (although statements that perform only reads are permitted at any time).

  2. Access privileges for the connection metadata repository table mysql.slave_master_info should be restricted to the database administrator, because it contains the replication user account name and password for connecting to the source. Use a restricted access mode to protect database backups that include this table. From MySQL 8.0.21, you can clear the replication user account credentials from the connection metadata repository, and instead always provide them using the START REPLICA | SLAVE statement or START GROUP_REPLICATION statement that starts the replication channel. This approach means that the replication channel always needs operator intervention to restart, but the account name and password are not recorded in the replication metadata repositories.

RESET REPLICA | SLAVE clears the data in the replication metadata repositories, with the exception of the replication connection parameters (depending on the MySQL Server release). For details, see the description for RESET REPLICA | SLAVE.

Before MySQL 8.0, to create the replication metadata repositories as tables, it was necessary to specify master_info_repository=TABLE and relay_log_info_repository=TABLE at server startup. Otherwise, the repositories were created as files in the data directory named master.info and relay-log.info, or with alternative names and locations specified by the --master-info-file option and relay_log_info_file system variable. From MySQL 8.0, creating the replication metadata repositories as tables is the default, and the use of all these system variables is deprecated.

The mysql.slave_master_info and mysql.slave_relay_log_info tables are created using the InnoDB transactional storage engine. Updates to the applier metadata repository table are committed together with the transactions, meaning that the replica's progress information recorded in that repository is always consistent with what has been applied to the database, even in the event of an unexpected server halt. For information on the combination of settings on a replica that is most resilient to unexpected halts, see Section 17.4.2, “Handling an Unexpected Halt of a Replica”.

When you back up the replica's data or transfer a snapshot of its data to create a new replica, ensure that you include the mysql.slave_master_info and mysql.slave_relay_log_info tables containing the replication metadata repositories. For cloning operations, note that when the replication metadata repositories are created as tables, they are copied to the recipient during a cloning operation, but when they are created as files, they are not copied. When binary log file position based replication is in use, the replication metadata repositories are needed to resume replication after restarting the restored, copied, or cloned replica. If you do not have the relay log files, but still have the applier metadata repository, you can check it to determine how far the replication SQL thread has executed in the source's binary log. Then you can use a CHANGE MASTER TO statement with the MASTER_LOG_FILE and MASTER_LOG_POS options to tell the replica to re-read the binary logs from the source from that point (provided that the required binary logs still exist on the source).

One additional repository, the applier worker metadata repository, is created primarily for internal use, and holds status information about worker threads on a multithreaded replica. The applier worker metadata repository includes the names and positions for the relay log file and the source's binary log file for each worker thread. If the applier metadata repository is created as a table, which is the default, the applier worker metadata repository is written to the mysql.slave_worker_info table. If the applier metadata repository is written to a file, the applier worker metadata repository is written to the worker-relay-log.info file. For external use, status information for worker threads is presented in the Performance Schema replication_applier_status_by_worker table.

The replication metadata repositories originally contained information similar to that shown in the output of the SHOW REPLICA | SLAVE STATUS statement, which is discussed in Section 13.4.2, “SQL Statements for Controlling Replica Servers”. Further information has since been added to the replication metadata repositories which is not displayed by the SHOW REPLICA | SLAVE STATUS statement.

For the connection metadata repository, the following table shows the correspondence between the columns in the mysql.slave_master_info table, the columns displayed by SHOW REPLICA | SLAVE STATUS, and the lines in the deprecated master.info file.

slave_master_info Table Column SHOW REPLICA | SLAVE STATUS Column master.info File Line Description
Number_of_lines [None] 1 Number of columns in the table (or lines in the file)
Master_log_name Source_Log_File 2 The name of the binary log currently being read from the source
Master_log_pos Read_Source_Log_Pos 3 The current position within the binary log that has been read from the source
Host Source_Host 4 The host name of the replication source server
User_name Source_User 5 The replication user account name used to connect to the source
User_password Password (not shown by SHOW REPLICA | SLAVE STATUS) 6 The replication user account password used to connect to the source
Port Source_Port 7 The network port used to connect to the replication source server
Connect_retry Connect_Retry 8 The period (in seconds) that the replica waits before trying to reconnect to the source
Enabled_ssl Source_SSL_Allowed 9 Whether the replica supports SSL connections
Ssl_ca Source_SSL_CA_File 10 The file used for the Certificate Authority (CA) certificate
Ssl_capath Source_SSL_CA_Path 11 The path to the Certificate Authority (CA) certificate
Ssl_cert Source_SSL_Cert 12 The name of the SSL certificate file
Ssl_cipher Source_SSL_Cipher 13 The list of possible ciphers used in the handshake for the SSL connection
Ssl_key Source_SSL_Key 14 The name of the SSL key file
Ssl_verify_server_cert Source_SSL_Verify_Server_Cert 15 Whether to verify the server certificate
Heartbeat [None] 16 Interval between replication heartbeats, in seconds
Bind Source_Bind 17 Which of the replica's network interfaces should be used for connecting to the source
Ignored_server_ids Replicate_Ignore_Server_Ids 18 The list of server IDs to be ignored. Note that for Ignored_server_ids the list of server IDs is preceded by the total number of server IDs to ignore.
Uuid Source_UUID 19 The source's unique ID
Retry_count Source_Retry_Count 20 Maximum number of reconnection attempts permitted
Ssl_crl [None] 21 Path to an SSL certificate revocation-list file
Ssl_crlpath [None] 22 Path to a directory containing SSL certificate revocation-list files
Enabled_auto_position Auto_position 23 Whether GTID auto-positioning is in use or not
Channel_name Channel_name 24 The name of the replication channel
Tls_version Source_TLS_Version 25 TLS version on the source
Public_key_path Source_public_key_path 26 Name of the RSA public key file
Get_public_key Get_source_public_key 27 Whether to request RSA public key from source
Network_namespace Network_namespace 28 Network namespace
Master_compression_algorithm [None] 29 Permitted compression algorithms for the connection to the source
Master_zstd_compression_level [None] 30 zstd compression level
Tls_ciphersuites [None] 31 Permitted ciphersuites for TLSv1.3
Source_connection_auto_failover [None] 32 Whether the asynchronous connection failover mechanism is activated

For the applier metadata repository, the following table shows the correspondence between the columns in the mysql.slave_relay_log_info table, the columns displayed by SHOW REPLICA | SLAVE STATUS, and the lines in the deprecated relay-log.info file.

slave_relay_log_info Table Column SHOW REPLICA | SLAVE STATUS Column Line in relay-log.info File Description
Number_of_lines [None] 1 Number of columns in the table or lines in the file
Relay_log_name Relay_Log_File 2 The name of the current relay log file
Relay_log_pos Relay_Log_Pos 3 The current position within the relay log file; events up to this position have been executed on the replica database
Master_log_name Relay_Source_Log_File 4 The name of the source's binary log file from which the events in the relay log file were read
Master_log_pos Exec_Source_Log_Pos 5 The equivalent position within the source's binary log file of the events that have been executed on the replica
Sql_delay SQL_Delay 6 The number of seconds that the replica must lag the source
Number_of_workers [None] 7 The number of worker threads for applying replication transactions in parallel
Id [None] 8 ID used for internal purposes; currently this is always 1
Channel_name Channel_name 9 The name of the replication channel
Privilege_checks_username [None] 10 The user name for the PRIVILEGE_CHECKS_USER account for the channel
Privilege_checks_hostname [None] 11 The host name for the PRIVILEGE_CHECKS_USER account for the channel
Require_row_format [None] 12 Whether the channel accepts only row-based events
Require_table_primary_key_check [None] 13 The channel's policy on whether tables must have primary keys for CREATE TABLE and ALTER TABLE operations
Assign_gtids_to_anonymous_transactions_type [None] 14 Whether the channel assigns a GTID to replicated transactions that do not already have one, and if so, whether it uses the replica's local UUID or a manually set UUID
Assign_gtids_to_anonymous_transactions_value [None] 15 The UUID used in the GTIDs assigned to anonymous transactions

17.2.5 How Servers Evaluate Replication Filtering Rules

If a replication source server does not write a statement to its binary log, the statement is not replicated. If the server does log the statement, the statement is sent to all replicas and each replica determines whether to execute it or ignore it.

On the source, you can control which databases to log changes for by using the --binlog-do-db and --binlog-ignore-db options to control binary logging. For a description of the rules that servers use in evaluating these options, see Section 17.2.5.1, “Evaluation of Database-Level Replication and Binary Logging Options”. You should not use these options to control which databases and tables are replicated. Instead, use filtering on the replica to control the events that are executed on the replica.

On the replica side, decisions about whether to execute or ignore statements received from the source are made according to the --replicate-* options that the replica was started with. (See Section 17.1.6, “Replication and Binary Logging Options and Variables”.) The filters governed by these options can also be set dynamically using the CHANGE REPLICATION FILTER statement. The rules governing such filters are the same whether they are created on startup using --replicate-* options or while the replica server is running by CHANGE REPLICATION FILTER. Note that replication filters cannot be used on Group Replication-specific channels on a MySQL server instance that is configured for Group Replication, because filtering transactions on some servers would make the group unable to reach agreement on a consistent state.

In the simplest case, when there are no --replicate-* options, the replica executes all statements that it receives from the source. Otherwise, the result depends on the particular options given.

Database-level options (--replicate-do-db, --replicate-ignore-db) are checked first; see Section 17.2.5.1, “Evaluation of Database-Level Replication and Binary Logging Options”, for a description of this process. If no database-level options are used, option checking proceeds to any table-level options that may be in use (see Section 17.2.5.2, “Evaluation of Table-Level Replication Options”, for a discussion of these). If one or more database-level options are used but none are matched, the statement is not replicated.

For statements affecting databases only (that is, CREATE DATABASE, DROP DATABASE, and ALTER DATABASE), database-level options always take precedence over any --replicate-wild-do-table options. In other words, for such statements, --replicate-wild-do-table options are checked if and only if there are no database-level options that apply.

To make it easier to determine what effect a given set of options has, it is recommended that you avoid mixing do-* and ignore-* options, or options containing wildcards with options which do not.

If any --replicate-rewrite-db options were specified, they are applied before the --replicate-* filtering rules are tested.

Note

All replication filtering options follow the same rules for case sensitivity that apply to names of databases and tables elsewhere in the MySQL server, including the effects of the lower_case_table_names system variable.

17.2.5.1 Evaluation of Database-Level Replication and Binary Logging Options

When evaluating replication options, the replica begins by checking to see whether there are any --replicate-do-db or --replicate-ignore-db options that apply. When using --binlog-do-db or --binlog-ignore-db, the process is similar, but the options are checked on the source.

The database that is checked for a match depends on the binary log format of the statement that is being handled. If the statement has been logged using the row format, the database where data is to be changed is the database that is checked. If the statement has been logged using the statement format, the default database (specified with a USE statement) is the database that is checked.

Note

Only DML statements can be logged using the row format. DDL statements are always logged as statements, even when binlog_format=ROW. All DDL statements are therefore always filtered according to the rules for statement-based replication. This means that you must select the default database explicitly with a USE statement in order for a DDL statement to be applied.

For replication, the steps involved are listed here:

  1. Which logging format is used?

    • STATEMENT.  Test the default database.

    • ROW.  Test the database affected by the changes.

  2. Are there any