MySQL replication recovery

MySQL replication can stop if slave fails to execute SQL statement from the binary log. From that moment, slave prints last error and waits for replication recovery. If master has consistent snapshot, then is only necessary to re-point slave to the new master position. It can be done with change master to or sql_slave_skip_counter.

Contents
  1. Replication recovery – change master to
  2. Replication recovery – sql_slave_skip_counter
  3. Disable or enable binary logging for the current connection
  4. My case when replication stopped working

1. Replication recovery – change master to
This recover example will cover the case where slave stops to work and master has consistent snapshot. You will need to stop slave, read binlog name and position from the master, set them to the slave and start slave.

# on master server run "show master status" and remember file and position column
mysql> show master status; 
+------------------+-----------+---------------------+------------------+
| File             | Position  | Binlog_Do_DB        | Binlog_Ignore_DB |
+------------------+-----------+---------------------+------------------+
| mysql-bin.000334 | 502360215 | database1,database2 |                  | 
+------------------+-----------+---------------------+------------------+

# stop slave server, re-point and start
mysql> stop slave;
mysql> change master to master_log_file='mysql-bin.000334', master_log_pos=502360215;
mysql> start slave;

If error occurred in the replication, then the stop slave line is not required because slave is already stopped.

2. Replication recovery – sql_slave_skip_counter
If you want to skip the next few statements from the master, set the sql_slave_skip_counter variable on the slave:

mysql> stop slave;
mysql> set global sql_slave_skip_counter = N;
mysql> start slave;

The value of N should be 1 if the next statement from the master does not use auto_increment or last_insert_id(). Otherwise, the value should be 2. The reason for using 2 as sql_slave_skip_counter value for statements that use auto_increment or last_insert_id() is that they take two events in the binary log of the master.

Setting this variable isn’t like setting other server variables, it can’t be read back (it’s read only by slave thread). When you start the slave again with start slave, the slave skips statements and decrements the variable until it reaches 0, at which point it begins executing statements again. You can watch this happening by executing show slave status, where the variable’s value appears in the Skip_Counter column. This is the only place you can see its value.

The effect is that the setting isn’t persistent. If you set it to 1, start the slave from the error, and the slave has an error in replication sometime later, the variable won’t still be set to 1. At that point, if you want the slave to skip the statement that caused the error, you will have to set it to 1 again.

3. Disable or enable binary logging for the current connection

mysql> set sql_log_bin=0;
mysql> create table ...
mysql> exit;

If you need to execute statements on the master only, then you have to disable binary logging for the current connection (sql_log_bin is a session variable). Statement will not be written to the binary log and therefore slaves will be skipped.

4. My case when replication stopped working
In my case, master has InnoDB tables and slave has MyISAM. Why? Because slave is used for reporting while master for OLTP (online transaction processing) queries. Thus, slave server doesn’t know anything about InnoDB storage type. Replication stopped in a moment when I tried to create a new table with InnoDB storage on the master. Table was created on master without any problem, but create statement failed on slave. To prevent this replication problem, I disabled binary logging for the current connection and then created table on master – binary log was intact. Next step was to manually create table on slave with MyISAM storage structure.

1 thought on “MySQL replication recovery”

Leave a Comment