Fixing MySQL replication after slaves’s relay log was corrupted

MySQL replication on slave (version 5.1.61) has stopped. Slave_IO_Running was marked as Yes, but Slave_SQL_Running as No. Simple stop/start slave didn’t help so further problem analysis was needed. It seemed that current slave’s relay log was corrupted because testing with “mysqlbinlog” has printed out an error. Therefore, the solution was to discard current relay binlogs and to point slave to the last master binlog position.

Here is complete output from show slave status\G on stopped slave server:

               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.1.79.48
                  Master_User: replica
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.002046
          Read_Master_Log_Pos: 639600842
               Relay_Log_File: triton-relay-bin.001957
                Relay_Log_Pos: 243
        Relay_Master_Log_File: mysql-bin.002045
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: pretinac_radio,web
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: web.logging_www,web.logging_raspored,web.web_korisnik
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1594
                   Last_Error: Relay log read failure: Could not parse relay log event entry.
                               The possible reasons are: the master's binary log is corrupted
                               (you can check this by running 'mysqlbinlog' on the binary log),
                               the slave's relay log is corrupted (you can check this by running
                               'mysqlbinlog' on the relay log), a network problem, or a bug in the
                               master's or slave's MySQL code. If you want to check the master's
                               binary log or slave's relay log, you will be able to know their names
                               by issuing 'SHOW SLAVE STATUS' on this slave.
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 103641119
              Relay_Log_Space: 983411603
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1594
               Last_SQL_Error: [the same error description as in Last_Error]

To fix the error, current binlog files on slave should be discarded and set new position. Before setting new binlog position it’s important to remember Relay_Master_Log_File and Exec_Master_Log_Pos values:

Relay_Master_Log_File: mysql-bin.002045
Exec_Master_Log_Pos: 103641119

OK, with this values, new binlog position can be set:

# stop slave
mysql> stop slave;

# make slave forget its replication position in the master's binary log
mysql> reset slave;

# change slave to start reading from stopped position
mysql> change master to master_log_file='mysql-bin.002045', master_log_pos=103641119;

# start slave
mysql> start slave;

Just to note that “reset slave” will delete master.info, relay-log.info and all the relay log files, so it’s not needed to clean leftovers in /var/lib/mysql directory. After all commands were executed, slave has reconnected to the master and start to read SQL statements (Seconds_Behind_Master value was not NULL any more). Complete “reset slave” documentation can be read on MySQL 5.1 – RESET SLAVE Syntax page.

32 thoughts on “Fixing MySQL replication after slaves’s relay log was corrupted”

  1. @Whisley Santos – I’m glad this post is still helpful and thank you for feedback :)

    Cheers!

  2. This looks pretty good, but is there a chance that the slave is going to miss some data sometime between when it stopped replicating and the master’s current Exec_Master_Log_Pos?

  3. Hi,
    when slave is “rewound” to the stopped position, master will send all data since that moment so there shouldn’t be missed anything. Actually, everything contained in masters logs from defined position will be send to the slave server.

    Cheers

Leave a Comment