MySQL master full disk and replication problem

After /var partition on master MySQL server was full, replication has stopped working. Master server wasn’t able to write new changes to the binlog file. Recovery procedure seems simple, clean /var partition and restart slave servers but it was not so easy.

Fixing MySQL replication after slaves’s relay log was corrupted

MySQL replication on slave (version 5.1.61) has stopped. Slave_IO_Running was marked as Yes, but Slave_SQL_Running as No. Simple stop/start slave didn’t help so further problem analysis was needed. It seemed that current slave’s relay log was corrupted because testing with “mysqlbinlog” has printed out an error. Therefore, the solution was to discard current relay binlogs and to point slave to the last master binlog position.

MySQL replication recovery

MySQL replication can stop if slave fails to execute SQL statement from the binary log. From that moment, slave prints last error and waits for replication recovery. If master has consistent snapshot, then is only necessary to re-point slave to the new master position. It can be done with “change master to” or “sql_slave_skip_counter”.