Most of us used to get errors like (Row not found, Duplicate row etc) on slave in Master slave replication and sometimes it is very difficult to find unsynced data and fix it while we know table name(s).
There are few recommended tools from percona to check replication integrity and fixed unsync data:
1. pt-table-checksum: performs an online replication consistency check by executing checksum queries on the master, which produces different results on replicas that are inconsistent with the master.
2. pt-table-sync: synchronizes data efficiently between MySQL tables.
Recommended tools work well with limited data with less time, But what if table size is more and we don't have time to analyze each row of the table with recommended tools which is time consuming?
We can achieve this with very simple step though:
Let's assume a simple cluster(MySQL 5.7.XX) with master m1 and slave s1 with GTID based replication replicating a db(db_ankit) with 5 tables(a,b,c,d,e). Now due to some issue data of table c in db_ankit on slave s1 got corrupted and table if huge in size like > 500GB.
- On Slave s1: Start slave with replication filter to ignore table c in replication.
- From Master m1: Dump table c with replication co-ordinates or GTID. mydumper can be used with multiple threads for faster dump according to the server capacity.
- On Slave s1: Rename current table c and restore(myloader with multiple threads will save your time if dump has been taken with mydumper) dumped table.
- On Slave s1: Once restore finishes stop the slave and note down the master exec position and GTID executed sets.
- On Slave s1: Reset slave & Change replication Filter to replicate only c.
- On Slave s1: Run Change master to start replication of table c from coordinates of dump taken in step 2.
- On Slave s1: start slave until the coordinates noted in step 4.
- On Slave s1: when replication lag becomes 0 and sql_thread_status 0 remove replication filter and start slave without any restriction.
In this way with the help of multithreaded dump and restore one can sync specific table(s) from master.