A GH-OST can save your time!



Today I am going to introduce you all to an awesome tool GH-OST. We are using it since many months to Alter tables online. Believe me it saves lot of time and efforts while altering big tables in MySQL.

gh-ost has been developed at GitHub. To answer a problem we faced with ongoing, continuous production schema changes requiring modifications to MySQL tables. gh-ost changes the existing online table migration paradigm by providing a low impact, controllable, auditable, operations friendly solution.

gh-ost stands for GitHub’s Online Schema Transmogrifier/Transfigurator/Transformer/Thingy.

While migrating table gh-ost creates a ghost table in the likeness of your original table, migrate that table while empty, slowly and incrementally copy data from your original table to the ghost table, meanwhile propagating ongoing changes (any INSERT, DELETE, UPDATE applied to your table) to the ghost table. Finally, at the right time, they replace your original table with the ghost table.


Installation and configuration of Gh-ost:
cd /usr/bin
wget https://github.com/github/gh-ost/releases/download/v1.0.47/gh-ost-binary-linux-20181016015113.tar.gz
tar -zxvf gh-ost-binary-linux-20181016015113.tar.gz
chmod +x gh-ost
After installing gh-ost on one of the replica on cluster, we need to create a mysql user on slaves as well as on master that will allow gh-ost to work upon. Below code tells the minimum privileges needed by gh-ost to operate.
GRANT ALTER, CREATE, DELETE, DROP, INDEX, INSERT, LOCK TABLES, SELECT, TRIGGER, UPDATE, SUPER, REPLICATION SLAVE,REPLICATION CLIENT on *.* to gh_user@'localhost' identified by 'XXXXXX';

gh-ost operation modes:
gh-ost operates by connecting to potentially multiple servers, as well as connecting itself as a replica in order to stream binary log events directly from one of those servers. There are various operation modes, which depend on your setup, configuration, and where you want to run the migration.

a. Connect to replica, migrate on master

This is the mode gh-ost expects by default. gh-ost will investigate the replica, crawl up to find the topology’s master, and connect to it as well. Migration will:
  • Read and write row-data on master
  • Read binary logs events on the replica, apply the changes onto the master
  • Investigate table format, columns & keys, count rows on the replica
  • Read internal changelog events (such as heartbeat) from the replica
  • Cut-over (switch tables) on the master
  • If your master works with SBR, this is the mode to work with. The replica must be configured with binary logs enabled (log_bin, log_slave_updates) and should have binlog_format=ROW (gh-ost can apply the latter for you).
However even with RBR we suggest this is the least master-intrusive operation mode.

b. Connect to master

If you don’t have replicas, or do not wish to use them, you are still able to operate directly on the master. gh-ost will do all operations directly on the master. You may still ask it to be considerate of replication lag.
  • Your master must produce binary logs in RBR format.
  • You must approve this mode via --allow-on-master.

c. Migrate/test on replica

This will perform a migration on the replica. gh-ost will briefly connect to the master but will thereafter perform all operations on the replica without modifying anything on the master. Throughout the operation, gh-ost will throttle such that the replica is up to date.
  • --migrate-on-replica indicates to gh-ost that it must migrate the table directly on the replica. It will perform the cut-over phase even while replication is running.
  • --test-on-replica indicates the migration is for purpose of testing only. Before cut-over takes place, replication is stopped. Tables are swapped and then swapped back: your original table returns to its original place. Both tables are left with replication stopped. You may examine the two and compare data.
Execution:
gh-ost \
 --user='gh_user' \
 --password='XXXX' \
 --allow-on-master  \
 --database='db_name' \
 --table='table_name' \
 --verbose \
 -max-lag-millis=2000 \
 --alter='' \
 --throttle-control-replicas=slavehost1:3306,slavehost2:3306 \
 --initially-drop-ghost-table \
 --initially-drop-old-table \
 --initially-drop-socket-file \
 --max-load=Threads_running=30 \
 --chunk-size=100 \
 --nice-ratio=1 \
 --default-retries=200 \
 --concurrent-rowcount  \
 --switch-to-rbr \
 --hooks-path=/tmp/ \
 --postpone-cut-over-flag-file=/tmp/ghostcutover.file \
 --execute

More details on execution are available at link.

gh-ost provides external hooks using option called hooks-path. This allows us to perform some special task on alter stages. Sample hooks can be found here.

I would like to thank to the dev team of gh-ost for making such nice tool which helps us to work hassle free.

Comments