Replication is a very useful functionality which enables creating identical database copies on different servers. It is an asynchronous operation and it adds insignificant IO & network load to the server. It is also quite easy to setup. I’ll try to explain some aspects, benefits & problems that may arise mainly for statement-based replication which is most commonly used (there are a lot similarities with row-based replication though).
Common Replication Terms
- Master [Server]:
- the MySQL server from which the data is being replicated. All DML statements that modify data (INSERT, DELETE, UPDATE) are (must be) executed here. That goes for DDL statements as well which is also imperative to execute on this server. Every change is replicated as is.
- Slave [Server]:
- the MySQL server which replicates from the Master. Nothing should be written or modified on this server. All changes to data or structures should pass through the Master or else you might run into duplicate-key errors which will stop the replication until resolved.
- [Master] Binary Logs:
- Every data modification statement and DDL statement is written in a special binary format on these files. They are usually located in /var/log/mysql with a filename pattern such as mysql-bin.NNNNNN (unless you specify otherwise in the configuration file). Every time you restart the server or execute FLUSH LOGS, a new log is created and the index is incremented.
You can use this utility to convert binary logs to text mysqlbinlog.
- [Slave] Relay Logs:
- Used to store locally in an identical to Master Binary Logs format SQL statements from the Master Binary logs. Usually located in /var/lib/mysql and follow the pattern hostname-relay-bin.NNNNNN. As you may have already guessed, they can also be read with the mysqlbinlog utility, mainly for troubleshooting.
- SQL Thread:
- Runs on the Slave and is responsible for executing the SQL statements that need to be replicated.
- IO Thread:
- Runs on the Slave but also reads from the Master binary logs SQL statements and stores them locally on the Relay binary logs. Together with the SQL Thread performs the whole concept of asynchronous replication.
- Replication Lag:
- the distance between the master and slave. An approximate measure for this is the value of the Seconds_Behind_Master field, shown by the SHOW SLAVE STATUS command.
- Broken Replication:
- If an error occurs, preventing the SQL thread to execute a certain statement (e.g. a DUPLICATE KEY error) the replication SQL thread stops until the issue is resolved or the statement is skipped. This requires manual handling for the majority of occurences.
(Some) Useful Commands and Configuration Parameters
- START SLAVE
- starts the entire replication procedure.
- STOP SLAVE
- stops the entire replication procedure.
- START SLAVE IO_THREAD
- starts the IO thread. It is very useful to keep the IO thread running because the Master server removes the Binary Logs after some amount of time (specified in the master my.cnf file).
- STOP SLAVE IO_THREAD
- Stops the IO thread.
- START SLAVE SQL_THREAD
- Starts the SQL thread. When this starts, SQL statements will start to be executed on the slave. Statements
- SHOW SLAVE STATUS\G
- Shows the status of the replication process. Notice the \G, it is used on a MySQL command-line client to write each column and value in a line (transpose the resultset – sort of speak) thus providing more readable output. \G is extremely useful in general if you have many columns in your resultset. You can then combine it with grep to find what you really need easily or create a BASH script based on the output for example.
- Replication threads do not start automatically when you restart the MySQL server.
- Specify which databases should be replicated. For multiple values you must include multiple entries of this parameter (let’s call this type of parameter “multi-value” so that I do not have to repeat the examples).
replicate-do-db = db1
replicate-do-db = db2
- Reduce the tables being replicated to specific tables. You must specify the database name as well in the known object format:
replicate-do-table = db1.tableA
replicate-do-table = db1.tableD
replicate-do-table = db2.tableC
A common issue if you target replication to specific tables is that you may run into replication errors involving cross-table DML statements. If we are replicating table A from database db1 and not table B the following example SQL statement will break the replication:
INSERT INTO A(A_col1, A_col2) SELECT B_col1, B_col2 FROM B WHERE B_id < 500
- Better read performance. You can easily rotate your connection to the database among many database servers, thus distributing the load.
a = [‘replica1’, ‘replica2’, … , ‘replicaN’]
host = a[rand(0, N – 1)]
This is specifically for read performance, not read & write. If you for example have a website with user interactions you may need to be extra careful (more on this in a short while).
- A database replica enables you to execute there long running queries for reporting purposes. By controlling replication "consistent" database snapshots can be obtained, thus enabling you to extract aggregated data without interfering with the "live" database. This takes load off your actual server and does not create unpleasant locks on tables which will be surely noticed by and finally annoy users.
- Backup against a server crash, hard disk failure etc . Replication is not backup but it is a well established method of having a live copy of your databases ready to use in case something goes terribly wrong on the master hardware. If this happens you just reverse the configuration – the slave becomes master.
You can use this great tool from Percona Toolkit to hold back your slave server in order to help you prevent damage to your data as well: