AWS Database Migration Service: Replication Task Failed
Table of Contents
Problem #
Using AWS Database Migration Service to replicate data from an RDS MySQL database (see my previous post on this topic for more details), I encountered a situation in which a replication task would suddenly fail with the following error, after working fine for days (or even months):
Could not find first log file name in binary log index file
From that point onwards, the task would consistently fail to restart with the same error.
Investigation #
After lots of time spent digging into the problem and double-checking that my RDS instance’s configuration parameters matched the recommended values, I found out about the binlog retention hours
configuration parameter, which determines how long binary log files will be stored on the server before being purged. Quoting from the documentation:
The binlog retention hours parameter is used to specify the number of hours to retain binary log files. Amazon RDS normally purges a binary log as soon as possible, but the binary log might still be required for replication with a MySQL database external to Amazon RDS. The default value of binlog retention hours is NULL. This default value is interpreted as follows:
For RDS for MySQL, NULL means binary logs are not retained (0 hours).
For Aurora MySQL, NULL means binary logs are cleaned up lazily. Aurora MySQL binary logs might remain in the system for a certain period, usually not longer than a day.
Solution #
This parameter is unfortunately, at the time of writing, not visible in the RDS Configuration tab. It can only be managed by invoking stored procedures, using an SQL client to connect to your RDS instance:
call mysql.rds_show_configuration(); -- read current configuration
call mysql.rds_set_configuration('binlog retention hours', 24); -- set new value
Note: The value 24
in the above snippet is arbitrary: determining a sensible value for your instance depends on your use case and needs. The maximum value, also according to the documentation, is:
For MySQL DB instances, the maximum
binlog retention hours
value is 168 (7 days).