For years, organizations using DB2 for LUW addressed their DR needs via a mechanism known as log shipping. The concept is simple: First, you back up the DB2 database at the primary site and restore it on a DB2 system at the DR site so that two copies of the database exist. Following the database backup operation, data changes (updates, inserts, and deletes) are processed at the primary site and recorded in the transaction log of the primary-site DB2 system. When this transaction log is periodically backed up, a copy of the log backup is sent (electronically) to the DR site. The DB2 system at the DR site processes the log file and applies the data changes it contains, bringing the database copy at that site that much closer to currency with respect to the database at the primary site.
Log shipping can deliver RTO and RPO numbers that are satisfactory for many organizations. On the RTO side, completing the roll-forward recovery operation necessary to make the DR-site DB2 system ready to process application requests for database access will often take only seconds (assuming that log files sent to the DR site are processed as they are received, so there's no backlog of files to be processed). Even if the disaster strikes just after a really large log file has been sent to the DR site (for example, containing 15 minutes of data changes associated with a high-volume OLTP application), applying those changes at the DR site probably won't take more than one or two minutes.
The amount of data lost as a result of the disaster event (the focus of your efforts aimed at achieving an RPO) depends largely on the frequency with which the transaction log at the primary site is backed up. Note that the files typically backed up at the primary site and sent to the DR site are inactive log files, because backing up active log files can negatively impact system performance. Log files become inactive when DB2 marks them for archive because they're full or when they're truncated via the DB2 archive log command. Repeatedly issuing the archive log command isn't good for system performance, so organizations using log shipping often control log-backup frequency by sizing log files so that they fill up and become inactive more (smaller log files) or less (larger log files) often.
Log shipping is a good thing, but it probably isn't the best DR solution for your organization if your aim is an RPO of less than one minute. It isn't practical to size log files to fill up and become inactive every few seconds.
Fortunately, DB2 for LUW version 8.2 provided a new disaster recovery solution called High Availability Disaster Recovery (HADR). Conceptually, HADR is log shipping taken to the theoretical extreme, as though the transaction log at the primary site were backed up and transmitted to the DR site for each record written to the log. In my previous column, I described HADR operating in synchronous or near-synchronous mode as an excellent solution for minimizing downtime due to localized failure events. When running in asynchronous mode, HADR can keep a DR copy of a DB2 database within seconds of currency, even when the DR site is hundreds of miles away from the primary site. In addition to minimizing data loss, HADR enables an organization to achieve an aggressive RTO: The DR-site DB2 system can be made available for application processing in seconds.
Nice blog... This blog nicely explain what is RTO RPO. I found this blog post very helpful. Thanks for sharing
ReplyDelete