Using IBM Big Replicate to implement an active / standby disaster recovery solution for IBM BigInsights Big SQL

Question & Answer

Question

How do I use IBM BigInsights Big SQL with IBM Big Replicate to provide an active/standby disaster recovery solution?

Answer

This document describes how Big SQL and IBM Big Replicate can be used together to provide an active/standby disaster recovery strategy for Big SQL users.

This strategy supports a multi-cluster configuration with one cluster acting as a typical active Big SQL cluster and one operating as a standby Big SQL cluster (with Big SQL service stopped until needed). In this way, IBM Big Replicate and Big SQL backup and restore work together to ensure all necessary data is replicated and available on each cluster. This setup provides a Big SQL disaster recovery solution that can be utilized in the event of critical failures of the Big SQL Active Cluster.

Requirements and Prerequisites

Big SQL
Big SQL must be installed on both clusters with a consistent configuration

Same bigsql_db_path
Same bigsql_data_directories
Same number of Big SQL worker nodes
Consistent numbering for each Big SQL worker node in the db2nodes.cfg file

Big Replicate
Big Replicate should be installed and configured as outlined in the Big Replicate documentation.
Big SQL High Availability (HA) and Backup / Restore
This solution is compatible with Big SQL HA. Big SQL HA can be enabled, and remain enabled during BIGSQL backup operations (Big SQL Active Cluster), however, currently Big SQL HA must be disabled during a restore ( Big SQL Standby Cluster). This does not impact the current Big SQL with Big Replicate active/standby solution. The Big SQL service on the Big SQL Standby Cluster will be stopped during normal operation and only started during an actual failover situation or when performing a restore. After an actual failover to the Big SQL Standby Cluster, HA can then be enabled if necessary.

Solution Architecture
This section will outline the core architecture of the Big SQL active/standby disaster recovery solution.

Big Replicate Data Replication

Big SQL Database Replication (

Backup and Restore

)

bigsql_bar.py

Note

offline backup and restore of the offline image

BACKUP DATABASE command

RESTORE DATABASE command

Typical Big SQL Backup & Restore Usage Scenario (as part of Disaster Recovery Solution)
This section describes the typical usage scenario.
Big SQL Active Cluster

Take the necessary initial offline backup (using bigsql_bar.py)
Perform subsequent online backups regularly (using bigsql_bar.py - possibly as a cron job)
Transfer the backup images to the Big SQL Standby Cluster (using bigsql_bar.py)

Big SQL Standby Cluster

Start Big SQL service
Restore initial offline backup (using bigsql_bar.py)
Perform subsequent restores of online backups (using bigsql_bar.py - possibly as a cron job)
Execute HCAT_SYNC_OBJECTS (using bigsql_bar.py, or manually as necessary)
Enable HA as part of failover scenario, if necessary
Stop Big SQL service OR start using cluster as Big SQL Active Cluster (failover)

Note: It is the user's responsibility to ensure that any jars containing custom UDFs or SerDes are synchronized between the Big SQL Active Cluster and the Big SQL Standby Cluster as the backup and restore solution does not manage these.

Big SQL Backup & Restore Utility (bigsql_bar.py)
The bigsql_bar.py utility is provided to assist with the process of taking regular database backups and restoring on a second cluster. The script works by automating the backup, image transfer and restore operations.

Requirements and Prerequisites

User
The bigsql_bar.py utility must be executed as the configured bigsql service user.

Passwordless SSH
To implement a Big SQL active/standby disaster recovery solution, the generated backup images must be regularly transferred from the Big SQL Active Cluster to the Big SQL Standby Cluster.

To automatically transfer backup images from the Big SQL Active Cluster to the Big SQL Standby Cluster when using the bigsql_bar.pyscript, passwordless ssh must be configured between the Big SQL Active Cluster's primary Head Node and the Big SQL Standby Cluster's primary Head Node.

If passwordless ssh is not configured, it will be necessary to use your own preferred mechanism, outside of the bigsql_bar.py utility, to transfer backup images to the Big SQL Standby Cluster's primary Head Node.
Where to Run the Big SQL Backup and Restore Utility
The bigsql_bar.py utility must be executed on the Big SQL Head Node from any directory. In cases where Big SQL HA is enabled, the utility must be executed on the Big SQL primary Head Node.

Usage
The bigsql_bar.py utility can be executed in one of three modes:

Backup: Take a backup of the BIGSQL DB
Restore: Restore BIGSQL DB from an existing backup image
Transfer: Transfer an existing backup image to a remote machine

One of these options must be specified for the execution to continue.

IMPORTANT NOTE: As the Big SQL Backup & Restore Utility was developed to be used as part of the IBM Big SQL with IBM Big Replicate Active /Standby Disaster Recovery Solution, the restore operation (online and offline) will start the Big SQL service before performing the restore and stop the Big SQL service after a successful restore has completed.

Options
There are a number of options available when executing the bigsql_bar.py utility. All options except '-m' (mode) are optional and default values will be used if not explicitly specified.

-m mode

-r remote machine

-d backup directory

Note

-t timestamp

-c, --hcat

-o, --offline

-l, --local

-s, --skipcheck

-h, --help

Default Values

Option	Default
-m mode	No Default (execution will be aborted)
-r remote machine	No Default (image will not be transferred)
-d backup directory	'/var/ibm/bigsql/backups'
-t timestamp	The most recent backup available (online / offline)
-c, --hcat	HCAT_SYNC_OBJECTS will not be executed
-o, --offline	Online assumed
-l, --local	Remote backup images used
-s, --skipcheck	User verification sought when deactivating the BIGSQL database
-h, --help	NA

Logging & Troubleshooting
A complete log of the script execution is located at /var/ibm/bigsql/logs/bigsql_bar.log on the Big SQL primary Head Node (where the script is executed). All debug, information and error messages are written to this file.

Backup & Restore Directory Structure
Backup images will be stored in and fetched from either the default backup-directory (/var/ibm/bigsql/backups) or a backup-directory specified using the '-d' option. This  backup directory is the parent directory for all backup images. When executed, bigsql_bar.py will store offline and online backup images in sub-directories under this parent backup-directory, as shown below.

This directory structure is mirrored on both the primary and the standby clusters.

`<parent backup-directory>` `('/var/ibm/bigsql/backups')` `<offline_backups>` `<local_online_backups>` `<remote_online_backups>`

Note: It is the user's responsibility to ensure that there is sufficient diskspace available in the backup-directory and to manage any necessary housekeeping of these directories.

Backup Directory for BIGSQL DB Logs
Any existing database logs are backed-up before a restore is performed.  The directory used for this is: <backup directory>/logstream_backup/ (by default: /var/ibm/bigsql/backups/logstream_backup/)

bigsql_bar.py Utility Location
Currently the bigsql_bar.py utility is attached to this tech note and can be downloaded directly.

bigsql_bar.py Utility and Timestamps
When a BIGSQL backup is performed, the output will include something like the following:

Part Result ---- -------------------------------------------------------------- 0000 DB20000I The BACKUP DATABASE command completed successfully. Backup successful. The timestamp for this backup image is : 20161011093130

This output includes the timestamp associated with the backup image. This is also part of the backup image file name. For Example: BIGSQL.0.bigsql.DBPART000.20161011093130.001

When using bigsql_bar.py restore mode, you can specify the timestamp of a particular backup image to restore.
When using bigsql_bar.py transfer mode, you can specify the timestamp of a particular backup image to transfer.

If a timestamp is not specified during a restore or transfer operation, we default to the most recent backup image available in the backup-directory (e.g.: '/var/ibm/bigsql/backups/...' by default).
By default this will be an online backup image, however an offline image can be used by specifying the '-o, --offline' option.

HCAT_SYNC_OBJECTS
The HCAT_SYNC_OBJECTS stored procedure imports the definition of the Hive objects into the local database catalogs. This action makes the objects available for use within queries. For more details on this see HCAT_SYNC_OBJECTS.

The Backup and Restore utility, bigsql_bar.py, provides the option to automatically run HCAT_SYNC_OBJECTS after a successful restore using the '-c, --hcat ' option.
While this is off by default it's important to note that if there have been any DDL changes since the most recent available backup, HCAT_SYNC_OBJECTS should to be executed to ensure a fully up-to-date cluster (necessary as part of a failover). This can be performed as part of bigsql_bar.py execution using the '-c, --hcat' option or by running HCAT_SYNC_OBJECTS manually after a restore has completed.

Special consideration must be given to HCAT_SYNC_OBJECTS if impersonation is enabled for Big SQL. The HCAT_SYNC_OBJECTS execution as part of the bigsql_bar.py script does not currently support Big SQL impersonation. For more details on the use of HCAT_SYNC_OBJECTS with impersonation, see Impersonation in Big SQL.

Log File Management
In order to use an active / standby disaster recovery solution for Big SQL, the database must be made recoverable and database logging must be updated to use archive logging. That is, the logarchmeth1 database configuration parameter is changed to logretain during the initial offline backup. As database activity occurs, log files may accumulate and build up in the directory pointed to by the log file path database configuration parameter (usually set to: /var/ibm/bigsql/database/bigdb/bigsql/NODE0000/SQL00001/LOGSTREAM0000/) and it may be necessary to occasionally remove the archived log files that are no longer needed.

In order to remove the archived log files not needed for Big SQL disaster recovery, you can do the following:

1. Discover the value of the "First active log file" database configuration parameter. Files with a number lower than this file can be removed. For example:

`db2 get db cfg for bigsql \| grep -i "first active log file"` `First active log file = S0000245.LOG`

This returns the name of the first active log file. S0000245.LOG in our example above

2. Issue the 'db2 prune logfile...' command as follows:

`db2 prune logfile prior to` `<first active log file name>`

For our Example:

`db2 prune logfile prior to S0000245.LOG` `DB20000I The PRUNE command completed successfully.`

Note: The prune logfile command will NOT delete any active files even if the file name passed to it is more recent than the correct file name.

Example invocation of the bigsql_bar.py Utility
Backup

To perform an offline backup using all defaults, invoke bigsql_bar.py in the following way:

python bigsql_bar.py -m backup --offline

OR

python bigsql_bar.py -m backup -o

To perform an online backup using all defaults and without transferring the backup image to another cluster, invoke bigsql_bar.py in the following way:

python bigsql_bar.py -m backup

To perform an online backup, specifying a user-defined backup directory to use while using all other defaults, invoke bigsql_bar.py in the following way

python bigsql_bar.py -m backup -d /home/bigsql/my_backups

To perform an offline backup using all defaults and specifying a remote machine to transfer the backup image to (e.g.: primary Head Node of Big SQL Standby Cluster) , invoke bigsql_bar.py in the following way:

python bigsql_bar.py -m backup --offline -r <machine.domain.com>

OR

python bigsql_bar.py -m backup -o -r <machine.domain.com>

Transfer

To transfer the latest available offline image to the remote machine specified, invoke bigsql_bar.py in the following way:

python bigsql_bar.py -m transfer -r <machine.domain.com> --offline

OR

python bigsql_bar.py -m transfer -r <machine.domain.com> -o

To transfer the latest available online image to the remote machine specified, invoke bigsql_bar.py in the following way:

python bigsql_bar.py -m transfer -r <machine.domain.com>

To transfer the online image associated with the timestamp specified to the remote machine specified, invoke the utility in the following way:

python bigsql_bar.py -m transfer -r machine.domain.com -t 20161011093130

To transfer the latest available online image in the specified backup directory (/home/bigsql/mybackups22 in this case) to the remote machine specified, invoke the bigsql_bar.py utility in the following way:

python bigsql_bar.py -m transfer -r machine.domain.com --offline -d /home/bigsql/mybackups22

Restore

To perform a restore of the latest available offline backup image using all defaults, invoke the bigsql_bar.py utility in the following way:

python bigsql_bar.py -m restore --offline

OR

python bigsql_bar.py -m restore -o

To perform a restore of the latest available online backup image using all defaults, invoke the bigsql_bar.py utility in the following way:

python bigsql_bar.py -m restore

To perform a restore of the latest available online backup image using all defaults and skipping the user confirmation when deactivating the BIGSQL database, invoke the utility in the following way:

python bigsql_bar.py -m restore --skipcheck

OR

python bigsql_bar.py -m restore -s

To perform a restore of the latest online backup image available in the local backup subdirectory ('/var/ibm/bigsql/backups/local_online_backups/' by default ) using all defaults, invoke the bigsql_bar.py utility in the following way:

python bigsql_bar.py -m restore --local

OR

python bigsql_bar.py -m restore -l

To perform a restore of the latest available online backup image using all defaults and executing HCAT_SYNC_OBJECTS after, invoke bigsql_bar.py in the following way:

python bigsql_bar.py -m restore --hcat

OR

python bigsql_bar.py -m restore -c

To perform a restore of the online backup image associated with a specified timestamp and from a specified parent backup-directory (/home/bigsql/my_backups in this case) and execute HCAT_SYNC_OBJECTS after the restore completes, invoke the bigsql_bar.py utility in the following way:

python bigsql_bar.py -m restore -t 20161011093130 -d /home/bigsql/my_backups --hcat

bigsql_bar.py

[{"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Big SQL","Platform":[{"code":"PF016","label":"Linux"}],"Version":"4.2.0","Edition":"Enterprise Edition","Line of Business":{"code":"LOB10","label":"Data and AI"}},{"Product":{"code":"SSVPQ7","label":"IBM Big Replicate"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":" ","Platform":[{"code":"","label":""}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}},{"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":" ","Platform":[{"code":"","label":""}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Tips

Using IBM Big Replicate to implement an active / standby disaster recovery solution for IBM BigInsights Big SQL

Question & Answer

Question

Answer

Was this topic helpful?

Document Information

UID

Share your feedback

Need support?