Question & Answer
Question
How do I use IBM BigInsights Big SQL with IBM Big Replicate to provide an active/standby disaster recovery solution?
Answer
This document describes how Big SQL and IBM Big Replicate can be used together to provide an active/standby disaster recovery strategy for Big SQL users.
This strategy supports a multi-cluster configuration with one cluster acting as a typical active Big SQL cluster and one operating as a standby Big SQL cluster (with Big SQL service stopped until needed). In this way, IBM Big Replicate and Big SQL backup and restore work together to ensure all necessary data is replicated and available on each cluster. This setup provides a Big SQL disaster recovery solution that can be utilized in the event of critical failures of the Big SQL Active Cluster.
Requirements and Prerequisites
- Big SQL
Big SQL must be installed on both clusters with a consistent configuration - Same bigsql_db_path
- Same bigsql_data_directories
- Same number of Big SQL worker nodes
- Consistent numbering for each Big SQL worker node in the db2nodes.cfg file
- Big Replicate
Big Replicate should be installed and configured as outlined in the Big Replicate documentation. - Big SQL High Availability (HA) and Backup / Restore
This solution is compatible with Big SQL HA. Big SQL HA can be enabled, and remain enabled during BIGSQL backup operations (Big SQL Active Cluster), however, currently Big SQL HA must be disabled during a restore ( Big SQL Standby Cluster). This does not impact the current Big SQL with Big Replicate active/standby solution. The Big SQL service on the Big SQL Standby Cluster will be stopped during normal operation and only started during an actual failover situation or when performing a restore. After an actual failover to the Big SQL Standby Cluster, HA can then be enabled if necessary.
Solution Architecture
This section will outline the core architecture of the Big SQL active/standby disaster recovery solution.
- Big Replicate Data Replication
IBM Big Replicate allows you to migrate Hadoop data across multiple environments, distributions, and hybrid deployments. For more information on IBM Big Replicate refer to IBM Big Replicate.
Big SQL Database Replication (Backup and Restore)
The process for Big SQL data replication utilises a database backup and restore method. Here, backup and restore operations are used to keep the BIGSQL DB data safe and readily available on multiple clusters within our Big SQL active/standby disaster recovery solution.
With IBM Big SQL and IBM Big Replicate installed and configured appropriately on both clusters, it is then necessary to schedule regular backups of the BIGSQL database on the Big SQL Active Cluster's primary Head Node and transfer the generated backup images to the Big SQL Standby Cluster's primary Head Node. On the Big SQL Standby Cluster, these backup images are then used to either regularly restore the BIGSQL database, so that the Big SQL Standby Cluster is consistently up-to-date and ready for a failover, or to perform a single restore of the latest backup image only when necessary as part of a user defined failover procedure.
The bigsql_bar.py utility is provided to assist with this process, automating all necessary backup and restore operations.
- Note: An initial offline backup and restore of the offline image must be performed before proceeding with regular online backups of the BIGSQL database.
For more background information on backup and restore refer to the existing documentation on BACKUP DATABASE command and RESTORE DATABASE command
Typical Big SQL Backup & Restore Usage Scenario (as part of Disaster Recovery Solution)
This section describes the typical usage scenario.
Big SQL Active Cluster
- Take the necessary initial offline backup (using bigsql_bar.py)
- Perform subsequent online backups regularly (using bigsql_bar.py - possibly as a cron job)
- Transfer the backup images to the Big SQL Standby Cluster (using bigsql_bar.py)
Big SQL Standby Cluster
- Start Big SQL service
- Restore initial offline backup (using bigsql_bar.py)
- Perform subsequent restores of online backups (using bigsql_bar.py - possibly as a cron job)
- Execute HCAT_SYNC_OBJECTS (using bigsql_bar.py, or manually as necessary)
- Enable HA as part of failover scenario, if necessary
- Stop Big SQL service OR start using cluster as Big SQL Active Cluster (failover)
Note: It is the user's responsibility to ensure that any jars containing custom UDFs or SerDes are synchronized between the Big SQL Active Cluster and the Big SQL Standby Cluster as the backup and restore solution does not manage these.
Big SQL Backup & Restore Utility (bigsql_bar.py)
The bigsql_bar.py utility is provided to assist with the process of taking regular database backups and restoring on a second cluster. The script works by automating the backup, image transfer and restore operations.
Requirements and Prerequisites
- User
The bigsql_bar.py utility must be executed as the configured bigsql service user.
- Passwordless SSH
To implement a Big SQL active/standby disaster recovery solution, the generated backup images must be regularly transferred from the Big SQL Active Cluster to the Big SQL Standby Cluster.
To automatically transfer backup images from the Big SQL Active Cluster to the Big SQL Standby Cluster when using the bigsql_bar.pyscript, passwordless ssh must be configured between the Big SQL Active Cluster's primary Head Node and the Big SQL Standby Cluster's primary Head Node.
If passwordless ssh is not configured, it will be necessary to use your own preferred mechanism, outside of the bigsql_bar.py utility, to transfer backup images to the Big SQL Standby Cluster's primary Head Node.
- Where to Run the Big SQL Backup and Restore Utility
The bigsql_bar.py utility must be executed on the Big SQL Head Node from any directory. In cases where Big SQL HA is enabled, the utility must be executed on the Big SQL primary Head Node.
Usage
The bigsql_bar.py utility can be executed in one of three modes:
- Backup: Take a backup of the BIGSQL DB
- Restore: Restore BIGSQL DB from an existing backup image
- Transfer: Transfer an existing backup image to a remote machine
IMPORTANT NOTE: As the Big SQL Backup & Restore Utility was developed to be used as part of the IBM Big SQL with IBM Big Replicate Active /Standby Disaster Recovery Solution, the restore operation (online and offline) will start the Big SQL service before performing the restore and stop the Big SQL service after a successful restore has completed.
Options
There are a number of options available when executing the bigsql_bar.py utility. All options except '-m' (mode) are optional and default values will be used if not explicitly specified.
- The script accepts the following options:
-m mode
Required - Valid values are: 'Backup', 'Restore' or 'Transfer'
-r remote machine
Optional - Machine to transfer backup image to (e.g.: Big SQL Active Cluster's Head Node)
-d backup directory
Optional - Directory for backup images (Default: /var/ibm/bigsql/backups)
Note: Ensure there is sufficient space available in the backup directory to store the generated backup images.
-t timestamp
Optional - Timestamp of Backup Image (Default: timestamp for latest available image)
This option is valid with restore or transfer modes only.
-c, --hcat
Optional - Run HCAT_SYNC_OBJECTS after a restore
-o, --offline
Optional - Force offline option
This option is valid with restore, backup or transfer modes.
-l, --local
Optional - Use local backup images for restore (ie. Images in '<backup directory>/local_online_backups')
-s, --skipcheck
Optional - Skips the interactive check during a restore.
Enables restore operations to be run as a cron-job by not looking for user verification when deactivating the BIGSQL database.
-h, --help
Optional - Display the help menu
- Default Values
Option | Default |
-m mode | No Default (execution will be aborted) |
-r remote machine | No Default (image will not be transferred) |
-d backup directory | '/var/ibm/bigsql/backups' |
-t timestamp | The most recent backup available (online / offline) |
-c, --hcat | HCAT_SYNC_OBJECTS will not be executed |
-o, --offline | Online assumed |
-l, --local | Remote backup images used |
-s, --skipcheck | User verification sought when deactivating the BIGSQL database |
-h, --help | NA |
Logging & Troubleshooting
A complete log of the script execution is located at /var/ibm/bigsql/logs/bigsql_bar.log on the Big SQL primary Head Node (where the script is executed). All debug, information and error messages are written to this file.
Backup & Restore Directory Structure
Backup images will be stored in and fetched from either the default backup-directory (/var/ibm/bigsql/backups) or a backup-directory specified using the '-d' option. This backup directory is the parent directory for all backup images. When executed, bigsql_bar.py will store offline and online backup images in sub-directories under this parent backup-directory, as shown below.
This directory structure is mirrored on both the primary and the standby clusters.
<parent backup-directory> ('/var/ibm/bigsql/backups') <offline_backups> <local_online_backups> <remote_online_backups> |
Note: It is the user's responsibility to ensure that there is sufficient diskspace available in the backup-directory and to manage any necessary housekeeping of these directories.
Backup Directory for BIGSQL DB Logs
Any existing database logs are backed-up before a restore is performed. The directory used for this is: <backup directory>/logstream_backup/ (by default: /var/ibm/bigsql/backups/logstream_backup/)
bigsql_bar.py Utility Location
Currently the bigsql_bar.py utility is attached to this tech note and can be downloaded directly.
bigsql_bar.py Utility and Timestamps
When a BIGSQL backup is performed, the output will include something like the following:
Part Result ---- -------------------------------------------------------------- 0000 DB20000I The BACKUP DATABASE command completed successfully. Backup successful. The timestamp for this backup image is : 20161011093130 |
This output includes the timestamp associated with the backup image. This is also part of the backup image file name. For Example: BIGSQL.0.bigsql.DBPART000.20161011093130.001
- When using bigsql_bar.py restore mode, you can specify the timestamp of a particular backup image to restore.
- When using bigsql_bar.py transfer mode, you can specify the timestamp of a particular backup image to transfer.
If a timestamp is not specified during a restore or transfer operation, we default to the most recent backup image available in the backup-directory (e.g.: '/var/ibm/bigsql/backups/...' by default).
By default this will be an online backup image, however an offline image can be used by specifying the '-o, --offline' option.
HCAT_SYNC_OBJECTS
The HCAT_SYNC_OBJECTS stored procedure imports the definition of the Hive objects into the local database catalogs. This action makes the objects available for use within queries. For more details on this see HCAT_SYNC_OBJECTS.
The Backup and Restore utility, bigsql_bar.py, provides the option to automatically run HCAT_SYNC_OBJECTS after a successful restore using the '-c, --hcat ' option.
While this is off by default it's important to note that if there have been any DDL changes since the most recent available backup, HCAT_SYNC_OBJECTS should to be executed to ensure a fully up-to-date cluster (necessary as part of a failover). This can be performed as part of bigsql_bar.py execution using the '-c, --hcat' option or by running HCAT_SYNC_OBJECTS manually after a restore has completed.
Special consideration must be given to HCAT_SYNC_OBJECTS if impersonation is enabled for Big SQL. The HCAT_SYNC_OBJECTS execution as part of the bigsql_bar.py script does not currently support Big SQL impersonation. For more details on the use of HCAT_SYNC_OBJECTS with impersonation, see Impersonation in Big SQL.
Log File Management
In order to use an active / standby disaster recovery solution for Big SQL, the database must be made recoverable and database logging must be updated to use archive logging. That is, the logarchmeth1 database configuration parameter is changed to logretain during the initial offline backup. As database activity occurs, log files may accumulate and build up in the directory pointed to by the log file path database configuration parameter (usually set to: /var/ibm/bigsql/database/bigdb/bigsql/NODE0000/SQL00001/LOGSTREAM0000/) and it may be necessary to occasionally remove the archived log files that are no longer needed.
In order to remove the archived log files not needed for Big SQL disaster recovery, you can do the following:
1. Discover the value of the "First active log file" database configuration parameter. Files with a number lower than this file can be removed. For example:
db2 get db cfg for bigsql | grep -i "first active log file" First active log file = S0000245.LOG |
This returns the name of the first active log file. S0000245.LOG in our example above
2. Issue the 'db2 prune logfile...' command as follows:
db2 prune logfile prior to <first active log file name> |
For our Example:
db2 prune logfile prior to S0000245.LOG DB20000I The PRUNE command completed successfully. |
Note: The prune logfile command will NOT delete any active files even if the file name passed to it is more recent than the correct file name.
Example invocation of the bigsql_bar.py Utility
Backup
- python bigsql_bar.py -m backup --offline
OR
python bigsql_bar.py -m backup -o
- python bigsql_bar.py -m backup
python bigsql_bar.py -m backup -d /home/bigsql/my_backups
python bigsql_bar.py -m backup --offline -r <machine.domain.com>
OR
python bigsql_bar.py -m backup -o -r <machine.domain.com>
Transfer
- python bigsql_bar.py -m transfer -r <machine.domain.com> --offline OR
python bigsql_bar.py -m transfer -r <machine.domain.com> -o
- python bigsql_bar.py -m transfer -r <machine.domain.com>
- python bigsql_bar.py -m transfer -r machine.domain.com -t 20161011093130
python bigsql_bar.py -m transfer -r machine.domain.com --offline -d /home/bigsql/mybackups22
Restore
- python bigsql_bar.py -m restore --offline
OR
python bigsql_bar.py -m restore -o
python bigsql_bar.py -m restore
- python bigsql_bar.py -m restore --skipcheck
OR
python bigsql_bar.py -m restore -s
- python bigsql_bar.py -m restore --local
OR
python bigsql_bar.py -m restore -l
- python bigsql_bar.py -m restore --hcat
OR
python bigsql_bar.py -m restore -c
- python bigsql_bar.py -m restore -t 20161011093130 -d /home/bigsql/my_backups --hcat
Was this topic helpful?
Document Information
Modified date:
18 July 2020
UID
swg21994020