IBM Support

Requirements for installing WebSphere Application Server on systems using high-availability failover filesystems

Question & Answer


Question

What does the WebSphere Application Server product require for running in an environment with high-availability failover?

Cause

What is an "external high-availability" solution?
WebSphere Application Server can be installed on remote storage, and mounted by a networked filesystem (NFS) as long as the environment meets certain requirements. This allows the product to be configured in a manner where a backup system mounts the product's files and takes over services if the primary hosting server experiences an outage.

WebSphere Application Server Network Deployment has built-in high-availability capabilities, such as the clustering features and WebSphere Plugin load-balancing capabilities. This is referred to as "cluster fail-over". This allows a group of servers to continue running applications even if a smaller number of servers suffer an outage. But, sometimes clients require an additional layer of protection from failures beyond what cluster fail-over can provide. They require an external high-availability solution.

In a typical high-availability configuration, clients implement a solution where WebSphere Application Server normally runs on a "primary" system. If the primary system experiences a hardware or network failure, then the client's high-availability software will activate a "standby" system (sometimes called a "warm standby", "hot standby", or "secondary system"). The standby system will take over running the application servers. This process is called "failover", where the operations on the primary system are said to "fail over" to running on the standby system.

The set of systems involved in the high-availability solution include the primary system and one or more standby systems. These systems are collectively referred to as the "high-availability cluster".

In a high-availability configuration, the product files are located on a high-availability networked filesystem, located separately from the primary and standby systems. This filesystem can potentially be shared among any system in the high-availability cluster. During the failover process, application server processes on the primary system are stopped, and the primary system is eventually disconnected from the networked filesystem where the product's files are located. The standby system will connect to the networked filesystem (the same product files that the primary system was using) and start the application servers to resume service to the applications affected by the primary system's outage.

In addition, requests (web traffic) which was being routed to the primary system is re-routed to the standby system. One way to accomplish this is to have high-availability systems in the cluster all share the same "virtual address". The high-availability solution determines which system actually receives traffic directed to the virtual address. This is achieved through a combination of software and/or high-availability network appliances.

These high-availability solutions use configuration techniques and software which are separate from the WebSphere Application Server product. Some solutions are available and documented by IBM (such as the WebSphere Application Server Network Deployment V6: High Availability Solutions Redbooks) , and others are available from third-party vendors. In either case, the solutions are implemented externally from WebSphere Application Server. This technote refers to these solutions as "external high-availability solutions". The solutions are not integrated with the product like the clustering features or WebSphere Plugin load-balancing features are.

How are these solutions supported?
Since external high-availability solutions are not integrated with the WebSphere Application Server product, IBM WebSphere Application Server Support cannot provide specific advice or instructions for using the product with every external high-availability solution that is compatible with the product. Instead, IBM Support can provide guidelines that explain which solutions are acceptable to use with WebSphere Application Server.

The purpose of this technote is to provide guidelines for the process of installing WebSphere Application Server and configuring profiles on highly available systems.

Answer

Filesystem requirements
There are some requirements for installing and running WebSphere Application Server on a networked filesystem. In general, the networked filesystem must be configured to be consistently available, with appropriate file permissions and supporting processes (such as lock daemon processes and port-mapping processes). The system administrator must be entirely responsible for the configuration and troubleshooting of the networked filesystem. For more details about WebSphere Application Server support of networked filesystems, refer to the technote General guidelines for installation of WebSphere Application Server V6.1 and V7.0 on Networked Filesystems (NFS), Mapped Network Drives, or Storage Area Network (SAN) resources.


High-availability solution requirements
The system administrator must be familiar with the high-availability solution that WebSphere Application Server is installed into. The system administrator will be responsible for configuring the high-availability software and ensuring that it is capable of starting and stopping application servers when necessary. The system administrator will need to provide troubleshooting expertise if the high-availability solution encounters issues.

A Redbooks is available from IBM which can assist with these scenarios. Refer to the Redbooks, WebSphere Application Server Network Deployment V6: High Availability Solutions.


WebSphere Application Server installation requirements

  • Modification to wasprofile.properties, profileRegistry.xml, NIFRegistry.properties, setupCmdLine.sh, fsdb files, or variables.xml:
    Do not edit wasprofile.properites, profileRegistry.xml, NIFRegistry.properties, setupCmdLine.sh (or.bat), variables.xml, or any files in the WAS_HOME/properties/fsdb directory to compensate for issues with file locations in a high-availability cluster. IBM Support cannot provide assistance for installing, creating/augmenting profiles, or updating the product with fix packs or patches if any of those files have been modified in an effort to configure a high-availability solution.

    Note: Some publications concerning WebSphere Application Server and high-availability solutions advocate modifications to those files. Making modifications to those files can simplify the process of installing the product on a large number of systems and facilitate cloning configuration data between multiple systems. But, editing those files also causes significant untested changes to the behavior of the product's installation and profile tools, which negatively impacts IBM Support's ability to provide service and diagnose issues. Any solutions published by IBM which advocate those modifications make it clear that the solution is not fully supported through IBM Support. However, publications by third-parties might not make this clear. If you are following instructions provided by a third-party, verify whether those instructions require editing those files.

  • Use consistent users to install the product initially and update it with fix packs:
    The user which installed WebSphere Application Server product must also be the same user that applies maintenance updates and feature packs to the product. In other words, make sure you use the same user ID to run the product installer, Update Installer, and/or Installation Manager for the entire lifetime of the product. (This is a standard requirement which applies to non-high-availability environments as well.)

    It is possible to run application servers as a different user than the one that installed and updated the product. Configuring the file permissions and application servers to allow this is outside the scope of this article; refer to articles such as Managing profiles for non-root users and the "Run As User" section of Process execution settings.

  • Execute the installer and update utilities on one system (the "primary" system):
    If the product is only installed once in a cluster of high-availability systems (which means that the standby systems use that same install), then the product's installer must be executed on the primary system. When the product is updated with maintenance packs or feature packs, then the Update Installer or Installation Manager must also be executed on the primary system.

  • Be sure to backup the installer registry files such as ".nifregistry" or Installation Factory files:
    The V7.0 product installer maintains a file named ".nifregistry". All product installations by the same user share information in this file. This file can be created in a specific directory if the installer is run as the root user, or it can be created in a subdirectory of the user's home directory if the installer is run as a non-root user. Refer to the Installation MustGather for information about where to find the .nifregistry file. Be sure to include this file as part of the high-availability cluster's backup plan.

    The Installation Manager utility maintains a set of data called the "IM Agent Data", or "IM_DATA". It is absolutely essential to maintain backups of this data. It is helpful to maintain historical backups of this data up to two years in the past. For information about where this data is located by default, refer to the Information Center for the current version of Installation Manager. The relevant article for Installation Manager V1.5.x is here.

    Note: The application servers do not need access to this file when they are running. In fact, the application server runtime is unaware that the ".nifregistry" file or IM_DATA exists at all. So, this data does not necessarily need to be present on a failover system. However, the profile creation process, Update Installer, Feature Pack installer, and other installation-related utilities require this file to remain intact throughout the lifetime of the product. If this data is lost, it might no longer be possible to feature packs or other updates to the product. You may need to reinstall the entire set of products in order to recover from the loss of this data. In order to make sure this file is never lost, be absolutely sure to include this file in the backup plan for the product.

  • You can clone Virtual Machines with the product installed, but do not clone profiles:

    Editor: update section to reference new guidelines on cloning; i.e., standalone profiles can be cloned when nodename and hostname are changed (along with other restrictions such as cellname not being changed)

    If the product is installed on a virtual machine (a "VM"), it is acceptable to clone that virtual machine with the product installation intact. The product installation on the clone will be supported. However, there are some caveats associated with doing this. Please continue to read below.

    In this discussion, "cloning a virtual machine" assumes that you are using a virtual machine manager software (such as VMWare) which is capable of taking a virtual machine and cloning its system configuration and its filesystems. This means that the cloned system uses the same virtualized system "hardware" and the same filesystems. The clone's filesystems contain that all the product files, file permissions, symbolic links, and other associated data (like the .nifregistry file). Since the clone is a thorough, identical copy of the original, the product installation on the clone is supported.

  • Do not clone the product files by simple directory copying:
    Do not clone the WebSphere Application Server installation directory in order to use the product in other high-availability environments. (Do not simply copy all the files from one filesystem to another filesystem.) The WebSphere Application Server product needs to be installed using the product installer on the primary system in each high-availability cluster.

    If you choose to use the "Multiple product installation per cluster" method (described below), you will also need to install the product on the standby systems in the high-availability cluster.

  • Avoid concurrent file access from multiple systems:
    Do not allow multiple systems to execute the product files concurrently. WebSphere Application Server does not support multiple systems accessing a single install simultaneously. Installing the product once on a shared networked filesystem, then allowing multiple systems to run application servers from that single set of files simultaneously, is not supported.

    Note: It is acceptable for a single physical system running virtualization software (such as AIX WPARs or Solaris Zones) to allow virtualized clients on the same system to have read-only access to product files shared locally on the host partition. This is acceptable because the host system is using loopback filesystems to access the same files simultaneously, which provides proper file-locking facilities.

Product cloning guidelines
It is common for the product to be used in virtual machines (such as AIX WPARs or VMware virtual machines) in high availability environments. Virtualization features allow an entire virtual machine (a "VM") to be cloned. The concept of cloning VM systems and cloning product configurations is outside the scope of this article; however, there are some guidelines that are important to be aware of when you are setting up a high-availability environment:


Profile management and product update requirements

In this discussion, the "product binaries" refer to the set of files which is delivered by the product installer, and the "profiles" refer to the sets of configuration files which are generated by the profile creation process. The profiles can be located separately from the product binaries.
  • Profile creation guidelines:
    When creating a profile, be sure to specify the high-availability cluster's virtual address (fully-qualified hostname or IP address) as the hostname associated with the profile. Also, use the high-availability virtual address to refer to the deployment manager's hostname when federating nodes to the deployment manager. (For example, when using the addNode command, specify the virtual address that the deployment manager uses, not the actual address of that system.)

    Further guidelines about profile creation and augmentation are specified in other sections of this technote.

  • Application servers and other profile utilities require read/write access to the profile and read-only access to the binaries:
    In general, any operations run from within a profile require read/write access to the entire profile directory, but only require read access to the binaries. This includes running an application server or node agent.

    Profile creation and augmentation requires read/write access to both the profile and product binaries.

  • All profiles must be available and writeable when a fix pack is applied:
    The Update Installer and Installation Manager require read/write access to the entire product binaries directory. In addition, the Update Installer and Installation Manager require read/write access to every profile directory which is associated with the product. The Update Installer and Installation Manager might deliver configuration changes to every profile while installing fixes, so they must have access to every profile during the process.

    WebSphere Application Server V6.1 and V7.0 do not support manual methods of delivering these configuration updates. So, the process of delivering configuration updates to profiles cannot be separated from the process of installing a fix pack.

  • Do not clone profiles in order to create new ones:
    Do not clone profiles by copying files from one directory to another. Profiles are only supported if they are created by the Profile Management Tool or the manageprofiles utility.


Installation methods in high-availability environments
There are generally two methods of installing the WebSphere Application Server product in a high availability environment.

In one method, the product is installed on the primary system only, and its files are installed onto the NFS. The high-availability cluster only maintains one installation of the product binaries. This is called the "Single product installation per cluster" method. This method has the advantage of only requiring one set of product binaries to be maintained.

Note that the "Single product installation per cluster" method applies to scenarios where each individual node in a WebSphere Application Server cell has its own high-availability cluster.

In the other method, the product is installed onto the primary system and all of the standby systems. A separate product installation is maintained for each system in the high-availability cluster. Each product installation shares the same common set of profiles. This is called the "Multiple product installation per cluster" method. This method has the advantage of providing opportunities to avoid downtime due to failed fix pack updates or file corruption of the product binaries. However, it has the disadvantage of requiring installations and profile creation tasks to be performed once for each system in the cluster, and the data resulting from these operations cannot be cloned to save time. Due to the large number of redundant steps required to set up this environment, IBM Support does not recommend using this installation method.

In both methods, only one set of profiles is maintained in the high-availability cluster. To be clear, it is certainly possible to create and run many profiles. You are not restricted to using just one profile. But, that set of profiles is not cloned by the high-availability software. (If a WebSphere administrator is interested in cloning configurations among different profiles, then the administrator would take advantage of the server clustering feature in Network Deployment edition to clone multiple servers in a single cell. If the Network Deployment edition is not being used, or if one high-availability environment is using multiple cells, then server cloning features between members of different cells is not supported.)


Requirements for "Single product installation per cluster" method
If the product binaries are installed only once in a single high-availability cluster, then it is acceptable for that single install to run application servers from any member of the high-availability cluster, as long as only one high-availability cluster member is active at a time. Be mindful of these additional requirements in that configuration:
  • Perform product updates from the primary system only:
    The Update Installer and Installation Manager, which are responsible for applying fix packs and patches, should only be run from the primary system. Do not run the Update Installer or Installation Manager while the high-availability cluster is in failover mode.

  • Perform profile creation and augmentation from the primary system only:
    The profile tools, such as the Profile Management Tool and manageprofiles utility, which are responsible for creating, deleting, backing up, and augmenting profiles, should only be run from the primary system. Do not create, delete, or augment profiles when the high-availability cluster is in failover mode.

  • Do not allow a high-availability cluster member to start an application server if that same application server is running on another member:
    If an application server is running on a high-availability cluster member, then do not try to start up that same application server from another cluster member. Typically, the high-availability management software will facilitate this by making sure that only one system in the high-availability cluster is attempting to access the shared filesystem at any given time.

  • A profile only needs to be created once:
    The "Single product installation per cluster" method only requires you to run the profile creation process once per profile. You don't need to create multiple copies of a single profile.

Requirements for "Multiple product installation per cluster" method
If the product binaries are installed on each individual system in a high-availability cluster, then it is acceptable for all the product installations to share the same set of profiles, as long as only one high-availability cluster member is active at a time. Be mindful of these additional requirements in that configuration:
  • Run the installer on every cluster member:
    Run the product installer on each individual cluster member in the high-availability cluster. Do not install the product once, then copy that directory to multiple systems.

  • Run the profile creation process for every cluster member:
    It is necessary to run the profile creation tool once for each member of the high-availability cluster. After creating the profile on one system, delete the new profile's directory, then create the profile using exactly the same parameters on a standby system. This is redundant, but necessary to ensure that each set of product binaries contain proper references to the profile.

    Note: The IBM Redbooks mentioned earlier in this article does not explain this process. As a result, IBM Support has published an addendum which explains this process the technote Addendum to High Availability solution for Network Deployment using HACMP from IBM Redbookss Publication SG24-6688-00.

  • Do not use V6.1 Feature Packs:
    The "Multiple product installation per cluster" method is not compatible with WebSphere Application Server V6.1 Feature Packs due to the requirement that a profile be augmented in order to take advantage of the feature pack. Although it is possible to augment a profile once using the primary system, the standby systems would not be aware of the augmentation. A profile can only be augmented once, so it is not possible to repeat the augmentation procedure for each standby system.

    Note: V7.0 Feature Packs are compatible with the "Multiple product installation per cluster" method. Be sure to run the augmentation process for every profile in every high-availability cluster member.

  • High-availability cluster members can temporarily run at different patch levels, but update them as soon as possible:
    It is possible to update the primary system product binaries with a fix pack or patch, and not update the standby systems. If the high-availability system fails over from the primary system to a standby system, then the product will be running at a different patch level. For the purposes of failover, the profiles typically tolerate this change in versions, however this is not recommended. It is best practice to update all standby systems to the same patch level as the primary system as quickly as possible.

  • Do not apply updates to any cluster member while the product is running an application server anywhere in the cluster:
    Do not apply fix packs or patches using the Update Installer or Installation Manager while another system in the high-availability cluster is running application servers. The Update Installer or Installation Manager might modify the profiles, and these modifications are not supported while servers are running.

    Note: This restriction means that the system administrator cannot put the high-availability cluster into failover mode while applying updates to the primary system. When a product's binaries are updated, none of its associated profiles are allowed to be running. The section, "Requirements for minimizing downtime during product upgrades" below explains this requirement in greater detail.

Requirements for minimizing downtime during product upgrades
If an environment must be configured such that the applications experience zero downtime during product upgrades, then avoid the "Multiple product installation per cluster" method. It is not possible to put the high-availability cluster into failover mode and apply product updates to the primary system, because the fix pack installer needs exclusive access to every profile during the fix pack install process. It cannot update profiles which are currently running application servers. In order to apply updates, the servers associated with a profile must be completely shut down, regardless of which system the profile is running from.

The best way to minimize downtime during product updates is to use WebSphere Application Server Network Deployment edition to maintain a cell containing two or more nodes. Those nodes can use WebSphere clustering to provide redundancy while some nodes in the cluster are shut down for upgrades. For some clients, this WebSphere clustering redundancy is a sufficient high-availability solution, and they do not need to resort to external high-availability. However, some clients do absolutely require external high-availability. In that case, the deployment manager and each node/profile should have a dedicated high-availability cluster. This provides both WebSphere clustering redundancy as well as external high-availability clustering redundancy.

Here is an example of an environment which provides both WebSphere clustering and external high-availability clustering:
  • The deployment manager has its own primary and standby systems in a high-availability cluster. WebSphere Application Server is installed once on the primary system in this high-availability cluster.

  • Node One has its own primary and standby systems in a separate high-availability cluster. WebSphere Application Server is installed once on the primary system in this high-availability cluster; it is a separate cluster than the deployment manager high-availability cluster.

  • Node Two has its own primary and standby systems in a separate high-availability cluster. WebSphere Application Server is installed once on the primary system in this high-availability cluster; it is a separate cluster than the deployment manager and Node One high-availability clusters.

  • Nodes One and Two contain WebSphere-clustered application servers which both run the same client application. Both nodes accept requests for the the client application. If Node One becomes unavailable due to a maintenance outage, then Node Two will handle the burden of all requests for the application.

  • If Node One becomes unexpectedly unavailable due to a hardware failure, then Node Two can temporarily handle the burden of all requests for the application. The high-availability configuration will fail over Node One from the broken primary system to a working standby system. Node One will resume operations on the standby system.

  • This scenario can scale for multiple applications, multiple application servers, multiple WebSphere cluster members in the same cell, and multiple standby systems in the high-availability cluster. At a minimum, a cell should be designed so that if one or two nodes are taken down, the remaining nodes can temporarily handle the increased load per node.

  • This scenario will not work if Node One, Node Two, or any other WebSphere nodes share the same product installation files in the same high-availability cluster members.

When applying product upgrades, follow the standard procedure for rolling-out product updates in a multi-node cell. This process is outlined below:
  1. Shut down the deployment manager. Note that the nodes running the application servers can continue to run, because they do not require the deployment manager during normal runtime.

  2. Make sure that the deployment manager filesystem is backed up using your standard filesystem backup process.

  3. Using the Update Installer (or Installation Manager, if applicable), apply product updates to the deployment manager. If the process fails, then use the Update Installer or Installation Manager to roll-back the fix pack update.

  4. After the deployment manager is upgraded, then choose a high-availability cluster member to upgrade. Shut down all application servers running on that high-availability cluster member. Do not allow other high-availability members to run those same application server during this process.

  5. Make sure the node's filesystem is backed up using your standard filesystem backup process.

  6. Apply the product updates to that node. Once the process successfully completes, the application servers can be restarted.

  7. Repeat steps 4, 5, and 6 for each high-availability cluster member to upgrade other product installations.


Notes concerning filesystem backups and recovery from failed product updates
IBM Support highly encourages using a backup solution in high-availability environments. This provides further redundancy by allowing the system administrator to roll a system back to a previous state if the product becomes corrupt or if an upgrade process fails.

If a product update fails (such as encountering a failure while installing a fix pack using Update Installer or Installation Manager), then there are two options for recovery: Use the Update Installer or Installation Manager to roll-back the failed fix pack update, or recover the product's files from a system backup.

If a system administrator chooses to recover the product's files from backup, be sure to recover the entirety of the product binaries, all the profile files, and the ".nifregistry" file mentioned earlier in this technote. If backups of these items are not available, then use the Update Installer or Installation manager to roll-back the fix instead.

Also, if a system administrator chooses to recover the product's files from backup due to a bad fix pack upgrade, then be sure to delete the product's old files before recovering the backup files. This is important because the process of installing fix packs will always introduce new files to the product. Those new files must never be allowed to mix with an older version of the product; otherwise, application servers might encounter serialization and classloading errors, and it might also be impossible to install the fix pack again. In other words, when recovering files from backup, do not simply overlay the backup files on top of the existing files. Delete the existing product directories first, then recover the files from backup.

Notes concerning product licensing
For questions regarding the impact of using high-availability configurations on product licensing and the number of Processor Value Units (PVUs) consumed by standby systems, refer to your IBM Salesperson or IBM Account Representative. IBM Support encourages clients to direct all questions regarding contract details and license consumption to those individuals.

[{"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"High Availability (HA)","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"}],"Version":"7.0;6.1","Edition":"Base;Network Deployment","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
15 June 2018

UID

swg21419214