Security considerations when in a multi-node WebSphere Application Server environment

WebSphere® Application Server Network Deployment supports centralized management of distributed nodes and application servers. This support inherently brings complexity, especially when security is included. Because everything is distributed, security plays an even larger role in ensuring that communications are appropriately secure between application servers and node agents, and between node agents (a node-specific configuration manager) and the deployment manager (a domain-wide, centralized configuration manager).

Before you begin

[AIX Solaris HP-UX Linux Windows][IBM i]Because the processes are distributed, the authentication mechanism that must be used is Lightweight Third Party Authentication (LTPA). The LTPA tokens are encrypted, signed and forwardable to remote processes. However, the tokens have expirations. The SOAP connector, which is the default connector, is used for administrative security and does not have retry logic for expired tokens. However, the protocol is stateless so a new token is created for each request if there is not sufficient time to run the request with the given time remaining in the token. An alternative connector is the RMI connector, which is stateful, and has some retry logic to correct expired tokens by resubmitting the requests after the error is detected. Also, because tokens have time-specific expiration, the synchronization of the system clocks is crucial to the proper operation of token-based validation. If the clocks are off by too much (approximately 10-15 minutes), you can encounter unrecoverable validation failures that can be avoided by having them in sync. Verify that the clock time, date, and time zones are all the same between systems. It is acceptable for nodes to be across time zones, provided that the times are correct within the time zones (for example, 5 PM CST = 6 PM EST, and so on).

[z/OS]Because the processes are distributed, an authentication mechanism must be selected that supports an authentication token such as Lightweight Third Party Authentication (LTPA). The tokens are encrypted, signed and forwardable to remote processes. However, the tokens have expiration times which are set on the WebSphere Application Server administrative console. The SOAP connector which is the default connector, is used for administrative security and does not have retry logic for expired tokens. However, the protocol is stateless so a new token is created for each request if there is not sufficient time to run the request with the given time remaining in the token. An alternative connector is the Remote Method Invocation (RMI) connector, which is stateful, and has some retry logic to correct expired tokens by resubmitting the requests after the error is detected. Also, because tokens have time-specific expiration, the synchronization of the system clocks is crucial to the proper operation of token-based validation. If the clocks are off by too much (approximately 10-15 minutes), you can encounter unrecoverable validation failures that can be avoided by having them in sync. Verify that the clock time, date, and time zones are all the same between systems. It is acceptable for nodes to be across time zones, provided that the times are correct within the time zones (for example, 5 PM CST = 6 PM EST, and so on).

Deprecated feature: Support for the RMI connector is deprecated. Use JSR160RMI connectors instead of RMI connectors.
[z/OS]You have additional considerations with Secure Sockets Layer (SSL). WebSphere Application Server for z/OS® can use Resource Access Control Facility (RACF®) keyrings to store the keys and the truststores that are used for SSL, but different SSL protocols are used internally. You must be sure to set up both:
  • A system SSL repertoire for use by the web container
  • A Java™ Secure Sockets Extension (JSSE) SSL repertoire for use by the SOAP HTTP connector if the SOAP connector is used for administrative requests
Verify that the keystores and truststores that you configure are set up to trust only the servers to which they communicate. Make sure they do include the necessary signer certificates from those servers in the trust files of all servers in the domain. When using a certificate authority (CA) to create personal certificates, it is easier to ensure that all servers trust one another by having the CA root certificate in all the signers.

[z/OS]The WebSphere z/OS Profile Management Tool or the zpmt command uses the same certificate authority to generate certificates for all servers within a given cell, including those of the node agents and the deployment manager.

About this task

Consider the following issues when using or planning for a WebSphere Application Server Network Deployment environment.

Procedure

  • When attempting to run system management commands such as the stopNode command, explicitly specify administrative credentials to perform the operation. Most commands accept -user and -password parameters to specify the user ID and password, respectively. Specify the user ID and password of an administrative user; for example, a user who is a member of the console users with Operator or Administrator privileges or the administrative user ID that is configured in the user registry.
    An example of the stopNode command follows:

    [AIX Solaris HP-UX Linux Windows][IBM i]stopNode -username user -password pass

    [z/OS]stopNode.sh -username user -password pass

  • Verify that the configuration at the node agents is always synchronized with the deployment manager prior to starting or restarting a node. To manually get the configuration synchronized, issue the syncNode command from each node that is not synchronized. To synchronize the configuration for node agents that are started, click System Administration > Nodes. Select all the started nodes, and then click Synchronize.
  • [AIX Solaris HP-UX Linux Windows][IBM i]Verify that the clocks on all systems are in sync, including the time and date. If they are out of sync, the tokens expire immediately when they reach the target server due to the time differences. Coordinated Universal Time (UTC) is used by default, and all other machines must have the same UTC time. Consult your operating system documentation for information regarding how to ensure this.
  • Verify that the LTPA token expiration period is long enough to complete your longest downstream request. Some credentials are cached and therefore the timeout does not always include the length of the request. Specifically for cached credentials, you might need to evaluate your settings for the security cache (WSSecureMap) and LTPA timeout.
  • The administrative connector that is used by default for system management is SOAP. SOAP is a stateless HTTP protocol. For most situations, this connector is sufficient. If you have a problem using the SOAP connector, you might want to change the default connector on all the servers from SOAP to RMI. The RMI connector uses Common Secure Interoperability Version 2 (CSIv2), a stateful, interoperable protocol, and can be configured to use identity assertion (downstream delegation), message-layer authentication (BasicAuth or Token), and client certificate authentication (for server trust isolation). To change the default connector on a given server, go to Administration Services under Additional properties for that server.
  • [z/OS]An error message might occur within the administrative subsystem security. This error indicates that the sending process did not supply a credential to the receiving process. Typically the cause of this problem is the sending process has security disabled while the receiving process has security enabled. This setup typically indicates that one of the two processes are not synchronized with the cell. Having security disabled for a specific application server does not have any effect on administrative security.
  • [AIX Solaris HP-UX Linux Windows][IBM i]An error message might occur within the administrative subsystem security. This error indicates that the sending process did not supply a credential to the receiving process. Typically the causes of this problem are:
    • The sending process has security disabled while the receiving process has security enabled. This setup typically indicates one of the two processes are not synchronized with the cell. Having security disabled for a specific application server does not have any effect on administrative security.
    • The clocks between the systems are not synchronized; this immediately makes the credential tokens not valid. Verify that the time, date, and time zones are consistent between the two machines. An error similar to the following might occur:
      [9/18/02 16:48:23:859 CDT] 3b9cef35 RoleBasedAuth A CWSCJ0305I: Role based 
      authorization check failed for security name <null>, accessId NO_CRED_NO_ACCESS_ID 
      while invoking method propagateNotifications:[Ljavax.management.Notification; 
      on resource NotificationService and module NotificationService.
  • [AIX Solaris HP-UX Linux Windows][IBM i]When getting the following error message, validate that the clocks are synchronized between all servers within the cell, and the configurations are synchronized between all nodes and the Deployment Manager. An error similar to the following might occur:
    [9/18/02 16:48:22:859 CDT] 3bd06f34 LTPAServerObj E CWSCJ0372E: Validation of 
    the token failed.

Results

Proper understanding of the security interactions between distributed servers greatly reduces the problems that are encountered with secure communications. Security adds complexity because additional function must be managed. For security to work properly, it needs thorough consideration during the planning of your infrastructure.

What to do next

When you have security problems that are related to the WebSphere Application Server Network Deployment environment, see Troubleshooting security configurations to find additional information about the problem. When trace is needed to solve a problem because servers are distributed, it is often required to gather trace on all servers simultaneously while recreating the problem. This trace can be enabled dynamically or statically, depending on the type of problem that is occurring.