Allocating the files and directories
The following features use files on UNIX System Services (USS):
- End-to-end scheduling with fault tolerance capabilities.
- End-to-end scheduling with z-centric capabilities, if SSLKEYRINGTYPE is set to USS in the HTTPOPTS statement.
- Features running Java utilities:
- Historical run data archiving for Dynamic Workload Console reporting
- Event-driven workload automation for data set triggering
By default, the EQQJOBS installation aid sets the following paths for the following directories:
- End-to-end with fault tolerance work directory (EQQJOBS8)
- /var/TWS/inst
- JAVA utilities enablement work directory (EQQJOBS9)
- /var/TWS/inst
- SSL for TCP/IP connection work directory (EQQJOBSC)
- /var/TWS/inst/ssl
By keeping the default directories, if the end-to-end work directory is deleted, the Java and SSL work directories are also deleted. To avoid this problem, set different paths for the different work directories. For example:
- End-to-end with fault tolerance work directory (EQQJOBS8)
- /var/TWS/E2E
- JAVA utilities enablement work directory (EQQJOBS9)
- /var/TWS/JAVAUTL
- SSL for TCP/IP connection work directory (EQQJOBSC)
- /var/TWS/SSL
To create the correct directories and files, run the following sample jobs for each controller that supports the specific feature:
- The EQQPCS05 sample for the end-to-end scheduling with fault tolerance capabilities
- The EQQPCS08 sample for the historical run data archiving and event-driven workload automation
To run the previous samples, you must have one of the following permissions:
- UNIX System Services (USS ) user ID (UID) equal to 0
- BPX.SUPERUSER FACILITY class profile in RACF®
- UID specified in the JCL in eqqUID and belonging to the group (GID) specified in the JCL in eqqGID
For the EQQPCS05 sample, if the GID or the UID were not specified in EQQJOBS, you can specify them in the STDENV DD before running the sample. Make sure that you specify a unique UID with a nonzero value; for additional information about this requirement, see INFO APAR II1423.
The user must also have the /bin/sh login shell defined in his OMVS section of the RACF profile. Make sure that the login shell is set as a system default or use the following TSO command to define it:
ALTUSER username OMVS(PROGRAM('/bin/sh'))
To check the current settings:
- Run the following TSO command:
LISTUSER username OMVS
- Look in the PROGRAM line of the OMVS section.
After running EQQPCS05, you find the following files in the work directory:
- localopts
- Defines the attributes of the local workstation (OPCMASTER) for batchman, mailman, netman and writer processes and for SSL. The parameters that have no effect in an end-to-end environment are indicated and commented out. For information about customizing this file, see Tivoli Workload Scheduler for z/OS: Customization and Tuning.
- mozart/globalopts
- Defines the attributes of the Tivoli Workload Scheduler network (OPCMASTER ignores them).
- Netconf
- Netman configuration files
- TWSCCLog.properties
- Defines attributes for the trace function.
You will also find the following directories in the work directory:
- mozart
- pobox
- stdlist
- stdlist/logs contains the USS processes logs files
After running EQQPCS08, you find the following file in the work directory:
- java/env.profile
- Defines the environmental variable required by the Java utilities.
Configuring for end-to-end scheduling with fault tolerance capabilities in a SYSPLEX environment
In a configuration with a controller and no stand-by controllers, define the end-to-end server work directory in a file system mounted under either a system-specific HFS or a system-specificzFS.
Then configure the Byte Range Lock Manager (BRLM) server in a distributed form (see following considerations about BRLM). In this way the server will not be affected by the failure of other systems in the sysplex.
Having a shared HFS or zFS in a sysplex configuration means that all file systems are available to all systems participating in the shared HFS or zFS support. With the shared HFS or zFS support there is no I/O performance reduction for an HFS or zFS read-only (R/O). However, the intersystem communication (XCF) required for shared HFS or zFS might affect the response time on read/write (R/W) file systems being shared in a sysplex. For example, assume that a user on system SYS1 issued a read request to a file system owned R/W on system SYS2. Using shared HFS or zFS support, the read request message is sent via an XCF messaging function. After SYS2 receives the message, it gathers the requested data from the file and returns the data using the same request message.
In many cases, when accessing data on a system which owns a file system, the file I/O time is only the path length to the buffer manager to retrieve the data from the cache. On the contrary, file I/O to a shared HFS or zFS from a client which does not own the mount, requires additional path length to be considered, plus the time involved in the XCF messaging function. Increased XCF message traffic is a factor which can contribute to performance degradation. For this reason, it is recommended for system files to be owned by the system where the end-to end server runs.
In a configuration with an active controller and several stand-by controllers, make sure that all the related end-to-end servers running on the different systems in the Sysplex have access to the same work directory.
On z/OS® systems, the shared ZFS capability is available: all file systems that are mounted by a system participating in shared ZFS are available to all participating systems. When allocating the work directory in a shared ZFS you can decide to define it in a file system mounted under the system-specific ZFS or in a file system mounted under the sysplex root. A system-specific file system becomes unreachable if the system is not active. To make good use of the takeover process, define the work directory in a file system mounted under the sysplex root and defined as automove.
The Byte Range Lock Manager (BRLM) locks some files in the work directory. The BRLM can be implemented:
- With a central BRLM server running on one member of the sysplex and managing locks for all processes running in the sysplex.
- In a distributed form, where each system in the sysplex has its own BRLM server responsible for handling lock requests for all regular files in a file system which is mounted and owned locally (see APARs OW48204 and OW52293).
If the system where the BRLM runs experiences a scheduled or unscheduled outage, all locks held under the old BRLM are lost. To preserve data integrity, further locking and I/O on any opened files is prevented until files are closed and reopened. Moreover, any process locking a file is terminated.
To avoid these kinds of error in the end-to-end server, before starting a scheduled shutdown procedure for a system, you must stop the end-to-end server if either or both of the following conditions occurs:
- The work directory is owned by the system to be closed
- The df -v command on OMVS displays the owners of the mounted file systems
- The system hosts the central BRLM server
- The console command DISPLAY OMVS,O can be used to display the name of the system where the BRLM runs. If the BRLM Server becomes unavailable, then the distributed BRLM is implemented. In this case the E2E server needs to be stopped only if the system which owns the work directory is stopped.
The server can be restarted after a new system in the sharing has taken the ownership of the file system and/or a new BRLM is established by one of the surviving systems.
To minimize the risk of filling up the Tivoli® Workload Scheduler internal queues while the server is down, schedule the closure of the system when the workload is low.
A separate file system data set is recommended for each stdlist directory mounted in R/W on /var/TWS/inst/stdlist, where inst varies depending on your configuration.
When you calculate the size of a file, consider that you need 10 MG for each of the following files: Intercom.msg, Mailbox.msg, pobox/tomaster.msg, and pobox/CPUDOMAIN.msg.
You need 512 bytes for each record in the Symphony, Symold, Sinfonia, and Sinfold files. Consider a record for each CPU, schedule, and job/recovery job.
You can specify the number of days that the trace files are kept on the file system using the parameter TRCDAYS in the TOPOLOGY statement.