Troubleshooting
Problem
The InfoSphere DataStage parallel engine on Microsoft Windows may 'lock up'. Windows functionality (Explorer, Task Manager, etc) still works. Existing parallel jobs (.OSH files) keep running, but you cannot start or compile other parallel jobs. However, you can start InfoSphere DataStage server jobs. This note explains how to diagnose and fix this issue.
Symptom
Windows cannot create new processes which are linked to the MKS toolkit. These processes include all .OSH file processes, connectivity tools such as SSH, and many other system utilities provided by MKS toolkit.
Cause
Each process created by the InfoSphere DataStage client on a Windows Server requires a Security ID token in the MKS Toolkit. On very busy systems, it is necessary to increase the number of SSIDs above the default.
Environment
Windows
Diagnosing The Problem
If you suspect InfoSphere DataStage is in a hung state, where no additional InfoSphere DataStage parallel jobs will start, you can use the following test technique to verify your suspicions.
- Obtain an MKS NuTCRACKER process report dump.
- Open a Korn shell by clicking Start->Run->ksh
- From the Korn shell enter the following command: process –a –w 5000 > process-log.txt
The process command reports how many NuTCRACKER platform processes are running and displays information on active NuTCRACKER platform processes.Example
Note: no redirection to a log file is used in this example.
$ Process -a -w 5000
There are 5 currently running NuTCRACKER processes
There are 32 currently allocated process slots
A total of 2000 processes can run simultaneously
7 processes have run simultaneously
Process 4216, program C:\Program Files\MKS Toolkit\bin\rlogind.exe
ppid is not a NuTCRACKER application
pgroup 4216
sess_id 4216
nice 0
Process 4552, program C:\Program Files\MKS Toolkit\bin\secshd.exe
ppid is not a NuTCRACKER application
pgroup 4552
sess_id 4552
nice 0
Process 29672, program C:\PROGRA~1\MKSTOO~1\mksnt\perl.exe
ppid is not a NuTCRACKER application
pgroup 29672
sess_id 29672
nice 0
Process 30936, program C:\PROGRA~1\MKSTOO~1\bin\Process.exe
ppid is not a NuTCRACKER application
pgroup 30936
sess_id 30936
nice 0
$ - Check that you can successfully run a parallel job as follows:
To help in the testing it is recommended that you first create an environment file in the PXEngine directory. The environment file can be created using an editor such as the Windows Notepad utility. The format of the file is as follows:
#!/bin/sh
export APT_ORCHHOME="C:/IBM/InformationServer/Server/PXEngine"
export APT_CONFIG_FILE="C:/IBM/InformationServer/Server/Configurations/default.apt"
export PATH="$APT_ORCHHOME/bin;$PATH"
export APT_PM_SHOWRSH=1
The above example assumes that InformationServer was installed to the default C: drive. If you installed to a different drive substitute the C: with the appropriate drive letter.
Once defined, save the file as apt.env in the C:/IBM/InformationServer/Server/PXEngine directory. Note, if you are using Windows Notepad be sure to set the “Save as type” to “All Files” and enter apt.env as the File name. Do not use a .TXT file name extension.
After creating the environment file, you can test the parallel engine by executing a simple parallel job from the shell or command level as follows:
The above example assumes that InformationServer was installed to the default C: drive. If you installed to a different drive substitute the C: with the appropriate drive letter.
- Open a Korn shell by clicking Start->Run->ksh
- Change to the C:/IBM/InformationServer/Server/PXEngine directory by entering
cd C:/IBM/InformationServer/Server/PXEngine
- Source the apt.env environment file by entering
. ./apt.env
- Run a test by entering
osh “generator –schema record(a:int32) | peek”
The job should execute normally and you should not experience any hang or error message dialogs. If the following error dialog appears when trying to run the osh test, you will need to increase the number of MKS NuTCRACKER security IDs as described in the following section.
Resolving The Problem
The problem can be avoided by changing a parameter in the MKS Control Panel:
1. Make sure there are no DataStage jobs, uvsh.exe or osh.exe processes running on the server. Any job which uses the NuTCRACKER service will prevent changed settings from taking effect.
2. Using the DataStage control panel applet, stop all the DataStage service.s
3. Open the 'Configure MKS Toolkit' Control panel. Select the 'Manage Services' tab .
Starting with the bottom service shown in the dropdown, Stop each MKS Service.
4. After you stop the last service, press the 'Refresh' button and be sure that the 'Active Processes' box display zero. If necessary, stop any remaining services. Unless the Active Process count is zero, the setting changes will not take effect.
5. a. Select the "Runtime Settings" tab
b. Select 'Miscellaneous Settings' from the Category dropdown.
6. Change the value in the 'Max Number of Security ID.' box to 5000.
8. Select the 'ManageServices' tab again and restart the services in the order shown in the dropdown, starting at the top.
NOTE: When the system reboots, the MKS services and the DataStage services will all restart. So, if you plan to reboot the Windows system, you can skip restarting the services and hit OK to exit the MKS control panel.
9. Hit OK to exit the MKS control panel.
10. Restart the DataStage Services using the DataStage control panel.
Was this topic helpful?
Document Information
Modified date:
16 June 2018
UID
swg21615249