Troubleshooting
Problem
Failure updating VIO Server (VIOS) using updateios command.
Symptom
VIOS update failed or was interrupted (i.e. lost network connection while updateios command was running).
Cause
This document provides common causes and things to consider to resolve the problem.
Environment
VIOS 2.2.x and above.
Diagnosing The Problem
/home/padmin/install.log file is a key to understanding why the update failed.
Depending on your ioslevel, this file may be overwritten or appended each time updateios is ran.
Review the log with focus on the timestamp that coincides with when the updateios failure occurred. Pay close attention to the Installation Summary for FAILED results. Sometimes the log details may clearly show the cause of failure.
Resolving The Problem
Based on install.log details, determine if any of the following scenarios may be applicable to you.
1. The updateios process was interrupted before completing.
2. Hardware component (i.e. disk, adapter, etc) related to VIOS rootvg may be going bad.
3. Insufficient VIOS memory resources.
4. Updateios process hung.
5. Updateios may fail if an ifix exists on the VIOS.
6. Updateios fails for ios.cli.rte due to missing /home/padmin/config files.
7. Updateios fails for perfagent.tools due to missing /etc/perf directory/files.
8. Missing Requisites reported during a VIOS update.
If none of the above scenarios are applicable to you.
Scenario 1The updateios process was interrupted before completing (i.e. network connection was lost before updateios completed).
Recommendation
Attempt to clean it up before retrying updateios again. As padmin, run:
$ updateios -cleanup
-cleanup | Specifies the cleanup flag to remove all incomplete pieces of the previous installation. Perform cleanup processing whenever any software product or update is after an interrupted installation or update is in a state of either applying or committing. You can run this flag manually, as needed. |
The following symptoms are commonly seen when a hardware component (i.e. disk, adapter, etc) related to VIOS rootvg may be going bad.
a. If install.log shows I/O errors similar to the ones below, check for possible hardware failure.
- ...
ar: There is an input or output error.
ar: 0707-114 The fread system call failed.
0503-037 inurest: Failure on system call to execute command
OBJECT_MODE=32_64 /usr/ccs/bin/ar -or ./usr/lib/liblpmcommon.a.new
./usr/lib/inst_updt/liblpmcommon.a/shr.o.
0503-464 installp: The installation has FAILED for the "usr" part of the following filesets:
<fileset name>
...
b. If install.log reports "allocp" errors, that may indicate the VIOS rootvg is mirrored and there may be an issue with one of the mirrored disks. The log may show something similar to the following:
...
lquerypv: Warning, physical volume hdisk1 is excluded since it may be either missing or removed.
0516-404 allocp: This system cannot fulfill the allocation request.
There are not enough free partitions or not enough physical volumes to keep strictness and satisfy allocation requests. The command should be retried with different allocation characteristics.
lquerypv: Warning, physical volume hdisk1 is excluded since it may be either missing or removed.
0516-404 allocp: This system cannot fulfill the allocation request.
There are not enough free partitions or not enough physical volumes to keep strictness and satisfy allocation requests. The command should be retried with different allocation characteristics.
0503-008 installp: There is not enough free disk space in filesystem /usr (3017866 more 512-byte blocks are required).
An attempt to extend this filesystem was unsuccessful.
Make more space available then retry this operation.
...
c. Other symptoms that may indicate a potential hardware issue may include the updateios and/or other VIOS commands failing in padmin and/or oem_setup_env shell, i.e.
$ snap
...
/usr/sbin/snap[23]: 13697170 Bus error
/usr/sbin/snap[32]: 13697172 Bus error
/usr/sbin/snap[42]: 13697174 Bus error
/usr/sbin/snap[52]: 13697176 Bus error
/usr/sbin/snap[62]: 13697178 Bus error
/usr/sbin/snap[72]: 13697180 Bus error
/usr/sbin/snap[110]: 12189936 Bus error
/usr/sbin/rsct/bin/ctsnap[2006]: 14680184 Bus error
Bad read?
./etc/objrepos/lpp: I/O error
Bad read?
./etc/objrepos/product: I/O error
Bad read?
./usr/lib/objrepos/fix: I/O error
Unable to open file: /home/ios/logs/ioscli_global.trace for append
Error from cliCheckFile:-1
...
# bosboot -a
/usr/bin/bosboot[24]: 14090370 Bus error(coredump)
...
Recommendation
Determine the disk and adapter used by VIOS rootvg. Then check VIOS errlog for disk, adapter, LVM, or file system errors related to VIOS rootvg or its underlying physical devices.
To list all physical volume in VIOS rootvg, as padmin, run:
$ lspv|grep rootvg
hdisk0 00c1f1a077f16a01 rootvg active
OR
$ lsvg -pv rootvg
rootvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk0 active 546 328 109..15..13..82..109
Ensure all disks show PV STATE "active". If there is more than one physical volume in the volume group and one shows in "Missing" or "Removed" state, correct the disk problem, and then try the update again.
If the volume group is mirrored, you can temporarily break the mirror off the physical volume in Missing/Removed state and try the update again.
To check if the volume group is mirrored, run:
$ lsvg -lv rootvg =>If # of PPs is double the # of LPs, then that logical volume is mirrored
Note: If rootvg is mirrored and both mirrored disks show PV STATE active, check the number of FREE PPs in both disks. Sometimes the 0416-404 allocp and 0503-008 installp errors may be triggered if one of the mirrored disks has insufficient or no FREE PPs for the filesystem(s) to be increased during the updateios process on both mirror copies. In such case, you can break the mirror off the disk that may be almost full before re-attempting updateios.
To unmirror the volume group, run:
$ unmirrorios <hdisk#_in_missing_state>
To determine the disk and adapter type, run:
$ lsdev -type disk|grep hdisk0
hdisk0 Available SAS Disk Drive
$ lsdev -dev hdisk0 -parent
parent
sas0
$ lsdev -type adapter|grep sas0
sissas0 Available PCI-X266 Planar 3Gb SAS Adapter
$ errlog|pg
...
...errors to watch out for include but are not limited to:
...
E86653C3 0524184215 P H LVDD I/O ERROR DETECTED BY LVM
B6267342 0524184215 P H hdisk0 DISK OPERATION ERROR
...
...
For adapter and internal disk errors, contact your local Hardware Support Representative.
For issues with SAN-attached devices, contact your local Storage Support Representative.
If no hardware errors are found related to any of the physical devices used by VIOS rootvg, but the errlog lists LVM I/O errors, consider running hardware diagnostics or contact your local Hardware Support Representative to get a clean bill of health from the hardware perspective before pursuing further from the software side.
If filesystem corruption errors are found, schedule a maintenance window as soon as possible to boot the VIOS partition into maintenance mode and run a thorough filesystem check (fsck).
Scenario 3
Insufficient VIOS memory resources is known to cause updateios to fail. In some cases, you may see memory-related errors in the install.log, i.e.
"Out of memory, malloc() failed."
or
"...
/usr/lib/instl/cleanup: cannot fork: no swap space
/usr/lib/instl/cleanup[3]: cannot fork: no swap space
...
installp: fork system call failed
installp: fork system call failed
installp: fork system call failed
..."
In other case, the install.log may not show memory errors, but after the updateios attempt, other VIOS commands may fail with similar memory errors, permission-related errors, or library errors like the ones below:
sh: 0403-031 The fork function failed. There is not enough memory available.
lppchk: 0504-210 Unable to read file
/usr/sbin/rsct/install/bin/srMigrate because of permissions.
ar: 0707-114 The fread system call failed.
0503-037 inurest: Failure on system call to execute command
OBJECT_MODE=32_64 /usr/ccs/bin/ar -or ./usr/lib/liblpmcommon.a.new
./usr/lib/inst_updt/liblpmcommon.a/shr.o.
Note: For VIOS 2.2, it is recommended a minimum of 4 GB of memory when the VIOS is first installed. Then, when the overall configuration is put in place (i.e. storage, virtual devices, etc have been added), the VIOS Performance Advisor tool should be run to get an educated recommendation based on the VIOS workload going on at the time the performance data is being collected.
Recommendation
Then, increase VIOS memory to the suggested value and try the update again.
Scenario 4
Updateios process hung.
- Recommendation
Review /home/padmin/install.log to determine how far updateios got before it hang and whether or not your log details are a match to the Probable Cause below.
On the other hand, if the log details do not reveal enough information to determine why it hung, refer to technote Troubleshooting a Hung Process or Command on PowerVM Virtual I/O Server. The technote describes how to gather a process dump for further investigation. Then contact your local IBM SupportLine Representative and be ready to provide the following testcase:
- 1. /home/padmin/install.log &
2. pdump data as per the technote
3. VIOS snap
- ...
installp: APPLYING software for:
DirectorCommonAgent 6.3.5.0
0513-044 The cas_agent Subsystem was requested to stop.
Stopping The LWI Nonstop Profile...
Waiting for the LWI Nonstop Profile to exit ...
Waiting for the LWI Nonstop Profile to exit ...
Waiting for the LWI Nonstop Profile to exit ...
Waiting for the LWI Nonstop Profile to exit ...
Waiting for the LWI Nonstop Profile to exit ...
Waiting for the LWI Nonstop Profile to exit ...
Stopped The LWI Nonstop Profile.
. . . . . << Copyright notice for DirectorCommonAgent >> . . . . . . .
Licensed Materials - Property of IBM
5765-DRP
Copyright International Business Machines Corp. 2010, 2011.
All rights reserved.
US Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corp.
. . . . . << End of copyright notice for DirectorCommonAgent >>. . . .
Restarting the CAS nonstop service.......................................
...
Recommendation
Remove Systems Director from the VIO Server if it is not being used. Then try the update again.
cas.agent is not the root of the problem. The complex set of dependencies for the install, which is made slightly more tricky with Systems Director on the server, is the concern.
Below is a support document with the instructions to remove Systems Director. It contains a few links with good instructions to disable and remove System Director.IBM Systems Director Common Agent on AIX Options for Removing, Disabling, or Upgrading/Installing.
Updateios may fail if one or more ifixes are installed as it can cause filesets to be locked by efix manager.
Next is a sample output from install.log:
- ...snip...
+---------------------------------------------------------------+
BUILDDATE Verification ...
+---------------------------------------------------------------+
Verifying build dates...done
The updates being installed do not contain all the APARs to allow all existing interim fixes to be automatically removed. Please ensure the interim fixes are enabled for automatic removal and obtain the updates that contain the APARs for the following interim fixes, or remove the interim fixes, as described below.
IV62348s3b
0503-006 installp: Cannot change to directory /tmp/.workdir.8192072.6095004_1/usr/lpp/ios.cli.
Check path name and permissions.
EFIX MANAGER LOCKS
------------------
* * * ATTENTION * * *
The following selected filesets are locked by EFIX manager:
ios.cli.rte
installp has halted this operation because one or more files in the filesets listed above are registered as having an EFIX. You must remove these EFIXES before performing operations on the given fileset.
To get a listing of all locked filesets and the locking EFIX label, execute the following command:
# /usr/sbin/emgr -P
To remove the given EFIX, execute the following command:
# /usr/sbin/emgr -r -L <EFIX label>
For more information on EFIX management please see the emgr man page and documentation.
...
- Recommendation
Remove the ifix(es) prior to re-running updateios.
Scenario 6
Part of the updateios process is to check for the existence of the following Network Time Protocol (NTP) files:
- /home/padmin/config/ntp.conf
/home/padmin/config/ntp.drift
When these files are missing, the install.log may look similar to the sample output below
- . . .
installp: APPLYING software for:
ios.cli.rte 6.1.9.201
. . . . . << Copyright notice for ios.cli >> . . . . . . .
Licensed Materials - Property of IBM
5765G3400
Copyright International Business Machines Corp. 2004, 2016.
All rights reserved.
US Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corp.
. . . . . << End of copyright notice for ios.cli >>. . . .
sysck: 3001-022 The file
/home/padmin/config/ntp.conf
was not found.
sysck: 3001-022 The file
/home/padmin/config/ntp.drift
was not found.
chmod: /etc/snmpd.conf: A file or directory in the path name does not exist.
chown: /home/padmin/config/ntp.conf: A file or directory in the path name does not exist.
chown: /home/padmin/config/ntp.drift: A file or directory in the path name does not exist.
sysck: 3001-022 The file
/home/padmin/config/ntp.conf
was not found.
sysck: 3001-022 The file
/home/padmin/config/ntp.drift
was not found.
sysck: 3001-017 Errors were detected validating the files
for package ios.cli.rte.
0503-464 installp: The installation has FAILED for the "usr" part
of the following filesets:
ios.cli.rte 6.1.9.201
installp: Cleaning up software for:
ios.cli.rte 6.1.9.201
sysck: 3001-022 The file
/home/padmin/config/ntp.conf
was not found.
sysck: 3001-022 The file
/home/padmin/config/ntp.drift
was not found.
sysck: 3001-017 Errors were detected validating the files
for package ios.cli.rte.
Filesets processed: 403 of 461 (Total time: 24 mins 31 secs).
...
Installation Summary
--------------------
Name Level Part Event Result
-------------------------------------------------
. . .
ios.cli.rte 6.1.9.201 USR APPLY FAILED
ios.cli.rte 6.1.9.201 USR CLEANUP SUCCESS
. . .
- Recommendation
If /home/padmin/config directory was removed, it should be recreated and the files copied to it from another VIOS ensuring the permissions match those of a working VIOS. Then try VIOS update again.
Scenario 7
Part of the updateios process involves accessing certain directories/files that are expected to exist on a VIOS partition by default. Such is the case of /etc/perf directory.
This directory and its contents are expected to exist on a VIOS partition by default and should not be removed. If removed, the updateios process may partially update perfagent.tools fileset. The install.log may look similar to the sample output below (obtained from an updateios attempt to 2.2.4.30):
...
installp: APPLYING software for:
perfagent.tools 6.1.9.200
. . . . . << Copyright notice for perfagent.tools >> . . . . . . .
Licensed Materials - Property of IBM
5765G6200
Copyright International Business Machines Corp. 1993, 2016.
Copyright Regents of the University of California 1982, 1986, 1987.
Copyright BULL 1993, 2016.
All rights reserved.
US Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corp.
. . . . . << End of copyright notice for perfagent.tools >>. . . .
mkdir: 0653-357 Cannot access directory /etc/perf.
/etc/perf: A file or directory in the path name does not exist.
chmod: /etc/perf/daily: A file or directory in the path name does not exist.
touch: 0652-046 Cannot create /etc/perf/xmtopas.log1.
chmod: /etc/perf/xmtopas.log1: A file or directory in the path name does not exist.
touch: 0652-046 Cannot create /etc/perf/xmtopas.log2.
chmod: /etc/perf/xmtopas.log2: A file or directory in the path name does not exist.
touch: 0652-046 Cannot create /etc/perf/xmwlm.log1.
chmod: /etc/perf/xmwlm.log1: A file or directory in the path name does not exist.
touch: 0652-046 Cannot create /etc/perf/xmwlm.log2.
chmod: /etc/perf/xmwlm.log2: A file or directory in the path name does not exist.
error creating /etc/perf directories or files for xmtopas/xmwlm
update: Failed while executing the perfagent.tools.config_u script.
0503-464 installp: The installation has FAILED for the "root" part
of the following filesets:
perfagent.tools 6.1.9.200
installp: Cleaning up software for:
perfagent.tools 6.1.9.200
...
Installation Summary
--------------------
Name Level Part Event Result
-------------------------------------------------
. . .
perfagent.tools 6.1.9.200 USR APPLY SUCCESS
perfagent.tools 6.1.9.200 ROOT APPLY FAILED
perfagent.tools 6.1.9.200 ROOT CLEANUP SUCCESS
- Recommendation
If /etc/perf directory was removed, it should be recreated and its contents copied to it from another VIOS ensuring the permissions match those of a working VIOS. Then try VIOS update again.
If none of the above scenarios are applicable to you
Contact your local IBM SupportLine Representative and be ready to provide the following testcase:
- 1. /home/padmin/install.log
2. VIOS snap
Related Information
Was this topic helpful?
Document Information
Modified date:
19 February 2022
UID
isg3T1022699