mkcondition Command

Purpose

Creates a new condition definition which can be monitored.

Syntax

mkcondition -r resource_class -e "event_expression" [ -E "rearm_expression" ] [ -d "event_description" ] [ -D "rearm_description" ] [ -b interval[,max_events][,retention_period][,max_totalsize] ] [ -m l | m | p ] [ -n node_name1[,node_name2…]] [-p node_name] [ --qnotoggle | --qtoggle ] [ -s "selection_string" ] [ -S c | w | i ] [ -g 0 | 1 | 2 ] [-h] [-TV] condition

mkcondition -c existing_condition[:node_name] [-r resource_class] [ -e "event_expression" ] [ -E "rearm_expression" ] [ -d "event_description" ] [ -D "rearm_description" ] [ -b interval[,max_events][,retention_period][,max_totalsize] ] [ -m l | m | p ] [ -n node_name1[,node_name2…]] [-p node_name] [ --qnotoggle | --qtoggle ] [ -s "selection_string" ] [ -S c | w | i ] [ -g 0 | 1 | 2 ] [-h] [-TV] condition

Description

The mkcondition command creates a new condition with the name specified by the condition parameter. The condition is used to monitor a resource for the occurrence of the condition (or event). Use the mkresponse command to define one or more responses to an event. You can then link the conditions to the responses using the mkcondresp command, or you can use the startcondresp command to link the responses and start monitoring.

Using the -b flag, multiple events can be batched or grouped together and passed to a response. The grouping of events is by the time span in which they occur. In addition, the grouping can be done such that a specified maximum number of events are grouped within the time span. A response that handles batched events must be defined as supporting batched events.

In a cluster environment, use the -p flag to specify the node in the domain that is to contain the condition definition. If you are using mkcondition on the management server and you want the condition to be defined on the management server, do not specify the -p flag. If the -p flag is not specified, the condition is defined on the local node. If the node where the condition will be defined is:

in a cluster of nodes, the condition can monitor resources on more than one node. Use the -n flag to specify the nodes on which the condition will be monitored.
the management server in a management domain, a management scope (-m) of local (l) or management domain (m) can be specified to indicate how the condition applies. The selection string will be evaluated using the entire management domain when management scope is set to the management domain and the node is the management server.
a managed node in a management domain, only a management scope (-m) of local (l) can be used.
in a peer domain, a management scope (-m) of peer domain (p) or local (l) can be used to indicate how the condition and the selection string apply.
in both a management domain and a peer domain, a management scope (-m) of management domain (m), peer domain (p), or local (l) can be used to indicate how the condition and its selection string apply.

To lock a condition so it cannot be modified or removed, use the chcondition command (with its -L flag).

If Cluster Systems Management (CSM) is installed on your system, you can use CSM defined node groups as node name values to refer to more than one node. For information about working with CSM node groups and using the CSM nodegrp command, see the CSM: Administration Guide and the CSM: Command and Technical Reference.

Flags

-b interval[,max_events][,retention_period][,max_totalsize]

Specifies one or more batching-related attributes. Use commas to separate the attribute values. Do not insert any spaces between the values or the commas.

interval specifies that the events are to be batched together for the indicated interval. Batching continues until no events are generated for an interval. Use an interval of 0 to turn batching off.

max_events specifies that the events are to be batched together until the max_events number of events are generated. The interval restarts if the max_events number of events is reached before the interval expires.

retention_period specifies the retention period in hours. The batched event file is saved for the time specified as the retention period. Once this time is reached, the file is automatically deleted.

max_totalsize specifies the total size for the batched event file in megabytes (MB). The batched event file is saved until this size is reached, Once the size is reached, the file is automatically deleted.

max_events, retention_period, and max_totalsize cannot be specified unless interval is greater than 0.

When interval is greater than 0 and max_events is 0, no maximum number of events is used.

If retention_period and max_totalsize are both specified, the batched event file is saved until the specified time or size is reached, whichever occurs first.

If you want to change one, two, or three attribute values, you must specify a valid value or an empty field for any attributes that precede the value you want to change. You do not have to specify any values for attributes that follow the value you want to change. For example, if you only need to change the retention period, you need to specify values for interval and max_events as well. You can provide an empty field if an attribute does not need to be changed. To change the retention period to 36 hours without changing the values of interval and max_events, enter:

mkcondition -c existing_condition -b ,,36

-c existing_condition[:node_name]

Copies an existing condition. The existing condition is defined on node_name. If node_name is not specified, the local node is used. node_name is a node within the scope determined by the CT_MANAGEMENT_SCOPE environment variable. If any other flags are specified, update the new condition as indicated by the flags. Links with responses are not copied.

-d "event_description"

Describes the event expression.

-D "rearm_description"

Describes the rearm expression.

-e "event_expression"

Specifies an event expression, which determines when an event occurs. An event expression consists of a dynamic attribute or a persistent attribute of resource_class, a mathematical comparison symbol ( or <, for example), and a constant. When this expression evaluates to TRUE, an event is generated.

-E "rearm_expression"

Specifies a rearm expression. After event_expression has evaluated to True and an event is generated, the rearm expression determines when monitoring for the event expression will begin again. Typically, the rearm expression prevents multiple events from being generated for the same event evaluation. The rearm expression consists of dynamic attributes or persistent attributes of resource_class, mathematical comparison symbols (> or <, for example), logical operators (|| or &&), constants, and an optional qualifier.

--g 0 | 1 | 2

Specifies granularity levels that control audit logging for the condition. The levels of granularity are:

0: Enables audit logging. ERRM writes all activities to the audit log. This is the default.
1: Enables error logging only. ERRM writes only in case of errors to the audit log.
2: Disables audit logging. ERRM does not write any records to the audit log.

-m l │ m │ p

Specifies the management scope to which the condition applies. The management scope determines how the condition is registered and how the selection string is evaluated. The scope can be different from the current configuration, but monitoring cannot be started until an appropriate scope is selected. The valid values are:

l: Specifies local scope. This is the default. The condition applies only to the local node (the node where the condition is defined; see the -p flag). Only the local node is used in evaluating the selection string.
m: Specifies management domain scope. The condition applies to the management domain in which the node where the condition is defined belongs (see the -p flag). All nodes in the management domain are used in evaluating the selection string. The node where the condition is defined must be the management server in order to use management domain scope.
p: Specifies peer domain scope. The condition applies to the peer domain in which the node where the condition is defined belongs (see the -p flag). All nodes in the peer domain are used in evaluating the selection string.

-n node_name1[,node_name2…]

Specifies the host name for a node (or a list of host names separated by commas for multiple nodes) where this condition will be monitored. Node group names can also be specified, which are expanded into a list of node names.

You must specify the -m flag with a value of m or p if you want to use the -n flag. This way, you can monitor conditions on specific nodes instead of the entire domain.

The host name does not have to be online in the current configuration, but once the condition is monitored, the condition will be in error if the node does not exist. The condition will remain in error until the node is valid.

-p node_name

Specifies the name of the node where the condition is defined. This is used in a cluster environment and the node name is the name by which the node is known in the domain. The default node_name is the local node on which the command runs. node_name is a node within the scope determined by the CT_MANAGEMENT_SCOPE environment variable.

If you are using mkcondition on the management server and you want the condition to be defined on the management server, do not specify the -p flag.

--qnotoggle

Specifies that monitoring does not toggle between the event expression and the rearm expression, but instead the event expression is always evaluated.

--qtoggle

Specifies that monitoring toggles between the event expression and the rearm expression.

-r resource_class

Specifies the resource class to be monitored by this condition. You can display the resource class names using the lsrsrcdef command.

-s "selection_string"

Specifies a selection string that is applied to all of the resource_class attributes to determine which resources should be monitored by the event_expression. The default is to monitor all resources within the resource_class. The resources used to evaluate the selection string is determined by the management scope (the -m flag). The selection string must be enclosed within double or single quotation marks. For information on how to specify selection strings, see the RSCT: Administration Guide .

-S c │ w │ i

Specifies the severity of the event:

c: Critical
w: Warning
i: Informational (the default)

-h

Writes the command's usage statement to standard output.

-T

Writes the command's trace messages to standard error. For your software service organization's use only.

-V

Writes the command's verbose messages to standard output.

Parameters

condition: The condition name is a character string that identifies the condition. If the name contains spaces, it must be enclosed in quotation marks. A name cannot consist of all spaces, be null, or contain embedded double quotation marks.

Security

The user needs write permission for the IBM.Condition resource class to run mkcondition. Permissions are specified in the access control list (ACL) file on the contacted system. See the RSCT: Administration Guide for details on the ACL file and how to modify it.

Exit Status

0: The command ran successfully.
1: An error occurred with RMC.
2: An error occurred with a command-line interface script.
3: An incorrect flag was entered on the command line.
4: An incorrect parameter was entered on the command line.
5: An error occurred that was based on incorrect command-line input.

Environment Variables

CT_CONTACT

Determines the system where the session with the resource monitoring and control (RMC) daemon occurs. When CT_CONTACT is set to a host name or IP address, the command contacts the RMC daemon on the specified host. If CT_CONTACT is not set, the command contacts the RMC daemon on the local system where the command is being run. The target of the RMC daemon session and the management scope determine the resource classes or resources that are processed.

CT_IP_AUTHENT

When the CT_IP_AUTHENT environment variable exists, the RMC daemon uses IP-based network authentication to contact the RMC daemon on the system that is specified by the IP address to which the CT_CONTACT environment variable is set. CT_IP_AUTHENT only has meaning if CT_CONTACT is set to an IP address; it does not rely on the domain name system (DNS) service.

CT_MANAGEMENT_SCOPE

Determines the management scope that is used for the session with the RMC daemon in processing the resources of the event-response resource manager (ERRM). The management scope determines the set of possible target nodes where the resources can be processed. The valid values are:

0: Specifies local scope.
1: Specifies local scope.
2: Specifies peer domain scope.
3: Specifies management domain scope.

If this environment variable is not set, local scope is used.

Implementation Specifics

This command is part of the Reliable Scalable Cluster Technology (RSCT) fileset for AIX®.

Standard Output

When the -h flag is specified, this command's usage statement is written to standard output. All verbose messages are written to standard output.

Standard Error

All trace messages are written to standard error.

Examples

These examples apply to standalone systems:

To define a condition with the name "FileSystem space used" to check for percentage of space used greater than 90% and to rearm when the percentage is back down below 85%, enter:
```
mkcondition -r IBM.FileSystem  \                    
-e "PercentTotUsed > 90" -E "PercentTotUsed < 85" \       
"FileSystem space used"
```

To define a condition with the name "tmp space used" to check for percentage of space used greater than 90% for /tmp and to rearm when the percentage is back down below 85%, including comments, enter:

mkcondition -r IBM.FileSystem  \                    
-e "PercentTotUsed > 90" -E "PercentTotUsed < 85" \          
-d "Generate event when tmp > 90% full"  \                    
-D "Restart monitoring tmp again after back down < 85% full"\ 
-s 'Name=="/tmp"'  "tmp space used"

To define a condition with the name "Space used" as a copy of "FileSystem space used", enter:
```
mkcondition -c "FileSystem space used"  "Space used"
```
To define a condition with the name "var space used" as a copy of "tmp space used", but change the selection to /var, enter:
```
mkcondition -c "tmp space used" -s 'Name=="/var"' \    
"var space used"
```

To define a condition with the name "vmstat is running" to monitor when user joe is running the vmstat program in a 64-bit environment, enter:

mkcondition -r "IBM.Program" \
-e "Processes.CurPidCount > 0" -E "Processes.CurPidCount  <= 0"  \
-d "Generate event when user starts vmstat" \
-D "Restart monitoring when vmstat is terminated" \
-s ProgramName == \"vmstat64\" && Filter==\"ruser==\\\"joe\\\"\"" \
-S "i" -m "l" "vmstat is running"

To define a condition with the name "myscript terminated" to monitor when a script has ended, enter:
```
mkcondition -r "IBM.Program" \
-e "Processes.CurPidCount <= 0" -E "Processes.CurPidCount  > 0"  \ 
-d "Generate event when myscript is down" \
-D "Rearm the event when myscript is running" \
-s ProgramName == \"ksh\" && Filter == 'args[1]==\"/home/joe/myscript\"'" \ 
-m "l" "myscript terminated"
```
In this example, args represents the array of argument strings that was passed to main. Because this is an array, args[1] references the first argument after the program name. Use the ps -el command to determine the ProgramName. See the lsrsrcdef command for more information.
To batch together a maximum of 20 events at a time that come from a sensor named DBInit in 60-second intervals, enter:
```
mkcondition -r "IBM.Sensor" \
-e "Int32 < 0" -E "Int32 > 0"  -b 60,20 \
-s "Name == \"DBInit\""  "DBInit Sensor"
```
To define a condition with the name tmp space used to check for percentage of space used greater than 90% for /tmp for at least seven out of the last 10 observations, including comments, enter:
```
mkcondition -r IBM.FileSystem \
-e "PercentTotUsed > 90 __QUAL_COUNT(7,10)" \
-d "Generate event when tmp > 90% full for 7 out of 10 last
\observations" \ -s 'Name=="/tmp"' "tmp space used"
```

To define a condition with the name adapter stability to check for adapter status that has changed four times within one minute, including comments, enter:

mkcondition -r IBM.NetworkInterface \
-e "OpState != OpState@P __QUAL_RATE(4,60)" \
-d "Generate event when OpState is changed 4 times within 1 minute" \
"adapter stability"

To define a condition for a batched event called tmp space used to check the percentage of space used by /tmp that is greater then 90%, with a batch interval of 5 and a batch event file retention period of 72 hours, enter:
```
mkcondition -r IBM.FileSystem -e "PercentTotUsed > 90" -b 5,,72 "tmp space used"
```
To define a condition called tmp space used to check that percentage of space used by /tmp that is greater then 90%, with audit logging enabled only in case of errors, enter:
```
mkcondition -r IBM.FileSystem -e "PercentTotUsed > 90" -g 1 "tmp space used"
```

These examples apply to management domains:

To define a condition with the name "FileSystem space used" to check for percentage of space used greater than 90%, to rearm when the percentage is back down below 85%, and to monitor all nodes in the domain, run this command on the management server:
```
 mkcondition -r IBM.FileSystem  -e "PercentTotUsed > 90" \ 
-E "PercentTotUsed < 85" -m d "FileSystem space used"
```
To define a condition with the name "FileSystem space used" to check for percentage of space used greater than 90%, to rearm when the percentage is back down below 85%, and to monitor nodes nodeA and nodeB in the domain, run this command on the management server:
```
mkcondition -r IBM.FileSystem  -e "PercentTotUsed > 90" \ 
-E "PercentTotUsed < 85" -n nodeA,nodeB -m p \ 
"FileSystem space used"
```
To define a condition with the name "nodeB FileSystem space used" on nodeB to check for percentage of space used greater than 90%, to rearm when the percentage is back down below 85%, and to monitor the condition with local scope, run this command on the management server:
```
mkcondition -r IBM.FileSystem  -e "PercentTotUsed > 90" \ 
-E "PercentTotUsed < 85" -m l -p nodeB \ 
"nodeB FileSystem space used"
```
To define a condition with the name "local FileSystem space used" to check for percentage of space used greater than 90%, to rearm when the percentage is back down below 85%, and to monitor the local node, run this command on a managed node:
```
mkcondition -r IBM.FileSystem  -e "PercentTotUsed > 90" \ 
-E "PercentTotUsed < 85" -m l "local FileSystem space used"
```

These examples apply to peer domains:

To define a condition on nodeA with the name "FileSystem space used" to check for percentage of space used greater than 90%, to rearm when the percentage is back down below 85%, and to monitor all nodes in the domain, run this command:
```
mkcondition -r IBM.FileSystem  -e "PercentTotUsed > 90" \ 
-E "PercentTotUsed < 85" -m p -p nodeA "FileSystem space used"
```
To define a condition on nodeC with the name "FileSystem space used" to check for percentage of space used greater than 90%, to rearm when the percentage is back down below 85%, and to monitor nodes nodeA and nodeB in the domain, run this command:
```
mkcondition -r IBM.FileSystem  -e "PercentTotUsed > 90" \ 
-E "PercentTotUsed < 85" -n nodeA,nodeB -m p -p nodeC \ 
"FileSystem space used"
```
To define a condition with the name "local FileSystem space used" on nodeB to check for percentage of space used greater than 90%, to rearm when the percentage is back down below 85%, and to monitor the local node only, run this command:
```
mkcondition -r IBM.FileSystem  -e "PercentTotUsed > 90" \ 
-E "PercentTotUsed < 85" -m l -p nodeB "local FileSystem space used"
```

Location

/opt/rsct/bin/mkcondition