IBM Support

ITM Situation to Monitor Stale files

Question & Answer


Question

How to alert when a file is not changing?

Answer

Linux or Unix:

Create a situation in the Unix OS group or Linux OS Group using the File Information attribute group. Here is an example of what is needed:




The time last changed needs to be less than or equal to the current timestamp minus 10 minutes. When that is true, the file is stale and an alert is generated. The cell function is "Compare Time to a time + or - delta". You will get an initial dialog box for another time compare style that needs to be closed. When the function is selected again you can choose the compare test.

In the formula, the Last Changed Time attribute is in the Agent time. The Local_Time.Timestamp is in the TEMS time [See Note 1]. If the Agent and the TEMS it connects to are in the same time zone this works as expected. If they are in different time zones the formula must be changed. For example, in a test environment the hub TEMS showed 3:43am and the agent showed at 5:43am. To achieve the expected results, the time comparison value had to be '#'Local_Time.Timestamp'+110M'. That works because the hub time plus 2 hours or 120 minutes equals the agent time. Subtract ten minutes and you get 110 minutes.

The Agent time versus TEMS time can be a bothersome issue which prevents a single situation from working across all agent instances. To avoid this issue, configure remote TEMS instances in each time zone. This can also ease daylight saving time transitions.

Another issue will show if the agents and the TEMS are running at slightly different times. ITM itself handles such cases with differences of a minute or two. Most large environments have a time synchronization process such as Network Time Protocol in Linux/Unix. If the time synchronization is not running or mal-configured then the cross system time checks may not work as expected.

Another issue is what sampling interval to use. The actual time of day a situation formula is evaluated is unpredictable. For sampled situations, the maximum delay between condition occurring and the situation event creation is twice the sampling interval. That means you have to reduce the sampling interval to get an event sooner. On the other hand, tests with *TIME are potentially very impactful on the TEMS. Every single file in the directory will be sent to the TEMS and that can result in overload and TEMS instability. That effect is reduced if there are fewer files in the directory. These type of situations should be carefully tested before committing to production.

The last issue involves situation evaluation timing. The Agent processing is largely sequential and is never precise as to timing. For example, if the agent has many situations to evaluate at one point in time any particular situation may well get delayed and the transmission of results may be delayed. The same issue may prevent timely evaluation at the TEMS the agent reports to. Communication delays may further delay processing. Because of these factors an alert may be raised or not raised unexpectedly. Therefore consider setting a longer stale time test to avoid false alerts.

Note:
    1) Wild cards [asterisks] cannot be used in this context: the file name must be specific.
    2) Use both a path and a file attribute: the file attribute comparison does not include the path.
    3) No action command allowed

If (3) is important, use a helper situation like this:



Two tests are required because a situation with a single situation comparison is invalid.

The first test is always true [Local Time attribute group Time attribute hhmmss]. The second test is against the base situation IBM_stale_file PMR. This helper situation must have the same distribution and sampling interval as the base situation. In this helper situation you can add an action command and have attributes from the base to substitute in the action command.

Note: The Sampling interval is defined in the Situation Editor on the Formula tab. It defines how often the situation formula should be evaluated. The minimum time is 30 seconds and the maximum is 999 days.

The IBM_stale_file situation xml dump is attached for Linux OS agents.. It can be installed with the command:
    tacmd createSit -i IBM_stale_file.xml

The IBM_stale_file_unix situation xml dump is attached for Unix OS Agents. It can be installed with the command:
    tacmd createSit -i IBM_stale_file_unix.xml


Windows:

Create a situation in the Windows OS group using the File Trend attribute group. Here is an example of what is needed:





The rest of the discussion is the same.

The IBM_stale_windows situation xml dump is attached. It can be installed with the command:

tacmd createSit -i IBM_stale_windows.xml


Note 1:

In most cases Local_Time.Timestamp refers to the local time at the agent. However the usage in the examples refer to TEMS local time.

*AND *TIME Linux_File_Information.Last_Changed_Time *LE 'Local_Time.Timestamp - 10M'

From the standpoint of the agent, the presence of the *TIME causes the potential result rows to be transmitted to the TEMS without more filtering. The TEMS dataserver [SQL] component recognizes the construction and performs the test of the result row Linux_File_Information.Last_Changed_Time against the current TEMS Timestamp less 10 minutes.

The examples here use the Local Time attribute group. In addition attributes like Linux_File_Information.Last_Changed_Time are in Local Time. It would be possible to use Universal Time attribute group, but then you would need to figure out the correct displacement between an agent time attribute in local time and a universal time value at the TEMS the agent connects to so there is no simplification of the issue.

The only possible simplification is to have a remote TEMS at the same time zone as the agents where this sort of situation is being run. Then a single situation could work for agents in different time zones.

[{"Product":{"code":"SSTFXA","label":"Tivoli Monitoring"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"ITM Tivoli Enterprise Mgmt Server V6","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"}],"Version":"All Versions","Edition":"All Editions","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
17 June 2018

UID

swg21393829