IBM Tivoli Netcool/OMNIbus, Version 7.4

Best practices for creating triggers

When you create or modify triggers, ensure that the triggers are as efficient as possible, and have the shortest possible execution time.

A trigger has exclusive access to the ObjectServer database for the duration of its execution. By minimizing the execution time of a trigger, you can free up time for other triggers or clients that require access to the database. It is important to reduce the execution of database triggers because they interrupt the execution of a database operation, and so slow down the operation. For example, a pre-insert trigger on the alerts.status table fires for every new event. So, if an event flood occurs, the trigger is executed multiple times.

The ObjectServer records the amount of time that each trigger uses during each granularity period and saves the details in the $NCHOME/omnibus/log/servername_trigger_stats.logn file. Use this file to identify which triggers are using the most time, prioritize which triggers to review, and monitor the system. In general, review a trigger if it uses more than 3 seconds of time every 60 seconds (that is, the default granularity period).

Whenever you update your triggers, review the log file to verify that your changes do not cause a degradation in performance.

Use the following guidelines to improve the performance of your triggers.

Avoid table scans in database triggers

Table scans are expensive operations and can occur when SQL statements such as FOR EACH ROW are applied to a database table. In a database trigger, the cost of these scans is high if the trigger fires frequently and if the table has a large number of rows. For example, if you change the deduplication trigger on the alerts.status table that every time the trigger fires it scans alerts.status for rows matching a set of criteria. In this case, the scalability of the system is limited, because the database trigger takes longer and longer as the number of rows that need to be scanned increases. Also avoid nested scans.

You can use the following techniques to avoid the table scan in database triggers:
  • Perform the scan in a temporal trigger that is written so that one scan can match many rows. See the generic_clear trigger in $NCHOME/omnibus/etc/automation.sql for an example.
  • If using a lookup table to enrich events, access the lookup table by using its primary key, as described further on. The use of the primary key results in a direct lookup of the row rather than a scan (V7.2, or later). You can also limit the size of the lookup table. The number of rows that are acceptable for a lookup table is site-specific. It also depends on factors such as how often the lookup table is accessed, and hardware performance.
  • Access a lookup table by using an index.

Avoid the EVALUATE clause

When a trigger contains an EVALUATE clause, a temporary table is created to hold the results of the SELECT statement in the EVALUATE clause. The amount of time and resources that this temporary table consumes depends on the number of columns that are selected and the number of rows matched by the condition in the WHERE clause.

In most cases, you can replace the EVALUATE clause with a FOR EACH ROW clause. The FOR EACH ROW clause reads over the data and does not incur the overhead of creating a temporary table.

A suitable use for an EVALUATE clause is when a GROUP BY clause is being applied to an SQL query.

Avoid excessive use of the WRITE INTO statement for logging out to file

When triggers is deployed in production environments, comment out or remove WRITE INFO statements. It is advisable to limit the use of WRITE INFO statements to development environments, for debugging triggers. The quantity of data that is logged during debugging might cause a bottleneck in production environments.

Determine what is suitable for your system. For example, if the logging is infrequently called, there is probably no issue. However, if logging is called multiple times per INSERT statement (for example, within a nested loop), there could be a bottleneck.

Where possible, use the primary key when modifying rows

If the primary key of a database table is used in the WHERE clause of an UPDATE statement, the row is accessed by using direct lookup, rather than a table scan. For example:

update alerts.status where Identifier = tt.Identifier set Severity = Severity + 1;
Note: The VIA keyword is no longer required in V7.2, or later. The following command (which uses VIA) is equivalent to the preceding command:
update alerts.status VIA Identifier = tt.Identifier set Severity = Severity + 1;

Use indexes when using lookup tables

In V7.2, or later, the ObjectServer uses an index to access rows in a table if the primary key is used in a FOR EACH ROW statement.

This functionality is most useful where an ObjectServer table is being used as a lookup table, for example to enrich events. In such a case, design the table and triggers to access the lookup table by its primary keys, to prevent costly full table scans. For example:

create table alerts.iplookup persistent
(
	IpAddr		varchar(32) primary key,
	HostName	varchar(8),
	Owner		varchar(40)
);

create or replace trigger set_hostname
group madeup_triggers
priority 10
before insert on alerts.status
for each row
begin
	-- Access the lookup table using the primary key
	for each row tt in alerts.iplookup where tt.IpAddr = new.Node
	begin
		set new.Hostname = tt.HostName;
	end;
end;

Review and modify triggers produced from migrating from V3.6

If you migrated from V3.6 to V7.2.1, as part of an upgrade to the current version, the V7.2.1 migration tool produces best-effort replications of the V3.6 triggers. When you then upgrade to the current version, these are functionally correct, but might not perform efficiently. Review and modify these triggers as follows:
  • The V3.6 ObjectServer supported only temporal triggers, while V7.0 or later, includes database and signal triggers. The processing that is performed by a temporal trigger in V3.6 might be better suited to a database trigger in V7.0 or later. Because the triggers are converted only on a like-for-like basis, identify which triggers can be better implemented by using the new trigger types.
  • Where V3.6 triggers have the condition select *, the migration tool implements the condition as an EVALUATE clause, where all the columns in the alerts.status table are selected. Where possible, replace the EVALUATE clause with a FOR EACH ROW statement.
  • When migrating from V3.6, the migration tool also creates generic clear triggers that work in the same manner as in V3.6. However, the triggers supplied in V7.0, or later, are more efficient. Therefore, it is advisable to use the V7.0 or later triggers, which are by default disabled, instead of using the triggers migrated from V3.6.

Use the generic_clear trigger as a basis for correlation type triggers

The standard generic_clear trigger (see $NCHOME/omnibus/etc/automation.sql) correlates resolution events with their related problem events. After this trigger runs, all matched rows have their severity set to 0, in readiness for removal by the delete_clears automation. If you need different types of correlation trigger, base these triggers on the generic_clear trigger.

The standard generic_clear trigger does not use the EVALUATE clause to select the events. Instead it uses the FOR EACH ROW construct to loop over the events to populate a temporary table with the problem events. Because this temporary table contains only a subset of the events in the alerts.status table, the cost of the update operation that is applied to relate the problems with resolutions is reduced. Additionally, because the identifier of the problem event is stored in the temporary table, the problem events can be updated directly in alerts.status by using the UPDATE VIA command to perform a direct lookup on the row; this takes advantage of the Identifier field being a primary key.

Use deduplication to clear events where possible

The deduplication trigger can be used to clear problem events with the incoming resolution event when there is a one-to-one mapping between the problem and resolution. The following modification is required to your existing system:
  • Write the probe rules so that the problem and resolution events have the same identifier.
  • Modify the deduplication trigger so that when it fires it checks the Type field. If the type of the incoming event is set to 2 (resolution), set the severity of the existing event to 0.

This approach reduces the amount of processing for the generic_clear trigger to perform. It leaves the trigger to resolve cases where a single resolution event can clear many problem events.

Notes on creating automations

Use the following best practice guidelines create new automations:
  • Confirm whether an automation currently exists before attempting to create new automations with the same function.
  • In any WHERE conditions within the automation, use the guidelines for reordering of predicates in SQL queries. For example, compare integers, compare characters, and then compare regular expressions. For more information, see Optimization rules for SQL queries.
  • Ensure that the automation trigger does not acquire events which were previously processed, especially in the case of external scripts.
  • For temporal triggers, set the firing interval of different triggers to prevent them from being activated together.
  • Add a description to all newly created automations.
  • Automations can update the journal entry if they modify events in the ObjectServer database.

Test your changes

After new triggers are developed and validated, test the performance of the triggers as follows:
  1. Ensure that the data on which you run the tests is representative of the production system.
  2. Ensure that the number of rows in any table that the trigger accesses is representative of the production system.
  3. Measure the effect on system performance by using profiling and by collecting trigger statistics.