Rete rules executing in an infinite loop

Technote (troubleshooting)


Problem(Abstract)

One or more rules are firing in an infinite loop when executing with the Rete algorithm.

Symptom

The method invoked to execute a ruleset never returns and seems to mobilize a full CPU core indefinitely.

Or, while capturing rule execution trace an Out Of Memory error happens and seems caused by a large number of IlrRuleEventImpl instances held in heap memory.

Or, with the execution of rules possibly augmenting the size of the domain object graph associated with the rule session, an Out Of Memory error happens and seems caused by a large number of such domain objects held in heap memory.

Cause

The side effect of a rule action part causes itself or another rule to be eligible to fire. The transitive closure of this chain of rules contains a circularity that can lead to an infinite loop situation. The refraction principle of the Rete algorithm normally prevents such infinite loop from happening:

A rule instance that has been fired cannot be reinserted into the agenda if no new fact has occurred, that is, if none of the objects matched by the rule are modified, or if no new object is matched by the rule.
(see in the product documentation:
Rule Studio > Optimizing execution > Reference > The RetePlus algorithm > Refraction)

There are however certain patterns of rules that will appear to defeat the refraction principle even though it is being applied correctly by the rule engine. Three different such situations serve as illustration below.

1 - Creation of new facts in the condition part of rules through "in" and "from" keywords

Normally the refraction principle of the Rete algorithm prevents a rule (or chain of rules) from firing repeatedly by keeping track of which tuple of matching objects have already been considered for each rule. But this principle can be defeated if new facts are created as rules are being evaluated, causing rules to fire again with new tuples of objects matching its conditions. This can happen through the use of the from or in IRL keywords because the accessor method (on the right hand side of the keyword) is executed on each evaluation or re-evaluation of the rule. Consider the below IRL rule for example:

rule infinite_loop_rule
{
 when
 {
   obj1: SomeClass() from SomeClass.
createNew();
   evaluate(someInputParam.equals("foo"));
 }
 then
 {

    someOutputParam = "bar";
   context.
updateContext();
 }
};



The following points are important to note:

  1. The condition part matches instances of SomeClass that are accessed through the from IRL keyword so that whenever the rule is evaluated the method SomeClass.createNew() is invoked and a new instance is created.
  2. The condition part of the rule also checks the value of an input ruleset parameter called someInputParam. Since ruleset parameters are attached to the engine context, if the context is updated, this rule will be eligible for re-evaluation because it is interested in the value of this ruleset parameter.
  3. The action part of the rule sets the value of an output ruleset parameter (someOutputParam). It is necessary in this case to perform context.updateContext() to notify the rule engine of this change.

So when the rule above fires, the update of the context (3) makes the rule eligible for re-evaluation (2) and while it is being re-evaluated a new SomeClass instance is created (1). The result is that the rule will fire again but this time with a different value for obj1.
If the method createNew had instead returned the same object with each invocation then the refraction principle would prevent the rule from firing again because no new fact would be introduced in the rule engine context.


2 - Using "refresh" type of engine update

A direct way to prevent the refraction principle from applying is to use the IRL keyword refresh when doing an update on an object ( or by using the API IlrContext.update(Object object, boolean refresh) ). The "refresh" keyword is an explicit override of the refraction principle. When in use, rules can fire repeatedly with the same tuples of matching objects as long as the tests performed continue to evaluate to true.


3 - Rules mutually re-enabling each other to fire

Consider the below ruleset made of two rules setting a grade parameter. A first rule with high priority sets the grade to 5 if it is greater than 5. Another rule with low priority sets the grade to 6 if it less than 6.

ruleset grade_infinite_loop
{
  inout java.lang.Integer grade = 7;
};

rule grade_greater_than_5
{
  priority = high;
  when
  {
    IlrContext() from context;
    evaluate(grade > 5);
  }
  then
  {
    grade = 5;
    System.out.println("grade="+grade);
    context.updateContext();
  }
};

rule grade_less_than_6
{
  priority = low;
  when
  {
    IlrContext() from context;
    evaluate(grade < 6);
  }
  then
  {
    grade = 6;
    System.out.println("grade="+grade);
    context.updateContext();
  }
};


When this ruleset executes, starting with a grade value of 7, the first rule fires and its action part makes its condition part now false ( grade is now 5 so it is not strictly greater than 5 ). The lower priority rule then fires, and sets the grade to 6 : it makes its own condition part now false, while making the condition part of the first rule true again. The refraction principle would normally prevent the first rule from firing again for the same tuple of objects. However because the condition part of the first rule was made false (by its own action part), then true again (by the action part of the second rule), the refraction principle does not apply here, so the first rule fires again. For the same reason that it became false then true again, the second rule in turn fires, and so on.
One way to prevent this looping situation would be to use "greater than or equal to" in the condition part of the first rule, or "less than or equal to" in the condition part of the second rule, so as to avoid having the condition part of the rule turning false by the side effect of its own action part, which will in turn allow another rule to make it eligible to fire again.

Diagnosing the problem

To determine in which rule task and which set of rules is involved in the infinite loop situation, it can be useful to enable ruleset trace in Rule Execution Server, by setting the ruleset property ruleset.trace.enabled to true. Refer to the documentation:


Rule Execution Server Console online help > Viewing and managing rulesets > Predefined ruleset properties > ruleset.trace.enabled

If there is indeed an infinite loop situation the execution unit (XU) log should contain a large number of statements of rules being executed with the rule name printed as well as the rule task name that was entered prior to rule executions. The rule task and rule(s) involved in this execution trace should be reviewed to determine their interaction and how what is causing the loop.


Resolving the problem

Modifying the rules

First identify the rule or set of rules involved in the infinite loop by enabling the ruleset execution trace.

Then review the definition of the rules involved and :

  1. in the action part of the rules, note occurrences of the IRL keywords update and modify (as well as the methods IlrContext.update() and IlrContext.updateContext() ). In particular occurrences of the keyword refresh should be reviewed carefully as it is easy to create an infinite loop with this keyword.
  2. in the condition part of rules, identify which rules will be eligible for re-evaluation as a result of the update notifications noted in the previous point
  3. in the condition part of the rules found to be candidates for re-evaluations, check whether the objects that are being matched through navigation by in and from IRL keywords constitute a finite set
  4. in the condition part of the rules found to be candidates for re-evaluations, check whether the tests can evaluate alternatively to true and false as a side effect of the action part of the rules

If the looping situation seems caused by the creation of new facts in the condition part of the rule (in step 3 above), it is best to either postpone the creation of the new objects to the action part of the rule whenever possible. If the object must be present when the condition part is evaluated, then create the object before the rule evaluation in the initial action of the rule task for example, or in the action part of a higher priority rule that would execute before all others.

If the looping situation seems caused by rules mutually re-electing each other to fire, consider if the logic implemented by the rules is indeed correct. Also consider whether the condition part of the rules can be modified to remain true once the action part is executed, with a modification that still stays true to the intended logic of the rule.

If the looping situation seems caused by a "refresh" keyword, make sure the use of the keyword is indeed appropriate for the situation. Then make sure that side effects in the action part of rules will eventually make the condition part evaluate to false.

Exit criteria and runtime rule selection

At the level of each rule task it is possible to define the maximum number of rules that will fire, and also define a runtime rule selection criteria that can be used to prevent further rules from firing.

See in the product documentation:

Rule Studio > Orchestrating ruleset execution > Tasks > Working with ruleflows > Creating rule tasks > Setting the exit criteria


Rule Studio > Orchestrating ruleset execution > Concepts > Rule selection > Runtime rule selection

Other engine execution modes

The Sequential and FastPath algorithms perform stateless pattern matching and are not at risk of such infinite loop situations. If a rule task does not require to be executed with the Rete algorithm then a possible workaround is to use either of these algorithms. See in the product documentation:

Rule Studio > Optimizing execution > Concepts > Engine execution modes


Cross reference information
Segment Product Component Platform Version Edition
Business Integration IBM Operational Decision Manager Platform Independent 8.0.1, 8.0, 7.5 Enterprise

Rate this page:

(0 users)Average rating

Add comments

Document information


More support for:

WebSphere ILOG JRules
Modules:Java Engine

Software version:

7.0, 7.0.1, 7.0.2, 7.0.3, 7.1, 7.1.1

Operating system(s):

Platform Independent

Reference #:

1432462

Modified date:

2010-06-01

Translate my page

Machine Translation

Content navigation