Optimizing sweeps with FilteredQueryTimeout
The use of the FilteredQueryTimeout property can optimize the performance of job sweeps and policy sweeps, depending on the sweep filter conditions.
Default sweep query behavior
A job sweep or policy sweep first issues a query to retrieve a batch of objects from a database table. The returned objects are considered candidate objects because the sweep has yet to determine which objects will be processed. To evaluate, the sweep applies the filter expression conditions on the candidate objects. The objects that satisfy the conditions are then processed by the sweep.
For example, the SQL statement is a sweep query of the Document class and all subclasses:
SELECT TOP 2000 bcn.Id, bcn.StorageArea, bcn.ReplicationGroup
FROM Document bcn WITH INCLUDESUBCLASSES
WHERE Id > {00000000-0000-0000-0000-000000000000} ORDER BY Id
The query retrieves a batch of candidate Document objects from the database. Say that the sweep is a disposal policy that has a filter expression of IsReserved=TRUE AND ContentSize > 10000. When the disposal policy evaluates the candidate objects, it deletes only instances that are in the reservation state and with content that is larger than 10,000 bytes.
FilteredQueryTimeout property
For default sweep query behavior, the FilteredQueryTimeout property is set to 0. Changing FilteredQueryTimeout to a value greater than 0 changes the sweep query behavior. That is, queries that are issued to the database now include the conditions of the filter expression in the WHERE clause.
Expanding on the previous example, using a FilteredQueryTimeout value of 30 seconds and a filter expression of IsReserved=TRUE AND ContentSize > 10000 results in the following sweep query:
SELECT TOP 2000 bcn.Id, bcn.StorageArea, bcn.ReplicationGroup, bcn.ContentSize, bcn.IsReserved
FROM Document bcn WITH INCLUDESUBCLASSES
WHERE Id > {00000000-0000-0000-0000-000000000000}
AND (IsReserved=TRUE AND ContentSize > 10000)
ORDER BY Id OPTIONS(TIMELIMIT 30)
The value of the FilterExpression property is used to expand the WHERE clause and the selection list of the query. The TIMELIMIT value of the query is set to the value of the FilteredQueryTimeout property (TIMELIMIT 30).
When the time limit is greater than 0, it specifies the acceptable time for the query to finish. The time limit is applied only to the first query of the sweep iteration. If the query time limit is exceeded, then the sweep falls back to using a query that does not contain the filter expression in the WHERE clause. The sweep continues to use the WHERE clause without the filter expression until the iteration is completed. If the query time limit is not exceeded during the first query, then the sweep continues to use the WHERE clause with the filter expression until the sweep iteration is completed.
In the administrator console, the FilteredQueryTimeout property cannot be set during the creation of a sweep job or sweep policy with a wizard. The property must be set after the sweep is created. The FilteredQueryTimeout property is listed in the Properties tab of the sweep job or sweep policy.
Multiple filter expressions
Policy sweeps can include multiple policies; for example, a sweep of the Document class can include a disposal sweep policy and a retention update sweep policy. When multiple policies are defined for a sweep, the conditions of each filter expression of each policy are combined with an OR operator and are included in the WHERE clause.
In the following example, the FilteredQueryTimeout value is set to 90 seconds for a sweep of the Document class that includes two sweep policies. The first policy defines the filter expression as IsReserved=TRUE, and the second policy defines the filter expression as ContentSize<5000000. The sweep must select objects that match either expression, by using the following SQL statement:
SELECT TOP 2000 bcn.Id, bcn.Owner, bcn.CmRetentionDate, bcn.IsReserved, bcn.DateCreated, bcn.ContentSize
FROM Document bcn WITH INCLUDESUBCLASSES
WHERE Id > {00000000-0000-0000-0000-000000000000}
AND ((IsReserved=TRUE) OR (ContentSize<5000000))
ORDER BY Id OPTIONS(TIMELIMIT 90)
Usage guidelines
The use of the FilteredQueryTimeout property is normally a trade-off between longer-running queries or evaluating candidate objects before processing them. If the FilteredQueryTimeout property is used, each query runs longer because it includes the conditions of the filter expression in the WHERE clause. However, the sweep job or policy runs faster because it need not evaluate candidate objects that are returned by the database before it processes them. If the FilteredQueryTimeout property is not used, then each query runs faster, but 100% of the instances are examined by the policy.
To determine whether the FilteredQueryTimeout property (set to a value greater than 0) provides a performance advantage for a particular sweep, follow these guidelines:
- Create the appropriate database indexes for the sweep. Without the appropriate indexes, sweep performance is not improved by including filter conditions in the sweep query. For more information, see Creating an index to improve sweep performance.
- Initially, set the time limit for the query to a fairly low value, such as 30 seconds. Adjust the value later as sweep performance dictates. For information about monitoring sweep performance, see Sweep iteration progress monitoring.