QRadar: Auto Updates Can Interrupt Network Activity Data Collection

Flashes (Alerts)

Abstract

FLASH NOTICE: An issue has been identified that impacts flow processing on 17xx, 18xx, and Console appliances. Administrators are encouraged to notify users that a service restart is required to correct this issue. This is an important notice to administrators with Network Activity or flow data.

Content

Urgency

IMPORTANT: Administrators should schedule a Deploy Full Configuration from the Admin tab in QRadar to correct a reported flow issue as described in this flash notice. Alternately, a service restart of ECS-EC for affected appliances with QFlow processors also corrects the issue where the flow processor thread is stopped. This article outlines the issue and options for administrators.

Summary

On Monday (July 10, 2017), we identified an issue that has the potential to cause a Network Activity (flow) data outage due to updates from the QRadar Identifier (QID) map with new items for QFlow from the weekly auto update. This week's auto update (WAU Version Serial 1499356784) was the first time QFlow QID map entries have been added from a QRadar automatic update.

If you are not using QRadar Automatic Updates, you are not impacted by this issue. Also, if you do not have flows in your deployment, then you are not affected by this issue. Administrators who complete a 'Full Deploy Configuration' after receiving a QRadar auto update are likely not going to experience this issue. QRadar support recommends that all administrators review their system notifications or QRadar logs to verify if any of your flow processor components in ECS-EC are impacted.

An APAR is pending for this issue. This article will be updated with an APAR link for administrators to track this issue.

Affected Products and Versions

QRadar appliances with QFlow processes components at 7.2.x or 7.3.x (all patch levels) are affected by this issue:

Affected Appliances

QRadar Consoles (31xx)
All-in-One Consoles (31xx-C)
QRadar Flow Processors (17xx)
QRadar Combination Event/Flow Processors (18xx)

Not Affected

QRadar Network Insights (19xx)
QFlow Collectors (12xx and 13xx)

How to Diagnose this QFlow Process Issue

To determine if you are experiencing an issue with flow processing, administrators can verify this issue using the user interface or from the command-line.

From the user interface:
Administrators who experience this issue can review if their System Notifications is generating a WARN: Performance Degradation message. System Notifications can be viewed by all administrators or users with the 'View System Notifications' user role in QRadar.

- System Notification: Performance degradation has been detected in the event pipeline. Events were routed directly to storage.
- QRadar Identifier No.: 38750088
- Hover Text Message: Flow Support Filter has sent a total of ### flows directly to storage. ### flow(s) have been sent in the last 60 seconds. Queue is at 100% capacity. The current incoming raw flow rate: ###.## fps is currently exceeding the ###.## fps license set on the system.
  
  For example:

Optionally, administrators can complete a search using the Quick Filter for the QID for this system notification in case their administrator is away or someone has cleared the system notifications recently: 38750088.

From the command-line interface:
Administrators with root access can review the logs for specific messages using the support all_servers.sh script to check the QRadar logs for error messages related to FlowSupportProcessingThread. The goal of this procedure is to identify appliances that are experiencing the flow thread exception so that ECS-EC can be restarted on individual affected appliances, without completing a 'Deploy Full Configuration'.

Error string from the QRadar logs:

/var/log/qradar.error:Jul 11 10:43:57 ::ffff:IP ADDRESS [ecs-ec] [FlowSupportProcessingThread_1] com.q1labs.frameworks.core.ThreadExceptionHandler: [ERROR] [NOT:0000003000][IP ADDRESS/- -] [-/- -]Exception was uncaught in thread: FlowSupportProcessingThread_1

/var/log/qradar.error:Jul 11 10:43:57 ::ffff:IP ADDRESS [ecs-ec] [FlowSupportProcessingThread_2] com.q1labs.frameworks.core.ThreadExceptionHandler: [ERROR] [NOT:0000003000][IP ADDRESS/- -] [-/- -]Exception was uncaught in thread: FlowSupportProcessingThread_2

Procedure

1. Using SSH, log in to the QRadar Console as the root user.
2. To identify appliances with this issue, type:
  /opt/qradar/support/all_servers.sh -C "grep -r FlowSupportProcessingThread /var/log/qradar* | grep ERROR"
  
  NOTE: It might take several minutes for this command to return results. This command will return the IP address of appliances where the FlowSupportProcessingThread generated an error in the QRadar logs. When this command is run, the qradar.java.debug log will display an error message but this can be ignored by administrators.
3. Any appliance that is experiencing this issue will return the following text for an 'Exception was uncaught in thread: FlowSupportProcessingThread' as displayed below. For example:
  
  [root@Console] /opt/qradar/support/all_servers.sh -C "grep -r FlowSupportProcessingThread /var/log/qradar* | grep ERROR" IP ADDRESS -> hostname.example.com Appliance Type: 3199 Product Version: 7.2.8.20170530170730 11:21:13 up 14:17, 5 users, load average: 6.77, 7.47, 5.21 ------------------------------------------------------------------------ IP ADDRESS -> hostname.example.com Appliance Type: 1801 Product Version: 7.2.8.20170530170730 11:21:13 up 14:24, 1 user, load average: 9.28, 4.93, 2.37 ------------------------------------------------------------------------ grep: /var/log/qradar.java.debug: No such file or directory /var/log/qradar.error:Jul 11 10:43:57 ::ffff:IP ADDRESS [ecs-ec] [FlowSupportProcessingThread_1] com.q1labs.frameworks.core.ThreadExceptionHandler: [ERROR] [NOT:0000003000][IP ADDRESS/- -] [-/- -]Exception was uncaught in thread: FlowSupportProcessingThread_1 /var/log/qradar.error:Jul 11 10:43:57 ::ffff:IP ADDRESS [ecs-ec] [FlowSupportProcessingThread_2] com.q1labs.frameworks.core.ThreadExceptionHandler: [ERROR] [NOT:0000003000][IP ADDRESS/- -] [-/- -]Exception was uncaught in thread: FlowSupportProcessingThread_2 /var/log/qradar.log:Jul 11 10:43:57 ::ffff:IP ADDRESS [ecs-ec] [FlowSupportProcessingThread_1] com.q1labs.frameworks.core.ThreadExceptionHandler: [ERROR] [NOT:0000003000][IP ADDRESS/- -] [-/- -]Exception was uncaught in thread: FlowSupportProcessingThread_1 /var/log/qradar.log:Jul 11 10:43:57 ::ffff:IP ADDRESS [ecs-ec] [FlowSupportProcessingThread_2] com.q1labs.frameworks.core.ThreadExceptionHandler: [ERROR] [NOT:0000003000][IP ADDRESS/- -] [-/- -]Exception was uncaught in thread: FlowSupportProcessingThread_2
  
  Note: In this example the 3199 appliance is not affected as no exceptions were returned from the logs; however, the 1801 appliance returned exception errors and will require a service restart to resolve this issue.
4. Only appliances that return exceptions are affected by the flow processor issue.
5. Administrators should note the IP address or hostnames of the affected appliances.
6. See the remediation section below for instructions on restarting ECS-EC on individual appliances.

Remediation for Administrators

Administrators can complete either procedure outlined below. For those administrators who do not want to complete a Deploy Full Configuration can use 'Option 2' to identify which hosts are affected and then restart ECS-EC on those appliances to resolve this issue.

Option 1: Complete a Deploy Full Configuration
1. Log in to the QRadar user interface as an administrator.
2. Click the Admin tab.
3. Click Advanced > Deploy Full Configuration.
  
  Results
  After the service is restarted administrators can remove the system notification for the pipeline degradation or monitor their Flow appliances for QID 38750088. For any additional questions, contact QRadar Support or ask in our forums: http://ibm.biz/qradarforums.
Option 2: Restart the ECS-EC Service on an Individual Appliance
1. Using SSH, log in to the QRadar Console as the root user.
2. SSH to the appliance with the affected flow processor.
3. Type one of the following commands, based on version:
  - For QRadar 7.2: service ecs-ec restart
  - For QRadar 7.3: systemctl restart ecs-ec
4. Optional. Use the all_servers utility to restart the affected appliances.
  
  For example to restart individual managed hosts with flow processes:
  For QRadar 7.2: /opt/qradar/support/all_servers.sh -I IPADDRESS, IPADDRESS, IPADDRESS "service restart ecs-ec"
  or
  For QRadar 7.3: /opt/qradar/support/all_servers.sh -I IPADDRESS, IPADDRESS, IPADDRESS "systemctl restart ecs-ec"
  or
  For QRadar 7.2 :/opt/qradar/support/all_servers.sh -a '18%' "service restart ecs-ec"
  or
  For QRadar 7.3: /opt/qradar/support/all_servers.sh -a '17%' "systemctl restart ecs-ec"
  
  NOTE: The commands above use an upper-case letter (i), not a lower-case L, depending on how your browser displays fonts. Where IPADDRESS is the comma-separated list of appliances.
  
  Results
  After the service is restarted administrators can remove the system notification for the pipeline degradation or monitor their Flow appliances for QID 38750088. For any additional questions, contact QRadar Support or ask in our forums: http://ibm.biz/qradarforums.

[{"Product":{"code":"SSBQAC","label":"IBM Security QRadar SIEM"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Network Activity","Platform":[{"code":"PF016","label":"Linux"}],"Version":"7.2;7.3","Edition":"","Line of Business":{"code":"LOB24","label":"Security Software"}}]

Tips

QRadar: Auto Updates Can Interrupt Network Activity Data Collection

Flashes (Alerts)

Abstract

Content

Urgency

Summary

Affected Products and Versions

How to Diagnose this QFlow Process Issue

Remediation for Administrators

Was this topic helpful?

Document Information

UID

Share your feedback

Need support?