@Christopher Detzel,
Here is the reason:
The main issue is the MX's results tablespace was almost full.
The MX was constantly generating system events about the tablespace which is running low on space.
- When tablespace running low a system event is generated - For example:The Tablespace RESULTS is running low on space. Warning Threshold of 85% has exceeded. Used space: 86%
- When running a DAS scan, we check for tablespace and if it is below the threshold this system event is issued
- This system event is aggregated, and when lots of scans running simultaneously there's a StaleObjectException on the SystemEventAggregationItem
For some reason, these system events were sent at high rate (they're sent daily at 6:00 and on each scan run) this caused collisions between them (system events & results table space). This is the root cause of this issue.
Solution / Workaround
The bug was known to be resolved starting v12 - However, it should be noted that the fix evidently only includes an update to the Assessment Thresholds being pre-configured to 1,000.
The steps below should be used as general troubleshooting steps when dealing with the errors mentioned above:
- Create a new policy that contains the same configuration as a policy that fails with the hibernate errors listed above.
- Next - run the scan
- If it fails - continue with steps 2 and onward.
- if it does not fail - Then chances are it has to do with what's currently stored for the scan
- E.g. purging scans for the scan you replicated my allow it to continue without error.
- Change the assessment results purging definitions to be "by size" and keep 2-3 scan runs (Admin -> Maintenance -> Assessment Results Archive -> Purge Definitions)
- Reduce the result records threshold to 1000 (Admin -> System Definitions -> Management Server Settings -> Assessments)
- Consider deleting old unused scans
------------------------------
Ira Miga
Imperva
Knowledge Engineer
------------------------------
Original Message:
Sent: 06-17-2020 11:43
From: Christopher Detzel
Subject: What do you do when all DAS scans suddenly failed?
This question has been asked more than one to our support team. I thought I would share it with you. @Ira Miga will answer it with her thoughts.
The following error was displayed, when we clicked on the "information" next to "Failed" status in the scan's history tab:
The following error is displayed, only when running the scan (the scan almost instantly failed):
#DataRiskAnalytics(formerlyCounterBreach)
------------------------------
Christopher Detzel
Community Manager
Imperva
------------------------------