The result set from the SQL query run by the trashcan-cleaner job can lead to performance issues or OOM crashes where the repository has a lot of deleted nodes.
The query returns all children of the archive workspace at each execution and doesn't take the trashcan congfiguration settings into account
Enable the trashcan cleaner job
- Set the keepPeriod to 2 hours
- Set the deleteBatchCount to 50
- Set the cron to 15 minute intervals
- Enable SQL tracing to capture the result set being returned - eg: P6Spy
- Restart Alfresco
- Delete a lot of content (10000 nodes) into the trashcan
The SQL result should reflect the setting for the job. For example if the keepPeriod was P30D (30 days) the result set should reflect that along with the batch count of 50
Regardless of the settings, all archive nodes are returned at each execution of the job. With larger result sets, this can fill the heap and trigger excessive garbage collection or even OOM errors
The customer identified the following in mitigation:
- this line in our code.
- this issue raised against the package in github