[MNT-10812] Enormous SQL SELECT load when doing a verbose query for all audit data Created: 28-Feb-14  Updated: 03-Sep-14  Resolved: 29-Jul-14

Status: Closed
Project: Service Packs and Hot Fixes
Component/s: Auditing, Database and Persistence
Affects Version/s: 4.2
Fix Version/s: 4.2.3

Type: Service Pack Request
Reporter: Antonio Soler-Asenci [X] (Inactive) Assignee: Closed Bugs
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: 0 minutes
Time Spent: 4 hours, 30 minutes
Original Estimate: Not Specified

Attachments: PNG File Alfresco-AuditSQL-withPropClassCached.png     PNG File Alfresco-AuditSQL.png     PNG File Alfresco-PropClassTable.png     File createcontent.js     Text File threaddump(2).txt    
Issue Links:
Related
relates to ALF-15582 Performance issues with Audit Query API Closed
relates to MNT-10067 Cleanup alf_prop_XXX data Closed
relates to MNT-10070 Provide audit filter capabilities for... Closed
Bug Priority:
Category 3
ACT Numbers:

00149037

Build Location: http://releases.alfresco.com/Enterprise%204.2/4.2.3/build-00050/

 Description   

Different jobs to clean up or process audit data (i.e. to turn them into notification mails) have to process the entire set of audit data for a specific audit application.

During their run, they generate a lot of SELECT statements (several millions) on the alf_prop_class table which has fewer than 20 values and can easily be cached.

This has already been described in ALF-15582 where Derek Hulley correctly identified the main improvement to be made: the PropertyDAOImpl needs to have its propertyClassCache configured. Unfortunately, that bug has been closed as "Not a Bug". Since we see major improvements in SQL load / performance when the cache is used (see YourKit profiler screenshots) and using paging is suggested only "masks" the problem instead of fixing it, we do not agree with that specific resolution. We request that the trivial change to properly activate the propertyClassCache be included in future versions / service packs. Alfresco users / customers should not have to customize the DAO configuration to activate an improvement that affects all Audit / AttributeService use cases (not just mass data query).

Steps to reproduce:
1) Activate alfresco-access auditing on global properties
audit.enabled=true
audit.alfresco-access.enabled=true

2) Perform a bulk content generation / load of at least several tens of thousand nodes (see attached script)

3) Start a SQL / Java profiling tool to capture SQL selects
on my case I used this process
https://wiki.alfresco.com/wiki/Driver-side_DB_Query_Profiling

4) Perform an audit query (e.g. curl -u admin:admin "http://localhost:8080/alfresco/service/api/audit/query/alfresco-access?verbose=true&forward=false&limit=0")

NOTES: The most significant query is the select * from alf_prop_class where id = ? query with hundreds of thousands or even millions of calls.
Expectation: Since the alf_prop_class table has very few entries, these should be cached by Alfresco and there should be at most a couple of dozen calls to the query - regardless of the amount of audit data, as the SQL call is part of the post-processing of results from the main audit query.


Generated at Mon Aug 10 17:51:11 BST 2020 using Jira 7.13.15#713015-sha1:7c5ddd2c3e1709974ae9c48c17df8edd3919fe2c.