Different jobs to clean up or process audit data (i.e. to turn them into notification mails) have to process the entire set of audit data for a specific audit application.
During their run, they generate a lot of SELECT statements (several millions) on the alf_prop_class table which has fewer than 20 values and can easily be cached.
This has already been described in
ALF-15582 where Derek Hulley correctly identified the main improvement to be made: the PropertyDAOImpl needs to have its propertyClassCache configured. Unfortunately, that bug has been closed as "Not a Bug". Since we see major improvements in SQL load / performance when the cache is used (see YourKit profiler screenshots) and using paging is suggested only "masks" the problem instead of fixing it, we do not agree with that specific resolution. We request that the trivial change to properly activate the propertyClassCache be included in future versions / service packs. Alfresco users / customers should not have to customize the DAO configuration to activate an improvement that affects all Audit / AttributeService use cases (not just mass data query).
Steps to reproduce:
1) Activate alfresco-access auditing on global properties
2) Perform a bulk content generation / load of at least several tens of thousand nodes (see attached script)
3) Start a SQL / Java profiling tool to capture SQL selects
on my case I used this process
4) Perform an audit query (e.g. curl -u admin:admin "http://localhost:8080/alfresco/service/api/audit/query/alfresco-access?verbose=true&forward=false&limit=0")
NOTES: The most significant query is the select * from alf_prop_class where id = ? query with hundreds of thousands or even millions of calls.
Expectation: Since the alf_prop_class table has very few entries, these should be cached by Alfresco and there should be at most a couple of dozen calls to the query - regardless of the amount of audit data, as the SQL call is part of the post-processing of results from the main audit query.