Uploaded image for project: 'Service Packs and Hot Fixes'
  1. Service Packs and Hot Fixes
  2. MNT-10812

Enormous SQL SELECT load when doing a verbose query for all audit data

    Details

      Description

      Different jobs to clean up or process audit data (i.e. to turn them into notification mails) have to process the entire set of audit data for a specific audit application.

      During their run, they generate a lot of SELECT statements (several millions) on the alf_prop_class table which has fewer than 20 values and can easily be cached.

      This has already been described in ALF-15582 where Derek Hulley correctly identified the main improvement to be made: the PropertyDAOImpl needs to have its propertyClassCache configured. Unfortunately, that bug has been closed as "Not a Bug". Since we see major improvements in SQL load / performance when the cache is used (see YourKit profiler screenshots) and using paging is suggested only "masks" the problem instead of fixing it, we do not agree with that specific resolution. We request that the trivial change to properly activate the propertyClassCache be included in future versions / service packs. Alfresco users / customers should not have to customize the DAO configuration to activate an improvement that affects all Audit / AttributeService use cases (not just mass data query).

      Steps to reproduce:
      1) Activate alfresco-access auditing on global properties
      audit.enabled=true
      audit.alfresco-access.enabled=true

      2) Perform a bulk content generation / load of at least several tens of thousand nodes (see attached script)

      3) Start a SQL / Java profiling tool to capture SQL selects
      on my case I used this process
      https://wiki.alfresco.com/wiki/Driver-side_DB_Query_Profiling

      4) Perform an audit query (e.g. curl -u admin:admin "http://localhost:8080/alfresco/service/api/audit/query/alfresco-access?verbose=true&forward=false&limit=0")

      NOTES: The most significant query is the select * from alf_prop_class where id = ? query with hundreds of thousands or even millions of calls.
      Expectation: Since the alf_prop_class table has very few entries, these should be cached by Alfresco and there should be at most a couple of dozen calls to the query - regardless of the amount of audit data, as the SQL call is part of the post-processing of results from the main audit query.

        Attachments

        1. threaddump(2).txt
          160 kB
        2. createcontent.js
          0.6 kB
        3. Alfresco-PropClassTable.png
          Alfresco-PropClassTable.png
          144 kB
        4. Alfresco-AuditSQL-withPropClassCached.png
          Alfresco-AuditSQL-withPropClassCached.png
          139 kB
        5. Alfresco-AuditSQL.png
          Alfresco-AuditSQL.png
          182 kB

          Issue Links

            Structure

              Activity

                People

                • Assignee:
                  closedbugs Closed Bugs
                  Reporter:
                  asolerasenci Antonio Soler-Asenci [X] (Inactive)
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  5 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved:

                    Time Tracking

                    Estimated:
                    Original Estimate - Not Specified
                    Not Specified
                    Remaining:
                    Remaining Estimate - 0 minutes
                    0m
                    Logged:
                    Time Spent - 4 hours, 30 minutes
                    4h 30m

                      Structure Helper Panel