Uploaded image for project: 'Service Packs and Hot Fixes'
  1. Service Packs and Hot Fixes
  2. MNT-11343

CLONE - SOLR: Long reindex times when deleting or changing sites env with 15K sites

    Details

      Description

      STEPS TO REPRODUCE:
      1) Create 15k sites.
      2) Create 350 users
      3) Add all users to every site
      4) Delete 100 sites and upload a couple of documents. This will cause all users to be re-indexed and solr is stuck for many hours . You cannot search new docs until solr catch up new transactions after this deletion.

      Expected behavior:
      To be able to search new documents shortly after being added to system (timescale of minutes).

      OBSERVED BEHAVIOR:
      In the dev environment took approx 2 hours per site deleted, the impact of all the deletion was 5 days on their prod environment.

      ANALYSIS
      During the indexing there was a high network load between Solr and Alfresco (2 to 10MB/s). We enabled Solr debug logging for a short time
      (in attachment). It seems that for each user Solr indexes all possible paths. For each site there are 6 paths per user:
      1) /system/authorities/GROUP_site_

      {site}/…
      2) /system/zones/APP.SHARE/GROUP_site_{site}

      /…
      3) /system/zones/AUTH.ALF/GROUP_site_

      {site}/…
      4) /system/authorities/GROUP_site_{site}

      /GROUP_site_

      {site}_{role}/…
      5) /system/zones/APP.SHARE/GROUP_site_{site}

      /GROUP_site_

      {site}_{role}/…
      6) /system/zones/AUTH.ALF/GROUP_site_{site}

      /GROUP_site_

      {site}

      _

      {role}

      /…

      In our case this means each user has 6 * 15.000 paths (= 90.000) for the Site groups alone. In addition there will be more paths for the other groups of which the user is a member. The same applies to the configurations/preferences child nodes of each user (2 extra nodes per user). For 350 users/15k sites this means more than 350 * 3 * 6 * 15.000 = 94.5 million path fields have to be indexed. We suspect this is the cause of the very slow indexing we experience.

        Attachments

          Issue Links

            Structure

              Activity

                People

                • Assignee:
                  closedbugs Closed Bugs (Inactive)
                  Reporter:
                  asolerasenci Antonio Soler-Asenci [X] (Inactive)
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  3 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved:

                    Structure Helper Panel