Service Packs and Hot Fixes
  1. Service Packs and Hot Fixes
  2. MNT-3965

Deadlock during reindexing when many threads are running (e.g. LDAP sync)

    Details

      Description

      A customer is running LDAP sync with 8 threads at the same time as reindexing and is finding that indexing threads sometimes deadlock. At least one thread hangs in waitForHeadOfQueue() at the same time as another thread hangs in setStatus(). An example thread dump is below.

      Thread 'indexTrackerThread2', process 'server0', index '357'
      "indexTrackerThread2" Id=488 IN_WAIT_WITH_TIMEOUT
      cpu=170610.0 ms (system=38790.0 / user=131820.0) allocated=34267227704 B
      user="cifs/sap.corp@SAP.CORP" requestId="1"
      application="sap.com/com.sap.ca.alfresco.alfresco-internal.ear"
      Thread is in wait() operation with timeout: waiting on object monitor org.alfresco.repo.node.index.AbstractReindexComponent$ReindexWorkerRunnable (addr=0x0000000001f41d58)
      at java.lang.Object.wait(J)V(Native Method)
      at org.alfresco.repo.node.index.AbstractReindexComponent$ReindexWorkerRunnable.waitForHeadOfQueue()V(AbstractReindexComponent.java:831)
      at org.alfresco.repo.node.index.AbstractReindexComponent$ReindexWorkerRunnable.handleQueue()V(AbstractReindexComponent.java:993)
      at org.alfresco.repo.node.index.AbstractReindexComponent$ReindexWorkerRunnable.afterCommit()V(AbstractReindexComponent.java:962)
      at org.alfresco.repo.transaction.AlfrescoTransactionSupport$TransactionSynchronizationImpl.afterCompletion(I)V(AlfrescoTransactionSupport.java:814)
      at org.springframework.transaction.support.TransactionSynchronizationUtils.invokeAfterCompletion(Ljava.util.List;I)V(TransactionSynchronizationUtils.java:133)
      at org.springframework.transaction.support.AbstractPlatformTransactionManager.invokeAfterCompletion(Ljava.util.List;I)V(AbstractPlatformTransactionManager.java:904)
      at org.springframework.transaction.support.AbstractPlatformTransactionManager.triggerAfterCompletion(Lorg.springframework.transaction.support.DefaultTransactionStatus;I)V(AbstractPlatformTransactionManager.java:879)
      at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(Lorg.springframework.transaction.support.DefaultTransactionStatus;)V(AbstractPlatformTransactionManager.java:707)
      at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(Lorg.springframework.transaction.TransactionStatus;)V(AbstractPlatformTransactionManager.java:632)
      at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(Lorg.springframework.transaction.interceptor.TransactionAspectSupport$TransactionInfo;)V(TransactionAspectSupport.java:314)
      at org.alfresco.util.transaction.SpringAwareUserTransaction.commit()V(SpringAwareUserTransaction.java:467)
      at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(Lorg.alfresco.repo.transaction.RetryingTransactionHelper$RetryingTransactionCallback;ZZ)Ljava.lang.Object;(RetryingTransactionHelper.java:349)
      at org.alfresco.repo.node.index.AbstractReindexComponent$ReindexWorkerRunnable.run()V(AbstractReindexComponent.java:880)
      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Ljava.lang.Runnable;)V(ThreadPoolExecutor.java:651)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run()V(ThreadPoolExecutor.java:676)
      at java.lang.Thread.run()V(Thread.java:666)

      Thread 'indexTrackerThread1', process 'server0', index '356'
      "indexTrackerThread1" Id=487 IN_WAIT_WITHOUT_TIMEOUT
      cpu=52390.0 ms (system=13150.0 / user=39240.0) allocated=6560897856 B
      user="cifs/sap.corp@SAP.CORP" requestId="1"
      application="sap.com/com.sap.ca.alfresco.alfresco-internal.ear"
      Thread is in wait() operation without a timeout: waiting on object monitor java.lang.Object (addr=0x00002aacedb05170)
      at java.lang.Object.wait(J)V(Native Method)
      at java.lang.Object.wait()V(Object.java:474)
      at org.alfresco.repo.search.impl.lucene.index.IndexInfo.setStatus(Ljava.lang.String;Lorg.alfresco.repo.search.impl.lucene.index.TransactionStatus;Ljava.util.Set;Ljava.util.Set;)V(IndexInfo.java:1348)
      at org.alfresco.repo.search.impl.lucene.AbstractLuceneBase.setStatus(Lorg.alfresco.repo.search.impl.lucene.index.TransactionStatus;)V(AbstractLuceneBase.java:267)
      at org.alfresco.repo.search.impl.lucene.AbstractLuceneIndexerImpl.prepare()I(AbstractLuceneIndexerImpl.java:499)
      at org.alfresco.repo.search.impl.lucene.AbstractLuceneIndexerAndSearcherFactory.prepare()I(AbstractLuceneIndexerAndSearcherFactory.java:810)
      at org.alfresco.repo.transaction.AlfrescoTransactionSupport$TransactionSynchronizationImpl.beforeCommit(Z)V(AlfrescoTransactionSupport.java:695)
      at org.springframework.transaction.support.TransactionSynchronizationUtils.triggerBeforeCommit(Z)V(TransactionSynchronizationUtils.java:48)
      at org.springframework.transaction.support.AbstractPlatformTransactionManager.triggerBeforeCommit(Lorg.springframework.transaction.support.DefaultTransactionStatus;)V(AbstractPlatformTransactionManager.java:835)
      at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(Lorg.springframework.transaction.support.DefaultTransactionStatus;)V(AbstractPlatformTransactionManager.java:645)
      at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(Lorg.springframework.transaction.TransactionStatus;)V(AbstractPlatformTransactionManager.java:632)
      at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(Lorg.springframework.transaction.interceptor.TransactionAspectSupport$TransactionInfo;)V(TransactionAspectSupport.java:314)
      at org.alfresco.util.transaction.SpringAwareUserTransaction.commit()V(SpringAwareUserTransaction.java:467)
      at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(Lorg.alfresco.repo.transaction.RetryingTransactionHelper$RetryingTransactionCallback;ZZ)Ljava.lang.Object;(RetryingTransactionHelper.java:349)
      at org.alfresco.repo.node.index.AbstractReindexComponent$ReindexWorkerRunnable.run()V(AbstractReindexComponent.java:880)
      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Ljava.lang.Runnable;)V(ThreadPoolExecutor.java:651)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run()V(ThreadPoolExecutor.java:676)
      at java.lang.Thread.run()V(Thread.java:666)

        Issue Links

          Activity

          Hide
          Steve Rigby added a comment -

          For retest in 3.3.5 (and other versions as listed in Fix Versions list)

          Show
          Steve Rigby added a comment - For retest in 3.3.5 (and other versions as listed in Fix Versions list)
          Hide
          Alfresco QA Team added a comment -

          Could not reproduce against 3.2.0 (.s5 21) performing multithreaded CRUD operations during LDAP sync (triggered via JMX console)

          Show
          Alfresco QA Team added a comment - Could not reproduce against 3.2.0 (.s5 21) performing multithreaded CRUD operations during LDAP sync (triggered via JMX console)
          Hide
          Alfresco QA Team added a comment -

          Could not reproduce against 3.1.2 (.a8 478)

          Show
          Alfresco QA Team added a comment - Could not reproduce against 3.1.2 (.a8 478)
          Hide
          Alfresco QA Team added a comment -

          Couldn't reproduce against 3.3.5 (340)

          Show
          Alfresco QA Team added a comment - Couldn't reproduce against 3.3.5 (340)
          Hide
          Alfresco QA Team added a comment -

          Could not reproduce against 3.3.5 (366)

          Show
          Alfresco QA Team added a comment - Could not reproduce against 3.3.5 (366)
          Hide
          Alfresco QA Team added a comment -

          Validated against 3.4.1.234

          Show
          Alfresco QA Team added a comment - Validated against 3.4.1.234
          Hide
          Alfresco QA Team added a comment -

          Could not reproduce agaonst 3.3.4 (.9 16)

          Show
          Alfresco QA Team added a comment - Could not reproduce agaonst 3.3.4 (.9 16)

            People

            • Assignee:
              Closed Bugs
              Reporter:
              dward
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: