Uploaded image for project: 'Service Packs and Hot Fixes'
  1. Service Packs and Hot Fixes
  2. MNT-7181

Uploading « corrupted files » while using SHARE make the system unresponsive. JODConverter issue and/or "soffice" issue.

    Details

    • Type: Service Pack Request
    • Status: Closed (View Workflow)
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 3.2 R, 3.4
    • Fix Version/s: 4.2
    • Component/s: Repository
    • Labels:
      None

      Description

      Problem starts with 3.2R.

      Symptom:
      Almost 100% of the CPU is used by « soffice » and system become very slow and unresponsive.

      How to reproduce?:
      Install an AOB 3.4.0 version.
      Configure JODConverter like this:

      jodconverter.enabled=true
      jodconverter.officeHome=C:/Alfresco/openoffice/App/openoffice
      jodconverter.portNumbers=8101,8102,8103,8104,8105,8106,8107  

      jodconverter.maxTasksPerProcess=10

      1. timeouts are in milliseconds
        jodconverter.taskExecutionTimeout=5000
        jodconverter.taskQueueTimeout=30000
        ooo.enabled=false

      Create a site.
      Upload in SHARE corrupted documents document set attached. (Duplicate some if requires to have a set of at least 10 docs).
      Switch to "detailed view" and move slowly the mouse above the documents icons, you can also view the "documents details".

      Observed result:
      CPU will be at 100% utilisation and will stay like that during a very long period, more than 15 minutes if not forever.
      System become very slow and unresponsive because all the CPU is taken by the "soffice" processes.

      Other observations:
      Lowering odconverter.taskExecutionTimeout to 50000 did not helped, lowering odconverter.taskQueueTimeout to 30000 did not helped neither. I was not able to find the correct setting able to deal with such situation.

      Remarks:
      To prevent such situation we could avoid calling transformation on documents marked "NITF " during the indexing process (see method indexProperty in ADMLuceneIndexerImpl). Documents are "marked" because we just index "NITF" in place of the content that we failed to obtain.

        Attachments

          Activity

          Hide
          apriscott Alex Priscott added a comment -

          I believe we have an exact same problem and it's causing corruption of (Mac Office) file over CIFS, and in some cases deletion of the original file on CIFS, leading to an overall impression of flakey CIFS service.

          What I believe is happening is as follow:

          1. You have a very large Word document on CIFS
          2. You open it and make some changes
          3. You save the document
          4. At the same time while the document is saving, you go into the Share interface and bring up that document's details page (this triggers the preview generation and/or thumbnail generation)
          5. The system CPU is then highly utilized by the preview and/or thumbnail generation and so it doesn't respond to Word in a timely manner
          6. Word then generates a message that it cannot save due to errors.
          7. Sometimes the original file is deleted because the saving process of a Word document involves deleting the original file and renaming the current .tmp file to have the same name as the delete file. If the renaming process fails, you end up with just the .tmp file and the original file is deleted from the repository.

          Show
          apriscott Alex Priscott added a comment - I believe we have an exact same problem and it's causing corruption of (Mac Office) file over CIFS, and in some cases deletion of the original file on CIFS, leading to an overall impression of flakey CIFS service. What I believe is happening is as follow: 1. You have a very large Word document on CIFS 2. You open it and make some changes 3. You save the document 4. At the same time while the document is saving, you go into the Share interface and bring up that document's details page (this triggers the preview generation and/or thumbnail generation) 5. The system CPU is then highly utilized by the preview and/or thumbnail generation and so it doesn't respond to Word in a timely manner 6. Word then generates a message that it cannot save due to errors. 7. Sometimes the original file is deleted because the saving process of a Word document involves deleting the original file and renaming the current .tmp file to have the same name as the delete file. If the renaming process fails, you end up with just the .tmp file and the original file is deleted from the repository.
          Hide
          cnguyen Chi Nguyen added a comment -

          Hmm for some reasons, JIRA thinks I was Alex Priscott. That last comment was made by me (I had to log out and re-login)

          Show
          cnguyen Chi Nguyen added a comment - Hmm for some reasons, JIRA thinks I was Alex Priscott. That last comment was made by me (I had to log out and re-login)
          Hide
          cnguyen Chi Nguyen added a comment -

          We need to put these processes at a lower priority to the core Alfresco/tomcat process.

          Show
          cnguyen Chi Nguyen added a comment - We need to put these processes at a lower priority to the core Alfresco/tomcat process.
          Hide
          adavis Alan Davis added a comment - - edited

          This is a bug with OpenOffice 3.2. If either of these files is loaded interactivity into OpenOffice a dialogue is displayed that indicates that the files are corrupt and asks if an attempt should be made to recover them. If NO is selected the OpenOffice window locks up and a CPU core goes to 100%. The process must be killed to reduce the usage.

          The good news however is that LibreOffice 3.5 does not have this issue. There is no spike in CPU. I have not been able to find a reference to the fixed OpenOffice/LibreOffice bug. The Alfresco community version 4.2.a uses LibreOffice 3.5. Both files load successfully in Alfresco 4.2.a. It is possible to preview the ppt and a thumbnail is created in the document library. The content does not look that bad but defiantly is corrupted. The same is true of the xls, but is more corrupted and appears mainly as pages of text.

          Although not a supported option on 3.4.x (this would be considered a functional change), it is possible to reconfigure 3.4.10 onwards to point at LibreOffice 3.5 rather than OpenOffice by setting the following alfresco global properties (Enterprise customers normally use jod rather than ooo)

          ooo.exe=D:/Alfresco4.2.a/libreoffice/App/libreoffice/program/soffice.exe
          ooo.enabled=false
          ooo.port=8100

          jodconverter.enabled=true
          jodconverter.officeHome=D:/Alfresco4.2.a/libreoffice/App/libreoffice
          jodconverter.portNumbers=8101

          Show
          adavis Alan Davis added a comment - - edited This is a bug with OpenOffice 3.2. If either of these files is loaded interactivity into OpenOffice a dialogue is displayed that indicates that the files are corrupt and asks if an attempt should be made to recover them. If NO is selected the OpenOffice window locks up and a CPU core goes to 100%. The process must be killed to reduce the usage. The good news however is that LibreOffice 3.5 does not have this issue. There is no spike in CPU. I have not been able to find a reference to the fixed OpenOffice/LibreOffice bug. The Alfresco community version 4.2.a uses LibreOffice 3.5. Both files load successfully in Alfresco 4.2.a. It is possible to preview the ppt and a thumbnail is created in the document library. The content does not look that bad but defiantly is corrupted. The same is true of the xls, but is more corrupted and appears mainly as pages of text. Although not a supported option on 3.4.x (this would be considered a functional change), it is possible to reconfigure 3.4.10 onwards to point at LibreOffice 3.5 rather than OpenOffice by setting the following alfresco global properties (Enterprise customers normally use jod rather than ooo) ooo.exe=D:/Alfresco4.2.a/libreoffice/App/libreoffice/program/soffice.exe ooo.enabled=false ooo.port=8100 jodconverter.enabled=true jodconverter.officeHome=D:/Alfresco4.2.a/libreoffice/App/libreoffice jodconverter.portNumbers=8101
          Hide
          alfrescoqa Alfresco QA Team added a comment -

          The issue is not reproduced on Alfresco Enterprise v4.2.0(r52916-b183),Tomcat, PostgreSQL, Java (all installer deployed), LibreOffice
          TatianaK

          Show
          alfrescoqa Alfresco QA Team added a comment - The issue is not reproduced on Alfresco Enterprise v4.2.0( r52916 -b183),Tomcat, PostgreSQL, Java (all installer deployed), LibreOffice TatianaK

            People

            • Assignee:
              closedbugs Closed Bugs
              Reporter:
              pdubois Philippe Dubois
            • Votes:
              3 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0 minutes
                0m
                Logged:
                Time Spent - 1 week, 4 days, 7 hours
                1w 4d 7h