Uploaded image for project: 'Service Packs and Hot Fixes'
  1. Service Packs and Hot Fixes
  2. MNT-8339

The content of .msg files that are uploaded via Share are not searchable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Unprioritized
    • Resolution: Fixed
    • Affects Version/s: 3.4.7
    • Fix Version/s: None
    • Component/s: Installer
    • Labels:
      None
    • Environment:
      Alfresco 3.4.7, Windows 2008R2, Tomcat, SQL Server

      Description

      The content of .msg files that are uploaded via Share are not searchable.

      To reproduce:

      upload a .msg file (provided via attachment) to Share
      In the content for .msg file there are these three words:
      Enjoy!
      Mukilteo
      Mukilteo Speedway
      When searching either in basic or advanced with these terms, the file does not show up in the search.

        Attachments

          Activity

          hseritt Harlin Seritt created issue -
          hseritt Harlin Seritt made changes -
          Field Original Value New Value
          Attachment Test message Share upload of MSG file.msg [ 29604 ]
          jcohorn jcohorn made changes -
          Assignee Sustaining Support Team [ alf_sustaining ] John Cohorn [ jcohorn ]
          Hide
          jcohorn jcohorn added a comment -

          After enabling transform logging:

          log4j.logger.org.alfresco.repo.content.transform=DEBUG

          I see the following logged in 3.4.7 when uploading the file via Explorer(Share upload fails):

          15:43:15,769 DEBUG [content.transform.ContentTransformerRegistry] Searched for transformer:
          source mimetype: application/vnd.ms-outlook
          target mimetype: text/plain
          transformers: [PoiContentTransformer[ average=0ms], MailContentTransformer[ average=0ms]]
          15:43:30,917 DEBUG [content.metadata.MetadataExtracterRegistry] Finding extractors for application/vnd.ms-outlook
          15:43:30,926 WARN [content.metadata.AbstractMappingMetadataExtracter] Metadata extraction failed (turn on DEBUG for full error):
          Extracter: org.alfresco.repo.content.metadata.MailMetadataExtracter@54aa1384
          Content: ContentAccessor[ contentUrl=store:///Users/jc/alf/installs/34/e347/tomcat/temp/Alfresco/alfresco7365161311622237915.upload, mimetype=application/vnd.ms-outlook, size=17920, encoding=UTF-8, locale=en_US]
          Failure: Invalid chunk name Olk10SideProps_0001null
          15:43:31,055 DEBUG [content.transform.ContentTransformerRegistry] Searched for transformer:
          source mimetype: application/vnd.ms-outlook
          target mimetype: text/plain
          transformers: [PoiContentTransformer[ average=0ms], MailContentTransformer[ average=60000ms]]
          15:43:33,300 DEBUG [content.transform.ContentTransformerRegistry] Searched for transformer:
          source mimetype: application/vnd.ms-outlook
          target mimetype: text/plain
          transformers: [PoiContentTransformer[ average=60000ms], MailContentTransformer[ average=60000ms]]
          15:44:26,620 DEBUG [content.transform.ContentTransformerRegistry] Searched for transformer:
          source mimetype: application/vnd.ms-outlook
          target mimetype: text/plain
          transformers: [PoiContentTransformer[ average=60000ms], MailContentTransformer[ average=60000ms]]

          It looks as though the metadata/text extractors encountered a malformed element in the file:

          Failure: Invalid chunk name Olk10SideProps_0001null

          This probably came from the POI Library. There seem to have been fixes in POI related to issues that resemble this:
          https://issues.apache.org/bugzilla/show_bug.cgi?id=51873

          Can you confirm that this is only seen with certain ill-formed email files? All MSG files? Only MSG files saved by a certain version of Outlook?

          Also, since 4.0 includes a slightly newer version of POI you may want to verify that the same issue is seen there.

          Show
          jcohorn jcohorn added a comment - After enabling transform logging: log4j.logger.org.alfresco.repo.content.transform=DEBUG I see the following logged in 3.4.7 when uploading the file via Explorer(Share upload fails): 15:43:15,769 DEBUG [content.transform.ContentTransformerRegistry] Searched for transformer: source mimetype: application/vnd.ms-outlook target mimetype: text/plain transformers: [PoiContentTransformer[ average=0ms], MailContentTransformer[ average=0ms]] 15:43:30,917 DEBUG [content.metadata.MetadataExtracterRegistry] Finding extractors for application/vnd.ms-outlook 15:43:30,926 WARN [content.metadata.AbstractMappingMetadataExtracter] Metadata extraction failed (turn on DEBUG for full error): Extracter: org.alfresco.repo.content.metadata.MailMetadataExtracter@54aa1384 Content: ContentAccessor[ contentUrl=store:///Users/jc/alf/installs/34/e347/tomcat/temp/Alfresco/alfresco7365161311622237915.upload, mimetype=application/vnd.ms-outlook, size=17920, encoding=UTF-8, locale=en_US] Failure: Invalid chunk name Olk10SideProps_0001null 15:43:31,055 DEBUG [content.transform.ContentTransformerRegistry] Searched for transformer: source mimetype: application/vnd.ms-outlook target mimetype: text/plain transformers: [PoiContentTransformer[ average=0ms], MailContentTransformer[ average=60000ms]] 15:43:33,300 DEBUG [content.transform.ContentTransformerRegistry] Searched for transformer: source mimetype: application/vnd.ms-outlook target mimetype: text/plain transformers: [PoiContentTransformer[ average=60000ms], MailContentTransformer[ average=60000ms]] 15:44:26,620 DEBUG [content.transform.ContentTransformerRegistry] Searched for transformer: source mimetype: application/vnd.ms-outlook target mimetype: text/plain transformers: [PoiContentTransformer[ average=60000ms], MailContentTransformer[ average=60000ms]] It looks as though the metadata/text extractors encountered a malformed element in the file: Failure: Invalid chunk name Olk10SideProps_0001null This probably came from the POI Library. There seem to have been fixes in POI related to issues that resemble this: https://issues.apache.org/bugzilla/show_bug.cgi?id=51873 Can you confirm that this is only seen with certain ill-formed email files? All MSG files? Only MSG files saved by a certain version of Outlook? Also, since 4.0 includes a slightly newer version of POI you may want to verify that the same issue is seen there.
          jcohorn jcohorn made changes -
          Assignee John Cohorn [ jcohorn ] Harlin Seritt [ hseritt ]
          Hide
          mrogers Mark Rogers added a comment -

          Its certainly not the case that all msg files are not searchable since we have unit tests, including quick.msg.

          Show
          mrogers Mark Rogers added a comment - Its certainly not the case that all msg files are not searchable since we have unit tests, including quick.msg.
          hdann Helen Dann (Inactive) made changes -
          ACT Numbers 15024-39876 39876
          hdann Helen Dann (Inactive) made changes -
          Assignee Harlin Seritt [ hseritt ] Closed Issues [ closedissues ]
          Status New [ 10001 ] Closed [ 6 ]
          Resolution Fixed [ 1 ]
          adavis Alan Davis made changes -
          Workflow Alfjira_II_PM [ 140338 ] Alfjira_II_PM 3 [ 204179 ]
          adavis Alan Davis made changes -
          Assignee Closed Issues [ closedissues ] Closed Bugs [ closedbugs ]
          adavis Alan Davis made changes -
          Project Alfresco [ 10281 ] Service Packs and Hot Fixes [ 11350 ]
          Key ALF-13206 MNT-8339
          Workflow Alfjira_II_PM 3 [ 204179 ] Service Packs and Hot Fixes [ 206021 ]
          Last Developer brbunassigned
          Affects Version/s 3.4.7 [ 12743 ]
          Affects Version/s 3.4.7 [ 10793 ]
          Rank (Obsolete) 37730000000
          ACT Numbers 39876
          Component/s ZZ Do not use [ 12815 ]
          Component/s Sustaining [ 10760 ]
          adavis Alan Davis made changes -
          Component/s Installation [ 12296 ]
          Component/s ZZ Do not use [ 12815 ]
          adavis Alan Davis made changes -
          Assignee Closed Bugs [ closedbugs ] Closed Issues [ closedissues ]
          Transition Time In Source Status Execution Times
          Helen Dann (Inactive) made transition -
          New Closed
          99d 17h 45m 1

            People

            • Assignee:
              closedissues Closed Issues
              Reporter:
              hseritt Harlin Seritt
              My watchers:
              Helen Dann (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: