Uploaded image for project: 'Alfresco'
  1. Alfresco
  2. ALF-21847

Share document library: error loading contents of folder with 4-byte unicode character


    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Unprioritized
    • Resolution: Won't Fix
    • Affects Version/s: Community Edition 201605 GA, Community Edition 201612 GA
    • Fix Version/s: None
    • Component/s: Share Application
    • Security Level: external (External user)
    • Labels:
    • Environment:
      Oracle JDK 1.8.0_112, Tomcat 7.0.47, MariaDB 5.5.50 with utf8mb4, PostgreSQL 9.5 on Windows 10


      Alfresco as a heavily i18n-ized product is considered to fully supports unicode (provided backing database and servlet container are properly configured). It is possible to create a folder in Share using a 4-byte unicode character / emoji like the "pile of poo" character (note: JIRA does not allow inclusion of the character here) as the name.

      When navigating into a folder with a 4-byte unicode character in Alfresco Share, the document library will run into an error loading its contents and display "No elements" as well as hiding the toolbar. Looking at the network monitor via the developer tools of the browser shows that the data web script call was answered with a HTTP 404 response code.

      Steps to reproduce:

      1. Ensure DB / servlet container is set up to fully support unicode (note: MySQL/MariaDB use utf8 which only supports 3-byte unicode - see ACE-773)
      2. Create a folder via Share UI (e.g. in My Files) with name as the "pile of poo" emoji
      3. Navigate into the created folder

      Expectation: Document library shows help text for empty folder and toolbar items to create/upload new elements
      Observation: Message "No elements" is shown and toolbar items are hidden, developer tools show HTTP 404 error

      Analysis: The Repository-tier backend is perfectly capable of handling a request to load items for a path that contains a 4-byte unicode character. Debugging of the Share-tier DefaultDoclistDataUrlResolver class shows that the encoding of the URI (Surf URLEncoder class) for the call to the Repository web script does not correctly handle 4-byte unicode characters. This is likely the same root problem as in ALF-21846 - instead of handling unicode code points the code only handles characters without checking for surrogate pairs (high/low characters).




            • Assignee:
              closedissues Closed Issues
              afaust Axel Faust
            • Votes:
              0 Vote for this issue
              3 Start watching this issue


              • Created:
                Date of First Response: