Enterprise 2.2
  1. Enterprise 2.2
  2. ETWOTWO-1133

Incorrect CRC32 Values for non-ASCII names

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: Service Pack 2
    • Fix Version/s: Service Pack 4
    • Component/s: None
    • Security Level: external (External user)
    • Labels:
      None
    • Environment:
      windows 2003 or linux (forced with locale -Dsun.jnu.encoding=cp1252 -Dfile.encoding=cp1252)

      Description

      After upgrade from 2.2.0 to 2.2.2 due to the automatic encoding on the Windows server

      Using CIFS, if you try to access a space or document called "été" after the upgrade you will get this error:

      "..... refers to a location that is unavailable. It could be on a hard drive on this computer, on a network, or on a different computer on your home network. Check to make sure that the disk is properly inserted, or that you are connected to the Internet or home network, and they try again. If it still cannot be located, the information might have been moved to a different location."

      Before the upgrade, CIFS can access this folder fine.

      to reproduce:

      1) install a clean 2.2.0 on Windows with mysql (For Linux you need to set JAVA_OPTS: -Dsun.jnu.encoding=cp1252 -Dfile.encoding=cp1252 )

      2) boot and create a directory with non ascii characters (like acute accents)

      3) upgrade to 22 sp2 deploying the war
      4) try to access the folder using CIFS

      Results:
      ========
      The cifs (XP) client can't access the folder and gets
      "..... refers to a location that is unavailable. It could be on a hard drive on this computer, on a network, or on a different computer on your home network. Check to make sure that the disk is properly inserted, or that you are connected to the Internet or home network, and they try again. If it still cannot be located, the information might have been moved to a different location."

      The web UI can still access the folder and one can rename it using the web UI.

      Expected result:
      ============
      The cient can access the folder

        Issue Links

          Activity

          Hide
          Alex Madon added a comment - - edited
          After further testing I was able to narrow down the number of conditions to reproduce the issue.

          In fact only 1) c) is relevant.
          JAVA_OPTS:
          -Dsun.jnu.encoding=cp1252 -Dfile.encoding=cp1252


          Alfresco mysql tables are UTF-8 encoded independantly on the default encoding used for mysql client or mysql daemon.
          (this can be checked with a

          mysqlshow --status alfrescodb -u alfrescouser -p
          )

          The combinations that I tried are listed below:

          The result OK or FAILS after the arrow corresponds to the upgrade process result. The first two columns correspond to the default encoding are represented in the my.cnf file and/or the mysql/mysqld --print-default command


          mysql utf-8 + mysqld utf-8 + "-Dsun.jnu.encoding=utf-8 -Dfile.encoding=utf-8" => OK
          mysql latin1 + mysqld latin1 + "-Dsun.jnu.encoding=utf-8 -Dfile.encoding=utf-8" => OK
          mysql latin1 + mysqld latin1 + "-Dsun.jnu.encoding=cp1252 -Dfile.encoding=cp1252" => FAILS
          mysql latin1 + mysqld latin1 + "-Dsun.jnu.encoding=utf-8 -Dfile.encoding=utf-8" => OK
          mysql utf-8 + mysqld utf-8 + "-Dsun.jnu.encoding=cp1252 -Dfile.encoding=cp1252" => FAILS
          Show
          Alex Madon added a comment - - edited After further testing I was able to narrow down the number of conditions to reproduce the issue. In fact only 1) c) is relevant. JAVA_OPTS: -Dsun.jnu.encoding=cp1252 -Dfile.encoding=cp1252 Alfresco mysql tables are UTF-8 encoded independantly on the default encoding used for mysql client or mysql daemon. (this can be checked with a mysqlshow --status alfrescodb -u alfrescouser -p ) The combinations that I tried are listed below: The result OK or FAILS after the arrow corresponds to the upgrade process result. The first two columns correspond to the default encoding are represented in the my.cnf file and/or the mysql/mysqld --print-default command mysql utf-8 + mysqld utf-8 + "-Dsun.jnu.encoding=utf-8 -Dfile.encoding=utf-8" => OK mysql latin1 + mysqld latin1 + "-Dsun.jnu.encoding=utf-8 -Dfile.encoding=utf-8" => OK mysql latin1 + mysqld latin1 + "-Dsun.jnu.encoding=cp1252 -Dfile.encoding=cp1252" => FAILS mysql latin1 + mysqld latin1 + "-Dsun.jnu.encoding=utf-8 -Dfile.encoding=utf-8" => OK mysql utf-8 + mysqld utf-8 + "-Dsun.jnu.encoding=cp1252 -Dfile.encoding=cp1252" => FAILS
          Hide
          Alex Madon added a comment -
          changing the JVM encoding to UTF8 before the upgrade does not resolve the issue
          even with a FULL re-index.
          Show
          Alex Madon added a comment - changing the JVM encoding to UTF8 before the upgrade does not resolve the issue even with a FULL re-index.
          Hide
          Alex Madon added a comment -
          To rephrase the bug description to show the scale of the issue:

          Any customer upgrading from 2.2.0 to 2.2.2 when the server is a windows machine will end up with the documents and folders with non ascii characters unaccessible by CIFS.

          This is because windows machine use cp1252 encoding by default.
          Show
          Alex Madon added a comment - To rephrase the bug description to show the scale of the issue: Any customer upgrading from 2.2.0 to 2.2.2 when the server is a windows machine will end up with the documents and folders with non ascii characters unaccessible by CIFS. This is because windows machine use cp1252 encoding by default.
          Hide
          Helen Dann (Inactive) added a comment - - edited
          Re-wrote description.

          Assigning to Derek per Paul, since it seems to be specific to the upgrade.
          Show
          Helen Dann (Inactive) added a comment - - edited Re-wrote description. Assigning to Derek per Paul, since it seems to be specific to the upgrade.
          Hide
          Derek Hulley added a comment -
          Diagnosis:

          The upgrade process is NOT modifying the data. Rather, the calculation of the cm:name CRC values changed (ALFCOM-1335) from being based on the JVM encoding to being based on UTF-8 encoding.

          The column alf_child_assoc.child_node_name_crc contains incorrect values (after the upgrade) for cm:name properties that are out of the ascii range.

          Workaround to follow.
          Show
          Derek Hulley added a comment - Diagnosis: The upgrade process is NOT modifying the data. Rather, the calculation of the cm:name CRC values changed ( ALFCOM-1335 ) from being based on the JVM encoding to being based on UTF-8 encoding. The column alf_child_assoc.child_node_name_crc contains incorrect values (after the upgrade) for cm:name properties that are out of the ascii range. Workaround to follow.
          Hide
          Derek Hulley added a comment -
          The fix for ALFCOM-1335 caused this regression.
          Show
          Derek Hulley added a comment - The fix for ALFCOM-1335 caused this regression.
          Hide
          Derek Hulley added a comment -
          Workaround:

          On MySQL, the crc can be updated to match the UTF-8 version of the string:

          update alf_child_assoc ca
          join alf_node n on (ca.child_node_id = n.id and ca.child_node_name_crc > 0)
          join alf_node_properties np on (n.id = np.node_id)
          join alf_qname qn on (np.qname_id = qn.id and qn.local_name = 'name')
          join alf_namespace ns on (qn.ns_id = ns.id and ns.uri = 'http://www.alfresco.org/model/content/1.0')
          set ca.child_node_name_crc = crc32(CONVERT(LOWER(np.string_value) USING utf8))
          ;
          Show
          Derek Hulley added a comment - Workaround: On MySQL, the crc can be updated to match the UTF-8 version of the string: update alf_child_assoc ca join alf_node n on (ca.child_node_id = n.id and ca.child_node_name_crc > 0) join alf_node_properties np on (n.id = np.node_id) join alf_qname qn on (np.qname_id = qn.id and qn.local_name = 'name') join alf_namespace ns on (qn.ns_id = ns.id and ns.uri = ' http://www.alfresco.org/model/content/1.0') set ca.child_node_name_crc = crc32(CONVERT(LOWER(np.string_value) USING utf8)) ;
          Hide
          Steve Rigby added a comment -
          For retest on Alfresco 2.2 sp4 build 425
          Show
          Steve Rigby added a comment - For retest on Alfresco 2.2 sp4 build 425
          Hide
          mkononovich added a comment -
          Validated in Alfresco 2.2 SP4 build 425 using Windows 2003 SP2, Tomcat 6.0.18, Mysql 5.1.31, JDK 6u7
          Show
          mkononovich added a comment - Validated in Alfresco 2.2 SP4 build 425 using Windows 2003 SP2, Tomcat 6.0.18, Mysql 5.1.31, JDK 6u7

            People

            • Assignee:
              Closed Bugs
              Reporter:
              Alex Madon
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Date of First Response: