Type: Hot Fix Request
Resolution: Won't Fix
Affects Version/s: 18.104.22.168, 5.2.2
Fix Version/s: None
Environment:Alfresco Version - 22.214.171.124
Database - Oracle 12c
Application Server - Tomcat
Hot Fix Version:5.0.0
When documents containing Japanese character '' as part of the filename are uploaded (I can't seem to add the character in JIRA as it complains that character is invalid when I click Create. The character is within the sample files zip attached - Sample Files.zip - 1st letter in the filename), invalid/garbled characters are stored in the string_end_lower & string_value columns in alf_prop_string_value table. If the Original filename is '[problem_char]由美子DSCN8762.png', the values stored are something like this
string_end_lower value in the DB is - ?由美子dscn8762.png
string_value - ' 由美子DSCN8762.png' (string_value column starts with a whitespace. Copying the value from DB to a text editor, it looks like the whitespace is a collection of invalid characters)
NOTE: The problem is when they try to migrate from one DB to another (Oracle -> Postgres) the migration fails due to 'ERROR: Invalid byte sequence with encoding method "UTF 8": 0xed 0xb 0 0x8 b incompatibility' (this is because of invalid values stored in the DB). Once they change the string_end_lower & string_value columns to valid values the migration is successful.
Steps to reproduce
1) Enable auditing feature in the app - audit.enabled=true & audit.alfresco-access.enabled=true in alfresco-global.properties.
2) Upload files that contain Japanese character '[problem_char]' to Alfresco (either Drag&Drop or Upload button)
3) Documents are uploaded successfully, but checking the values stored in the alf_prop_string_value table 'string_end_lower' column value is - "?由美子dscn8762.png". string_value column starts with a whitespace. Copying the value from DB to a text editor it looks like the whitespace is a collection of invalid characters. (" 由美子DSCN8762.png")
The issue can be reproduced internally in out of the box versions 126.96.36.199 and 5.2.2. Screenshots and sample files are attached.
When documents containing Japanese character '' in the filename are uploaded, string_end_lower & string_value columns should contain the original Japanese character not the garbled or whitespace characters.
Garbled characters are saved/stored in the string_value, string_end_lower columns in alf_prop_string_value table.