How to reproduce?
1) create a plain vanilla alfresco 4.1.4 (linux pg tomcat)
2) on your local computer, create files that are UTF8 encoded and latin1 encoded.
You can use the attached python script that create 4 files:
or use the resulted files in the attached ZIP.
3) in Alfresco explorer create a folder 'import' under 'Company Home'
4) go to:
As "Import directory", enter your local export path
As "Target space (NodeRef or Path)" enter
5) submit the form
only 3 files are imported, the file whose name contains accent latin1 encoded fails.
In the logs, if we set
then we see:
2013-06-27 14:51:53,936 WARN [bulkimport.impl.DirectoryAnalyserImpl] [BulkFilesystemImport-BackgroundThread] Skipping unreadable file '/home/madon/act/69176_michelinusa/export2/ISO-8859-1_with_�ccent.txt'.
1) If we expect to have a failure on non UTF8 filenames, then we should WARN in the logs indicating the reason.
filenames encoding is not mentioned. We should specify that the tool only works if the filesystem is UTF8 encoded.
3) could we modify the default log level for the bulk importer?
Because at the current default, WARNings are not logged.
4) there exist tools that can help in doing file name encoding conversion. One of them is called convmv, example of use: