Troubleshooting
Problem
If transact.dat file in Content Engine Bulk Import Tool contains Arabic in any Properties field e.g. Description, after importing into ACCE the Description field (Unfiled Documents > Document_Name > Properties), shows gibberish characters instead of Arabic. The default encoding of transact.dat file is ANSI, hence the conflict when processing Arabic. Furthermore if the transact.dat file is saved as UTF-8 or Unicode encoding, the bulk import operation still fails and returns the following exception: Exception in thread "Import" java.lang.NumberFormatException: For input string: "meaningless string"
Symptom
When transact.dat file is saved as UTF-8/Unicode encoding, bulk import operation fails with the following error:
Exception in thread "Import" java.lang.NumberFormatException: For input string:
"meaningless string"
at java.lang.NumberFormatException.forInputString(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at java.lang.Integer.valueOf(Unknown Source)
at bulkImport.BI_Import$UserDoc.<init>(BI_Import.java:1056)
at bulkImport.BI_Import.transactDatRd(BI_Import.java:1909)
at bulkImport.BI_Import.run(BI_Import.java:2051)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Cause
The Bulk Import Tool code was not written to handle extended ASCII code.
Resolving The Problem
This is working as designed up to at least Content Platform Engine 5.2.1. An enhancement request has been opened to correct the behaviour to support Arabic by modifying the code to support UTF-8 or Unicode encoding when opening and reading the configuration files.
Was this topic helpful?
Document Information
Modified date:
17 June 2018
UID
swg21983846