How To Handle Special Characters When Converting From Html To Docx
I have a application that converts html files to DocX using DocX4J. I´m having problems with special characters like ç,á,é,í,ã,etc. My text font in the html files is Arial bu
Solution 1:
Following the tip given by JasonPlutext, I found an example of how to map a font to the XHTMLImporter at the DocX4J forum (http://www.docx4java.org/forums/docx-java-f6/docx-to-html-and-back-to-docx-t1913.html).
Now my code is working! See the final version below.
public WordprocessingMLPackage export(String xhtml) {
WordprocessingMLPackage wordMLPackage = null;
try {
RFonts arialRFonts = Context.getWmlObjectFactory().createRFonts();
arialRFonts.setAscii("Arial");
arialRFonts.setHAnsi("Arial");
XHTMLImporterImpl.addFontMapping("Arial", arialRFonts);
wordMLPackage = WordprocessingMLPackage.createPackage();
XHTMLImporter importer = new XHTMLImporterImpl(wordMLPackage);
List<Object> content = importer.convert(xhtml,null);
wordMLPackage.getMainDocumentPart().getContent().addAll(content);
}
catch (Docx4JException e) {
// ...
}
return wordMLPackage;
}
Post a Comment for "How To Handle Special Characters When Converting From Html To Docx"