Java jaxb utf-8/iso convertions
I have a XML file that contains non-standard characters (like a weird
"quote").
I read the XML using UTF-8 / ISO / ascii + unmarshalled it:
BufferedReader br = new BufferedReader(new InputStreamReader(
(conn.getInputStream()),"ISO-8859-1"));
String output;
StringBuffer sb = new StringBuffer();
while ((output = br.readLine()) != null) {
//fetch XML
sb.append(output);
}
try {
jc = JAXBContext.newInstance(ServiceResponse.class);
Unmarshaller unmarshaller = jc.createUnmarshaller();
ServiceResponse OWrsp = (ServiceResponse) unmarshaller
.unmarshal(new InputSource(new
StringReader(sb.toString())));
I have a oracle function that will take iso-8859-1 codes, and
converts/maps them to "literal" symbols. i.e: "’" => "left single
quote"
JAXB unmarshal using iso, displays the characters with iso conversion
fine. i.e all weird single quotes will be encoded to "’"
so suppose my string is: class of 10¨C11©\year©\olds (note the weird -
between 11 and year)
jc = JAXBContext.newInstance(ScienceProductBuilderInfoType.class);
Marshaller m = jc.createMarshaller();
m.setProperty(Marshaller.JAXB_ENCODING, "ISO-8859-1");
//save a temp file
File file2 = new File("tmp.xml");
this will save in file :
class of 10–11‐year‐olds. (what i want..so file saving
works!)
[side note: i have read the file using java file reader, and it out puts
the above string fine]
the issue i have is that the STRING representation using jaxb unmarshaller
has weird output, for some reason i cannot seem to get the string to
represent ¨C.
when I 1: check the xml unmarshalled output:
class of 10?11?year?olds
2: the File output:
class of 10–11‐year‐olds
i even tried to read the file from the saved XML, and then unmarshal that
(in hopes of getting the ¨C in my string)
String sCurrentLine;
BufferedReader br = new BufferedReader(new FileReader("tmp.xml"));
StringBuffer sb = new StringBuffer();
while ((sCurrentLine = br.readLine()) != null) {
sb.append(sCurrentLine);
}
ScienceProductBuilderInfoType rsp =
(ScienceProductBuilderInfoType) unm
.unmarshal(new InputSource(new StringReader(sb.toString())));
no avail.
any ideas how to get the iso-8859-1 encoded character in jaxb?
No comments:
Post a Comment