Transformation of multiple input files

Transformation of multiple input files - java

Right now i am using this java (which receives one xml file parameter) method to perform XSLT transformation:
static public byte[] simpleTransform(byte[] sourcebytes, int ref_id) {
try {
StreamSource xmlSource = new StreamSource(new ByteArrayInputStream(sourcebytes));
StringWriter writer = new StringWriter();
transformations_list.get(ref_id).transformer.transform(xmlSource, new StreamResult(writer));
return writer.toString().getBytes("UTF-8");
} catch (Exception e) {
e.printStackTrace();
return new byte[0];
}
}
And in my xslt file i am using the document('f2.xml') to refer to other transform related files.
I want to use my Java like this (get multiple xml files):
static public byte[] simpleTransform(byte[] f1, byte[] f2, byte[] f3, int ref_id)
An in my XSLT i don't want to call document('f2.xml') but refer to the object by using f2 received in my Java method.
Is there a way to do it? how do i refer to
f2.xml
in my XSLT using this way?

I'm not entirely sure what is in f1, f2 etc. Is it the URL of a document? or the XML document content itself?
There are two possible approaches you could consider.
One is to write a URIResolver. When you call document('f2.xml') Saxon will call your URIResolver to get the relevant document as a Source object. Your URIResolver could return a StreamSource initialized with a ByteArrayInputStream referring to the relevant btye[] value.
A second approach is to supply the documents as parameters to the stylesheet. You could declare a global parameter <xsl:param name="f2" as="document-node()"/> and then use Transfomer.setParameter() to supply the actual document; within the stylesheet, replace document('f2.xml') by $f2. Saxon will accept a Source object as the value supplied to setParameter, so you could again create a StreamSource initialized with a ByteArrayInputStream referring to the relevant btye[] value; alternatively (and perhaps better) you could pre-build the tree by calling a Saxon DocumentBuilder.

Related

javax.xml.transform.TransformerFactory Unicode issue- Java

We are not able transform Unicode Characters properly. We are giving input in XML format, when we try to transform we are not able to get back the original string.
This is the code i'm using,
StringCarrier OStringCarrier = new StringCarrier();
String SXmlFileData= "<export_candidate_response><criteria><output><lastname>Bhagavath</lastname><firstname>ガネーシュ</firstname></output></export_candidate_response>";
String SResult = "";
try
{
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer(new StreamSource(SXslFileName));
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF8");
OutputStream xmlResult = (OutputStream)new ByteArrayOutputStream();
StreamResult outResult = new StreamResult(xmlResult);
transformer.transform(new StreamSource(
new ByteArrayInputStream(SXmlFileData.getBytes("UTF8"))),outResult);
SResult = outResult.getOutputStream().toString();
}
catch (TransformerConfigurationException OException)
{
//Exception has been thrown
OException.printStackTrace();
return OStringCarrier;
}
catch (TransformerException OException)
{
//Exception has been thrown
OException.printStackTrace();
return OStringCarrier;
}
catch (Exception OException)
{
//Exception has been thrown
OException.printStackTrace();
return OStringCarrier;
}
This is the output i'm getting ã‚¬ãƒ?ãƒ¼ã‚·ãƒ¥ in place of ガネーシュ

This is the output i'm getting ã‚¬ãƒ?ãƒ¼ã‚·ãƒ¥ in place of ガネーシュ
That tells you that somewhere in this process, data in UTF-8 is being read by a piece of software that thinks it is reading Latin-1. What it doesn't tell you is where in the process this is happening. So you need to divide-and-conquer - you need to find the last point at which the data is correct.
Start by establishing whether the problem is before the transformation or after it. That's very easy if you're using an XSLT 2.0 processor: you can use ` to see what string of characters the XSLT processor has been given. It's a bit trickier with a 1.0 processor, but you can use substring($in, $n, 1) to extract the nth character, and that should give you a clue.
My suspicion is that it's the input. Firstly, putting non-ASCII characters in a Java string literal is always a bit dangerous, because the round trip to a source repository can easily corrupt the code if you're not very careful about everything being configured correctly. Secondly, if the string is correct, it would be much safer to read it using a StringReader, rather than converting it to a byte stream. Try:
transformer.transform(new StreamSource(
new StringReader(SXmlFileData)),outResult);

JDOM Transformer - don't contract empty elements

I'm using JDOM 2.0.6 to transform an XSLT into an HTML, but I'm coming across the following problem - sometimes the data should be empty, that is, I'll have in my XSLT the following:
<div class="someclass"><xsl:value-of select="somevalue"/></div>
and when somevalue is empty, the output I get is:
<div class="someclass"/>
which may be perfectly valid XML, but is not valid HTML, and causes problems when displaying the resulting page.
Similar problems occur for <span> or <script> tags.
So my question is - how can I tell JDOM not to contract empty elements, and leave them as <div></div>?
Edit
I suspect the problem is not in the actual XSLTTransformer, but later when using JDOM to write to html. Here is the code I use:
XMLOutputter htmlout = new XMLOutputter(Format.getPrettyFormat());
htmlout.getFormat().setEncoding("UTF-8");
Document htmlDoc = transformer.transform(model);
htmlDoc.setDocType(new DocType("html"));
try (OutputStreamWriter osw = new OutputStreamWriter(new FileOutputStream(outHtml), "UTF-8")) {
htmlout.output(htmlDoc, osw);
}
Currently the proposed solution of adding a zero-width space works for me, but I'm interested to know if there is a way to tell JDOM to treat the document as an HTML (be it in the transform stage or the output stage, but I'm guessing the problem lies in the output stage).

You can use a zero-width-space between the elements. This doesn't affect the HTML output, but keeps the open-close-tags separated because they have a non-empty content.
<div class="someclass"><xsl:value-of select="somevalue"/></div>
Downside is: the tag is not really empty anymore. That would matter if your output would be XML. But for HTML - which is probably the last stage of processing - it should not matter.

In your case, the XML transform is happening directly to a file/stream, and it is no longer in the control of JDOM.
In JDOM, you can select whether the output from the JDOM document has expanded, or not-expanded output for empty elements. Typically, people have output from JDOM like:
XMLOutputter xout = new XMLOutputter(Format.getPrettyFormat());
xout.output(document, System.out);
You can modify the output format, though, and expand the empty elements
Format expanded = Format.getPrettyFormat().setExpandEmptyElements(true);
XMLOutputter xout = new XMLOutputter(expanded);
xout.output(document, System.out);
If you 'recover' (assuming it is valid XHTML?) the XSLT transformed xml as a new JDOM document you can output the result with expanded empty elements.

If you want to transform to a HTML file then consider to use Jaxp Transformer with a JDOMSource and a StreamResult, then the Transformer will serialize the transformation result as HTML if the output method is html (either as set in your code or as done with a no-namespace root element named html.

In addition to the "expandEmptyElements" option, you could create your own writer and pass it to the XMLOutputter:
XMLOutputter outputter = new XMLOutputter(Format.getPrettyFormat().setExpandEmptyElements(true));
StringWriter writer = new HTML5Writer();
outputter.output(document, writer);
System.out.println(writer.toString());
This writer can then modify all HTML5 void elements. Elements like "script" for example won't be touched:
private static class HTML5Writer extends StringWriter {
private static String[] VOIDELEMENTS = new String[] { "area", "base", "br", "col", "command", "embed", "hr",
"img", "input", "keygen", "link", "meta", "param", "source", "track", "wbr" };
private boolean inVoidTag;
private StringBuffer voidTagBuffer;
public void write(String str) {
if (voidTagBuffer != null) {
if (str.equals("></")) {
voidTagBuffer.append(" />");
super.write(voidTagBuffer.toString());
voidTagBuffer = null;
} else {
voidTagBuffer.append(str);
}
} else if (inVoidTag) {
if (str.equals(">")) {
inVoidTag = false;
}
} else {
for (int i = 0; i < VOIDELEMENTS.length; i++) {
if (str.equals(VOIDELEMENTS[i])) {
inVoidTag = true;
voidTagBuffer = new StringBuffer(str);
return;
}
}
super.write(str);
}
}
}
I know, this is dirty, but I had the same problem and didn't find any other way.

How to attach single or multiple attachments to CouchbaseLite document - Android?

I want to attach files to CouchbaseLite document. How can I do so? I did not find any code sample on official CBLite website for this - CBLite code Sample. I am still stuck how to accomplish it.
One way to do this in code is:
Document document = mDatabaseLocal.createDocument();
document.getCurrentRevision().createRevision().setAttachment(name, contentType, contentStream);
But this is not clear. *What should be the name?* - It is the absolute path of the attachment on your local disk?
For contentType: I do not know if there exists any enum class or constants that I can pass as contentType.
How would I attach multiple files to a document? Do I need to create unsavedRevision for every attachment?

The name must be unique per attachment, and doesn't refer to the local file, it refers to the name that you want to fetch it from on the document.
In this case you would call createRevision() once and then setAttachment() multiple times on the revision, before saving it.

you have to put an inputstream as attachment to your document.
A example can be found here CouchBase Attachment Example.
You have to convert each file into an InputStream and then you can set it to the document.
For convert you can use something like this:
private InputStream getAsStream(YourData data)
{
baos = new ByteArrayOutputStream();
try
{
objOstream = new ObjectOutputStream(baos);
objOstream.writeObject(data);
} catch (IOException e)
{
e.printStackTrace();
}
bArray = baos.toByteArray();
bais = new ByteArrayInputStream(bArray);
return bais;
}
In this example YourData can be every object or some of your own objectTypes.
Hope this explanation will help you.

How apply CDATA to transformer parameter with jdom

For some reason I have tried to surround the parameters sExtraParameter, sExtraParameter2, sExtraParameter3 with <![CDATA[ ]]> string in order to get "pretty-printed" latin characters. But every time I check the xml output, it stills show bad parsed characters.
So, if is there another way to apply the CDATA to this parameters?
public static Element xslTransformJDOM(File xmlFile, String xslStyleSheet, String sExtraParameter, String sExtraParameterValue, String sExtraParameter2, String sExtraParameterValue2, String sExtraParameter3,String sExtraParameterValue3 ) throws JDOMException, TransformerConfigurationException, FileNotFoundException, IOException{
try{
Transformer transformer = TransformerFactory.newInstance().newTransformer(new StreamSource(xslStyleSheet));
transformer.setParameter(sExtraParameter, sExtraParameterValue);
transformer.setParameter(sExtraParameter2, sExtraParameterValue2);
transformer.setParameter(sExtraParameter3, sExtraParameterValue3);
JDOMResult out = new JDOMResult();
transformer.transform(new StreamSource(xmlFile), out);
Element result = out.getDocument().detachRootElement();
setSize(new XMLOutputter().outputString(result).length());
return result;
}
catch (TransformerException e){
throw new JDOMException("XSLT Transformation failed", e);
}
}
edit:
I am following up a project from my boss, for these reason I have not the entire code to show you here.

Maybe I have missed the question, but the API (http://docs.oracle.com/javaee/1.4/api/javax/xml/transform/Transformer.html#setParameter(java.lang.String, java.lang.Object)) for setParameter does not expect
value - The value object. This can be any valid Java object. It is up to the processor to provide the proper object coersion or to simply pass the object on for use in an extension.
This could then vary by implementation, assuming you are using JDOM.
There may be a CDATA xml element that would then be processed correctly. Maybe: http://www.jdom.org/docs/apidocs/org/jdom2/CDATA.html
You could still think about setting the serializer settings to some sort of whitespace preservation. http://www.jdom.org/docs/apidocs.1.1/org/jdom/output/Format.TextMode.html

jaxb marshaller characterEscapeHandler

I have the following problem. I've set the following properties to the marshaller:
marshaller.setProperty( Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE );
marshaller.setProperty( "com.sun.xml.bind.characterEscapeHandler", new CharacterEscapeHandler() {
public void escape(char[] ch, int start, int length, boolean isAttVal, Writer out) throws IOException {
String s = new String(ch, start, length);
System.out.println("Inside CharacterEscapeHandler...");
out.write(StringEscapeUtils.escapeXml(StringEscapeUtils.unescapeXml(s)));
}
});
When i try to marshall an object to SOAPBody with the following code:
SOAPMessage message = MessageFactory.newInstance().createMessage();
marshaller.marshal(request, message.getSOAPBody());
the CharacterEscapeHandler.escape is not invoked, and the characters are not escaped, but this code:
StringWriter writer = new StringWriter();
marshaller.marshal(request, writer);
invokes CharacterEscapeHandler.escape(), and all the characters are escaped... Is this normal behaviour for JAXB. And how can I escape characters before placing them inside SOAP's body?
Update:
Our system have to communicate with another system, which expects the text to be escaped.
Example for message sent by the other system:
<env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope">
<env:Body xmlns:ac="http://www.ACORD.org/Standards/AcordMsgSvc/1">
<ac:CallRs xmlns:ac="http://www.ACORD.org/Standards/AcordMsgSvc/1">
<ac:Sender>
<ac:PartyId>urn:partyId</ac:PartyId>
<ac:PartyRoleCd/>
<ac:PartyName>PARTYNAME</ac:PartyName>
</ac:Sender>
<ac:Receiver>
<ac:PartyRoleCd>broker</ac:PartyRoleCd>
<ac:PartyName>Ð�Ð¼Ð°Ñ€Ð°Ð½Ñ‚ Ð‘ÑŠÐ»Ð³Ð°Ñ€Ð¸Ñ� ÐžÐžÐ”</ac:PartyName>
</ac:Receiver>
<ac:Application>
<ac:ApplicationCd>applicationCd</ac:ApplicationCd>
<ac:SchemaVersion>schemaversion/</ac:SchemaVersion>
</ac:Application>
<ac:TimeStamp>2011-05-11T18:41:19</ac:TimeStamp>
<ac:MsgItem>
<ac:MsgId>30d63016-fa7d-4410-a19a-510e43674e70</ac:MsgId>
<ac:MsgTypeCd>Error</ac:MsgTypeCd>
<ac:MsgStatusCd>completed</ac:MsgStatusCd>
</ac:MsgItem>
<ac:RqItem>
<ac:MsgId>d8c2d9c4-3f1c-459f-abe1-0e9accbd176b</ac:MsgId>
<ac:MsgTypeCd>RegisterPolicyRq</ac:MsgTypeCd>
<ac:MsgStatusCd>completed</ac:MsgStatusCd>
</ac:RqItem>
<ac:WorkFolder>
<ac:MsgFile>
<ac:FileId>cid:28b8c9d1-9655-4727-bbb2-3107482e7f2e</ac:FileId>
<ac:FileFormatCd>text/xml</ac:FileFormatCd>
</ac:MsgFile>
</ac:WorkFolder>
</ac:CallRs>
</env:Body>
</env:Envelope>
So I need to escape all the text between the opening/closing tags.. like this inside ac:PartyName

When you marshal to a DOM Document, JAXB is not in charge of the actual serialization and escaping, it just builds the DOM tree in memory. The serialization is then handled by the DOM implementation.
Needing additional escaping when writing xml is usually a sign of a design problem or not using xml correctly. If you can give some more context why you need this escaping, maybe I could suggest an alternative solution.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Transformation of multiple input files - java

Related

javax.xml.transform.TransformerFactory Unicode issue- Java

JDOM Transformer - don't contract empty elements

How to attach single or multiple attachments to CouchbaseLite document - Android?

How apply CDATA to transformer parameter with jdom

jaxb marshaller characterEscapeHandler

Categories

Resources