Matching string to xml data - java

I pulled two strings from a user input. I need to match one of them to the row ID.
I'm unaware of whether or not I would need to parse the Strings into integers (Perhaps even parse all of my XML data into strings, then look up that data) or if I can use the Strings to directly lookup XML data? Perhaps neither.
Here's an example of what i'm storing in my XML file:
<?xml version="1.0" encoding="UTF-8"?>
<data>
<language>
<row id="7101">
<language-from>English</language-from>
<language-to>Thai</language-to>
<cost>30.00</cost>
<comment><![CDATA[out source]]></comment>
</row>
</language>
</data>

Looking at the XML document, I suggest you parse the XML document to a set of structured objects, store them in a map for easy lookup, where the key is your search criteria that comes from the user's input. Unless the XML file changes too often, one-time parsing of the document is worth it than any string based searches on it.
I would define a class to hold that information as follows:
class TranslationCost
{
private int id;
private String sourceLang;
private String targetLang;
private float cost;
private String comment;
}
Map<Integer, TranslationCost> idTocostMap;
public float calculateCost(int id, int countOfWords)
{
TranslationCost costObj = idToCostMap.get(id);
if (null == costObj) {
// throw exception
}
return costObj.getCost() * countOfWords;
}
something like that...

You could use xPath query to look up the nodes...
Document xmlDoc = // Load the XML into an DOM document
Node root = xmlDoc.getDocumentElement();
XPathFactory xFactory = XPathFactory.newInstance();
XPath xPath = getXPathFactory().newXPath();
XPathExpression xExpress = getXPath().compile("/data/language/row[language-from='English']");
NodeList nodeList = (NodeList)xExpress.evaluate(root, XPathConstants.NODESET);
I'd personally build a simpler model around the concept to make it easier to call into, but the concept is the same

Related

XML Merging using XPATH

I am trying to merge two xml using
"javax.xml.xpath.XPath".
Both source and destination xml looks as below mentioned.
i want to append all the nodes of "bpmn:process" in the second xml to first xml
<?xml version="1.0" encoding="UTF-8"?>
<bpmn:definitions xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:bpmn="http://www.omg.org/spec/BPMN/20100524/MODEL" xmlns:bpmndi="http://www.omg.org/spec/BPMN/20100524/DI" xmlns:dc="http://www.omg.org/spec/DD/20100524/DC" xmlns:di="http://www.omg.org/spec/DD/20100524/DI" id="Definitions_1" targetNamespace="http://bpmn.io/schema/bpmn">
<bpmn:collaboration id="Collaboration_1ah989h">
<bpmn:participant id="Participant_108if28" processRef="Process_2" />
</bpmn:collaboration>
<bpmn:process id="Process_1" isExecutable="false">**
<bpmn:startEvent id="StartEvent_1">
<bpmn:outgoing>SequenceFlow_1i0zw0x</bpmn:outgoing>
</bpmn:startEvent>
<bpmn:intermediateThrowEvent id="IntermediateThrowEvent_00epl00">
<bpmn:incoming>SequenceFlow_1i0zw0x</bpmn:incoming>
<bpmn:outgoing>SequenceFlow_05qx4z2</bpmn:outgoing>
</bpmn:intermediateThrowEvent>
</bpmn:process>
</bpmn:definitions>
Below is the code used to merge xml
Document destination= (Document) xpath.evaluate("/", new InputSource("C:/diagram_Sec.bpmn"), XPathConstants.NODE);
NodeList listPosts = (NodeList) xpath.evaluate("//bpmn:process//*",new InputSource("C:/diagram_Fir.xml"), XPathConstants.NODESET);
Element element= (Element) xpath.evaluate("//bpmn:process", destination, XPathConstants.NODE);
for (int i = 0; i < listPosts.getLength(); i++) {
Node listPost = listPosts.item(i);
Element element = (Element) listPost;
AttributeMap map = (AttributeMap) element.getAttributes();
for(int j=0;j<map.getLength();j++)
{
element.setAttribute(map.item(j).getLocalName(), map.item(j).getNodeValue());
}
Node node = xml1.adoptNode(element);
blog.appendChild(node);
}
DOMImplementationLS impl = (DOMImplementationLS) xml1.getImplementation();
System.out.println(impl.createLSSerializer().writeToString(destination ));
The problem is, this code will consider all the child nodes of "bpmn:process" tag as seperate node and will put directly under "bpmn:process"(all the sub chidren will also come under "bpmn:process"). the output looks like this
<bpmn:process id="Process_1" isExecutable="false">
//Here comes First xml nodes
//Second XML Content after merge
<bpmn:startEvent id="StartEvent_1">
</bpmn:startEvent>
**//This tag should be inside bpmn:startEvent tag**
<bpmn:outgoing>SequenceFlow_1i0zw0x</bpmn:outgoing>
<bpmn:intermediateThrowEvent id="IntermediateThrowEvent_00epl00">
</bpmn:intermediateThrowEvent>
**//THis should be inside above bpmn:intermediateThrowEvent tag**
<bpmn:incoming>SequenceFlow_1i0zw0x</bpmn:incoming>
</bpmn:process
But the Expected is
<bpmn:process id="Process_1" isExecutable="false">
//Here comes First xml Children
//Second XML Content
<bpmn:startEvent id="StartEvent_1">
// outgoing is Inside bpmn:startEvent tag
**<bpmn:outgoing>SequenceFlow_1i0zw0x</bpmn:outgoing>**
</bpmn:startEvent>
<bpmn:intermediateThrowEvent id="IntermediateThrowEvent_00epl00">
// Inside bpmn:intermediateThrowEvent tag
<bpmn:incoming>SequenceFlow_1i0zw0x</bpmn:incoming>
</bpmn:intermediateThrowEvent>
</bpmn:process
Please let me know the correct way of doing this.
Thanks,
XPath is a read-only language: you can't use it to construct new XML trees. For that you need XSLT or XQuery. Or DOM, if you really want to sink that low.
But this is so easy in XSLT that using DOM really seems a waste of effort. In XSLT you just need two template rules: the standard identity rule to copy everything unchanged, plus the rule
<xsl:template match="bpmn:process">
<xsl:copy-of select="."/>
<xsl:copy-of select="document('second.xml')//bpmn:process"/>
</xsl:template>

CreateTextNode escape characters in large text string

charactersI am trying to include the correct characters in an XML document text node:
Element request = doc.createElement("requestnode");
request.appendChild(doc.createTextNode(xml));
rootElement.appendChild(request);
The xml string is a segment of a large xml file which I have read in:
Document doc = docBuilder.newDocument();
Element rootElement = doc.createElement("rootnode");
doc.appendChild(rootElement);
<firstname>John</firstname>
<dateOfBirth>28091999</dateOfBirth>
<surname>Doe</surname>
The problem is that passing this into createTextNode is replacing some of the charters:
<firstname>John</firstname>
<dateOfBirth>28091999</dateOfBirth>
<surname>Doe</surname>
Is there any way I can keep the correct characters (< , >) in the textnode. I have read about using importnode but this is not correctly XML, only a segment of a file.
Any help would be greatly appreciated.
EDIT: I need the xml string (which is not fully formatted xml, only a segment of an external xml file) to be in the "request node" as I am building XML to be imported into SOAP UI
You can't pass the element tag and text to the createTextNode() method. You only need to pass the text. You need then to append this text node to an element.
If the source is another XML document, you must extract the text node from an element and insert it in to the other. You can grab a Node (element and text) and try to inserted as a text node in the other. That is why you are seeing all the escape characters.
On the other hand, you can insert this Node into the other XML (if the structure is allowed) and it should be just fine.
In your context, I assume "request" is some sort of Node. The child element of a Node could be another element, text, etc. You have to be very specific.
You can do something like:
Element name = doc.createElement("name");
Element dob = doc.createElement("dateOfBirth");
Element surname = doc.createElement("surname");
name.appendChild( doc.createTextNode("John") );
dob.appendChild( doc.createTextNode("28091999") );
surname.appendChild( doc.createTextNode("Doe") );
Then you can add these element to a parent node:
node.appendChild(name);
node.appendChild(dob);
node.appendChild(surname);
UPDATE: As an alternative, you can open a stream to a document and insert your XML string as a byte stream. Something like this (untested code, but close):
String xmlString = "<firstname>John</firstname><dateOfBirth>28091999</dateOfBirth><surname>Doe</surname>";
DocumentBuilderFactory fac = javax.xml.parsers.DocumentBuilderFactory.newInstance();
DocumentBuilder builder = fac.newDocumentBuilder();
Document newDoc = builder.parse(new ByteArrayInputStream(xmlString.getBytes()));
Element newElem = doc.createElement("whatever");
doc.appendChild(newElem);
Node node = doc.importNode(newDoc.getDocumentElement(), true);
newElem.appendChild(node);
Something like that should do the trick.

Parse xml with text and xml tags in same xml tag

I wan't to parse a xml with java that looks something like this:
<sentence>This is a <a><b>long</b></a> sentence.</sentence>
<sentence>This is a second <a><b>even</b></a> longer sentence.</sentence>
As a result i need the whole sentence without the xml. I tried to parse this with dom4j. Calling the function element.getText() (current element is the sentence tag) i just get the sentence without the text in the nested xml tags.
Thanks for your help!
Regards
You can use XPath to select all the text nodes
String getAllTextContent(Node node) {
List<Node> nodes = node.selectNodes("descendant-or-self::text()");
StringBuilder buf = new StringBuilder();
for ( Node n : nodes ) {
buf.append(n.getText());
}
return buf.toString();
}
// usage
System.out.println(getAllTextContent(doc.selectSingleNode("//sentence")));
Keep your data in [CDATA] section in your xml tags
<sentence><![CDATA[This is a <a><b>long</b></a> sentence.]]></sentence>

Java xPath - extract subdocument from XML

I have an XML document as follows:
<DocumentWrapper>
<DocumentHeader>
...
</DocumentHeader>
<DocumentBody>
<Invoice>
<Buyer/>
<Seller/>
</Invoice>
</DocumentBody>
</DocumentWrapper>
I would like to extract from it the content of DocumentBody element as String, raw XML document:
<Invoice>
<Buyer/>
<Seller/>
</Invoice>
With xPath it could be simple to get by:
/DocumentWrapper/DocumentBody
Unfrotunatelly, my Java code doesn't want to work as I want. It returns empty lines instead of expected result. Is there any chance to do that, or I have to return NodeList and then genereate xml document from them?
My Java code:
XPathFactory xPathFactoryXPathFactory.newInstance();
XPath xPath xPathFactory.newXPath();
XPathExpression xPath.compile(xPathQuery);
String result = expression.evaluate(xmlDocument);
Calling this method
String result = expression.evaluate(xmlDocument);
is the same as calling this
String result = (String) expression.evaluate(xmlDocument, XPathConstants.STRING);
which returns the character data of the result node, or the character data of all child nodes in case the result node is an element.
You should probably do something like this:
Node result = (Node) expression.evaluate(xmlDocument, XPathConstants.NODE);
TransformerFactory.newInstance().newTransformer()
.transform(new DOMSource(result), new StreamResult(System.out));

How to access to value read XML using XPath in Java

I want to read XML data using XPath in Java.
I have the next XML file named MyXML.xml:
<?xml version="1.0" encoding="iso-8859-1" ?>
<REPOSITORY xmlns:LIBRARY="http://www.openarchives.org/LIBRARY/2.0/"
xmlns:xsi="http://www.w3.prg/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.openarchives.org/LIBRARY/2.0/ http://www.openarchives.org/LIBRARY/2.0/LIBRARY-PHM.xsd">
<repository>Test</repository>
<records>
<record>
<ejemplar>
<library_book:book
xmlns:library_book="http://www.w3c.es/LIBRARY/book/"
xmlns:book="http://www.w3c.es/LIBRARY/book/"
xmlns:bookAssets="http://www.w3c.es/LIBRARY/book/"
xmlns:bookAsset="http://www.w3c.es/LIBRARY/book/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3c.es/LIBRARY/book/ http://www.w3c.es/LIBRARY/replacement/book.xsd">
<book:bookAssets count="1">
<book:bookAsset nasset="1">
<book:bookAsset.id>value1</book:bookAsset.id>
<book:bookAsset.event>
<book:bookAsset.event.id>value2</book:bookAsset.event.id>
</book:bookAsset.event>
</book:bookAsset>
</book:bookAssets>
</library_book:book>
</ejemplar>
</record>
</records>
</REPOSITORY>
I want access to value1 and value2 values. For this, I try this:
// Standard of reading a XML file
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder;
Document doc = null;
XPathExpression expr = null;
builder = factory.newDocumentBuilder();
doc = builder.parse("MyXML.xml");
// Create a XPathFactory
XPathFactory xFactory = XPathFactory.newInstance();
// Create a XPath object
XPath xpath = xFactory.newXPath();
expr = xpath.compile("//REPOSITORY/records/record/ejemplar/library_book:book//book:bookAsset.event.id/text()");
Object result = expr.evaluate(doc, XPathConstants.STRING);
System.out.println("RESULT=" + (String)result);
But I don't get any results. Only prints RESULT=.
¿How to access to value1 and value2 values?. ¿What is the XPath filter to apply?.
Thanks in advanced.
I'm using JDK6.
You are having problems with namespaces, what you can do is
take them into account
ignore them using the XPath local-name() function
Solution 1 implies implementing a NamespaceContext that maps namespaces names and URIs and set it on the XPath object before querying.
Solution 2 is easy, you just need to change your XPath (but depending on your XML you may fine-tune your XPath to be sure to select the correct element):
XPath xpath = xFactory.newXPath();
expr = xpath.compile("//*[local-name()='bookAsset.event.id']/text()");
Object result = expr.evaluate(doc, XPathConstants.STRING);
System.out.println("RESULT=" + result);
Runnable example on ideone.
You can take a look at the following blog article to better understand the uses of namespaces and XPath in Java (even if old)
Try
Object result = expr.evaluate(doc, XPathConstants.NODESET);
// Cast the result to a DOM NodeList
NodeList nodes = (NodeList) result;
for (int i=0; i<nodes.getLength();i++){
System.out.println(nodes.item(i).getNodeValue());
}
One approach is to implement a name space context like:
public static class UniversalNamespaceResolver implements NamespaceContext {
private Document sourceDocument;
public UniversalNamespaceResolver(Document document) {
sourceDocument = document;
}
public String getNamespaceURI(String prefix) {
if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
return sourceDocument.lookupNamespaceURI(null);
} else {
return sourceDocument.lookupNamespaceURI(prefix);
}
}
public String getPrefix(String namespaceURI) {
return sourceDocument.lookupPrefix(namespaceURI);
}
public Iterator getPrefixes(String namespaceURI) {
return null;
}
}
And then use it like
xpath.setNamespaceContext(new UniversalNamespaceResolver(doc));
You also need to move up all the namespace declarations to the root node (REPOSITORY). Otherwise it might be a problem if you have namespace declarations on two different levels.

Categories