Xpath query with Java - java

I need the right command for an Xpath query with Java
Good evening.
I'm here beacause I am not able to get the right result. I am trying to get UF1 UF2 UF3 and Steve Rogers in the next xml document. I am using Java and I am using the next expresion: XPathExpression expr = xpath.compile("/modulos/modulo[#m='M01']/alumno/nombre/text()");
I get Steve Rogers only. If I write
XPathExpression expr = xpath.compile("/modulos/modulo[#m='M01']/alumno/nombre/UF1/UF2/UF3/text()");
I don´t get anyting. Any idea? Thanks a lot. Regards
<?xml version="1.0"?>
<modulos>
<modulo m="M01">
<alumno>
<nombre>Steve Rogers</nombre>
<UF1>5.00</UF1>
<UF2>3.00</UF2>
<UF3>7.00</UF3>
</alumno>
<alumno>
<nombre>Bruce Banner</nombre>
<UF1>9.00</UF1>
<UF2>8.50</UF2>
<UF3>8.00</UF3>
</alumno>
<alumno>
<nombre>Tony Stark</nombre>
<UF1>9.00</UF1>
<UF2>9.00</UF2>
<UF3>9.00</UF3>
</alumno>
</modulo>
<modulo m="M02">
<alumno>
<nombre>Bruce Banner</nombre>
<UF1>10.00</UF1>
<UF2>7.75</UF2>
<UF3>6.00</UF3>
</alumno>
</modulo>
<modulo m="M03">
<alumno>
<nombre>Bruce Banner</nombre>
<UF1>8.50</UF1>
<UF2>6.50</UF2>
<UF3>5.00</UF3>
</alumno>
<alumno>
<nombre>Tony Stark</nombre>
<UF1>8.00</UF1>
<UF2>10.00</UF2>
<UF3>9.00</UF3>
</alumno>
</modulo>
</modulos>

The slash (/) represents a parent-child relationship.
/modulos/modulo means “all modulo elements which are children of the root modulos element.”
alumno/nombre/UF1/UF2/UF3 means “a UF3 element which is a child of a UF2 element which is a child of a UF1 element which is a child of a nombre element which is a child of an alumno element.” In other words, your XPath is looking for this:
<modulos>
<modulo m="M01">
<alumno>
<nombre>
<UF1>
<UF2>
<UF3>7.00</UF3>
</UF2>
</UF1>
</nombre>
</alumno>
<alumno>
<nombre>
<UF1>
<UF2>
<UF3>8.00</UF3>
</UF2>
</UF1>
</nombre>
</alumno>
<alumno>
<nombre>
<UF1>
<UF2>
<UF3>9.00</UF3>
</UF2>
</UF1>
</nombre>
</alumno>
</modulo>
Obviously, this isn’t what you want. You want the text of each child element in the alumno element. The easiest way to do that is:
xpath.compile("/modulos/modulo[#m='M01']/alumno/*/text()");
The * means “an element with any name.” The above expression is safe as long as you can be sure that the only element children of alumno will be nombre, UF1, UF2, and UF3.

All requested elements are children of alumno. This XPath gets the values
//modulo[#m="M01"]/alumno[nombre[.="Steve Rogers"]]/*/text()
Or
//modulo[#m="M01"]/alumno[1]/*/text()
Or
//modulo[#m="M01"]/alumno[1]/descendant::*[local-name()="nombre" or starts-with(local-name(),"UF")]/text()

Related

XPath search by "id" attribute , giving NPE - Java

All,
I have multiple XML templates that I need to fill with data, to allow my document builder class to use multiple templates and insert data correctly
I designate the node that I want my class to insert data to by adding an attribute of:
id="root"
One example of an XML
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<SiebelMessage MessageId="07f33fa0-2045-46fd-b88b-5634a3de9a0b" MessageType="Integration Object" IntObjectName="" IntObjectFormat="Siebel Hierarchical" ReturnCode="0" ErrorMessage="">
<listOfReadAudit >
<readAudit id="root">
<recordId mapping="Record ID"></recordId>
<userId mapping="User ID"></userId>
<customerId mapping="Customer ID"></customerId>
<lastUpd mapping="Last Updated"></lastUpd>
<lastUpdBy mapping="Last Updated By"></lastUpdBy>
<busComp mapping="Entity Name"></busComp>
</readAudit>
</listOfReadAudit>
</SiebelMessage>
Code
expr = xpath.compile("//SiebelMessage[#id='root']");
root = (Element) expr.evaluate(xmlDoc, XPathConstants.NODE);
Element temp = (Element) root.cloneNode(true);
Using this example:
XPath to select Element by attribute value
The expression is not working:
//SiebelMessage[#id='root']
Any ideas what I am doing wrong?
Try this:
//readAudit[#id='root']
This selects all readAudit elements with the id attribute set to root (it should be just 1 element in your case).
You could make sure it returns maximum 1 element with this:
//readAudit[#id='root'][1]
What you are doing is selecting SiebelMessage nodes with the attribute id='root'.
But the SiebelMessage doesn't have an id, it's the readAudit you are after. So either do
//readAudit[id='root']
or
//SiebelMessage//readAudit[id='root']

How to access OWL documents using XPath in Java?

I am having an OWL document in the form of an XML file. I want to extract elements from this document. My code works for simple XML documents, but it does not work with OWL XML documents.
I was actually looking to get this element: /rdf:RDF/owl:Ontology/rdfs:label, for which I did this:
DocumentBuilder builder = builderfactory.newDocumentBuilder();
Document xmlDocument = builder.parse(
new File(XpathMain.class.getResource("person.xml").getFile()));
XPathFactory factory = javax.xml.xpath.XPathFactory.newInstance();
XPath xPath = factory.newXPath();
XPathExpression xPathExpression = xPath.compile("/rdf:RDF/owl:Ontology/rdfs:label/text()");
String nameOfTheBook = xPathExpression.evaluate(xmlDocument,XPathConstants.STRING).toString();
I also tried extracting only the rdfs:label element this way:
XPathExpression xPathExpression = xPath.compile("//rdfs:label");
NodeList nodes = (NodeList) xPathExpression.evaluate(xmlDocument, XPathConstants.NODESET);
But this nodelist is empty.
Please let me know where I am going wrong. I am using Java XPath API.
Don't query RDF (or OWL) with XPath
There's already an accepted answer, but I wanted to elaborate on #Michael's comment on the question. It's a very bad idea to try to work with RDF as XML (and hence, the RDF serialization of an OWL ontology), and the reason for that is very simple: the same RDF graph can be serialized as lots of different XML documents. In the question, all that's being asked for the is rdfs:label of an owl:Ontology element, so how much could go wrong? Well, here are two serializations of the ontology.
The first is fairly human readable, and was generated by the OWL API when I saved the ontology using the Protégé ontology editor. The query in the accepted answer would work on this, I think.
<rdf:RDF xmlns="http://www.example.com/labelledOnt#"
xml:base="http://www.example.com/labelledOnt"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<owl:Ontology rdf:about="http://www.example.com/labelledOnt">
<rdfs:label>Here is a label on the Ontology.</rdfs:label>
</owl:Ontology>
</rdf:RDF>
Here is the same RDF graph using fewer of the fancy features available in the RDF/XML encoding. This is the same RDF graph, and thus the same OWL ontology. However, there is no owl:Ontology XML element here, and the XPath query will fail.
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns="http://www.example.com/labelledOnt#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" >
<rdf:Description rdf:about="http://www.example.com/labelledOnt">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Ontology"/>
<rdfs:label>Here is a label on the Ontology.</rdfs:label>
</rdf:Description>
</rdf:RDF>
You cannot reliably query an RDF graph in RDF/XML serialization by using typical XML-processing techniques.
Query RDF with SPARQL
Well, if we cannot query reliably query RDF with XPath, what are we supposed to use? The standard query language for RDF is SPARQL. RDF is a graph-based representation, and SPARQL queries include graph patterns that can match a graph.
In this case, the pattern that we want to match in a graph consists of two triples. A triple is a 3-tuple of the form [subject,predicate,object]. Both triples have the same subject.
The first triple says that the subject is of type owl:Ontology. The relationship “is of type” is rdf:type, so the first triple is [?something,rdf:type,owl:Ontology].
The second triple says that subject (now known to be an ontology) has an rdfs:label, and that's the value that we're interested in. The corresponding triple is [?something,rdfs:label,?label].
In SPARQL, after defining the necessary prefixes, we can write the following query.
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?label WHERE {
?ontology a owl:Ontology ;
rdfs:label ?label .
}
(Note that because rdf:type is so common, SPARQL includes a as an abbreviation for it. The notation s p1 o1; p2 o2 . is just shorthand for the two-triple pattern s p1 o1 . s p2 o2 ..)
You can run SPARQL queries against your model in Jena either programmatically, or using the command line tools. If you do it programmatically, it is fairly easy to get the results out. To confirm that this query gets the value we're interested in, we can use Jena's command line for arq to test it out.
$ arq --data labelledOnt.owl --query getLabel.sparql
--------------------------------------
| label |
======================================
| "Here is a label on the Ontology." |
--------------------------------------
as xpath does not know the namespaces you are using.
try using:
"/*[local-name()='RDF']/*[local-name()='Ontology']/*[local-name()='label']/text()"
local name will ignore the namespaces and will work (for the first instance of this that it finds)
You would be able to use namespaces in query if you implement javax.xml.namespace.NamespaceContext for yourself. Please have a look at this answer https://stackoverflow.com/a/5466030/1443529, this explains how to get it done.

dom4J: How to get the value of Elements of a Node?

I am reading an XML using dom4j by using XPath techniques for selecting desired nodes. Consider that my XML looks like this:
<Employees>
<Emp id=1>
<name>jame</name>
<age>12</age>
</Emp>
.
.
.
</Employees>
Now i need to store the Information of all employees in a list of my Employee Class. Until i code the following:
List<? extends Node> lstprmntEmps = document.selectNodes("//Employees/Emp");
ArrayList<Employee> Employees = new ArrayList<Employee>();//Employee is my custom class
for (Node node : lstprmntEmps)
{
Employees.add(ParseEmployee(node));//ParseEmployee(. . .) is my custom function that pareses emp XML and return Employee object
}
Now how do i get the name and age of Currently selected Node?
is there any such method exist node.getElementValue("name");
Cast each node to Element, then ask the element for its first "name" sub-element and its first "age" sub-element and get their text.
See http://dom4j.sourceforge.net/apidocs/org/dom4j/Element.html.
The elementText(String) method of Element maybe gets a sub-element by name and retrieves its text in one operation, but it's undocumented, so it's hard to say.
Note that variables and methods should always start with a lowercase letter in Java.

Find duplicated XML Element Names (xPath with variable)

I'm using XPATH 1.0 parsers alongside CLiXML in my JAVA project, I'm trying to setup a CLiXML constraint rules file.
I would like to show an error if there are duplicate element names under a specific child.
For example
<parentNode version="1">
<childA version="1">
<ignoredChild/>
</childA>
<childB version="1">
<ignoredChild/>
</childB>
<childC version="4">
<ignoredChild/>
</childC>
<childA version="2">
<ignoredChild/>
</childA>
<childD version="6">
<ignoredChild/>
</childD>
</parentNode>
childA appears more than once, so I would show an error about this.
NOTE: I only want to 'check/count' the Element name, not the attributes inside or the children of the element.
The code inside my .clx rules file that I've tried is:
<forall var="elem1" in=".//parentNode/*">
<equal op1="count(.//parentNode/$elem1)" op2="1"/>
</forall>
But that doesn't work, I get the error:
Caused by: class org.jaxen.saxpath.XPathSyntaxException: count(.//PLC-Mapping/*/$classCount: 23: Expected one of '.', '..', '#', '*', <QName>
As I want the code to check each child name and run another xPath query with the name of the child name - if the count is above 1 then it should give an error.
Any ideas?
Just try to get list of subnodes with appropriate path expression and check for duplicates in that list:
XPathExpression xPathExpression = xPath.compile("//parentNode/*");
NodeList children = (NodeList) xPathExpression.evaluate(config, XPathConstants.NODESET);
for (int i = 0; i < children.getLength(); i++) {
// maintain hashset of clients here and check if element is already there
}
This cannot be done with a single XPath 1.0 expression (see this similar question I answered today).
Here is a single XPath 2.0 expression (in case you can use XPath 2.0):
/*/*[(for $n in name()
return count(/*/*[name()=$n])
)
>1
]
This selects all elements that are children of the top element of the XML document and that occur more than once.

Iterate and concat using XPath Expression

I have the following xml file:
<author>
<firstname>Akhilesh</firstname>
<lastname>Singh</lastname>
</author>
<author>
<firstname>Prassana</firstname>
<lastname>Nagaraj</lastname>
</author>
And I am using the following JXPath expression,
concat(author/firstName," ",author/lastName)
To get the value Akhilesh Singh ,Prassana Nagaraj but
I am getting only Akhilesh Singh.
My requirement is that I should get the value of both author by executing only one JXPath expression.
XPath 2.0 solution:
/*/author/concat(firstname, ' ', lastname, following-sibling::author/string(', '))
With XPath 1.0, when an argument type other than node set is expected, the first node in the node set is selected and then apply the type conversion (boolean type conversion is some how different).
So, your expresion (Note: no capital):
concat(author/firstname," ",author/lastname)
It's the same as:
concat( string( (author/firstname)[1] ), " ", string( (author/lastname)[1] ) )
Depending on the host language you could use:
author/firstname|author/lastname
This is evaluate to a node set with firstName and lastName in document order, so then you could iterate over this node set extracting the string value.
In XPath 2.0 you could use:
string-join(author/concat(firstname,' ', lastname),' ,')
Output:
Akhilesh Singh ,Prassana Nagaraj
Note: Now, with sequence data type and function calls as steps, XPath resembles the functional language it claims to be. Higher Order Functions and partial applycation must wait to XPath 2.1 ...
Edit: Thanks to Dimitre's comments, I've corrected the string separator.
concat() will return single string. If you want both results then you need to iterate over "author" element and do "concat(firstName," ",lastName)"

Categories