Find duplicated XML Element Names (xPath with variable) - java

I'm using XPATH 1.0 parsers alongside CLiXML in my JAVA project, I'm trying to setup a CLiXML constraint rules file.
I would like to show an error if there are duplicate element names under a specific child.
For example
<parentNode version="1">
<childA version="1">
<ignoredChild/>
</childA>
<childB version="1">
<ignoredChild/>
</childB>
<childC version="4">
<ignoredChild/>
</childC>
<childA version="2">
<ignoredChild/>
</childA>
<childD version="6">
<ignoredChild/>
</childD>
</parentNode>
childA appears more than once, so I would show an error about this.
NOTE: I only want to 'check/count' the Element name, not the attributes inside or the children of the element.
The code inside my .clx rules file that I've tried is:
<forall var="elem1" in=".//parentNode/*">
<equal op1="count(.//parentNode/$elem1)" op2="1"/>
</forall>
But that doesn't work, I get the error:
Caused by: class org.jaxen.saxpath.XPathSyntaxException: count(.//PLC-Mapping/*/$classCount: 23: Expected one of '.', '..', '#', '*', <QName>
As I want the code to check each child name and run another xPath query with the name of the child name - if the count is above 1 then it should give an error.
Any ideas?

Just try to get list of subnodes with appropriate path expression and check for duplicates in that list:
XPathExpression xPathExpression = xPath.compile("//parentNode/*");
NodeList children = (NodeList) xPathExpression.evaluate(config, XPathConstants.NODESET);
for (int i = 0; i < children.getLength(); i++) {
// maintain hashset of clients here and check if element is already there
}

This cannot be done with a single XPath 1.0 expression (see this similar question I answered today).
Here is a single XPath 2.0 expression (in case you can use XPath 2.0):
/*/*[(for $n in name()
return count(/*/*[name()=$n])
)
>1
]
This selects all elements that are children of the top element of the XML document and that occur more than once.

Related

XPath search by "id" attribute , giving NPE - Java

All,
I have multiple XML templates that I need to fill with data, to allow my document builder class to use multiple templates and insert data correctly
I designate the node that I want my class to insert data to by adding an attribute of:
id="root"
One example of an XML
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<SiebelMessage MessageId="07f33fa0-2045-46fd-b88b-5634a3de9a0b" MessageType="Integration Object" IntObjectName="" IntObjectFormat="Siebel Hierarchical" ReturnCode="0" ErrorMessage="">
<listOfReadAudit >
<readAudit id="root">
<recordId mapping="Record ID"></recordId>
<userId mapping="User ID"></userId>
<customerId mapping="Customer ID"></customerId>
<lastUpd mapping="Last Updated"></lastUpd>
<lastUpdBy mapping="Last Updated By"></lastUpdBy>
<busComp mapping="Entity Name"></busComp>
</readAudit>
</listOfReadAudit>
</SiebelMessage>
Code
expr = xpath.compile("//SiebelMessage[#id='root']");
root = (Element) expr.evaluate(xmlDoc, XPathConstants.NODE);
Element temp = (Element) root.cloneNode(true);
Using this example:
XPath to select Element by attribute value
The expression is not working:
//SiebelMessage[#id='root']
Any ideas what I am doing wrong?
Try this:
//readAudit[#id='root']
This selects all readAudit elements with the id attribute set to root (it should be just 1 element in your case).
You could make sure it returns maximum 1 element with this:
//readAudit[#id='root'][1]
What you are doing is selecting SiebelMessage nodes with the attribute id='root'.
But the SiebelMessage doesn't have an id, it's the readAudit you are after. So either do
//readAudit[id='root']
or
//SiebelMessage//readAudit[id='root']

xpath expression to match text ending with a variable

I have an XML with entities like this :
<Entity>
<Name>Lwresd_Dns_Server|LwresdDnsServer</Name>
</Entity
<Entity>
<Name>Lwresd_Dns_Server_Data|LwresdDnsServerData</Name>
</Entity>
My xpath expression is
XPathExpression expr = xpath1.compile("//Entity[matches(Name,'" +line+ "')]");
where line is a variable with value LwresdDnsServer.
The above xpath expression matches both entities , where I need it to match only the first one, i.e
Lwresd_Dns_Server|LwresdDnsServer
How should I frame the expression to do that ??
I believe this should do the trick:
XPathExpression expr =
xpath1.compile("//Entity[contains(concat('|', Name, '|'),'|" +line+ "|')]");
This compares the entity Name enclosed in |s with the variable name enclosed in |s, so you get something like:
contains('|Lwresd_Dns_Server|LwresdDnsServer|', '|LwrestDnsServer|') => Yes
contains('|Lwresd_Dns_Server_Data|LwresdDnsServerData|', '|LwrestDnsServer|') => No
And resultingly, only the first of the two Entities is selected.
If you only want to find entities that end with line (and not just those that contain an exact match for it), then you can do this (assuming the values are guaranteed to not contain the character $ - if there's the possibility it would contain a $, you should choose a different delimiter that it definitely won't contain, or use Dimitre Novatchev's answer to this question):
XPathExpression expr =
xpath1.compile("//Entity[contains(concat(Name, '$'),'" +line+ "$')]");
I haven't used the matches() function in XPath (it's not supported in XPath 1.0), but I suspect the following would also work for finding a value at the end of an Entity name, if your XPath evaluator supports matches():
XPathExpression expr =
xpath1.compile("//Entity[matches(Name,'" +line+ "$')]");
Here, $ is the RegEx symbol for the end of a string.
Here is an XPath 1.0 expression that implements what the XPath 2.0 function ends-with($s, $t) does:
substring($s, string-length($s) - string-length($t) +1) = $t
You can substitute $s and $t above with specific strings.

XPath normalize-space() to return a sequence of normalized strings

I need to use the XPath function normalized-space() to normalize the text I want to extract from a XHTML document: http://test.anahnarciso.com/clean_bigbook_0.html
I'm using the following expression:
//*[#slot="address"]/normalize-space(.)
Which works perfectly in Qizx Studio, the tool I use to test XPath expressions.
let $doc := doc('http://test.anahnarciso.com/clean_bigbook_0.html')
return $doc//*[#slot="address"]/normalize-space(.)
This simple query returns a sequence of xs:string.
144 Hempstead Tpke
403 West St
880 Old Country Rd
8412 164th St
8412 164th St
1 Irving Pl
1622 McDonald Ave
255 Conklin Ave
22011 Hempstead Ave
7909 Queens Blvd
11820 Queens Blvd
1027 Atlantic Ave
1068 Utica Ave
1002 Clintonville St
1002 Clintonville St
1156 Hempstead Tpke
Route 49
10007 Rockaway Blvd
12694 Willets Point Blvd
343 James St
Now, I want to use the previous expression in my Java code.
String exp = "//*[#slot=\"address"\"]/normalize-space(.)";
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile(exp);
Object result = expr.evaluate(doc, XPathConstants.NODESET);
But the last line throws an Exception:
Cannot convert XPath value to Java object: required class is org.w3c.dom.NodeList; supplied value has type xs:string
Obvsiously, I should change XPathConstants.NODESET for something; I tried XPathConstants.STRING but it only returns the first element of the sequence.
How can I obtain something like an array of Strings?
Thanks in advance.
Your expression works in XPath 2.0, but is illegal in XPath 1.0 (which is used in Java) - it should be normalize-space(//*[#slot='address']).
Anyway, in XPath 1.0, when normalize-space() is called on a node-set, only the first node (in document order) is taken.
In order to do what you want to do, you'll need to use a XPath 2.0 compatible parser, or traverse the resulting node-set and call normalize-space() on every node:
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr;
String select = "//*[#slot='address']";
expr = xpath.compile(select);
NodeList result = (NodeList)expr.evaluate(input, XPathConstants.NODESET);
String normalize = "normalize-space(.)";
expr = xpath.compile(normalize);
int length = result.getLength();
for (int i = 0; i < length; i++) {
System.out.println(expr.evaluate(result.item(i), XPathConstants.STRING));
}
...outputs exactly your given output.
It depends on what version of XPath you're using. Check out this post, hopefully it'll answer your question: Is it possible to apply normalize-space to all nodes XPath expression finds? Good luck.
The expression:
//*[#slot="address"]/normalize-space(.)
is syntactically legal (and practically useful) XPath 2.0 expression.
The same expression is not syntactically legal in XPath 1.0 -- it isn't allowed for a location step to be a function call.
In fact, it isn't possible to write a single XPath 1.0 expression the result of whose evaluation is the wanted set of strings.
You need to use in your program a product that implements XPath 2.0 -- such as Saxon 9.x.
As you noted, the XPath 2.0 expression //*[#slot="address"]/normalize-space(.) returns a sequence of strings. This return type is not supported by the JAXP XPathConstants class, because the JAXP interfaces were not designed to support XPath 2.0.
This leaves you with two choices:
Use an XPath 2.0 processor that has native interfaces for XPath 2.0 or that can convert sequences to a return type supported by JAXP
Use only XPath 1.0 expressions. For example, in your case you could simply select the target nodes:
//*[#slot="address"]
And then iterate the resulting nodeset, collecting the results into an array or List.
Note that it's important to distinguish between the processer you're using to evaluate the expression and the interface you're using to initiate the evaluation.

dom4J: How to get the value of Elements of a Node?

I am reading an XML using dom4j by using XPath techniques for selecting desired nodes. Consider that my XML looks like this:
<Employees>
<Emp id=1>
<name>jame</name>
<age>12</age>
</Emp>
.
.
.
</Employees>
Now i need to store the Information of all employees in a list of my Employee Class. Until i code the following:
List<? extends Node> lstprmntEmps = document.selectNodes("//Employees/Emp");
ArrayList<Employee> Employees = new ArrayList<Employee>();//Employee is my custom class
for (Node node : lstprmntEmps)
{
Employees.add(ParseEmployee(node));//ParseEmployee(. . .) is my custom function that pareses emp XML and return Employee object
}
Now how do i get the name and age of Currently selected Node?
is there any such method exist node.getElementValue("name");
Cast each node to Element, then ask the element for its first "name" sub-element and its first "age" sub-element and get their text.
See http://dom4j.sourceforge.net/apidocs/org/dom4j/Element.html.
The elementText(String) method of Element maybe gets a sub-element by name and retrieves its text in one operation, but it's undocumented, so it's hard to say.
Note that variables and methods should always start with a lowercase letter in Java.

Iterate and concat using XPath Expression

I have the following xml file:
<author>
<firstname>Akhilesh</firstname>
<lastname>Singh</lastname>
</author>
<author>
<firstname>Prassana</firstname>
<lastname>Nagaraj</lastname>
</author>
And I am using the following JXPath expression,
concat(author/firstName," ",author/lastName)
To get the value Akhilesh Singh ,Prassana Nagaraj but
I am getting only Akhilesh Singh.
My requirement is that I should get the value of both author by executing only one JXPath expression.
XPath 2.0 solution:
/*/author/concat(firstname, ' ', lastname, following-sibling::author/string(', '))
With XPath 1.0, when an argument type other than node set is expected, the first node in the node set is selected and then apply the type conversion (boolean type conversion is some how different).
So, your expresion (Note: no capital):
concat(author/firstname," ",author/lastname)
It's the same as:
concat( string( (author/firstname)[1] ), " ", string( (author/lastname)[1] ) )
Depending on the host language you could use:
author/firstname|author/lastname
This is evaluate to a node set with firstName and lastName in document order, so then you could iterate over this node set extracting the string value.
In XPath 2.0 you could use:
string-join(author/concat(firstname,' ', lastname),' ,')
Output:
Akhilesh Singh ,Prassana Nagaraj
Note: Now, with sequence data type and function calls as steps, XPath resembles the functional language it claims to be. Higher Order Functions and partial applycation must wait to XPath 2.1 ...
Edit: Thanks to Dimitre's comments, I've corrected the string separator.
concat() will return single string. If you want both results then you need to iterate over "author" element and do "concat(firstName," ",lastName)"

Categories