Iterate and concat using XPath Expression - java

I have the following xml file:
<author>
<firstname>Akhilesh</firstname>
<lastname>Singh</lastname>
</author>
<author>
<firstname>Prassana</firstname>
<lastname>Nagaraj</lastname>
</author>
And I am using the following JXPath expression,
concat(author/firstName," ",author/lastName)
To get the value Akhilesh Singh ,Prassana Nagaraj but
I am getting only Akhilesh Singh.
My requirement is that I should get the value of both author by executing only one JXPath expression.

XPath 2.0 solution:
/*/author/concat(firstname, ' ', lastname, following-sibling::author/string(', '))

With XPath 1.0, when an argument type other than node set is expected, the first node in the node set is selected and then apply the type conversion (boolean type conversion is some how different).
So, your expresion (Note: no capital):
concat(author/firstname," ",author/lastname)
It's the same as:
concat( string( (author/firstname)[1] ), " ", string( (author/lastname)[1] ) )
Depending on the host language you could use:
author/firstname|author/lastname
This is evaluate to a node set with firstName and lastName in document order, so then you could iterate over this node set extracting the string value.
In XPath 2.0 you could use:
string-join(author/concat(firstname,' ', lastname),' ,')
Output:
Akhilesh Singh ,Prassana Nagaraj
Note: Now, with sequence data type and function calls as steps, XPath resembles the functional language it claims to be. Higher Order Functions and partial applycation must wait to XPath 2.1 ...
Edit: Thanks to Dimitre's comments, I've corrected the string separator.

concat() will return single string. If you want both results then you need to iterate over "author" element and do "concat(firstName," ",lastName)"

Related

How to use apostrophe in formatted xpath?

I have place the locator in properties file like :
header.navigation.product.link = //div[contains(#class,'grid-')]//li/a[contains(.,'%s')]
and while I'm using this locator in my code-
String headerproductlink = String.format(ConfigurationManager.getBundle()
.getString("header.navigation.category.link"), category)
And category = Women's Gym Clothing
While I'm trying to locate the element it unable to find.
even i have tried as Women\'s Gym Clothing but no success.
Can someone please suggest a way ?
In XPath 1.0 you can use either single quotes or double quotes to delimit a string literal, and you can use the other kinds of quotes to represent itself within the string. You can't have a string literal containing both single and double quotes, but you can use concat() to get around this limitation:
concat('He said: "', "I won't", '"')
The situation is complicated if the XPath expression appears within a host language that imposes its own constraints; in that case any quotes within the XPath expression must be escaped using host language conventions, for example \" in Java, " in XML.
Below different ways worked for me:
Locator in Property file:
another.header=xpath=//h1[contains(.,"%s")]
Java code:
String t = "st. john\'s bay - women";
String header = String.format(getBundle().getString("another.header"), t);
CommonStep.get("https://www.jcpenney.com/g/st-johns-bay-women/N-bwo3xZ1z0nvauZ1z0nh7w");
String headerText=ElementFactory.$(header).getText();
Below also worked fine
Locator in Property file:
another.header={'locator':'xpath=//h1[contains(.,"%s")]'}
Java code:
String t = "st. john\\'s bay - women";
...
Or
Locator in Property file:
another.header={"locator":"xpath=//h1[contains(.,\\"%s\\")]"}
Java code:
String t = "st. john's bay - women";
...

Fetching data from xml Using Xquery with starts with function

I want to fetch a data from xml Using Xquery with starts with function.
data.xml
<data><employee id=\"1\"><name value=\"vA-12\">A</name> <title id=\"2\">Manager</title></employee>
<employee id=\"2\"><name value=\"vC-12\">C</name><title id=\"2\">Manager</title></employee>
<employee id=\"2\"><name value=\"vB-12\">B</name><title id=\"2\">Manager</title></employee>
</data>
Now I want to fetch that name which has employee#id=title#id and name#value starts with 'vC'.
I have written this xquery for the same.Please see below but getting error-
for $x in /data/employee where $x/#id=$x/title/#id and [fn:starts-with($x/name/#value,vC)] return data($x/name)
this is error-
Error on line 1 column 55
XPST0003 XQuery syntax error near #.../title/#id and [fn:starts-with#:
Unexpected token "[" in path expression
net.sf.saxon.s9api.SaxonApiException: Unexpected token "[" in path expression
at net.sf.saxon.s9api.XQueryCompiler.compile(XQueryCompiler.java:544)
at Xml.process(Xml.java:46)
at Xml.main(Xml.java:30)
Caused by: net.sf.saxon.trans.XPathException: Unexpected token "[" in path expression
at net.sf.saxon.query.XQueryParser.grumble(XQueryParser.java:479)
at net.sf.saxon.expr.parser.XPathParser.grumble(XPathParser.java:221)
starts-with() function expects two parameters. I'm not familiar with Saxon specifically, but in general xquery you can do this way :
for $x in /data/employee[#id=title/#id and name[starts-with(#value,'vC')]]
return data($x/name)
or using where clause instead of predicate in for clause, as in your attempted query :
for $x in /data/employee
where $x/#id=$x/title/#id and $x/name/starts-with(#value,'vC')
return data($x/name)
Just use this shorter and simpler XPath 2.0 one-liner:
/*/employee[#id eq title/#id and starts-with(name/#value, 'vC')]/data(name)
Depending on your exact requirements, you might need this instead (I let you go through the differences, but basically it selects all names with #value starting with vC, where the other solutions here select all names of all employees with at least a name starting with vC):
/data/employee[#id eq title/#id]/name[starts-with(#value, 'vC')]/data(.)

xpath expression to match text ending with a variable

I have an XML with entities like this :
<Entity>
<Name>Lwresd_Dns_Server|LwresdDnsServer</Name>
</Entity
<Entity>
<Name>Lwresd_Dns_Server_Data|LwresdDnsServerData</Name>
</Entity>
My xpath expression is
XPathExpression expr = xpath1.compile("//Entity[matches(Name,'" +line+ "')]");
where line is a variable with value LwresdDnsServer.
The above xpath expression matches both entities , where I need it to match only the first one, i.e
Lwresd_Dns_Server|LwresdDnsServer
How should I frame the expression to do that ??
I believe this should do the trick:
XPathExpression expr =
xpath1.compile("//Entity[contains(concat('|', Name, '|'),'|" +line+ "|')]");
This compares the entity Name enclosed in |s with the variable name enclosed in |s, so you get something like:
contains('|Lwresd_Dns_Server|LwresdDnsServer|', '|LwrestDnsServer|') => Yes
contains('|Lwresd_Dns_Server_Data|LwresdDnsServerData|', '|LwrestDnsServer|') => No
And resultingly, only the first of the two Entities is selected.
If you only want to find entities that end with line (and not just those that contain an exact match for it), then you can do this (assuming the values are guaranteed to not contain the character $ - if there's the possibility it would contain a $, you should choose a different delimiter that it definitely won't contain, or use Dimitre Novatchev's answer to this question):
XPathExpression expr =
xpath1.compile("//Entity[contains(concat(Name, '$'),'" +line+ "$')]");
I haven't used the matches() function in XPath (it's not supported in XPath 1.0), but I suspect the following would also work for finding a value at the end of an Entity name, if your XPath evaluator supports matches():
XPathExpression expr =
xpath1.compile("//Entity[matches(Name,'" +line+ "$')]");
Here, $ is the RegEx symbol for the end of a string.
Here is an XPath 1.0 expression that implements what the XPath 2.0 function ends-with($s, $t) does:
substring($s, string-length($s) - string-length($t) +1) = $t
You can substitute $s and $t above with specific strings.

XPath normalize-space() to return a sequence of normalized strings

I need to use the XPath function normalized-space() to normalize the text I want to extract from a XHTML document: http://test.anahnarciso.com/clean_bigbook_0.html
I'm using the following expression:
//*[#slot="address"]/normalize-space(.)
Which works perfectly in Qizx Studio, the tool I use to test XPath expressions.
let $doc := doc('http://test.anahnarciso.com/clean_bigbook_0.html')
return $doc//*[#slot="address"]/normalize-space(.)
This simple query returns a sequence of xs:string.
144 Hempstead Tpke
403 West St
880 Old Country Rd
8412 164th St
8412 164th St
1 Irving Pl
1622 McDonald Ave
255 Conklin Ave
22011 Hempstead Ave
7909 Queens Blvd
11820 Queens Blvd
1027 Atlantic Ave
1068 Utica Ave
1002 Clintonville St
1002 Clintonville St
1156 Hempstead Tpke
Route 49
10007 Rockaway Blvd
12694 Willets Point Blvd
343 James St
Now, I want to use the previous expression in my Java code.
String exp = "//*[#slot=\"address"\"]/normalize-space(.)";
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile(exp);
Object result = expr.evaluate(doc, XPathConstants.NODESET);
But the last line throws an Exception:
Cannot convert XPath value to Java object: required class is org.w3c.dom.NodeList; supplied value has type xs:string
Obvsiously, I should change XPathConstants.NODESET for something; I tried XPathConstants.STRING but it only returns the first element of the sequence.
How can I obtain something like an array of Strings?
Thanks in advance.
Your expression works in XPath 2.0, but is illegal in XPath 1.0 (which is used in Java) - it should be normalize-space(//*[#slot='address']).
Anyway, in XPath 1.0, when normalize-space() is called on a node-set, only the first node (in document order) is taken.
In order to do what you want to do, you'll need to use a XPath 2.0 compatible parser, or traverse the resulting node-set and call normalize-space() on every node:
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr;
String select = "//*[#slot='address']";
expr = xpath.compile(select);
NodeList result = (NodeList)expr.evaluate(input, XPathConstants.NODESET);
String normalize = "normalize-space(.)";
expr = xpath.compile(normalize);
int length = result.getLength();
for (int i = 0; i < length; i++) {
System.out.println(expr.evaluate(result.item(i), XPathConstants.STRING));
}
...outputs exactly your given output.
It depends on what version of XPath you're using. Check out this post, hopefully it'll answer your question: Is it possible to apply normalize-space to all nodes XPath expression finds? Good luck.
The expression:
//*[#slot="address"]/normalize-space(.)
is syntactically legal (and practically useful) XPath 2.0 expression.
The same expression is not syntactically legal in XPath 1.0 -- it isn't allowed for a location step to be a function call.
In fact, it isn't possible to write a single XPath 1.0 expression the result of whose evaluation is the wanted set of strings.
You need to use in your program a product that implements XPath 2.0 -- such as Saxon 9.x.
As you noted, the XPath 2.0 expression //*[#slot="address"]/normalize-space(.) returns a sequence of strings. This return type is not supported by the JAXP XPathConstants class, because the JAXP interfaces were not designed to support XPath 2.0.
This leaves you with two choices:
Use an XPath 2.0 processor that has native interfaces for XPath 2.0 or that can convert sequences to a return type supported by JAXP
Use only XPath 1.0 expressions. For example, in your case you could simply select the target nodes:
//*[#slot="address"]
And then iterate the resulting nodeset, collecting the results into an array or List.
Note that it's important to distinguish between the processer you're using to evaluate the expression and the interface you're using to initiate the evaluation.

Find duplicated XML Element Names (xPath with variable)

I'm using XPATH 1.0 parsers alongside CLiXML in my JAVA project, I'm trying to setup a CLiXML constraint rules file.
I would like to show an error if there are duplicate element names under a specific child.
For example
<parentNode version="1">
<childA version="1">
<ignoredChild/>
</childA>
<childB version="1">
<ignoredChild/>
</childB>
<childC version="4">
<ignoredChild/>
</childC>
<childA version="2">
<ignoredChild/>
</childA>
<childD version="6">
<ignoredChild/>
</childD>
</parentNode>
childA appears more than once, so I would show an error about this.
NOTE: I only want to 'check/count' the Element name, not the attributes inside or the children of the element.
The code inside my .clx rules file that I've tried is:
<forall var="elem1" in=".//parentNode/*">
<equal op1="count(.//parentNode/$elem1)" op2="1"/>
</forall>
But that doesn't work, I get the error:
Caused by: class org.jaxen.saxpath.XPathSyntaxException: count(.//PLC-Mapping/*/$classCount: 23: Expected one of '.', '..', '#', '*', <QName>
As I want the code to check each child name and run another xPath query with the name of the child name - if the count is above 1 then it should give an error.
Any ideas?
Just try to get list of subnodes with appropriate path expression and check for duplicates in that list:
XPathExpression xPathExpression = xPath.compile("//parentNode/*");
NodeList children = (NodeList) xPathExpression.evaluate(config, XPathConstants.NODESET);
for (int i = 0; i < children.getLength(); i++) {
// maintain hashset of clients here and check if element is already there
}
This cannot be done with a single XPath 1.0 expression (see this similar question I answered today).
Here is a single XPath 2.0 expression (in case you can use XPath 2.0):
/*/*[(for $n in name()
return count(/*/*[name()=$n])
)
>1
]
This selects all elements that are children of the top element of the XML document and that occur more than once.

Categories