Resolving an xpath that includes namespaces in Java appears to require the use of a NamespaceContext object, mapping prefixes to namespace urls and vice versa. However, I can find no mechanism for getting a NamespaceContext other than implementing it myself. This seems counter-intuitive.
The question: Is there any easy way to acquire a NamespaceContext from a document, or to create one, or failing that, to forgo prefixes altogether and specify the xpath with fully qualified names?
It is possible to get a NamespaceContext instance without writing your own class. Its class-use page shows you can get one using the javax.xml.stream package.
String ctxtTemplate = "<data xmlns=\"http://base\" xmlns:foo=\"http://foo\" />";
NamespaceContext nsContext = null;
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLEventReader evtReader = factory
.createXMLEventReader(new StringReader(ctxtTemplate));
while (evtReader.hasNext()) {
XMLEvent event = evtReader.nextEvent();
if (event.isStartElement()) {
nsContext = ((StartElement) event)
.getNamespaceContext();
break;
}
}
System.out.println(nsContext.getNamespaceURI(""));
System.out.println(nsContext.getNamespaceURI("foo"));
System.out.println(nsContext
.getNamespaceURI(XMLConstants.XMLNS_ATTRIBUTE));
System.out.println(nsContext
.getNamespaceURI(XMLConstants.XML_NS_PREFIX));
Forgoing prefixes altogether is likely to lead to ambiguous expressions - if you want to drop namespace prefixes, you'd need to change the document format. Creating a context from a document doesn't necessarily make sense. The prefixes have to match the ones used in the XPath expression, not the ones in any document, as in this code:
String xml = "<data xmlns=\"http://base\" xmlns:foo=\"http://foo\" >"
+ "<foo:value>"
+ "hello"
+ "</foo:value>"
+ "</data>";
String expression = "/stack:data/overflow:value";
class BaseFooContext implements NamespaceContext {
#Override
public String getNamespaceURI(String prefix) {
if ("stack".equals(prefix))
return "http://base";
if ("overflow".equals(prefix))
return "http://foo";
throw new IllegalArgumentException(prefix);
}
#Override
public String getPrefix(String namespaceURI) {
throw new UnsupportedOperationException();
}
#Override
public Iterator<String> getPrefixes(
String namespaceURI) {
throw new UnsupportedOperationException();
}
}
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
xpath.setNamespaceContext(new BaseFooContext());
String value = xpath.evaluate(expression,
new InputSource(new StringReader(xml)));
System.out.println(value);
Neither the implementation returned by the StAX API nor the one above implement the full class/method contracts as defined in the doc. You can get a full, map-based implementation here.
I've just been working through using xpath and NamespaceContexts myself. I came across a good treatment of the issue on developerworks.
I found a convenient implementation in "Apache WebServices Common Utilities" called NamespaceContextImpl.
You can use the following maven dependency to obtain this class:
<dependency>
<groupId>org.apache.ws.commons</groupId>
<artifactId>ws-commons-util</artifactId>
<version>1.0.1</version>
</dependency>
I've use it in the following manner (I know its built for sax, but after reading the code, its o.k):
NamespaceContextImpl nsContext = new NamespaceContextImpl();
nsContext.startPrefixMapping("foo", "my.name.space.com");
You don't need to called endPrefixMapping.
If you are using the Spring framework you can reuse their NamespaceContext implementation
org.springframework.util.xml.SimpleNamespaceContext
This is a similar answer like the one from Asaf Mesika. So it doesn't give you automatic a NamespaceContext based on your document. You have to construct it yourself. Still it helps you because it at least gives you an implementation to starts with.
When we faced a similar problem, Both the spring SimpleNamespaceContext and the "Apache WebServices Common Utilities" worked. We wanted to avoid to the addition jar dependency on Apache WebServices Common Utilities and used the Spring one, because our application is Spring based.
If you are using Jersey 2 and only have a default XML namespace (xmlns="..."), you can use SimpleNamespaceResolver:
<?xml version="1.0" encoding="UTF-8"?>
<Outer xmlns="http://host/namespace">
<Inner />
</Outer>
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
docBuilderFactory.setNamespaceAware(true);
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document document = docBuilder.parse(new File("document.xml"));
String query = "/t:Outer/t:Inner";
XPath xpath = XPathFactory.newInstance().newXPath();
String xmlns = document.getDocumentElement().getAttribute("xmlns");
xpath.setNamespaceContext(new SimpleNamespaceResolver("t", xmlns));
NodeList nodeList = (NodeList) xpath.evaluate(query, document, XPathConstants.NODESET);
//nodeList will contain the <Inner> element
You can also specify xmlns manually if you want.
Related
I'm parsing a xml string with dom4j and I'm using xpath to select some element from it, the code is :
String test = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><epp xmlns=\"urn:ietf:params:xml:ns:epp-1.0\"><response><result code=\"1000\"><msg lang=\"en-US\">Command completed successfully</msg></result><trID><clTRID>87285586-99412370</clTRID><svTRID>52639BB8-1-ARNES</svTRID></trID></response></epp>";
SAXReader reader = new SAXReader();
reader.setIncludeExternalDTDDeclarations(false);
reader.setIncludeInternalDTDDeclarations(false);
reader.setValidation(false);
Document xmlDoc;
try {
xmlDoc = reader.read(new StringReader(test));
xmlDoc.getRootElement();
Node nodeStatus = xmlDoc.selectSingleNode("//epp/response/result");
System.out.print(nodeStatus.getText());
} catch (DocumentException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
I always get null for the nodeStatus variable. I actualy nead to read the code from the result noad from the xml
<result code="1000">
This is the XML that I am reading from the String test:
<?xml version="1.0" encoding="UTF-8"?>
<epp xmlns="urn:ietf:params:xml:ns:epp-1.0">
<response>
<result code="1000">
<msg lang="en-US">Command completed successfully</msg>
</result>
<trID>
<clTRID>87285586-99412370</clTRID>
<svTRID>52639BB8-1-ARNES</svTRID>
</trID>
</response>
</epp>
Any hints?
Your XML has a namespace. DOM4J returns null because it won't find your nodes.
To make it work, you first have to register the namespaces you are using. You will need a prefix. Any one. And you will have to use that prefix in your XPath.
You could use tns for "target namespace". Then you have to create a xpath object with it like this:
XPath xpath = new DefaultXPath("/tns:epp/tns:response/tns:result");
To register the namespaces you will need to create a Map, add the namespace with the prefix you used in the xpath expression, and pass it to the setNamespaceURIs() method.
namespaces.put("tns", "urn:ietf:params:xml:ns:epp-1.0");
xpath.setNamespaceURIs(namespaces);
Now you can call selectSingleNode, but you will call it on your XPath object passing the document as the argument:
Node nodeStatus = xpath.selectSingleNode(xmlDoc);
From there you can extract the data you need. getText() won't give you the data you want. If you want the contents of the result node as XML, you can use:
nodeStatus.asXML()
Edit: to retrieve just the code, change your XPath to:
/tns:epp/tns:response/tns:result/#code
And retrieve the result with
nodeStatus.getText();
I replaced the double slash // (which means descendant-or-self) with / since the expression contains the full path and / is more efficient. But if you only have one result node in your whole file, you can use:
//result/#code
to extract the data. It will match all descendants. If there is more than one result, it will return a node-set.
I have to parse an XML file with following structure:
<root>
<object_1>
<pro1> abc </pro1>
<pro2> pqr </pro2>
<pro3> xyz </pro3>
<children>
<object_a>
<pro1> abc </pro1>
<pro2> pqr </pro2>
<pro3> xyz </pro3>
<children>
.
.
.
</children>
</object_a>
</children>
</object_1>
<object_2>
.
.
.
</object_n>
</root>
Aim is to parse this multilevel nesting. A few classes are defined in Java.
Class Object_1
Class Object_2
.
.
.
Class Object_N
with their respective properties.
The following code is working for me, but then this is not the best way of doing things.
File file = new File(fileName);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(file);
doc.getDocumentElement().normalize();
if(doc ==null) return;
Node node = doc.getFirstChild();
NodeList lst = node.getChildNodes();
Node children = null ;
int len = lst.getLength();
for(int index=0;index<len;index++)
{
Node child = lst.item(index);
String name = child.getNodeName();
if(name=="Name")
name = child.getNodeValue();
else if(name=="Comment")
comment = child.getNodeValue());
else if(name=="children")
children = child;
}
if(children==null) return;
lst = children.getChildNodes();
len = lst.getLength();
Class<?> obj=null;
AbsModel model = null;
for(int index=0;index<len;index++)
{
Node childNode = lst.item(index);
String modelName = childNode.getNodeName();
try {
obj = Class.forName(modelName);
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
if(obj!=null)
model = (AbsModel) obj.newInstance();
else
model = new GenericModel();
model.restoreDefaultPropFromXML(childNode);
addChild(model);
}
}
Is there a better way of parsing this XML.
Consider using JAXB, which is part of Java since version 6. You should be able to parse (“unmarshall”) your XML file into your own classes with almost no code, just adding a few annotations expliciting the mapping between your object structure and your XML structure.
StAX and or JAXB is almost always the way to go.
If the XML is really dynamic (like attributes specify the property name) ie <prop name="property" value="" /> then you will need to use StAX only or live with what JAXB will map it to (a POJO with name and value properties) and post process.
Personally I find combining StAX and JAXB the best. I parse to the elements I want and then use JAXB to turn the element into a POJO.
See Also:
My own utility library that will turn an XML Stream into an iterator of objects.
Parsing very large XML files and marshalling to Java Objects
http://tedone.typepad.com/blog/2011/06/unmarshalling-benchmark-in-java-jaxb-vs-stax-vs-woodstox.html
While JAXB may be the best choice I'd also like to mention jOOX which provides a JQuery-like API and makes working with XML documents really pleasant.
So I've tried searching and searching on how to do this but I keep seeing a lot of complicated answers for what I need. I basically am using the Flurry Analytics API to return some xml code from an HTTP request and this is what it returns.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<eventMetrics type="Event" startDate="2011-2-28" eventName="Tip Calculated" endDate="2011-3-1" version="1.0" generatedDate="3/1/11 11:32 AM">
<day uniqueUsers="1" totalSessions="24" totalCount="3" date="2011-02-28"/>
<day uniqueUsers="0" totalSessions="0" totalCount="0" date="2011-03-01"/>
<parameters/>
</eventMetrics>
All I want to get is that totalCount number which is 3 with Java to an int or string. I've looked at the different DOM and SAX methods and they seem to grab information outside of the tags. Is there someway I can just grab totalCount within the tag?
Thanks,
Update
I found this url -http://www.androidpeople.com/android-xml-parsing-tutorial-%E2%80%93-using-domparser/
That helped me considering it was in android. But I thank everyone who responded for helping me out. I checked out every answer and it helped out a little bit for getting to understand what's going on. However, now I can't seem to grab the xml from my url because it requires an HTTP post first to then get the xml. When it goes to grab xml from my url it just says file not found.
Update 2
I got it all sorted out reading it in now and getting the xml from Flurry Analytics (for reference if anyone stumbles upon this question)
HTTP request for XML file
totalCount is what we call an attribute. If you're using the org.w3c.dom API, you call getAttribute("totalCount") on the appropriate element.
If you are using an SAX handler, override the startElement callback method to access attributes:
public void startElement (String uri, String name, String qName, Attributes atts)
{
if("day".equals (qName)) {
String total = attrs.getValue("totalCount");
}
}
A JDOM example. Note the use of SAXBuilder to load the document.
URL httpSource = new URL("some url string");
Document document = SAXBuilder.build(httpSource);
List<?> elements = document.getDescendants(new KeyFilter());
for (Element e : elements) {
//do something more useful with it than this
String total = (Element) e.getAttributeValue("totalCount");
}
class KeyFilter implements Filter {
public boolean matches (Object obj) {
return (Element) obj.getName().equals("key");
}
}
I think that the simplest way is to use XPath, below is an example based on vtd-xml.
import com.ximpleware.*;
public class test {
public static void main(String[] args) throws Exception {
String xpathExpr = "/eventMetrics/day/#totalCount";
VTDGen vg = new VTDGen();
int i = -1;
if (vg.parseHttpUrl("http://localhost/test.xml", true)) {
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot();
ap.selectXPath(xpathExpr);
ap.bind(vn);
System.out.println("total count "+(int)ap.evalXPathtoDouble());
}
}
}
I am constructing XML code using Java. See my code snippet.
Document document = null;
String xml = "";
ReportsDAO objReportsDAO = null;
try
{
logger.info("Getting XML data for Consumable Report Starts...");
objReportsDAO = new ReportsDAO();
List consumableDTOLst = objReportsDAO.getConsumableData(issuedBy, issuedTo, employeeType, itemCode, itemName, className, transactionFromDate, transactionToDate, machineCode, workOrderNumber, jobName, customerId);
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
document = builder.newDocument();
Element rootElmnt = (Element) document.createElement("items");
document.appendChild(rootElmnt);
Element elmt = null;
ConsumableDTO objConsumableDTO = null;
SimpleDateFormat sdf = new SimpleDateFormat("MM/dd/yyyy");
for (int i = 0; i < consumableDTOLst.size(); i++)
{
objConsumableDTO = (ConsumableDTO)consumableDTOLst.get(i);
elmt = (Element) document.createElement("item");
elmt.setAttribute("IssuedBy", objConsumableDTO.getIssuedBy());
elmt.setAttribute("IssuedTo", objConsumableDTO.getIssuedTo());
elmt.setAttribute("EMPLOYECADRE", objConsumableDTO.getEmployeeType());
elmt.setAttribute("ITEMCODE", objConsumableDTO.getItemCode());
elmt.setAttribute("ITEMNAME", objConsumableDTO.getItemName());
elmt.setAttribute("ITEMCLASS", objConsumableDTO.getClassName());
elmt.setAttribute("DATE", sdf.format(objConsumableDTO.getTransactionDate()));
elmt.setAttribute("machineCode", objConsumableDTO.getMachineCode());
elmt.setAttribute("JOB", objConsumableDTO.getJobName());
elmt.setAttribute("WORKORDERNUMBER", objConsumableDTO.getWorkOrderNumber());
elmt.setAttribute("CustomerName", objConsumableDTO.getCustomerName());
elmt.setAttribute("RoleName", objConsumableDTO.getGroupName());
elmt.setAttribute("VendorName", objConsumableDTO.getVendorName());
elmt.setAttribute("QTY", String.valueOf(Math.abs(objConsumableDTO.getQuantity())));
elmt.setAttribute("unitDescription", objConsumableDTO.getUnitDescription());
elmt.setAttribute("RATEPERQTY", String.valueOf(objConsumableDTO.getRate()));
elmt.setAttribute("AMOUNT", String.valueOf(objConsumableDTO.getAmount()));
rootElmnt.appendChild(elmt);
}
The problem is all the attributes are sorted automatically. How to restrict it?
For eg,
<empdetails age="25" name="john"/>
but i want
<empdetails name="john" age="25"/>
Please suggest some idea.
Thanks,
Duplicate: Order of XML attributes after DOM processing
From the accepted answer:
Look at section 3.1 of the XML
recommendation. It says, "Note that
the order of attribute specifications
in a start-tag or empty-element tag is
not significant."
If a piece of software requires
attributes on an XML element to appear
in a specific order, that software is
not processing XML, it's processing
text that looks superficially like
XML. It needs to be fixed.
If it can't be fixed, and you have to
produce files that conform to its
requirements, you can't reliably use
standard XML tools to produce those
files.
Credit to Robert Rossney
XML attributes are not ordered. How they're output is dependent on the XML output mechanism you use.
Consequently you could write your on output mechanism, but you shouldn't rely on any consumer to consume them in an ordered fashion. If you want/need ordering, you should instead specify a sequence of XML elements below this node.
I have a XmlDocument in java, created with the Weblogic XmlDocument parser.
I want to replace the content of a tag in this XMLDocument with my own data, or insert the tag if it isn't there.
<customdata>
<tag1 />
<tag2>mfkdslmlfkm</tag2>
<location />
<tag3 />
</customdata>
For example I want to insert a URL in the location tag:
<location>http://something</location>
but otherwise leave the XML as is.
Currently I use a XMLCursor:
XmlObject xmlobj = XmlObject.Factory.parse(a.getCustomData(), options);
XmlCursor xmlcur = xmlobj.newCursor();
while (xmlcur.hasNextToken()) {
boolean found = false;
if (xmlcur.isStart() && "schema-location".equals(xmlcur.getName().toString())) {
xmlcur.setTextValue("http://replaced");
System.out.println("replaced");
found = true;
} else if (xmlcur.isStart() && "customdata".equals(xmlcur.getName().toString())) {
xmlcur.push();
} else if (xmlcur.isEnddoc()) {
if (!found) {
xmlcur.pop();
xmlcur.toEndToken();
xmlcur.insertElementWithText("schema-location", "http://inserted");
System.out.println("inserted");
}
}
xmlcur.toNextToken();
}
I tried to find a "quick" xquery way to do this since the XmlDocument has an execQuery method, but didn't find it very easy.
Do anyone have a better way than this? It seems a bit elaborate.
How about an XPath based approach? I like this approach as the logic is super-easy to understand. The code is pretty much self-documenting.
If your xml document is available to you as an org.w3c.dom.Document object (as most parsers return), then you could do something like the following:
// get the list of customdata nodes
NodeList customDataNodeSet = findNodes(document, "//customdata" );
for (int i=0 ; i < customDataNodeSet.getLength() ; i++) {
Node customDataNode = customDataNodeSet.item( i );
// get the location nodes (if any) within this one customdata node
NodeList locationNodeSet = findNodes(customDataNode, "location" );
if (locationNodeSet.getLength() > 0) {
// replace
locationNodeSet.item( 0 ).setTextContent( "http://stackoverflow.com/" );
}
else {
// insert
Element newLocationNode = document.createElement( "location" );
newLocationNode.setTextContent("http://stackoverflow.com/" );
customDataNode.appendChild( newLocationNode );
}
}
And here's the helper method findNodes that does the XPath search.
private NodeList findNodes( Object obj, String xPathString )
throws XPathExpressionException {
XPath xPath = XPathFactory.newInstance().newXPath();
XPathExpression expression = xPath.compile( xPathString );
return (NodeList) expression.evaluate( obj, XPathConstants.NODESET );
}
How about an object oriented approach? You could deserialise the XML to an object, set the location value on the object, then serialise back to XML.
XStream makes this really easy.
For example, you would define the main object, which in your case is CustomData (I'm using public fields to keep the example simple):
public class CustomData {
public String tag1;
public String tag2;
public String location;
public String tag3;
}
Then you initialize XStream:
XStream xstream = new XStream();
// if you need to output the main tag in lowercase, use the following line
xstream.alias("customdata", CustomData.class);
Now you can construct an object from XML, set the location field on the object and regenerate the XML:
CustomData d = (CustomData)xstream.fromXML(xml);
d.location = "http://stackoverflow.com";
xml = xstream.toXML(d);
How does that sound?
If you don't know the schema the XStream solution probably isn't the way to go. At least XStream is on your radar now, might come in handy in the future!
You should be able to do this with query
try
fn:replace(string,pattern,replace)
I am new to xquery myself and I have found it to be a painful query language to work with, but it does work quiet well once you get over the initial learning curve.
I do still wish there was an easier way which was as efficient?