this is my xml:
Example:
<?xml version="1.0" encoding="UTF_8" standalone="yes"?>
<StoreMessage xmlns="http://www.xxx.com/feed">
<billingDetail>
<billingDetailId>987</billingDetailId>
<contextId>0</contextId>
<userId>
<pan>F0F8DJH348DJ</pan>
<contractSerialNumber>46446</contractSerialNumber>
</userId>
<declaredVehicleClass>A</declaredVehicleClass>
</billingDetail>
<billingDetail>
<billingDetailId>543</billingDetailId>
<contextId>0</contextId>
<userId>
<pan>F0F854534534348DJ</pan>
<contractSerialNumber>4666546446</contractSerialNumber>
</userId>
<declaredVehicleClass>C</declaredVehicleClass>
</billingDetail>
</StoreMessage>
With JDOM parser i want to get all <billingDetail> xml nodes from it.
my code:
SAXBuilder builder = new SAXBuilder();
try {
Reader in = new StringReader(xmlAsString);
Document document = (Document)builder.build(in);
Element rootNode = document.getRootElement();
List<?> list = rootNode.getChildren("billingDetail");
XMLOutputter outp = new XMLOutputter();
outp.setFormat(Format.getCompactFormat());
for (int i = 0; i < list.size(); i++) {
Element node = (Element)list.get(i);
StringWriter sw = new StringWriter();
outp.output(node.getContent(), sw);
StringBuffer sb = sw.getBuffer();
String text = sb.toString();
xmlRecords.add(sb.toString());
}
} catch (IOException io) {
io.printStackTrace();
} catch (JDOMException jdomex) {
jdomex.printStackTrace();
}
but i never get as output xml node as string like:
<billingDetail>
<billingDetailId>987</billingDetailId>
<contextId>0</contextId>
<userId>
<pan>F0F8DJH348DJ</pan>
<contractSerialNumber>46446</contractSerialNumber>
</userId>
<declaredVehicleClass>A</declaredVehicleClass>
</billingDetail>
what i am doing wrong? How can i get this output with JDOM parser?
EDIT
And why if XML start with
<StoreMessage> instead like <StoreMessage xmlns="http://www.xxx.com/MediationFeed">
then works? How is this possible?
The problem is that there are two versions of the getChildren method:
java.util.List getChildren(java.lang.String name)
This returns a List of all the child elements nested directly (one level deep) within this element with the given local name and belonging to no namespace, returned as Element objects.
and
java.util.List getChildren(java.lang.String name, Namespace ns)
This returns a List of all the child elements nested directly (one level deep) within this element with the given local name and belonging to the given Namespace, returned as Element objects.
The first one doesn't find your node if it belongs to a namespace, you should use the second one.
Related
The first time I run this program I need to create the xml, I first create the file and create a Document object and then convert it to an Element object.
xmlDoc = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>";
xmlDoc +="<head>";
xmlDoc += "</head>";
Document xmlFile = XmlParser.parseXmlString(xmlDoc);
Element element = xmlFile.getDocumentElement();
I have already verified this with its NodeType code, but when I create the parent node it gives me the Element_Node == 1. I attach this node to the element object.
Element newElement = xmlFile.createElement("parent");
newElement.setAttribute("id", i);
element.appendChild(newElement);
I will put the child in a parent if it isn't already a child of the parent element, I check for this, if it isn't a child yet I will create a new Node class and give it text content.
Node newChild = xmlFile.createElement("child");
newChild.setTextContent(text);
newElement.appendChild(newChild);
Then I will save this file with a transformer.
Transformer transformer = null;
try {
transformer = TransformerFactory.newInstance().newTransformer();
} catch (TransformerConfigurationException | TransformerFactoryConfigurationError e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(xmlFile);
StreamResult console = new StreamResult(System.out);
try {
transformer.transform(source, new StreamResult(new FileOutputStream(file.getPath())));
} catch (TransformerException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Now the second time I run the program I will parse straight from this file. The XML file that was created has the structure below
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<head>
<parent id="1">
<child>text1</child>
<child>text2</child>
<child>text3</child>
</parent>
<parent id="2">
<child>text1</child>
<child>text2</child>
</parent>
</head>
now that the file is created, the file will be read then parsed to create the element instead of the hard coded string.
xmlDoc = this.readFile(file, Charset.forName("UTF-8"));
Document xmlFile = XmlParser.parseXmlString(xmlDoc);
Element element = xmlFile.getDocumentElement();
...
String readFile(File file, Charset charset) throws IOException {
return new String(Files.readAllBytes(file.toPath()), charset);
The problem is now the parent element cannot be casted as a Element and has the Text_Node type value == 3. The following object cannot be casted.
Element nextSib = (Element) element.getFirstChild();
The idea is that now I can append a relevant child to a parent by going through each parent node which is why I need to obtain it in Element form so I can use the id attribute. But I cannot do this since the parent node is converted to a text node for some reason.
As you use the indentation when writing out a tree there will be white space between element nodes so a child node can be a text node with white space. If you are looking for the first element child node either use XPath *[1] or simply the name of the element foo[1] or if you want to do it with childNodes make sure you check the nodeType until you have an element node.
I am trying to parse a simple XML file. It looks like this
<?xml version="1.0" encoding="utf-8">
<resources xmlns:ns1="urn:oasis:names:tc:xliff:document:1.2">
<string name="action_settings">Settings</string>
<string name="app_name">Colatris Sample</string>
<string name="cdata"><![CDATA[<p>Text<p>]]></string>
<string name="content_description_sample">Something</string>
<string name="countdown"><xliff:g example="5 days" id="time">%1$s</xliff:g> until holiday</string>
</resources>
This is my parsing method:
List<CsString> extract(Document document) throws CsException {
List<CsString> csStrings = new ArrayList<>();
Element resources = document.getDocumentElement();
NodeList strings = resources.getElementsByTagName("string");
for (int i = 0; i < strings.getLength(); i++) {
Node string = strings.item(i);
csStrings.add(new CsString(string.getAttributes().getNamedItem("name").getNodeValue(), string.getTextContent()));
}
return csStrings;
}
I am building the passed Document with this method.
Document getDocument() throws CsException {
try {
Application application = core.getApplication();
AssetManager assetManager = application.getAssets();
InputStream inputStream = assetManager.open("colatris/values.xml");
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setIgnoringElementContentWhitespace(true);
DocumentBuilder builder = factory.newDocumentBuilder();
return builder.parse(inputStream);
} catch (IOException | ParserConfigurationException | SAXException e) {
throw new CsException("Unable to get parser");
}
}
Everything is working great. Except for the cdata and countdown elements. I want to just get the literal between the string elements. However, the parser is only returning the text inside of CDATA and stripping out the xliff tags.
String countdown = %1$s until holiday
String cdata = <p>Text<p>
I want the parsed strings to look like this so I can persist them literally. I need to be able to reconstruct XML down the road with the meta data in the correct places.
String countdown = <ns1:g example="5 days" id="time">%1$s</ns1:g> until holiday
String cdata = <![CDATA[<p>Text<p>]]>
Are there are any configuration tricks for Document in order to keep the nodes between two elements as literal strings? For most users strpping CDATA makes sense but I need to get around that.
The reason is of course that you are just extracting the text from the string element. What you should do is to get the sub-node (or maybe sub-nodes, don't know the exact layout of your files) and output them again using a javax.xml.transform.Transformer. The code would look something like:
NodeList list = document.getDocumentElement().getElementsByTagName("string");
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty("omit-xml-declaration", "yes");
for (int i = 0; i < list.getLength(); i++) {
Node node = list.item(i);
Node child = node.getFirstChild();
StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(child), new StreamResult(writer));
System.out.println(writer.toString()); // Do your list thing in stead
}
I am creating a W3C Document object using a String value. Once I created the Document object, I want to add a namespace to the root element of this document. Here's my current code:
Document document = builder.parse(new InputSource(new StringReader(xmlString)));
document.getDocumentElement().setAttributeNS("http://com", "xmlns:ns2", "Test");
document.setPrefix("ns2");
TransformerFactory tranFactory = TransformerFactory.newInstance();
Transformer aTransformer = tranFactory.newTransformer();
Source src = new DOMSource(document);
Result dest = new StreamResult(new File("c:\\xmlFileName.xml"));
aTransformer.transform(src, dest);
What I use as input:
<product>
<arg0>DDDDDD</arg0>
<arg1>DDDD</arg1>
</product>
What the output should look like:
<ns2:product xmlns:ns2="http://com">
<arg0>DDDDDD</arg0>
<arg1>DDDD</arg1>
</ns2:product>
I need to add the prefix value and namespace also to the input xml string. If I try the above code I am getting this exception:
NAMESPACE_ERR: An attempt is made to create or change an object in a way which is incorrect with regard to namespaces.
Appreciate your help!
Since there is not an easy way to rename the root element, we'll have to replace it with an element that has the correct namespace and attribute, and then copy all the original children into it. Forcing the namespace declaration is not needed because by giving the element the correct namespace (URI) and setting the prefix, the declaration will be automatic.
Replace the setAttribute and setPrefix with this (line 2,3)
String namespace = "http://com";
String prefix = "ns2";
// Upgrade the DOM level 1 to level 2 with the correct namespace
Element originalDocumentElement = document.getDocumentElement();
Element newDocumentElement = document.createElementNS(namespace, originalDocumentElement.getNodeName());
// Set the desired namespace and prefix
newDocumentElement.setPrefix(prefix);
// Copy all children
NodeList list = originalDocumentElement.getChildNodes();
while(list.getLength()!=0) {
newDocumentElement.appendChild(list.item(0));
}
// Replace the original element
document.replaceChild(newDocumentElement, originalDocumentElement);
In the original code the author tried to declare an element namespace like this:
.setAttributeNS("http://com", "xmlns:ns2", "Test");
The first parameter is the namespace of the attribute, and since it's a namespace attribute it need to have the http://www.w3.org/2000/xmlns/ URI. The declared namespace should come into the 3rd parameter
.setAttributeNS("http://www.w3.org/2000/xmlns/", "xmlns:ns2", "http://com");
Bellow approach also works for me, but probably should not use in performance critical case.
Add name space to document root element as attribute.
Transform the document to XML string. The purpose of this step is to make the child element in the XML string inherit parent element namespace.
Now the xml string have name space.
You can use the XML string to build a document again or used for JAXB unmarshal, etc.
private static String addNamespaceToXml(InputStream in)
throws ParserConfigurationException, SAXException, IOException,
TransformerException {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
/*
* Must not namespace aware, otherwise the generated XML string will
* have wrong namespace
*/
// dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
Document document = db.parse(in);
Element documentElement = document.getDocumentElement();
// Add name space to root element as attribute
documentElement.setAttribute("xmlns", "http://you_name_space");
String xml = transformXmlNodeToXmlString(documentElement);
return xml;
}
private static String transformXmlNodeToXmlString(Node node)
throws TransformerException {
TransformerFactory transFactory = TransformerFactory.newInstance();
Transformer transformer = transFactory.newTransformer();
StringWriter buffer = new StringWriter();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.transform(new DOMSource(node), new StreamResult(buffer));
String xml = buffer.toString();
return xml;
}
Partially gleaned from here, and also from a comment above, I was able to get it to work (transforming an arbitrary DOM Node and adding a prefix to it and all its children) thus:
private String addNamespacePrefix(Document doc, Node node) throws TransformerException {
Element mainRootElement = doc.createElementNS(
"http://abc.de/x/y/z", // namespace
"my-prefix:fake-header-element" // prefix to "register" it with the DOM so we don't get exceptions later...
);
List<Element> descendants = nodeListToArrayRecurse(node.getChildNodes()); // for some reason we have to grab all these before doing the first "renameNode" ... no idea why ...
mainRootElement.appendChild(node);
doc.renameNode(node, "http://abc.de/x/y/z", "my-prefix:" + node.getNodeName());
descendants.stream().forEach(c -> doc.renameNode(c, "http://abc.de/x/y/z", "my-prefix:" + c.getNodeName()));
}
private List<Element> nodeListToArrayRecurse(NodeList entryNodes) {
List<Element> allEntries = new ArrayList<>();
for (int i = 0; i < entryNodes.getLength(); i++) {
Node child = entryNodes.item(i);
if (child.getNodeType() == Node.ELEMENT_NODE) {
allEntries.add((Element) child);
allEntries.addAll(nodeListToArray(child.getChildNodes())); // recurse
} // ignore other [i.e. text] nodes https://stackoverflow.com/questions/14566596/loop-through-all-elements-in-xml-using-nodelist
}
return allEntries;
}
If it helps anybody. I then convert it to string, then manually remove the extra header and closing lines. What a pain, I must be doing something wrong...
This seems to be working for me, and it's much simpler than those answers provided:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
document = builder.parse(new File(filename));
document.getDocumentElement().setAttributeNS("http://www.w3.org/2000/xmlns/", "xmlns:yourNamespace", "http://whatever/else");
I have to parse an XML file using JDOM and get some infos from all his elements.
<?xml version="1.0" encoding="UTF-8"?>
<root>
<element1>something</element1>
<element2>
<subelement21>moo</subelement21>
<subelement22>
<subelement221>toto</subelement221>
<subelement222>tata</subelement222>
</subelement22>
</element2>
</root>
So, for the element1 it's easy. But for the element2 I have to go through his children and if the children has children go through them too and so on.
public static void getInfos(Vector<String> files) {
Document document = null;
Element root = null;
SAXBuilder sxb = new SAXBuilder();
for (int i =0 ; i< files.size() ; i++)
{
System.out.println("n°" + i + " : " + files.elementAt(i));
try
{
document = sxb.build(files.elementAt(i));
root = document.getRootElement();
List<?> listElements = root.getChildren();
Iterator<?> it = listElements.iterator();
while(it.hasNext())
{
Element courant = (Element)it.next();
System.out.println(courant.getName());
if(courant.getChildren().size() > 0)
{
// here is the problem -> the element has a children
}
}
}
catch (Exception e) {
e.printStackTrace();
}
}
}
What do you suggest in this case, like a recursive call or something else so I can use the same function.
Thanks.
I would use SAX. I'd keep a stack in the contenthandler that tracked what my current path was in the document, and keep a buffer that my characters method appended to. In endElement I'd get the content from the buffer and clear it out, then use the current path to decide what to do with it.
(this is assuming this document has no mixed-content.)
Here's a link to an article on using SAX to process complex XML documents, it expands on what I briefly described into an approach that handles recursive data structures. (It also has a predecessor article that is an introduction to SAX.)
You could consider using XPath to get the exact elements you want. The example here uses namespaces but the basic idea holds.
I wanna read feed entries and I'm just stuck now. Take this for example : https://stackoverflow.com/feeds/question/2084883 lets say I wanna read all the summary node value inside each entry node in document. How do I do that? I've changed many variations of code this one is closest to what I want to achieve I think :
Element entryPoint = document.getRootElement();
Element elem;
for(Iterator iter = entryPoint.elements().iterator(); iter.hasNext();){
elem = (Element)iter.next();
System.out.println(elem.getName());
}
It goes trough all nodes in xml file and writes their name. Now what I wanted to do next is
if(elem.getName().equals("entry"))
to get only the entry nodes, how do I get elements of the entry nodes, and how to get let say summary and its value? tnx
Question: how to get values of summary nodes from this link
Have you tried jdom? I find it simpler and convenient.
http://www.jdom.org/
To get all children of an xml element, you can just do
SAXBuilder sb = new SAXBuilder();
StringReader sr = new StringReader(xmlDocAsString);
Document doc = sb.build(sr);
Element root = doc.getRootElement();
List l = root.getChildren("entry");
for (Iterator iter = l.iterator(); iter.hasNext();) {
...//do whatever...
}
Here's how you'd do it using vanilla Java:
//read the XML into a DOM
StreamSource source = new StreamSource(new StringReader("<theXml></theXml>"));
DOMResult result = new DOMResult();
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.transform(source, result);
Node root = result.getNode();
//make XPath object aware of namespaces
XPath xpath = XPathFactory.newInstance().newXPath();
xpath.setNamespaceContext(new NamespaceContext(){
#Override
public String getNamespaceURI(String prefix) {
if ("atom".equals(prefix)){
return "http://www.w3.org/2005/Atom";
}
return null;
}
#Override
public String getPrefix(String namespaceURI) {
return null;
}
#Override
public Iterator getPrefixes(String namespaceURI) {
return null;
}
});
//get all summaries
NodeList summaries = (NodeList) xpath.evaluate("/atom:feed/atom:entry/atom:summary", root, XPathConstants.NODESET);
for (int i = 0; i < summaries.getLength(); ++i) {
Node summary = summaries.item(i);
//print out all the attributes
for (int j = 0; j < summary.getAttributes().getLength(); ++j) {
Node attr = summary.getAttributes().item(j);
System.out.println(attr.getNodeName() + "=" + attr.getNodeValue());
}
//print text content
System.out.println(summaries.item(i).getTextContent());
}
if(elem.getName() == "entry")
I have no idea whether this is your problem (you don't really state what your problem is), but never test string equality with --. Instead, use equals():
if(elem.getName().equals("entry"))
A bit late but it might be useful for people googling...
There is a specialized API for dealing with RSS and Atom feeds in Java. It's called Rome, can be found here :
http://java.net/projects/rome/
It is really quite useful, it makes easy to read feed whatever the RSS or Atom version. You can also build feeds and generate the XML with it though I have no experience with this feature.
Here is a simple example that reads a feed and prints out the description nodes of all the entries in the feed :
URL feedSource = new URL("http://....");
feed = new SyndFeedInput().build(new XmlReader(feedSource));
List<SyndEntryImpl> entries = (List<SyndEntryImpl>)feed.getEntries();
for(SyndEntryImpl entry : entries){
System.out.println(entry.getDescription().getValue());
}
Simple enough.