first of all please excuse my shallow understanding into coding as I am a business analyst. Now my question. I am writing java code to convert a csv into xml. I am able to read csv successfully into objects. However, while writing the xml, when special a space or "=" is encounteredan error is thrown.
Piece of the problematic code, I have imporovised the value in create element just to highlight the problem. In actual I am getting this value from an object:-
DocumentBuilderFactory documentFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentFactory.newDocumentBuilder();
Document xmlDocument= documentBuilder.newDocument();
Element root = xmlDocument.createElement("Media NationalGroupId="8" AllFTA="1002" AllSTV="1001");
xmlDocument.appendChild(root);
My xml should look something like this
<Media DateCreated="20200224 145251" NationalGroupId="8" AllFTA="1002" AllSTV="1001" AllTV="1000" NextId="1000000">
createElement should only receive Media as the argument.
To add the other attributes (DateCreated, NationalGroupId, etc), you need to call setAttribute on root, one by one.
Related
I have never had to download an XML file in Java before and parse it after. I'm looking to download and parse this file http://api.irishrail.ie/realtime/realtime.asmx/getStationDataByNameXML?StationDesc=Bayside
All I want to do is read the train times. I've been reading about parsing XML but I'm not really getting anywhere with it. I just keep reading about parsers like stax, after that I'm a bit lost.
Can anyone give me some basic advice of what I need to do?
You can use JAXB for this and any other XML processing needs. Start here.
You can use DOM parser and new Java Architecture for XML Binding JAXB it will help you in marshalling (for converting an xml to Object) and unmarshalling(for converting an Object to xml)
link for the example
http://www.vogella.com/tutorials/JAXB/article.html
You can use the DOM Parser to create a Document object from the XML file.
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new File(filename));
The DOM Parser creates a traversable tree from your XML data.
You can then pick out the data you need from the DOM tree using XPath.
try this library
http://x-stream.github.io/download.html
it is simple and fast to use to write and read xml from java.
My XML file looks like this:
<Messages>
<Contact Name="Robin" Number="8775454554">
<Message Date="24 Jan 2012" Time="04:04">this is report1</Message>
</Contact>
<Contact Name="Tobin" Number="546456456">
<Message Date="24 Jan 2012" Time="04:04">this is report2</Message>
</Contact>
<Messages>
I need to check whether the 'Number' attribute of Contact element is equal to 'somenumber' and if it is, I'm required to insert one more Message element inside Contact element.
How can it be achieved using DOM? And what are the drawbacks of using DOM?
The main drawback to using a DOM is it's necessary to load the whole model into memory at once, rather than if your simply parsing the document, you can limit the data you keep in memory at one point. This of course isn't really an issue until your processing very large XML documents.
As for the processing side of things, something like the following should work:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document dom = db.parse(is);
NodeList contacts = dom.getElementsByTagName("Contact");
for(int i = 0; i < contacts.getLength(); i++) {
Element contact = (Element) contacts.item(i);
String contactNumber = contact.getAttribute("Number");
if(contactNumber.equals(somenumber)) {
Element newMessage = dom.createElement("Message");
// Configure the message element
contact.appendChild(newMessage);
}
}
DOM has two main disadvantages:
It requires reading of the complete XML into a Java representation in memory. That can be both time and memory consuming
It is a pretty verbose API, so you need to write a lot of code to achieve simple things like you're asking for.
If time and memory consumption is OK for you, but verbosity is not, you could still use jOOX, a library that I have created to wrap standard Java DOM objects to simplify manipulation of XML. These are some examples of how you would implement your requirement with jOOX:
// With css-style selectors
String result1 = $(file).find("Contact[Number=somenumber]").append(
$("<Message Date=\"25 Jan 2012\" Time=\"23:44\">this is report2</Message>")
).toString();
// With XPath
String result2 = $(file).find("//Contact[#Number = somenumber]").append(
$("<Message Date=\"25 Jan 2012\" Time=\"23:44\">this is report2</Message>")
).toString();
// Instead of file, you can also provide your source XML in various other forms
Note that jOOX only wraps standard Java DOM. The underlying operations (find() and append(), as well as $() actually perform various DOM operations).
You will do something to this effect.
Get the NodeList of Contact element.
Iterate through the NodeList and get Contact element.
Get Number through contact.getAttribute("Number") where contact is of type Element.
If your number equals someNumber, then add Message by calling contact.appendChild(). Message must be an element.
Use the Element class to create a new element
Element message = doc.createElement("Message");
message.setAttribute("message", strMessage);
Now add this element after whatever element you want using
elem.getParentNode().insertBefore(message, elem.getNextSibling());
You might want to take a look at this tutorial its about exactly what you want to do
I am trying to build server that sends a xml file to client. I am getting info from db and wants to build from that xml file.
But I have a problem with:
DocumentBuilder documentBuilder = null;
Document doc =documentBuilder.newDocument();
I am getting NullPointerException. Here is me full code:
public void createXmlTree() throws Exception {
//This method creates an element node
DocumentBuilder documentBuilder = null;
Document doc =documentBuilder.newDocument();
Element root = doc.createElement("items");
//adding a node after the last child node of the specified node.
doc.appendChild(root);
for(int i=0;i<db.stories.size();i++){
Element child = doc.createElement("item");
root.appendChild(child);
Element child1 = doc.createElement("title");
child.appendChild(child1);
Text text = doc.createTextNode(db.stories.get(i).title);
child1.appendChild(text);
//Comment comment = doc.createComment("Employee in roseindia");
//child.appendChild(comment);
Element child2 = doc.createElement("date");
child.appendChild(child2);
Text text2 = doc.createTextNode(db.stories.get(i).date);
child2.appendChild(text2);
Element child3 = doc.createElement("text");
child.appendChild(child3);
Text text3 = doc.createTextNode(db.stories.get(i).text);
child3.appendChild(text3);
root.appendChild(child3);
Well yes, you would get a NullPointerException. You're calling a method on a null reference - very clearly, given that you've assigned the documentBuilder a null value on the line before. You need to get an instance of DocumentBuilder to start with. For example:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = factory.newDocumentBuilder();
of course you are getting a NullPointerException, your DocumentBuilder is null.
Try instantiating it first.
// Step 1: create a DocumentBuilderFactory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
// Step 2: create a DocumentBuilder
DocumentBuilder db = dbf.newDocumentBuilder();
Guys are right about DocumentBuilder. But may I offer you other solution? Your servlet mostly deals with generating of XML itself, i.e. produces kind of markup. This is the purpose of JSP. You can implement simple JSP page that will actually contain template of your XML and some code that inserts dynamic data. This is much simpler and easier to maintain.
Yes, JSP typically generate HTML but no-one said that they cannot generate XML or any other text format. Just do not forget to set content type to text/xml.
Do you really need to write you XML manually?
Do you have the XSD of the XML you want to write?
Because, it would be easier to generate some classes using XJC/JAXB and use the marshaller to write your XML file.
I am currently parsing XHTML documents with a DOM parser, like:
final DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setValidating(false);
final DocumentBuilder db = dbf.newDocumentBuilder();
db.setEntityResolver(MY_ENTITY_RESOLVER);
db.setErrorHandler(MY_ERROR_HANDLER);
...
final Document doc = db.parse(inputSource);
And my problem is that when my document contains an entity reference like, for example:
<p>€</p>
My parser creates a Text node for that content containing "€" instead of "€". This is, it is resolving the entity in the way it is supposed to do it (the XHTML 1.0 Strict DTD links to the ENTITIES Latin1 DTD, which in turn establishes the equivalence of "€" with "€").
The problem is, I don't want the parser to do such thing. I would like to keep the "€" text unmodified.
I've already tried with:
final DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setExpandEntityReferences(false);
But:
I don't like this because I fear this might make some parser implementations not navigate from the XHTML 1.0 Strict DTD to the ENTITIES Latin1 DTD and therefore not consider "€" as a declared entity.
When I do this, it weirdly creates two nodes: a "pound" Entity node, and a Text node with the "€" symbol after it.
Any ideas? Is it possible to configure this in a DOM Parser without resorting to preprocessing the XHTML and substituting all "&" symbols for something other?...
Solutions could be for a DOM parser or also a SAX one, I wouldn't mind using SAX parsing and then creating my DOM using a transformation...
Also, I cannot switch to a non standard XML parsing libray. No jdom, no jsoup, no HtmlCleaner, etc.
Thanks a lot.
The approach I took was to replace any entities with a unique marker that is treated as plain text by Xerces. Once converted into a Document object, the markers are replaced with Entity Reference objects.
See the convertStringToDocument() function in http://sourceforge.net/p/commonclasses/code/14/tree/trunk/src/com/redhat/ecs/commonutils/XMLUtilities.java
I inherited an "XML" license file containing no root element, but rather two XML fragments (<XmlCreated> and <Product>) so when I try to parse the file, I (expectantly) get an error about a document that is not-well-formed.
I need to get both the XmlCreated and Product tags.
Sample XML file:
<?xml version="1.0"?>
<XmlCreated>May 11 2009</XmlCreated>
<!-- License Key file Attributes -->
<Product image ="LicenseKeyFile">
<!-- MyCompany -->
<Manufacturer ID="7f">
<SerialNumber>21072832521007</SerialNumber>
<ChassisId>72060034465DE1C3</ChassisId>
<RtspMaxUsers>500</RtspMaxUsers>
<MaxChannels>8</MaxChannels>
</Manufacturer>
</Product>
Here is the current code that I use to attempt to load the XML. It does not work, but I've used it before as a starting point for well-formed XML.
public static void main(String[] args) {
try {
File file = new File("C:\\path\\LicenseFile.xml");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(file);
} catch (Exception e) {
e.printStackTrace();
}
}
At the db.parse(file) line, I get the following Exception:
[Fatal Error] LicenseFile.xml:6:2: The markup in the document following the root element must be well-formed.
org.xml.sax.SAXParseException: The markup in the document following the root element must be well-formed.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
at com.mycompany.licensesigning.LicenseSigner.main(LicenseSigner.java:20)
How would I go about parsing this frustrating file?
If you know this document is always going to be non-well formed... make it so. Add a new dummy <root> tag after the <?xml...>and </root> after the last of the data.
You're going to need to create two separate Document objects by breaking the file up into smaller pieces and parsing those pieces individually (or alternatively reconstructing them into a larger document by adding a tag which encloses both of them).
If you can rely on the structure of the file it should be easy to read the file into a string and then search for substrings like <Product and </Product> and then use those markers to create a string you can pass into a document builder.
How about implementing a simple wrapper around InputStream that wraps the input from the file with a root-level tag, and using that as the input to DocumentBuilder.parse()?
If the expected input is small enough to load into memory, read into a string, wrap it with a dummy start/end tag and then use:
DocumentBuilder.parse(new InputSource(new StringReader(string)))
I'd probably create a SequenceInputStream where you sandwich the real stream with two ByteArrayInputStreams that return some dummy root start tag, and end tag.
Then i'd use use the parse method that takes a stream rather than a file name.
I agree with Jim Garrison to some extent, use an InputStream or StreamReader and wrap the input in the required tags, its a simple and easy method. Main problem i can forsee is you'll have to have some checks for valid and invalid formatting (if you want to be able to use the method for both valid and invalid data), if the formatting is invalid (because of root level tags missing) wrap the input with the tags, if its valid then don't wrap the input. If the input is invalid for some other reason, you can also alter the input to correct the formatting issues.
Also, its probably better to store the ipnut in a collection of strings (of some sort) rather than a string itself, this will mean that you wont have as much of a limit to your input size. Make each string one line from the file. You should end up with a logical and easy to follow structure which mwill make it easier to allow for corrections of other formatting issues in the future.
Hardest part about that is figuring out what has caused the invalid formatting. In your case just check for root level tags, if the tags exist and are formatted correctly, dont wrap, If not, wrap.