Java, Xerces 2.9.1
insertHere.setAttributeNS(XMLConstants.XML_NS_URI, "xml:space", "preserve");
and
insertHere.setAttributeNS(XMLConstants.XML_NS_URI, "space", "preserve")
both end up with an attribute of just space='preserve', no XML prefix.
insertHere.setAttribute( "xml:space", "preserve")
works, but it seems somehow wrong. Am I missing anything?
EDIT
I checked.
I read a template document in with setNamespaceAware turned on.
I then use the following to make a copy of it, and then I start inserting new elements.
public static Document copyDocument(Document input) {
DocumentType oldDocType = input.getDoctype();
DocumentType newDocType = null;
Document newDoc;
String oldNamespaceUri = input.getDocumentElement().getNamespaceURI();
if (oldDocType != null) {
// cloning doctypes is 'implementation dependent'
String oldDocTypeName = oldDocType.getName();
newDocType = input.getImplementation().createDocumentType(oldDocTypeName,
oldDocType.getPublicId(),
oldDocType.getSystemId());
newDoc = input.getImplementation().createDocument(oldNamespaceUri, oldDocTypeName,
newDocType);
} else {
newDoc = input.getImplementation().createDocument(oldNamespaceUri,
input.getDocumentElement().getNodeName(),
null);
}
Element newDocElement = (Element)newDoc.importNode(input.getDocumentElement(), true);
newDoc.replaceChild(newDocElement, newDoc.getDocumentElement());
return newDoc;
}
When I run the following code:
import javax.xml.XMLConstants;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
public class Demo {
public static void main(String[] args) throws Exception {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
Document document = db.newDocument();
Element rootElement = document.createElement("root");
document.appendChild(rootElement);
rootElement.setAttributeNS(XMLConstants.XML_NS_URI, "space", "preserve");
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
t.transform(new DOMSource(document), new StreamResult(System.out));
}
}
I get the following output:
<root xml:space="preserve"/>
How are you building your document?
Related
I have some kind of complex XML data structure. The structure contains different fragments like in the following example:
<data>
<content-part-1>
<h1>Hello <strong>World</strong>. This is some text.</h1>
<h2>.....</h2>
</content-part1>
....
</data>
The h1 tag within the tag 'content-part-1' is of interest. I want to get the full content of the xml tag 'h1'.
In java I used the javax.xml.parsers.DocumentBuilder and tried something like this:
String my_content="<h1>Hello <strong>World</strong>. This is some text.</h1>";
// parse h1 tag..
DocumentBuilder documentBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = documentBuilder.parse(new InputSource(new StringReader(my_content)));
Node node = doc.importNode(doc.getDocumentElement(), true);
if (node != null && node.getNodeName().equals("h1")) {
return node.getTextContent();
}
But the method 'getTextContent()' will return:
Hello World. This is some text.
The tag "strong" is removed by the xml parser (as it is the documented behavior).
My question is how I can extract the full content of a single XML Node within a org.w3c.dom.Document without any further parsing the node content?
Although java DOM parser provides functionality for parsing mixed content, in this particular case it could be more convenient to use Jsoup library. When using it code to extract h1 element content would be as follows:
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
String text = "<data>\n"
+ " <content-part1>\n"
+ " <h1>Hello <strong>World</strong>. This is some text.</h1>\n"
+ " <h2></h2>\n"
+ " </content-part1>\n"
+ "</data>";
Document doc = Jsoup.parse(text);
Elements h1Elements = doc.select("h1");
for (Element h1 : h1Elements) {
System.out.println(h1.html());
}
Output in this case will be "Hello <strong>World</strong>. This is some text."
What you probaly want is XML generation from some subnode of your document.
So with slighlty modified nodeToString from earlier answer to similar question I can propose to
generate text <h1>Hello <strong>World</strong>. This is some text.</h1>. Some extra effor might be needed to get rid of <h1> and </h1>
package com.github.vtitov.test;
import org.junit.Test;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.xml.sax.InputSource;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import java.io.StringReader;
import java.io.StringWriter;
import static org.hamcrest.MatcherAssert.*;
import static org.hamcrest.Matchers.*;
public class XmlTest {
#Test
public void buildXml() throws Exception {
String my_content="<h1>Hello <strong>World</strong>. This is some text.</h1>";
// parse h1 tag..
DocumentBuilder documentBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = documentBuilder.parse(new InputSource(new StringReader(my_content)));
Node node = doc.importNode(doc.getDocumentElement(), true);
String h1Content = null;
if (node != null && node.getNodeName().equals("h1")) {
h1Content = nodeToString(node);
}
assertThat("h1", h1Content, equalTo("<h1>Hello <strong>World</strong>. This is some text.</h1>"));
}
private static String nodeToString(Node node) throws TransformerException {
StringWriter sw = new StringWriter();
Transformer t = TransformerFactory.newInstance().newTransformer();
t.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
t.setOutputProperty(OutputKeys.INDENT, "no");
t.transform(new DOMSource(node), new StreamResult(sw));
return sw.toString();
}
}
I have the below code that appends to an xml file. The problem is that it includes the xml configuration line each time it appends along with the main element. It works fine for the first record added to the file because there is not previous existing elements. How would I modify this code to exclude those lines for an existing file with existing elements?
import java.io.FileWriter;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.xml.sax.SAXException;
public class WriteXMLFile {
public static void main() throws ParserConfigurationException,SAXException,Exception{
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();//
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();//
Document doc = docBuilder.newDocument();// this is difrent
Element rootElement = doc.createElement("Contacts");//
doc.appendChild(rootElement); // this is difrent
Element Contact1 = doc.createElement("Contact1");
rootElement.appendChild(Contact1);
Contact1.setAttribute("id","1");
Element firstname = doc.createElement("Name");
firstname.appendChild(doc.createTextNode(EmailFrame.name.getText()));
Contact1.appendChild(firstname);
//Email Element
Element email = doc.createElement("Email");
email.appendChild(doc.createTextNode(EmailFrame.email.getText()));
Contact1.appendChild(email);
// phone element
Element phone= doc.createElement("Phone");
phone.appendChild(doc.createTextNode(EmailFrame.phone.getText()));
Contact1.appendChild(phone);
//id element
Element id = doc.createElement("ID");
id.appendChild(doc.createTextNode(EmailFrame.id.getText()));
Contact1.appendChild(id);
try{
// write the content into xml file
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new FileWriter("C:/Users/steve/Desktop/xmlemail/Email.xml",true));
transformer.transform(source, result);
System.out.println("File saved!");
}
catch (TransformerException tfe) {
tfe.printStackTrace();
}
}}
If you are looking for "overriding" the file then you shall use
StreamResult result = new StreamResult(new FileWriter("D:/tmp/Email.xml")); as boolean parameter just appends to the document.
If you indeed want to append but don't want <?xml version="1.0" encoding="UTF-8" standalone="no"?> then you will have to "read" the file, get the root node and then add elements to the root node.
Remember if you just keep on adding nodes to the document and not to the root element, then even without <?xml version="1.0" encoding="UTF-8" standalone="no"?>, your xml will be invalid, as it will contain multiple root elements "contacts".
Something like:
Document doc = null;
Node rootElement = null;
if (f.exists()) {
doc = docBuilder.parse("C:/Users/steve/Desktop/xmlemail/Email.xml");
rootElement = doc.getFirstChild();//
} else {
doc = docBuilder.newDocument();// this is difrent
rootElement = doc.createElement("Contacts");//
doc.appendChild(rootElement); // this is difrent
}
And then save the document to file without appending.
I need to create a XML file using java, I got the file as I like but i am missing relation tag before every end tag, How Can I get that
Expected File:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<FlyBoy>
<learJet>CL-215</learJet>
<rank>2</rank>
<FlyBoy>
<viper>Mark II</viper>
<rank>1</rank>
<FlyBoy>
<viper>Mark II4455</viper>
<rank>2</rank>
<FlyBoy>
<viper>Mark II56666</viper>
<rank>3</rank>
<relation name="Date" table="Sam"/>
</FlyBoy>
<relation name="Date" table="Mark"/>
</FlyBoy>
<relation name="Date" table="sechma"/>
</FlyBoy>
<relation name="Date" table="John"/>
</FlyBoy>
Output I got:
<FlyBoy><learJet>CL-215</learJet><rank>2</rank><FlyBoy><viper>Mark II</viper><rank>1</rank><FlyBoy><viper>Mark II4455</viper><rank>2</rank><FlyBoy><viper>Mark II56666</viper><rank>3</rank></FlyBoy></FlyBoy></FlyBoy></FlyBoy>
Code:
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.TransformerFactoryConfigurationError;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
public class XmlGenerator {
/**
* Render flyboy
*
*/
private Element renderFlyBoy(Element parent, String viper, String rank) {
Element flyBoyEl = document.createElement("FlyBoy");
parent.appendChild(flyBoyEl);
Element viperEl = document.createElement("viper");
viperEl.setTextContent(viper);
flyBoyEl.appendChild(viperEl);
Element rankEl = document.createElement("rank");
rankEl.setTextContent(rank);
flyBoyEl.appendChild(rankEl);
return flyBoyEl;
}
// Test
public static void main(String[] args) {
try {
Document document = null;
Element root = null;
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
document = documentBuilderFactory.newDocumentBuilder().newDocument();
root = document.createElement("FlyBoy");
document.appendChild(root);
Element learJet = document.createElement("learJet");
learJet.setTextContent("CL-215");
root.appendChild(learJet);
Element rank = document.createElement("rank");
rank.setTextContent("2");
root.appendChild(rank);
Element flyBoy1 = renderFlyBoy(root, "Mark II", "1");
Element flyBoy2 = renderFlyBoy(flyBoy1, "Mark II4455", "2");
Element flyBoy3 = renderFlyBoy(flyBoy2, "Mark II56666", "3");
Element relation_schema= doc.createElement("relation");
flyBoy1.appendChild(relation_schema);
Attr join2 = doc.createAttribute("name");
join2.setValue("Date");
relation_schema.setAttributeNode(join2);
Attr type = doc.createAttribute("table");
type.setValue("xxxxxx");
relation_schema.setAttributeNode(type);
DOMSource domSource = new DOMSource(document);
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
StreamResult result = new StreamResult(new File("my.xml"));
transformer.transform(domSource, result);
} catch (ParserConfigurationException | TransformerFactoryConfigurationError | TransformerException e) {
e.printStackTrace();
}
System.out.println("done...");
} catch (Exception e) {
e.printStackTrace();
}
}
}
Inside the renderFlyBoy method you need to create the relation in the same way you created others. Then relation will be added inside FlyBoy
Element rankEl = document.createElement("relation");
// code to add attributes
flyBoyEl.appendChild(rankEl);
How to update the XML node from <ns0:Request> to <ns1:Request xmlns:ns1="with some URL">
in Java using DOM parser.
I do not want to use the replaceAll() method.
You can do the following using the Node.replaceChild(Node, Node) method:
http://docs.oracle.com/javase/7/docs/api/org/w3c/dom/Node.html#replaceChild%28org.w3c.dom.Node,%20org.w3c.dom.Node%29
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
public class Demo {
public static void main(String[] args) throws Exception {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
// Create original document
Document document = db.newDocument();
Element root = document.createElementNS("urn:FOO", "ns0:Root");
document.appendChild(root);
Element request = document.createElementNS("urn:FOO", "ns0:Request");
root.appendChild(request);
// Create new Request element.
Element newRequest = document.createElementNS("urn:BAR", "ns1:Request");
// Replace Request element
root.replaceChild(newRequest, request);
// Output the new document
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
DOMSource source = new DOMSource(document);
StreamResult result = new StreamResult(System.out);
t.transform(source, result);
}
}
In my XML file I have some entities such as ’
So I have created a DTD tag for my XML document to define these entities. Below is the Java code used to read the XML file.
SAXBuilder builder = new SAXBuilder();
URL url = new URL("http://127.0.0.1:8080/sample/subject.xml");
InputStream stream = url.openStream();
org.jdom.Document document = builder.build(stream);
Element root = document.getRootElement();
Element name = root.getChild("name");
result = name.getText();
System.err.println(result);
How can I change the Java code to retrieve a DTD over HTTP to allow the parsing of my XML document to be error free?
Simplified example of the xml document.
<main>
<name>hello ‘ world ’ foo & bar </name>
</main>
One way to do this would be to read the document and then validate it with the transformer:
import java.net.URL;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
public class ValidateWithExternalDTD {
private static final String URL = "http://127.0.0.1:8080/sample/subject.xml";
private static final String DTD = "http://127.0.0.1/YourDTD.dtd";
public static void main(String args[]) {
try {
DocumentBuilderFactory factory= DocumentBuilderFactory.newInstance();
factory.setValidating(true);
DocumentBuilder builder = factory.newDocumentBuilder();
// Set the error handler
builder.setErrorHandler(new org.xml.sax.ErrorHandler() {
public void fatalError(SAXParseException spex)
throws SAXException {
// output error and exit
spex.printStackTrace();
System.exit(0);
}
public void error(SAXParseException spex)
throws SAXParseException {
// output error and continue
spex.printStackTrace();
}
public void warning(SAXParseException spex)
throws SAXParseException {
// output warning and continue
spex.printStackTrace();
}
});
// Read the document
URL url = new URL(ValidateWithExternalDTD.URL);
Document xmlDocument = builder.parse(url.openStream());
DOMSource source = new DOMSource(xmlDocument);
// Use the tranformer to validate the document
StreamResult result = new StreamResult(System.out);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, ValidateWithExternalDTD.DTD);
transformer.transform(source, result);
// Process your document if everything is OK
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
Another way would be to replace the XML title with the XML title plus the DTD reference
Replace this:
<?xml version = "1.0"?>
with this:
<?xml version = "1.0"?><!DOCTYPE ...>
Of course you would replace the first occurance only and not try to go through the whole xml document
You have to instantiate the SAXBuilder by passing true(validate) to its constructor:
SAXBuilder builder = new SAXBuilder(true);
or call:
builder.setValidation(true)