Setting Namespace Attributes on an Element - java

I'm trying to create an XML document in Java that contains the following Element:
<project xmlns="http://www.imsglobal.org/xsd/ims_qtiasiv1p2"
xmlns:acme="http://www.acme.com/schemas"
color="blue">
I know how to create the project Node. I also know how to set the color attribute using
element.setAttribute("color",
"blue")
Do I set the xmlns and xmlns:acme attributes the same way using setAttribute() or do I do it in some special way since they are namespace attributes?

I believe that you have to use:
element.setAttributeNS("http://www.w3.org/2000/xmlns/", "xmlns:acme", "http://www.acme.com/schemas");

I do not think below code will serve the question!
myDocument.createElementNS("http://www.imsglobal.org/xsd/ims_qtiasiv1p2","project");
This will create an element as below (using DOM)
<http://www.imsglobal.org/xsd/ims_qtiasiv1p2:project>
So this will not add an namespace attribute to an element. So using DOM we can do something like
Element request = doc.createElement("project");
Attr attr = doc.createAttribute("xmlns");
attr.setValue("http://www.imsglobal.org/xsd/ims_qtiasiv1p2");
request.setAttributeNode(attr);
So it will set the first attribute like below, you can set multiple attributes in the same way
<project xmlns="http://www.imsglobal.org/xsd/ims_qtiasiv1p2>

The short answer is: you do not create xmlns attributes yourself. The Java XML class library automatically creates those. By default, it will auto-create namespace mappings and will choose prefixes based on some internal algorithm.
If you don't like the default prefixes assigned by the Java XML serializer, you can control them by creating your own namespace resolver, as explained in this article:
https://www.intertech.com/Blog/jaxb-tutorial-customized-namespace-prefixes-example-using-namespaceprefixmapper/

You can simply specify the namespace when you create the elements. For example:
myDocument.createElementNS("http://www.imsglobal.org/xsd/ims_qtiasiv1p2","project");
Then the java DOM libraries will handle your namespace declarations for you.

The only way that worked for me, in 2019, was using the attr() method:
Element element = doc.createElement("project");
element.attr("xmlns","http://www.imsglobal.org/xsd/ims_qtiasiv1p2");

Related

Java XSLT transformer with default namepace without xmlns

I'm working on some Java code that takes XML in DOM, with no namespace prefixes declared, yet each element has a namespace of http://www.w3.org/1999/xhtml. (This is equivalent to the HTML DOM a browser gets.) The code uses the following to serialize the DOM to a string:
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
The resulting string looks like this:
…
<html xmlns="http://www.w3.org/1999/xhtml">
…
Note the presence of xmlns="http://www.w3.org/1999/xhtml", which the DOM did not have. In terms of XML, this is entirely correct: if the element uses a namespace (even without a prefix), the namespace must be declared on that element or a an ancestor element; and this being the document element, the namespace declaration must go here.
However HTML is a little different story. The WHATWG HTML5 Specification § 2.1.3 XML compatibility says:
To ease migration from HTML to XML, user agents conforming to this specification will place elements in HTML in the http://www.w3.org/1999/xhtml namespace, at least for the purposes of the DOM and CSS.
In other words, HTML browsers will assume a namespace of http://www.w3.org/1999/xhtml namespace even without a namespace declaration. And typical clean HTML will not have a namespace declaration. And for this particular use case, a namespace declaration is not required.
How can I tell a transformer not to add a default namespace declaration for the document? Alternatively, how can I remove it later without resorting to brute force such as regular expression matching?
Internally the com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl creates a com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO instance, which eventually calls com.sun.org.apache.xml.internal.serializer.ToStream.startPrefixMapping(String prefix, String uri, boolean shouldFlush). Here is the "offending" code that adds the xmlns="http://www.w3.org/1999/xhtml" on the document element:
if (EMPTYSTRING.equals(prefix))
{
name = "xmlns";
addAttributeAlways(XMLNS_URI, name, name, "CDATA", uri, false);
}
But to be more precise, I see that the actual adding of the attribute is done by com.sun.org.apache.xml.internal.serializer.AttributesImplSerializer.addAttribute(String uri, String local, String qname, String type, String val). This class extends org.xml.sax.helpers.AttributesImpl and implements org.xml.sax.Attributes.
Is there some way I can splice my own customized Attributes implementation into a Transformer, so that I can check this special case and forgo adding the xmlns="http://www.w3.org/1999/xhtml" attribute in the appropriate context?
I suppose as a last resort, is there a way to tell the Transformer to be namespace aware, but never to add xmlns declarations that weren't already in the DOM?
(For those who insist in asking where I got a DOM with an HTML namespace without an xmlns declaration, it's irrelevant. Let's assume that I constructed an XML DOM instance programmatically but want to output it as "clean" HTML5, so I remove the default xmlns attribute, but the Transformer is putting it back.)
(Full disclosure: I'm actually fixing a bug in jsoup, which is an HTML parser that accepts dirty HTML as in the wild, and presents it to the application in DOM as a browser would. I fixed a bug that didn't assign the HTML namespace even without a namespace declaration. Now the existing W3CDom.asString(Document doc) serializer method tries to add the xmlns namespace declaration, but users are accustomed to it returning an HTML serialization without the xmlns (which for HTML5 isn't wrong). So I'm trying to keep from breaking code that relies on the original "clean" HTML serialization without rewriting the serializer.)
The following is an ugly kludge, but given the constraints I don't see an alternative. I welcome a better approach!
/**
* Pattern to detect the <code>xmlns="http://www.w3.org/1999/xhtml"</code> default namespace
* declaration when serializing the DOM to HTML. This pattern is "good enough", relying in part
* on the output of the {#link Transformer} used in the implementation, but is not a complete
* solution for all the serializations possible; that is, if one constructed an XML string
* manually, it might be possible to find an obscure variation that this pattern would not
* match.
*/
static final Pattern HTML_DEFAULT_NAMESPACE_PATTERN =
Pattern.compile("<html[^>]*(\\sxmlns=['\"]http://www.w3.org/1999/xhtml['\"])");
/**
* Removes the default <code>xmlns="http://www.w3.org/1999/xhtml"</code> HTML namespace
* declaration if present in the string.
*
* #param html The serialized HTML.
* #return A string without the default <code>xmlns="http://www.w3.org/1999/xhtml"</code> HTML
* namespace declaration.
* #see <a href="https://github.com/jhy/jsoup/issues/1837">Issue #1837: Bug: DOM elements not
* being placed in (X)HTML namespace.</a>
*/
static String removeDefaultHtmlNamespaceDeclaration(String html) {
Matcher matcher = HTML_DEFAULT_NAMESPACE_PATTERN.matcher(html);
if (matcher.find()) {
html = html.substring(0, matcher.start(1)) + html.substring(matcher.end(1));
}
return html;
}
It looks to me as if the DOM was created by an application that put the nodes in the XHTML namespace, and therefore the serializer is entirely correct to serialize them in that namespace. From your description, the application did that because it was parsing HTML5 and that's what the HTML5 specification says it should do.
Part of the problem is that you're using an XSLT 1.0 serializer, and XSLT 1.0 predates XHTML and certainly predates HTML5. Unfortunately, just because W3C or WHATWG issues a proclamation doesn't mean that everyone changes their software. You may have better luck using an XSLT 3.0 serializer (Saxon) with the HTML5 output method, but I don't know what your project constraints are.
It seems to me that since you're relying on some other code to perform the serialization, you will actually need to modify all the elements so that their names are not in the XHTML namespace (if you want a non-kludge solution).
You can recursively traverse all the elements in the document starting from the root element, and use the renameNode method of the DOM Document object to rename them all to have a name which is the same as their old local name, but with no namespace URI.
document.renameNode(element, null, element.getLocalName());
You could do the same renaming in XSLT, but I'm guessing you're probably more comfortable doing it in Java.

How TargetNamespace is different from Namespace [duplicate]

This question already has answers here:
xmlns, xmlns:xsi, xsi:schemaLocation, and targetNamespace?
(2 answers)
How to link XML to XSD using schemaLocation or noNamespaceSchemaLocation?
(1 answer)
Closed 5 years ago.
I'm new to xsd and learning about namespaces. Seen maximum all questions, blog etc but no one could help me understand in simple way.
What I have understood so far is that, Namespaces are used to differentiate elements having same names. (Not cleared about targetnamespace)
My understanding to make namespace is below
xlmns:foo()="URI" -->Namespace (Its a unique token I would say which is responsible to differentiate elements with this syntax. And this synatx given a name and the criteria of making that name is Prefix:ElementName-->Prefix.
I have got one example
<foo:tag xmlns:foo="http://me.com/namespaces/foofoo"
xmlns:bar="http://me.com/namespaces/foobar"
>
<foo:head>
<foo:title>An example document</foo:title>
</foo:head>
<bar:body>
<bar:e1>a simple document</bar:e1>
<bar:e2>
Another element
</bar:e2>
</bar:body>
</foo:tag>
If we want to use multiple namespaces in a xsd then we can declare them once like in above example. Where same prefix is being used for multiple namespaces
foo:tag --->xmlns:foo="http://me.com/namespaces/foofoo"
foo:tag --->xmlns:bar="http://me.com/namespaces/foobar
Is it same like in java where in a package we can have multiple classes and each class has its own attributes, in case of xml its elements. Am I correct ? Can anyone help me to understand TargetNamespace ?
The targetNamespace is the namespace that is going to be assigned to the schema you are creating or the namespace that this schema is intended to target, or validate. It is the namespace an instance is going to use to access the types it declares.
For example :
<schema xmlns="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.bestxml.com/jswbook/po">
...
</schema>
In an XML document instance, you declare the namespaces you are going to be using by means of the xmlns attribute
<purchaseOrder xmlns="http://www.bestxml.com/jswbook/po"
xmlns:addr="http://www.bestxml.com/jwsbook/addr">
<accountName>Shanita</accountName>
<accountNumber>123456</accountNumber>
<addr:street>20 King St</addr:street>
</purchaseOrder>

Can't register namespace with XMLUnit

I can't seem to work out how to set namespace when comparing XML's using xmlunit-2
Tried like:
#Test
public void testDiff_withIgnoreWhitespaces_shouldSucceed() {
// prepare testData
String controlXml = "<a><text:b>Test Value</text:b></a>";
String testXml = "<a>\n <text:b>\n Test Value\n </text:b>\n</a>";
Map<String, String> namespaces = new HashMap<String, String>();
namespaces.put("text","urn:oasis:names:tc:opendocument:xmlns:text:1.0");
// run test
Diff myDiff = DiffBuilder.compare(Input.fromString(controlXml).build())
.withTest(Input.fromString(testXml).build())
.withNamespaceContext(namespaces)
.ignoreWhitespace()
.build();
// validate result
Assert.assertFalse("XML similar " + myDiff.toString(), myDiff.hasDifferences());
}
but always get
org.xmlunit.XMLUnitException: The prefix "text" for element "text:b"
is not bound.
stripping away the namespace prefix from elements make it work, but I would like to learn how register properly the namespace with DiffBuilder.
The same problem/ignorence I experience with xmlunit-1.x so hints using that library I would appreciate as well.
EDIT, based on the answer
By adding the namespace attribute to the root node I managed to bind the namespace
<a xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0">
thanks Stefan
NamespaceContext is only used for the XPath associated with the "targets" of comparisions. It is not intended to be used to provide mappings for the XML documents you compare.
There is no way of binding XML namespaces to prefixes outside of the documents themselves in XMLUnit. This means you either must use xmlns attributes or not use prefixes at all.

xstream: only parsing child element

I have following xml
<root>
<child-1>
</child-1>
<child-2>
<subchild-21>
</subchild-22>
</child-2>
</root>
My requirement is such that I only want to parse child-2. I am unaware of root and child-1.
Is it possible with xstream because I couldn't find a way to ignore root.
There are several ways to go, depending on your requirements.
If you know the name of the class to parse (child-2 here), you could look for the <child-2> and </child-2> entry in the XML, copy them along with the content in-between to a new temporary XML file (you can create temporary files using createTempFile() from the standard File class). This is the way I would suggest.
If you want to take out the child-2 instance without knowing its name, but you know the names of the surrounding classes, you could mock their classes, that is create classes of the same name, but without their specific content. In your example there is no content (might have been ignored at export time), but it's important to have the same member data in the mock classes for the import to succeed. (unless you use ignoreUnknownElements() as stated by Philipi Willemann)
Of course, if you're the one creating the XML, you should be able to export only the child-2 instance in the first place.
If you know the root name you can create a simple class has an attribute of the class you have mapped to child-2:
#XStreamAlias("root")
class Root {
#XStreamAlias("child-2")
private Child2 child;
//get and set
}
Then when you are processing the XML you can set XStream to ignore unknown elements with xstream.ignoreUnknownElements();

Custom name space JAXB, XML

I want to map following xml using custom name sapace. I checked How to have custom namespace prefix but could not find any answer.
<p385:execute xmlns:p385="http://tal.myserver.com">
<version xsi:type="xsd:string">0.1.0</version>
<xmlData xsi:type="xsd:string">
.... xml encoded data
</xmlData>
</p385:execute>
How can i map this to a java class?
Since it is only the root element that is namespace qualified, you just need to specify the namespace on the #XmlRootElement annotation for the class.
#XmlRootElement(namespace="http://tal.myserver.com")
public class Execute {
}
You can the suggest the prefix that should be used for the namespace using the package level #XmlSchema annotation:
http://blog.bdoughan.com/2011/11/jaxb-and-namespace-prefixes.html
Use the wsimport tool to generate artifacts such as JAXB classes from a WSDL:
http://docs.oracle.com/javase/7/docs/technotes/tools/share/wsimport.html
http://jax-ws-commons.java.net/jaxws-maven-plugin/wsimport-mojo.html

Categories