Why can we call methods of interface org.w3c.dom.Document? - java

I don't see any class implementing the methods of interface org.w3c.dom.Document. Then why can we (usually) call getDocumentElement method of this interface to get the root element ?

org.w3c.dom.Document is a part of XML specifications which can be implemented by many different libraries. If you want to know which exact implementation is used, try
org.w3c.dom.Document doc = <your instance>;
System.out.println(doc.getClass().getName());
at the same place where you call methods on it. That will tell you the name of implementing class that would have those methods (or its superclass would).

The org.w3c.dom package and it's classes is part of the Java API for XML Processing (JAXP). It is present to provide the Java language binding for the DOM Level 2 Core API.
The language binding merely exists to provide an interface that can be implemented by various DOM parsers. After all, different parsers will have different techniques to maintain the internal data structures that represent the DOM. Multiple JAXP parsers that comply with the DOM Core API can co-exist in the libraries available to the JVM. At runtime, only one of these will be utilized for parsing XML documents.
You can call the method, once a suitable DOM parser that implements JAXP, has read the contents of an XML document, and has populated it's internal structures to make an instance of the Document class available to you. In other words, the DOM parser is responsible for providing you with an instance of the Document object, after parsing a XML document.

Few of the known implementations are Xerces and JDom

Related

org.w3c.dom.NodeList doesn't extend Iterable

Is there any reason why would authors of Java org.w3c.dom library choose not to support the Iterable interface? For example, the interface NodeList seems like a perfect fit for extending Iterable.
The World Wide Web consortium has defined the Document Object Model (DOM) as follows:
The Document Object Model is a platform- and language-neutral
interface that will allow programs and scripts to dynamically access
and update the content, structure and style of documents.
It's implementation for a number of languages look very much like each other, which smart people thought to be a good idea, a lot of years ago when they designed it.
As a result, it doesn't look like anything familiar in any language.
If you want to use an alternative to the w3c DOM that does look like a Java library, use JDOM. Or map your XML to Java objects using a mapping/binding solution, such as JAXB
But if you need to interface with existing libraries that already use w3c DOM (like the built-in XSLT and XSD processors), then you're stuck with it. Unfortunately.
To #eis:
Yes there is a reason that you can't add an interface such as Iterable to NodeList, and that reason is that the Java binding of the Document Object Model is defined in the standard. Take NodeList, it is 100% defined in the standard. No room for any extra interfaces.
org/w3c/dom/NodeList.java:
package org.w3c.dom;
public interface NodeList {
public Node item(int index);
public int getLength();
}
There is no binding in the standard for C#, but there is one for EcmaScript. I believe the the IXMLDocument interfaces that you mention are also used for their EcmaScript implementation (but I could be wrong), in which case they still need to stick to the standard in terms of what methods they support and what the type hierarchy is.
The difference is that the EcmaScript binding only describes which methods should exist, while the Java binding describes the exact method in the interface.
There is no reason though in Java that the class that implements NodeList can't implement Iterable too. However, if your code depended on that it would not work with the DOM standard, but with a particular implementation only.
Microsoft has never really bothered with this fine distinction since they generally don't cater for multiple standards compliant implementations - if you use any of the methods that Microsoft has labelled with "* Denotes an extension to the World Wide Web Consortium (W3C) DOM." in Microsoft's implementation, then you're not using the DOM standard.

Import Java Custom Method in Xquery

I am using Weblogic Integration framework. While transforming one XML format to another using .xq file, I want to apply some logic written in a custom Java Class.
For example, XML1 has tag: <UnitCode>XYZ</UnitCode>
Custom Java Class:
public class unitcodemapper{
public static String getMappedUnitCode(String unitCode){
if(unitCode=="XYZ")
return <<value from DB table>>
else
return unitCode;
}
}
XML2 will have a tag: <UnitCode>unitcodemapper.getMappedUnitCode(XML1/UnitCode)</UnitCode>
I cannot find any documentation or example to do this. Can someone please help in understanding how this can be done?
This is known as an "extension function". The documentation for your XQuery implementation should have a section telling you how to write such functions and plug them into the processor. (The details may differ from one XQuery processor to another, which is why I'm referring you to the manual.)
Whilst #keshlam mentions Extension Functions, which are indeed supported by many implementations each with their own API.
I think perhaps what you are looking for instead is Java Binding from XQuery. Many implementations also support this and tend to use the same approach. I do not know whether WebLogic supports this or not! If it does, the trick is to use java: at the start of your namespace URI declaration, you can then use the fully qualified Java class name of a static class, each static method you may then call directly from that namespace.
You can from two examples of implementations that offer the same Java Binding from XQuery functionality here:
http://exist-db.org/exist/apps/doc/xquery.xml#calling-java
http://docs.basex.org/wiki/Java_Bindings
These could serve as examples for you to try on WebLogic to see if it is supported in the same way. However, I strongly suggest you check their documentation as they may take a different approach.

Whats the correct way to extend the functionality of DOM elements?

After taking quite a long break from active coding I am just starting to get accustomed to Java again, so this might be considered a "newbie question". Any help is appreciated.
Consider the following scenario. I am parsing an XML document as DOM. I am using javax.xml.parsers.DocumentBuilder to obtain an org.w3c.dom.Document node and scan through its org.w3c.dom.Element nodes, and I am fine with that.
However, I would like to extend the functionality of my org.w3c.dom.Element objects. Say, I would like to have a convenient way to extract some information from the nodes by giving them some public FancyObject toFancyObject() method. Whats the right way of doing this?
Considering that org.w3c.dom.Element is an interface, inheritance seems to be no option. Composition, on the other hand, seems to be quite cumbersome in this case, since this would be like 5% new functionality and 95% delegation of the existing methods.
Also, I am aware that I could always write a static utility method to obtain my FancyObject, but I would like to avoid this solution.
You have a couple of options:
Use the user data field of the Node interface. You can attach arbitrary objects to i t and build something that resembles your static variant.
Use JDOM or DOM4J instead. These APIs are better suited for your requirements w.r.t. extending base implementation classes. For example, with JDOM you can define a custom NodeFactory that can create the customized Element implementations.
Use JAXB to unmarshal the XML into an object graph. In this case, you have almost complete freedom to implement custom behavior.

Java: Ways to parse XML in E4X?

I was wondering if there was a way to parse XML using E4X, or something similar to E4X.
Does such a framework / library exist?
Thanks!
You can use JavaScript engine Rahino with Java which can handle E4X.
http://blogs.oracle.com/sundararajan/entry/desktop_scripting_applications_with_netbeans
http://www.ibm.com/developerworks/library/ws-ajax1/
Java cannot support dynamically defined members, as JavaScript can.
However, with design-time generation, you can get Java whose members reflect the XML. E.g., JAXB
E4X is a language extension, XML is treated like a primitive. E4X is not just for parsing XML, it's using XML as real types.
This can't be simulated or done with a Java 'framework', it would require a language extension for Java.
There is no parsing XML with E4X. It is a specification that makes XML a native data type. Among browsers, only Firefox supports it as of now.
Here's a list of all known implementations of the spec.
A framework can only mimic making XML access easier, but will not fundamentally change the way we use XML. For example, the SimpleXML extension in PHP simplifies things a lot, but under the hood it converts elements to objects using reflection.
So to have something like E4X, it has to be implemented in the language itself and there is no other non-ECMAScript based language that has this as of now.

Putting each attribute on a new line during xml serialization

Lets say I have a DOM object (or a string containing xml). Is it in any way possible to serialize the xml in such a way that each attribute appears on a new line?
This is the output I want:
<parent>
<anElement
attrOne="1"
attrTwo="2"
attrThree="3"
/>
</parent>
Preferred if the solution a part of the standard java api, but I suspect such a feature is not available in there, or am I wrong?
I found a property for a serializer in the .NET Framework, called NewLineOnAttributes. What I am searching for is something equivalent, but in java.
The DecentXML parser can do this.
The XOM library has a Serializer class which you can override to output in whatever format you want.
I don't know of any XML API for Java that provides that specific ability. I've checked the source code for JDOM and XOM, and they all print attributes on the same line, and provide no specific hooks for overriding that.
Both XOM and JDOM do have specific classes for serializing XML (XMLOutputter and Serializer, respectively), and both classes have protected or public methods for handling the serialization of attributes, so you could, if you wanted to, subclass those classes and override the appropriate methods to control your attribute formatting as you want it.
As for the standard Java API, though, forget it, that stuff is pretty nasty.

Categories