I'm trying to generate flat file for test data using JAVA. Flat file has own mapping document which describes all fields of each line.
I was suggested to use XSD for mapping and I did some research on XSD. As I understood that XSD is for only validation of XML. In this case I have to randomly generate XML file based on XSD and convert it to txt or other format. Because as an output I need flat file, not XML.
It seems like using XSD i'm adding extra steps in creating the file as first create XML, validate with XSD and convert it expected format.
What would be your recommendation in my situation for following the mapping document?
Thanks in advance.
I have seen your kind of setup before. The reasons may be different than your particular scenario, but nonetheless it made sense. One thing to consider has to do with the skills and tools people have available and so whatever makes the job done quick and well, goes.
You seem to describe an offset based, "flat" data structure. In my case, people used COBOL copybooks which are very good at describing this. IBM Rational Developer had a built-in wizard which allows the creation of Java Data Bindings from a COBOL copybook. This is to say that within a minute one gets a Java class which can create a record for your flat file in no time (it comes with all the logic required to do the padding, etc.)
To get the data generated, there are tools capable of generating XML files which cover all constraints defined by an XSD (e.g. alternate content i.e. xsd:choice, enumerated values, etc.) Now, assuming you have a proper XSD describing your logical model of your flat file, one can get 10s, 100s, even 100K XML generated from an XSD spec. It takes a click, plus the time spent by the tool to create those files.
Next, to get the XML files in your generated Java class, and so avoid going through XSLT or whatever (many shops don't have the skills) it may be as simple as writing Java mapping code between a JAXB generated class and the one created above, or if matching is possible, simply annotate the generated class to support JAXB unmarshalling. This last step may take longer to code, but it would be trivial code any Java developer would know how to do it.
This could possibly give you a view into why someone may have recommended Java and XSD for this task. XSD is a modeling language with built in support for constraints, which may prove helpful in generating test data through combinatorial techniques.
I have to build an XML file for an input to a SOAP service in Java. The input xml can consist of at least 1000 tags. What is the best way to build the XML? I have the XSD files but it is a bit complicated to use JAXB. Is XMLStreamWriter a good option for that?
XMLStreamWriter is one of the better APIs to use for writing XML from a Java application, but it has a few quirks (e.g. its namespace handling is a bit bizarre) and you may find it worthwhile to wrap it in a convenience API that knows about the kind of document you are writing, e.g. what namespaces it uses.
One of the advantages of the XMLStreamWriter interface is that there are plenty of implementations to choose from. For example Saxon has an implementation that gives you full control over all the XSLT/XQuery serialization options plus Saxon extensions (for example, you can even control the output order of attributes!)
One of the problems I hit with all event-based APIs is that sooner or later you find yourself forgetting to write an end tag, and that can be quite tricky to debug. Using a wrapper API that forces you to include the element name in a call on endElement() can be useful for debugging; if debugging is switched on you can keep a stack of element names and check that endElement() is writing the right tag; with debugging switched off you just drop this check.
Serializing using JAXB is higher-level, of course, but the downside is that it gives you less control.
Is anyone aware of a library that makes the process of creating XSDs in Java a little easier than using something like a DocumentBuilder? Something like a helper or utility class. I came across the org.eclipse.xsd maven jar but I'm having ClassNotFoundException issues when working with it in Eclipse and I'm not entirely sure it's meant to be used as a standalone kind of thing. This is a bit difficult to Google for as well since there are lot of search results around automatic generation/translation from Java to XSD and vice versa.
Essentially what I need to do is to programmatically create an XSD from a certain source of data -- not Java classes.
Apache XMLSchema is a lightweight Java object model that can be used to manipulate and generate XML schema representations. You can use it to read XML Schema (xsd) files into memory and analyze or modify them, or to create entirely new schemas from scratch.
The fact that with this API one can create an XSD from scratch, it sounds as a starting point to achieve the ask; as to the fitness, it depends on what that "certain source of data" is.
For a current project the decision has to be made whether to use XML and an XSL-transformation to produce HTML or to directly use HTML-templates.
I'd be interested in arguments for or against the XSL-approach. I understand that in cases where you have to support many different layouts, an XSL-solution has a lot of advantages, but why would you choose it in those cases where you only have to support one target layout?
Edit: We're talking about Java here.
XSLT is a functional programming language and you can use it to create frontends as rich as any templating system. However, you shouldn't — you and your team will go insane.
Both options present the opportunity of transforming objects into a presentation form in a logical sort of way. XSLT is best suited for creating more XML, which might lead you to believe that it's a perfect candidate to use to create XHTML. However, creating XHTML shouldn't be the primary goal — Creating a user experience is. Don't concern yourself with the medium.
Two significant drawbacks to XSLT concern the syntax: Your templates, and the templates that they include, and the templates that those templates include will all be gigantic and verbose. Second, you'll have to do a lot of functional programming, and less-experienced engineers may be confused and terrified when they encounter a recursive template with an accumulating function parameter instead of a simple for loop.
If you're attracted by the beauty of transforming logically-constructed, valid XML entities, consider instead a type-safe templating system that transforms beans instead. Check out Google XML Pages, and create logically-organized, type-safe templates that will be easy for future engineers to pick up and extend.
I created an XML/XSLT-driven UI for an enterprise product about 5 years ago. We're still using it, and I can now look back on my experience and see many pros and cons:
Pros:
XSL is a powerful declarative language, useful & fun for experienced developers, and transforms can do pretty amazing things in a few lines of code
XSL is designed for use with XML, so if your data is already XML then it makes a lot of sense
Separation of concerns (rendering vs. data) is better than many template languages
XSL-based rendering can be easily "subclassed". By that I mean: let's say you have data class A with associated template A.xslt. For class B derived from A, you can easily create B.xslt with only the small differences, and include A.xslt for inherited behaviors. This makes it less succeptible to breaking due to changes in A.xslt.
The above point also gives you the power to do overrides. For class A with associated A.xslt, we can easily switch the associated template to A-custom.xslt, which is a few small changes plus inheritance of A.xslt. We can do this on the fly in the field and again, the benefit is that A-custom.xslt is only a few lines, not an entire modified copy of the original A.xslt. The small footprint means it's more likely to work with multiple versions of A.xslt.
In .NET 2.0, XSLT is compiled and becomes very fast. There may be similar tech for Java. (Most template languages do this now too.)
In .NET, it's possible to create an "Object XPath Navigator", which lets you transform your data objects without having to convert them to an XML object. Again there may be similar tech in Java
XSLT is smart about HTML & handles escaping, white space issues, etc. well
Cons:
XSL is a powerful declarative language, confusing to newer programmers - and fewer people know XSLT well
XSL is verbose. XML is often verbose too.
XSL transforms are probably slower than "native" templates. Even when compiled there's still more state overhead to XSL than most template languages
It's hard to pass parameters to XSLs, you have to either send them in line with your data (forcing you to create extra XML) or via system-specific methods (which may also involve constructing XML data)
If you don't have an ObjectXPathNavigator or equivalent, you'll incur significant overhead when turning your data objects into XML for transformation
Depending on the capabilities of your transformer, you may also incur buffering overhead as you transform into a string buffer and then send that string to the output device
The more advanced your XSLT usage, the less likely it is that your tools will support you (specifically as you start to use includes or faster ways to pass XML data in)
I'll try to update as I think of more issues. I think that looking back now, my verdict would be to stick with a common template language. What were once big issues when I selected XML/XSLT have now been addressed by newer and more mature revisions of the major template engines. We do still benefit greatly from the ability to inherit .xslt files, which is something most template engines don't do well. But in the end the value of having lots of developers providing examples is far greater (compare ASP.NET answers vs XSLT answers on StackOverflow, for instance.)
Hope that helps!
I've done significant development using XSLT and it has been both tremendously successful and a complete failure at two different sites.
A few thoughts before a conclusion:
I don't think anyone would argue that XSLT is far more powerful than a template parsing engine, it's a functional language.
Although it's not as widely adopted as most procedural languages, it's still a real language that's being used out there for actual projects, people can be hired already with knowledge of XSLT and it's a transferable skill for your current staff.
XSLT has also been around for a while now, the implementations are mature, I'm sure this is the case for long running templating engines (like Velocity) but newer engines may be less robust.
Whatever template language you decide on it's unlikely to be as well documented as XSLT. Check out any of the Michael Kay Programmer's reference series for an example on how to do a great reference book.
Tool support is generally very good ... if you have a budget. XMLSpy and Stylus Studio have both been very useful for me in the past.
XSLT is not only hard but, more importantly, different. Most people are not Computer Science graduates formally trained in functional programming. The majority of programmers will write XSLT in a procedural style which will not harness any power of the language and give you a maintenance headache.
XSLT transforms can be slow and can take a lot of memory. You may have problems if you have a stylesheet with a large XML input.
I love XSLT but whether you should use it or not comes down to a few points:
Are you committed to XSLT? Do you have serious in-house expertise in XSLT? Are you prepared to get some?
Is your data in XML? Does it make sense in XML? Do you have someone in-house who loves your data enough to make sure it's well structured and there's always an appropriate schema?
Unless the answer to those questions is yes and you have complex data that requires a complex rendering process, I wouldn't consider using XSLT ... especially if there's no experience in the team. Bad XSLT is much, much, much worse than a bad template.
However, it can render complex data in a maintainable fashion which would be impossible using many of today's templating engines.
Going the XSL way will future-proof your application. Meaning, if you decide in the future to add more templates with different layouts you will be able capitalize on those advantages. In my current project we save off the XML used (in an XMLType or CLOB) and allow other applications to access the data and XSL templates to generate documents via a web service. This was an after thought of the original design that was super easy to implement due to our decision to use XML/XSL.
XSLT has the advantage of being able to also produce output in other document types (i.e. pdf) and pdf output is very likely nowadays. XML/XSLT does also separate data from the view.
When we have done XSLT in the past, it was to allow the ability to extend our product. The output remained the same, only the presentation layer needed to change. This allowed us a lot of flexibility when we had clients that wanted to "customize" their UI, since all we needed to do was replace the XSLT file. If you foresee needing to make a lot of those kinds of changes, XSLT might be your answer.
However, as stated above, the XSLT syntax and functional programming mentality can make it difficult to effectively produce templates. We found that we liked to stick to the tricks that we learned and when we had client requests that fell outside of what we already knew, no one wanted to volunteer for the ticket. Usually someone eventually figured out how to do the task and our "bag of tricks" got larger, but it was often very cumbersome to figure out new things.
If you don't foresee change the UI ever, or at least not much, XSLT may not be worth the extra effort.
Please don't use XML/XSLT for web front-ends. I was in projects like this and it's horrible. Often you have to first produce the XML from objects or something similar, which doesn't make sense. A second point is, that there are so many good HTML editors out there for free, but I've found none for XSLT. So editing complex XSLT is no fun. I would recommend to go with HTML templates and a common template engine.
Depending on your application, having an XML layer that is then transformed to XHTML via XSLT also meens, that you can write easy WebServices to the XML layer - allowing your customers to consume your sites data...
Having the XML sent to the browser with a transformation link (forgot the exact syntax...) also meens less bandwith needed, as the XSLT file will stay the same and you only need to pass the raw XML it is built from - sort of like using an external CSS style sheet instead of adding the style attributes to your markup ;)
I think you need to examine what the source of your data will be. As mentioned by boris callens earlier, if the you are pulling from a database you will have to transform first to XML, then apply your transformations. Should the data source be RSS or the like, then XSLT is a natural choice.
XPATH and XSLT has a high learning curve and functional programming can be daunting to get your arms around. In time crunch this may not be the right choice.
For front end work JSON has a lighter payload, and is readily supported by jQuery and other Javascript libraries. You may want to consider JSON as the data protocol as the jQuery library is far more accessible to developers and the time to productivity with the framework is far less than with XSLT, embedded Javascript in tags, awful syntax and all the other minutia that come with XML/XPATH/XSLT on the front end.
Keep it simple. That's a principle that one gets to appreciate more and more.
Velocity or Freemarker are incredibly flexible and versatile. Your code base will be clear, easily understandable, and it will run much (much) faster than the X monstrosities.
http://fishbowl.pastiche.org/2002/02/12/xslt_is_the_spawn_of_satan/
I see how the XSL approach can be handy if your data is already XML.
But usually it isn't. It's somewhere in a database, needs to be generated on the spot or comes from some service.
Creating XML from this source to then be able to create HTML from that XML is useless in my opinion. I would stick with (X)Html templates.
In contrast to HTML, there are a lot of XML tools available if you need to do parsing and processing of the templates in any way. So you should choose XML to get the benefits of using tools and libraries for XML.
However, that said, it may just be that XHTML fits your needs, since this gives you full support of XML tools and libraries while still being normal HTML which is correctly processed by modern web browsers. If you need to do post-processing of those later on, you can still apply XSLT to the XHTML data.
I've used XML & XSLT in a previous project, financial web sites, and it worked well for us, but:
We had multiple customers, which
varied the number of outputs we had.
We could replace the XSLT stylesheet
and this made changes to the site
easier to manage for the developers
We had a specialist web editor on the team. We gave them example XML & they could edit the stylesheets directly
If there were ever any wording changes that needed to go onto the website yesterday ( it was a bank, this happened surprisingly often), we could just deploy the new XSLT without redeploying the entire site.
Multiple different output formats were needed. We used FOP for transformation to PDF, which is based upon the same sort of technology, so wasn't too hard for us to understand :-)
The main reason I see for using XSLT is if you have multiple sites all based upon the same XML, but requiring different HTML output.
XML + XSLT are really cool. You have the ability to output many types of target formats in the future.
But ne aware of embedded HTML in the XML. Firefox XSLT doesn't support "disable-output-escaping". See Bugzilla.
We use XSLT to generate html in our content management system and it works just fine.
Some hints: Don't try to generate all the page at once from one big hairy XML, you'll go insane. Use the HTML template (plain text/html file with styles, decorations and basic markup) with embedded markers (like, <!--MENU-->, <!--CONTENT-->), and replace markers with xslt-transformation of appropriate data.
Having said that, I doubt you really need xslt if you only going to have one layout, forever.
Out of all the libraries for inputing and outputting xml with java, in which circumstances is commons-digester the tool of choice?
From the digester wiki
Why use Digester?
Digester is a layer on top of the SAX
xml parser API to make it easier to
process xml input. In particular,
digester makes it easy to create and
initialise a tree of objects based on
an xml input file.
The most common use for Digester is to
process xml-format configuration
files, building a tree of objects
based on that information.
Note that digester can create and
initialise true objects, ie things
that relate to the business goals of
the application and have real
behaviours. Many other tools have a
different goal: to build a model of
the data in the input XML document,
like a W3C DOM does but a little more
friendly.
and
And unlike tools that generate
classes, you can write your
application's classes first, then
later decide to use Digester to build
them from an xml input file. The
result is that your classes are real
classes with real behaviours, that
happen to be initialised from an xml
file, rather than simple "structs"
that just hold data.
As an example of what it's NOT used for:
If, however, you are looking for a direct representation of the input xml document, as
data rather than true objects, then digester is not for you; DOM, jDOM or other more
direct binding tools will be more appropriate.
So, digester will map XML directly into java objects. In some cases that's more useful than having to read through the tree and pull out options.
My first take would be "never"... but perhaps it has its place. I agree with eljenso that it has been surpassed by competition.
So for good efficient and simple object binding/mapping, JAXB is much better, or XStream. Much more convenient and even faster.
EDIT 2019: also, Jackson XML, similar to JAXB in approach but using Jackson annotations
If you want to create and intialize "true" objects from XML, use a decent bean container, like the one provided by Spring.
Also, reading in the XML and processing it yourself using XPath, or using Java/XML binding tools like Castor, are good and maybe more standard alternatives.
I have worked with the Digester when using Struts, but it seems that it has been surpassed by other tools and frameworks for the possible uses it has.