I'm beginning to start on my first large project. It will be a program very similar to Rosetta Stone. It will be a program, used for learning a foreign language, written in Java using Swing. In my program I plan on the user being able to select downloaded courses to learn from. I will be able to create an English course since I am a native English speaker. However, I want people who speak other languages to be able to write courses for users to use as well (this is an essential part for my program to work).
Since I want the users to be able to download courses of languages they want, having it hard-coded into the program is out of the question. The courses needed to be interpreted during the runtime. Also since I want others to collaborate with my work (ie make courses), I need to make it easy for them to do so.
What would be the best way to go about doing this?
The idea I have come up with is having a strict empty course outline (hard-coded) with a simple xml file which details the text and sounds to be used. The drawback to this is that it extremely limits the author. Different languages may need to start out with learning different parts.
Any advice on the problem at hand as well as the project as a whole will be greatly appreciated. Any links to any relevant resources or information would also be greatly appreciated.
Think you for your time and effort,
Joseph Pond
Simply, you should base your program on a system such as Eclipse RCP, or the Netbeans Platform. Both of these systems already deal with exactly this problem, and both are perfectly adequate for this task. They're not just for IDEs.
It's a larger first step as you will need to learn one of these platforms beyond simply just Swing.
But, they solve the problem, and their overall organization and technique will serve your program well anyway.
Don't reinvent this wheel, just learn one of these instead.
If you are set on doing this from scratch (Will's idea isn't bad), What I would do is first lay down the file format that would be easiest to create your language course in. It could be XML, plaintext or some other format you come up with yourself.
You will probably need some flexibility in the language format because you will want to actually be able to specify things like questions and answers. XML is a pain because of all the extra terminators, but it gives a good amount of meta-data. If you like XML for that, you may consider defining your language file in YML, it gives you the data of XML but uses whitespace delineators instead of angle brackets.
You probably also want to define your file in the language it's created for, so you might or might not want to require english words as keys. If you don't want any english, you may have to skip both XML and YML and come up with your own file format--possibly where the layout and/or special symbols define the flow and "functionality".
Once you have defined the file format, you won't have to worry about hard-coding anything... you won't be able to because it will already be in the file.
Plug-in functionality would be nice as well... This is where your definition file also contains information that tells you what class to instantiate (reflectively) and use to parse/display the data. In that way you could add new types of questions just by delivering a new jar file.
If this is confusing, sorry, this is difficult in a one-way forum because I can't look at your face and see if you're following me or if I'm even going in the right direction. If you think I'm on the right track and want more details (I've done a bit of this stuff before) feel free to leave a follow-up question (or an email address) in a comment and I'd be glad to discuss it with you further.
If I was doing this, I'd seriously consider using Eclipse EMF to model the "language" for defining courses. EMF is rather daunting to start with, but it gives you:
A high-level model that can be entered/edited in a variety of ways.
An automatic mechanism for serializing "instances" (i.e. courses) to XML. (And you can tinker with the serialization if you choose.)
Automatically generated Java classes for in-memory representations of your instances. These provide APIs that are tuned to your model, an generic ones that are the EMF equivalent of Java reflection ... but based on EMF model classes rather than Java classes.
An automatically generated tree editor for your "instances".
Hooks for implementing your own constraints / validation rules to say what is a valid "course".
Related Eclipse plugins offer:
Mappings to text-based languages with generation of parsers/unparsers
Mappings to graphical languages; e.g. notations using boxes / arrows / etc
Various more advanced persistence mechanisms
Comparisons/differencing, model-to-model transformations, constraints in OCL, etc
I've used EMF in a couple of largish projects, and the main point that keeps me coming back for more is ease of model evolution ... compared with building everything at a lower level of abstraction. If my model (language) needs to be extended / changed, I can make the necessary changes using the EMF Model editor, regenerate the code, extend my custom code to do the right stuff with the extensions, and I'm pretty much done (modulo conversion of stored instances).
For a current project the decision has to be made whether to use XML and an XSL-transformation to produce HTML or to directly use HTML-templates.
I'd be interested in arguments for or against the XSL-approach. I understand that in cases where you have to support many different layouts, an XSL-solution has a lot of advantages, but why would you choose it in those cases where you only have to support one target layout?
Edit: We're talking about Java here.
XSLT is a functional programming language and you can use it to create frontends as rich as any templating system. However, you shouldn't — you and your team will go insane.
Both options present the opportunity of transforming objects into a presentation form in a logical sort of way. XSLT is best suited for creating more XML, which might lead you to believe that it's a perfect candidate to use to create XHTML. However, creating XHTML shouldn't be the primary goal — Creating a user experience is. Don't concern yourself with the medium.
Two significant drawbacks to XSLT concern the syntax: Your templates, and the templates that they include, and the templates that those templates include will all be gigantic and verbose. Second, you'll have to do a lot of functional programming, and less-experienced engineers may be confused and terrified when they encounter a recursive template with an accumulating function parameter instead of a simple for loop.
If you're attracted by the beauty of transforming logically-constructed, valid XML entities, consider instead a type-safe templating system that transforms beans instead. Check out Google XML Pages, and create logically-organized, type-safe templates that will be easy for future engineers to pick up and extend.
I created an XML/XSLT-driven UI for an enterprise product about 5 years ago. We're still using it, and I can now look back on my experience and see many pros and cons:
Pros:
XSL is a powerful declarative language, useful & fun for experienced developers, and transforms can do pretty amazing things in a few lines of code
XSL is designed for use with XML, so if your data is already XML then it makes a lot of sense
Separation of concerns (rendering vs. data) is better than many template languages
XSL-based rendering can be easily "subclassed". By that I mean: let's say you have data class A with associated template A.xslt. For class B derived from A, you can easily create B.xslt with only the small differences, and include A.xslt for inherited behaviors. This makes it less succeptible to breaking due to changes in A.xslt.
The above point also gives you the power to do overrides. For class A with associated A.xslt, we can easily switch the associated template to A-custom.xslt, which is a few small changes plus inheritance of A.xslt. We can do this on the fly in the field and again, the benefit is that A-custom.xslt is only a few lines, not an entire modified copy of the original A.xslt. The small footprint means it's more likely to work with multiple versions of A.xslt.
In .NET 2.0, XSLT is compiled and becomes very fast. There may be similar tech for Java. (Most template languages do this now too.)
In .NET, it's possible to create an "Object XPath Navigator", which lets you transform your data objects without having to convert them to an XML object. Again there may be similar tech in Java
XSLT is smart about HTML & handles escaping, white space issues, etc. well
Cons:
XSL is a powerful declarative language, confusing to newer programmers - and fewer people know XSLT well
XSL is verbose. XML is often verbose too.
XSL transforms are probably slower than "native" templates. Even when compiled there's still more state overhead to XSL than most template languages
It's hard to pass parameters to XSLs, you have to either send them in line with your data (forcing you to create extra XML) or via system-specific methods (which may also involve constructing XML data)
If you don't have an ObjectXPathNavigator or equivalent, you'll incur significant overhead when turning your data objects into XML for transformation
Depending on the capabilities of your transformer, you may also incur buffering overhead as you transform into a string buffer and then send that string to the output device
The more advanced your XSLT usage, the less likely it is that your tools will support you (specifically as you start to use includes or faster ways to pass XML data in)
I'll try to update as I think of more issues. I think that looking back now, my verdict would be to stick with a common template language. What were once big issues when I selected XML/XSLT have now been addressed by newer and more mature revisions of the major template engines. We do still benefit greatly from the ability to inherit .xslt files, which is something most template engines don't do well. But in the end the value of having lots of developers providing examples is far greater (compare ASP.NET answers vs XSLT answers on StackOverflow, for instance.)
Hope that helps!
I've done significant development using XSLT and it has been both tremendously successful and a complete failure at two different sites.
A few thoughts before a conclusion:
I don't think anyone would argue that XSLT is far more powerful than a template parsing engine, it's a functional language.
Although it's not as widely adopted as most procedural languages, it's still a real language that's being used out there for actual projects, people can be hired already with knowledge of XSLT and it's a transferable skill for your current staff.
XSLT has also been around for a while now, the implementations are mature, I'm sure this is the case for long running templating engines (like Velocity) but newer engines may be less robust.
Whatever template language you decide on it's unlikely to be as well documented as XSLT. Check out any of the Michael Kay Programmer's reference series for an example on how to do a great reference book.
Tool support is generally very good ... if you have a budget. XMLSpy and Stylus Studio have both been very useful for me in the past.
XSLT is not only hard but, more importantly, different. Most people are not Computer Science graduates formally trained in functional programming. The majority of programmers will write XSLT in a procedural style which will not harness any power of the language and give you a maintenance headache.
XSLT transforms can be slow and can take a lot of memory. You may have problems if you have a stylesheet with a large XML input.
I love XSLT but whether you should use it or not comes down to a few points:
Are you committed to XSLT? Do you have serious in-house expertise in XSLT? Are you prepared to get some?
Is your data in XML? Does it make sense in XML? Do you have someone in-house who loves your data enough to make sure it's well structured and there's always an appropriate schema?
Unless the answer to those questions is yes and you have complex data that requires a complex rendering process, I wouldn't consider using XSLT ... especially if there's no experience in the team. Bad XSLT is much, much, much worse than a bad template.
However, it can render complex data in a maintainable fashion which would be impossible using many of today's templating engines.
Going the XSL way will future-proof your application. Meaning, if you decide in the future to add more templates with different layouts you will be able capitalize on those advantages. In my current project we save off the XML used (in an XMLType or CLOB) and allow other applications to access the data and XSL templates to generate documents via a web service. This was an after thought of the original design that was super easy to implement due to our decision to use XML/XSL.
XSLT has the advantage of being able to also produce output in other document types (i.e. pdf) and pdf output is very likely nowadays. XML/XSLT does also separate data from the view.
When we have done XSLT in the past, it was to allow the ability to extend our product. The output remained the same, only the presentation layer needed to change. This allowed us a lot of flexibility when we had clients that wanted to "customize" their UI, since all we needed to do was replace the XSLT file. If you foresee needing to make a lot of those kinds of changes, XSLT might be your answer.
However, as stated above, the XSLT syntax and functional programming mentality can make it difficult to effectively produce templates. We found that we liked to stick to the tricks that we learned and when we had client requests that fell outside of what we already knew, no one wanted to volunteer for the ticket. Usually someone eventually figured out how to do the task and our "bag of tricks" got larger, but it was often very cumbersome to figure out new things.
If you don't foresee change the UI ever, or at least not much, XSLT may not be worth the extra effort.
Please don't use XML/XSLT for web front-ends. I was in projects like this and it's horrible. Often you have to first produce the XML from objects or something similar, which doesn't make sense. A second point is, that there are so many good HTML editors out there for free, but I've found none for XSLT. So editing complex XSLT is no fun. I would recommend to go with HTML templates and a common template engine.
Depending on your application, having an XML layer that is then transformed to XHTML via XSLT also meens, that you can write easy WebServices to the XML layer - allowing your customers to consume your sites data...
Having the XML sent to the browser with a transformation link (forgot the exact syntax...) also meens less bandwith needed, as the XSLT file will stay the same and you only need to pass the raw XML it is built from - sort of like using an external CSS style sheet instead of adding the style attributes to your markup ;)
I think you need to examine what the source of your data will be. As mentioned by boris callens earlier, if the you are pulling from a database you will have to transform first to XML, then apply your transformations. Should the data source be RSS or the like, then XSLT is a natural choice.
XPATH and XSLT has a high learning curve and functional programming can be daunting to get your arms around. In time crunch this may not be the right choice.
For front end work JSON has a lighter payload, and is readily supported by jQuery and other Javascript libraries. You may want to consider JSON as the data protocol as the jQuery library is far more accessible to developers and the time to productivity with the framework is far less than with XSLT, embedded Javascript in tags, awful syntax and all the other minutia that come with XML/XPATH/XSLT on the front end.
Keep it simple. That's a principle that one gets to appreciate more and more.
Velocity or Freemarker are incredibly flexible and versatile. Your code base will be clear, easily understandable, and it will run much (much) faster than the X monstrosities.
http://fishbowl.pastiche.org/2002/02/12/xslt_is_the_spawn_of_satan/
I see how the XSL approach can be handy if your data is already XML.
But usually it isn't. It's somewhere in a database, needs to be generated on the spot or comes from some service.
Creating XML from this source to then be able to create HTML from that XML is useless in my opinion. I would stick with (X)Html templates.
In contrast to HTML, there are a lot of XML tools available if you need to do parsing and processing of the templates in any way. So you should choose XML to get the benefits of using tools and libraries for XML.
However, that said, it may just be that XHTML fits your needs, since this gives you full support of XML tools and libraries while still being normal HTML which is correctly processed by modern web browsers. If you need to do post-processing of those later on, you can still apply XSLT to the XHTML data.
I've used XML & XSLT in a previous project, financial web sites, and it worked well for us, but:
We had multiple customers, which
varied the number of outputs we had.
We could replace the XSLT stylesheet
and this made changes to the site
easier to manage for the developers
We had a specialist web editor on the team. We gave them example XML & they could edit the stylesheets directly
If there were ever any wording changes that needed to go onto the website yesterday ( it was a bank, this happened surprisingly often), we could just deploy the new XSLT without redeploying the entire site.
Multiple different output formats were needed. We used FOP for transformation to PDF, which is based upon the same sort of technology, so wasn't too hard for us to understand :-)
The main reason I see for using XSLT is if you have multiple sites all based upon the same XML, but requiring different HTML output.
XML + XSLT are really cool. You have the ability to output many types of target formats in the future.
But ne aware of embedded HTML in the XML. Firefox XSLT doesn't support "disable-output-escaping". See Bugzilla.
We use XSLT to generate html in our content management system and it works just fine.
Some hints: Don't try to generate all the page at once from one big hairy XML, you'll go insane. Use the HTML template (plain text/html file with styles, decorations and basic markup) with embedded markers (like, <!--MENU-->, <!--CONTENT-->), and replace markers with xslt-transformation of appropriate data.
Having said that, I doubt you really need xslt if you only going to have one layout, forever.
I writing a dynamic HTML parsers functionality.
I will want to modify existing parsers and also would want to add more parsers (I expect parsers will be modified as sites a remodified and new parsers will be needed for new sites).
I started writing a generic functionality which use a XML with conditions and rules for each site but as this works fine for now, I'm pretty sure it will need constant modifications...
The parsers will parse and write the data to a DB.
My application runs on JBOSS 4.
Any known best practice for that?
Thanks,
Rod
Thanks for your answer. Maybe I was unclear. I realized that imm. from the rate my question got. What I am writing feature that manage parsers execution. Each parser will parse a different text document structure. Documents structure might change from time to time and more new structured document will be added to be parsed. I dont want to recompile build deploy my application for each arser change.
I want to manage the execution of each parser as theymight be executed in parralel or according to execution rules.
Does Using Java ScriptingEngine might be a good option?
There are lots of ways to have some code that can be modified without redeploying. Using groovy scripts to do the parsing is one. Is is a rather simple matter to check to see if the script has been modified and automatically reload it.
The design sounds convoluted to me, but IFF you prove to yourself there's not a much simpler way to accomplish the same task, you may want a rules engine like Drools...
I am a beginner in accessing backend XML files (which act like a database) in JSP code. Can anyone please provide me the links and references that provide good understanding for beginners like me? Please help.
Some tips when working with JSP: Keep as much code as possible outside of the JSP. I've had very good results with creating a helper object at the top of the JSP. In the HTML of the JSP, I can then call the methods of the helper object to get at my data.
This way, I have a normal object (which doesn't depend on the JSP cra....framework) which I can test and use just like any other object.
So my suggestion is to create a couple of objects which allow you to access the database. In the JSP, have as little actual Java code as possible.
You may want to take a look how to implement typical webapp design patterns in J2EE (see e.g. Sun's blueprint describing the webapp designs). Depending on complexity of your application, make a decision which pattern to use. You may also choose to use some of existing MVC frameworks build on J2EE (although I'd not advise that to a beginner).
Building a model classes around your XMLs would be a good start (there is a variety of ways to process XML in Java, check e.g. JAXP). Once a model is ready, you can start using it in your JSPs (implementing the view and controller per the pattern you will choose).
I am creating a tool that will check dynamically generated XHTML and validate it against expected contents.
I need to confirm the structure is correct and that specific attributes exist/match. There may be other attributes which I'm not interested in, so a direct string comparison is not suitable.
One way of validating this is with XPath, and I have implemented this already, but I would also like something less verbose - I want to be able to use CSS Selectors, like I can with jQuery, but on the server - within CFML code - as opposed to on the client.
Is there a CFML or Java library that allows me to use CSS Selectors against an XHTML string?
I've just released an open source project which is a W3C CSS Selectors Level 3 implementation in Java. Please give it a try. I was looking for the same thing and decided to implement my own engine. It's inspired by the code in WebKit etc.
http://github.com/chrsan/css-selectors/tree
I don't know of a Java library itself, but there is a Ruby library called Hpricot that does exactly what you're looking for. In conjunction with the Ruby implementation on the Java platform, JRuby, it should be relatively straightforward to call Ruby methods from your Java code (using BSF, JSR-222 Scripting APIs, or an internal API).
Are you using Coldfusion 8? Coldfusion 8, being based on Java 6, supports JSR-222 Scripting APIs "javax.scripting".
Take a look at this blog entry on embedding PHP within CFML. You should be able to do the same with Ruby. There is ZIP file example code linked from this blog posting, and if you crack open the CFML, you'll see a good example of embedding Ruby within CFML.
Although it might take a bit of work to make all the pieces work together, but with a bit of investment, it should give you the robust parsing/CSS selector querying that you're looking for.
Hpricot is definetly a fantastic solution if the JRuby-route is open to you.
Wrt. XPath being the "correct" way to access XML documents... sorry but this is rubbish. There are numerous ways to access elements of an XML document: DOM traversal, XPath, XQuery, CSS selectors to name a few. XPath is certainly popular but CSS selectors are very very powerful, assuming your XML document has HTML semantics.
If you can use PHP within your CFML (as mentioned above), you could take advantage of this excellent "jQuery for PHP" library, phpQuery
Full CSS selector support, manipulation functions, traversing, etc. It should work great for what you need.
Hope it helps.
There is a theoretical difference between the server and client. To a web browser, the document is a living DOM hierarchy. To your server code it's merely an XML document of whatever type. XPath is the "correct" way to access elements of an XML document.
So unless you have a serious performance problem with your current XPath solution, or it doesn't actually work correctly, I suggest you stick with it. Trying something too clever brings the risk of breaking something that's working.
If you find the XPath to be too verbose and ugly to leave sitting around, or want more power to re-use the tool in different cases, or just can't resist trying to do something clever, then you could try writing a utility that compiles a given CSS selector into an XPath. You could then call this in one line whenever you needed.
it may be easier to use cQuery.com - cQuery.com is an API based 'Content Query Engine' to extract content from live websites by using CSS.
You can using it programatically in you application.