Is there a well-designed, maintained RSS-parsing library for Java? - java

I know this question has been asked before, but that was several years ago, and of the two answers, Rome and Abdera, the first no-longer seems to be maintained (there aren't even any download links on the website, nor can I find documentation). The latter also appears rather complicated, and neither appears up to contemporary standards of Java library design.
Are there any new alternatives out there that are well designed, and well maintained?

Sorry, I do not know of any library, but, that said, seeing as RSS is an XML format you should be able to roll your own using SAX/JAXB/DOM. Which one to use depends on whether you wan ease of integration with Java (JAXB) or speed (SAX). There is a middle ground in DOM.
RSS is not a complicated format so I think you could just develop the features you need as you come across them and it'll be faster (and the skills you learn more transferable) than exhaustice searching for a library if one cannot be found easily.
Hope this helps.

I did find this class RSSDigester. It might help, I don't realy have the time to investigate it right now, sorry.

RSS reading hasn't really needed changing for some time. ROME really is quite nice, and as far as fetching it you can get it from http://download.java.net/maven/2/rome/.

I eventually found HorroRSS, which is exactly what I was hoping for. Its simple, easy to use, and appears robust.

Related

Where can I get good explanations of all thymeleaf out of the box processors?

There are a couple mentioned in the documentation but looking through the api, there are tons more processors
The current JavaDoc is indeed lacking in completeness. I find myself in the same situation where I am constantly searching for which abstract processor is the best to use, even though the names of the processors are pretty straight forward.
Currently the only way to exactly know what a processor does and how to use them is to open the source code.

Java API interface

Ever since I started playing around with Scala, I have had one big question concerning the Java API: why does Oracle keep the same old HTML page with "frameset" tags and no search function at all? It looks like they haven't made it to the Web 2.0...
The Scala API documentation on the other hand, while not the best website in the web history, is several orders of magnitude more usable.
Anyways, if anybody knows why that is and, more importantly, if there exists a Java API documentation with a better interface, please let me know!
Recently, for Java 7, JavaDoc was improved so it could use custom CSS. Here are the first results: http://download.java.net/jdk7/docs/api/. The work continues and I think we'll see more when new updates come out. I do agree that ScalaDoc is superior, but they didn't have to deal with 15 years of legacy.
Javadocs provides the output in that format and its published at that address, I guess no one really saw the need for improvement, but now that you mention it, it makes for an interesting side-project. I googled around to find if there was any "better" interface but no luck.
You could run javadoc -h to see what extra options are available if you want to re-generate the javadocs. Some interesting ones are to provide custom header/footer and linking to the source, but nothing to the effect that you are asking.
Those HTML pages were made using the Javadoc tools, a standard way to build documentation in Java.
I don't know if there are other webpages with a better formatting of the API, but if it helps you with anything, and you are using an IDE and the SDK, you can see the source code for most of the files there.
JavaDoc was designed to be the lowest common denominator. Virtually any web browser can display it, even without JavaScript support.
If you are looking for quicker access and search capabilities, you can access JavaDoc from within an IDE such as Eclipse.

Are there any tools to isolate the content of a webpage?

I'm working on a school project in which we would like to analyze the content of webpages. We don't, however, want to deal with things like Nav bars and comments. If we were looking at a specific website we could make a parser to filter that sort of extraneous stuff out specifically for that site, but we are hoping work on arbitrary sites that we may not have ever encountered before.
I feel like it's a bit much to hope for, so I won't be surprised if nothing like this exists already, but does anyone know of a tool that can do that sort of content isolation on arbitrary websites? I've had a bit of luck diffing pages with others from the same site, but it's imperfect and leaves comments and such.
I am working in Java, but would welcome anything open source in any language that I can use for ideas.
I'm a little late to this one (especially for a school project), but if anyone finds this at some future point, the following may be helpful.
I stumbled across a Java library to do exactly this. Performance, in my simple tests, is similar to Readability.
http://code.google.com/p/boilerpipe/
You could try an unofficial API of arc90's Readability.
Basically what Readability does is extract content on a webpage and presents it to you as a nicely formatted article. Nav bars, comments, and all the other stuff that surrounds content on a webpage is gone.
im also a bit late to this conversation but ...
the Java Boilerpipe extractors are probably what you want (ArticleSentencesExtractor probably), although there is at least 1 port of the arc90 readability to java on github.
If you want to build a poor mans boilerpipe you might try diff'ing 2 pages from the same site (assuming they are using the same template you will likely get an interesting result)
The main difference between boilerpipe, readability and a diff based hack is that boilerpipe will strip out all html but preserve some structure
I doubt that anything exists that would do what you want. Without some sort of semantic markup it is next to impossible to distinguish "real" content from the other stuff. This is a task that requires real intelligence.
There are of course good tools for parsing HTML of varying degrees of correctness, and it is often possible to cobble together some pattern-based solution for dealing with pages on a particular site ... assuming that there are common structures / patterns to be elicited.

How can one convince a team to use a new technology (LinQ, MVC, etc )?

Obviously, it's easier to do with some developers, but I'm sure many of us are on teams that prefer the status quo.
You know the type. You see some benefit in a piece of new technology and they prefer the tried and true methods.
Try, for example, DBA/C# programmer the advantages of using LinQ ( not necessarily LinQ to SQL, just LinQ in general ).
For example, When a project requirement is to be cross-platform... instead of thinking about how one can run Windows on a Mac through a VM Machine, introducing the idea of using relatively new Silverlight or creating it in Java ( as an option to look into ).
I know most people don't like to be out of their comfort level, so it takes a bit of convincing, and not ALL new technology makes business sense... but how have you convinced your team to look at a new technology?
What technologies have you successfully introduced to your workplace?
What technologies do you think are hardest to introduce? ( I'm thinking paradigm-shifting ones, like MVC from WebForms... or new languages )
What strategies do you employ to make these new technologies appealing?
Know the technology well before pitching it. You're going to get questions like "but how can we make it do X?", and you want to be able to give at least a general answer.
Try not to be a religious zealot. Acknowledging that the new technology is not perfect, that it's just another tool in the toolbox, goes a long way towards credibility.
Give a well-prepared live demo to show what it can do. For example, a friend of mine built a simple blog in Ruby on Rails in half an hour, in front of a live audience. I want to stress the word "well-prepared"; if things keep breaking along the way, or you don't fully understand what you're doing, or you are unable to answer basic questions, you'll hurt your cause rather than help it.
When it comes to coding practices my favorite is to quite simply use examples. I will take a few hours and edit our code base to use the new technique in place of the previous pattern. Then send a shelveset or changelist around to the rest of the developer list displaying the difference. Or just have a meeting to talk about the difference.
Showing examples in real production code really helps other developers see the advantages.
I've successfully introduced LINQ to my company and it has helped quite a bit.
What worked for me? Show and tell. Our previous technology was database programming with C, which is quite the mess. Our lead developer did about 3000 lines of code to fill a dataset, and I did it in a 10th of that with LINQ/C#.
Once I broke down what I did and he saw how powerful it was, he was convinced it was time to upgrade.
The advice from people who convinced the management to consider using F# goes something like this:
Implement the most important bits of the next key project of the company in F# in your free time and then show others what benefits it has, how quickly you were able to implement it and how easy it is to adapt the solution to changing requirements.
I think this is quite effective way - when people actually see the productivity (of any new technology), it is much easier to convince them that it is worht learning it.
It's best to lead by example. Complete a successful project using the new tool and wait for developers to ask how you did it.
I managed to convince the team Im part of to switch from CVS to Mercurial. Can you believe we were still using CVS? Neither could I when I started.
I became almost somewhat of a preacher, a royal pain in the butt. Everytime CVS screw up or caused some sort of discomfort (being brutally slow, for example) I held a little speech of how much better it could be.
Soon enough they accepted the possibility that there were alternatives (none of them actually knew there were alternatives to CVS!) and began to say things like "if there indeed are alternatives, anything must be better than this".
Thats when I made my move and simply ran some scripts converting the CVS repository into a Mercurial one and uploaded it to the company server. Once they saw it in action, they were sold.
Not that I planned anything during this little migration black ops, but in retrospect I would give the following advice to anyone attempting something similar:
Let people know there are (better) alternatives, it is entirely possible to work outside your comfort zone.
Lead by example, if you want something done, do it yourself. Show the alternative in action. No one is going to make the jump unless you jump first, especially not if they are already hesitating.
Show them how it solves a common problem. Pick out some problem that appears regularly for them and show them the solution. That will usually at least get them thinking about it.
Stand the two technologies next to one another, its safe to assume that advancements have been made and that what you are bringing to the table is better suited to the job at hand.
Put the raw results in front of them and let them decide for them selves!
I work for a data beauro, and until recently the company was hooked on MS Access, which was cumbersom and unfit for the job, after some serious convincing and showing the power of SQL in comparison to Access, its now the weapon of choice.
And it took standing the two techs side by side and allowing the guys up top to see for themselves, the time saved did make business sense!
You'll need to show why it's a BETTER technology (or at least better at something) than the current tool/method in use, and probably significantly so. Otherwise, why go through the effort of learning something new?
Otherwise, convince the boss and then get a mandate ... (though I don't really recommend that if you can't get at least half the team on board).

Are there some good and modern alternatives to Javadoc? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
Let's face it: You don't need to be a designer to see that
default Javadoc looks ugly.
There are some resources on the web which offer re-styled Javadoc. But the default behaviour represents the product and should be as reasonably good-looking.
Another problem is the fact that the usability of Javadoc is not up-to-date compared to other similar resources.
Especially huge projects are hard to navigate using Firefox's quick search.
Practical question:
Are there any standalone (desktop) applications which are able to browse
existing Javadoc in a more usable way than a browser would?
I'm thinking about something like Mono's documentation browser.
Theoretical question:
Does anyone know, if there some plans to evolve Javadoc, in a
somehow-standardized way?
EDIT: A useful link to Sun' wiki on this topic.
I have created a Markdown (java) Doclet which will take source comments in Markdown formatted text and create the same HTML Javadocs.
The new doclet also does some restyling on the text, but the HTML generated is not changed at this stage.
That goes some way to address the HTML-in-java-commenting issues which is probably the biggest usability problem with current Javadoc.
I don't think that the concepts of Javadoc are outdated. As far as i can see, these concepts are rooted years ago in a product named doxygen, which is still available for other languages (i.e. Objective-C where it is heavily used). Even this has it's predecessors - have a look at the programming environment used by Donald Knuth to create TeX (Literate programming).
Nevertheless it is a intriguing idea to have a single source for program code and documentation.
Besides of that, the presentation of the documentation can be customized to your special needs using a plug-in system supported by the JavaDoc tool. You might provide a plug-in (as we do) that publishes directly into a database which is directly accessible via web. Using collaborations anyone can provide additional comments or clarifications to the documentation that might find their way back into the original source.
Javadoc is the best source code auto-documentation generation system I've ever seen. Large part of that is that it's so simple - I can browse javadocs even with my 5 year old cell phone if I want to! While I agree that a bit of a facelift could be in order and especially JDK is a pain to browse through, I wouldn't dare reinventing the wheel entirely because what we currently have is a RESTful, easy to use solution for its purpose which works just about anywhere.
I recently got a mail forwarded that Sun is working on modernizing the Javadoc HTML output. From said mail:
We are proposing improvements to javadoc/doclet for JDK7. The
project wiki page is located at
http://wikis.sun.com/display/Javadoc/Home. As a part of the proposed
improvements, the UI of the javadoc output will be revamped. The new
design screenshots are uploaded to the project wiki. The javadoc output
markup will be modified to be valid HTML and WCAG 2.0 compliant.
So there is definitely still work going on there, even if somewhat late. However, in my eyes one of the biggest drawbacks of Javadoc is its very close coupling with HTML. Many classes have Javadoc which includes literal HTML and relies on the output being HTML, too. Unfortunate, but this won't change anytime, I think. Still, this means that developers are free to include whatever they want in HTML there which might as well be invalid, non-well-formed, etc. So adapting the output from the javadoc tool is only one part of this, the other won't and can't change and thus remains.
As for browsing documentation I also find the HTML documentation a little unwieldy. I usually use the Javadoc view in Eclipse. It has drawbacks as well (slow and you can't really search) but it's Good Enoughâ„¢ for most things.
Personally I still find Javadoc to be very useful. Especially since it is standardized. I don't know of any major documentation style that I find easier to navigate (that might very well be subjective, but I personally find MSDN horrible to use, for example).
For the search: Use the Javadoc Search Frame, it makes using Javadoc of all kinds a lot easier. It's available as a Userscript for Firefox and as a Google Chrome Extension.
To answer your Practical Question, I googled and asked friends and came up with these. Forrestdoc,doclet and doxygen.
The second question, I would say that yes, its not very "Web-oh-twoeye" but At least your guaranteed to work in an offline environment, and its small enough to ship along with your API. i dispise the use of frames, but then it works rather well for javadoc. I have not seen any plans to change it.
Eclipse has some support for javadoc as far as reading, interpreting and generating it goes.
You might want to phrase that in a less agressive and overbearing manner. Most people don't care what a technical resource looks like, and "It's not Web 2.0 enough!" sounds like vapid marketroidspeak.
And what exactly would you consider "more usable"? Personally, I would definitely like a full text search and a better useage browser, and AJAX could probable help with those.
Well, the nice thing about JavaDoc is that it's the opposite of outdated - it's arbitrarily extensible. Why don't you go ahead and write a doclet that produces the kind of API doc you want?
Why nobody else has done that so far (which apparently is the case) is anyone's guess - maybe nobody else feels as strongly about it as you.
There's a DocBook doclet. DocBook is a richer document type than (X)HTML and is better for describing technical content. From DocBook source you can generate all sorts of different output formats.
I personally would like a more readable "comment documentation" standard than the HTML (and hence tag-wieldy) JavaDoc.
For example, MarkDown, as used here, would be excellent, human readable in the source, nicely formatted external to the source.
With the current JavaDoc, I imagine many people use JavaDoc comments, but don't actually document to the extent they could. I'm sure everyone has browsed an API's online JavaDoc that has been non-documented or barely-documented, and thus far harder to use than it should be.
This isn't helped by code-reformatters (e.g., within Eclipse, or maybe upon source commit) that totally destroy any readable structure you might have put within a JavaDoc comment (e.g., a list of items) into one big blob of text, unless you literally use two carriage returns where you wish to use one).
Does anyone know, if there some plans to evolve Javadoc, in a somehow-standardized way?
The corresponding JSR (JSR 260), which specifies enhancements to Javadoc, has been voted out of JDK 7 (for now). An overview of what was planned (from this site):
Upgrade Javadoc to provide a richer set of tags to allow more structured presentation of Javadoc documentation. This JSR covers: categorization of methods and fields, semantical index of classes and packages, distinction of static, factory, deprecated methods from ordinary methods, distinction of property accessors, combining and splitting information into views, embedding of examples and common use-cases, and more.
The overall outlook for JDK 7 is pretty grim.
JavaDoc is itself extremely flexible because you can replace the standard doclet with a custom doclet to provide something that meets your projects specific needs.
On the project I've been working on, we created an HTML/XML-based documentation system (using client-side XSLT 2.0 on JS) for our product with JavaDoc fully integrated. For this, a custom doclet was used to produce JavaDoc data in XML, this used tagsoup to ensure even HTML markup within code comments were well formed.
With this, we were able to deliver an interactive user experience using a single-page app (similar to a desktop tool), but all from within the browser - without any server-side code/infrastructure. The viewer included standard features such as search, tree navigation etc.
Here's a link to a sample entry point in the rather vast documentation:
JavaDoc viewer sample
Here's an image also:
A smart seachable javadoc viewer:
For many times, I face the problem of browsing JavaDoc. I was looking for something just like Adnroid doc search option. At last I get something like that. If you use firefox the solution is here.
Install the plugin GreaseMonkey, its kinda customizing web page the way we see. ( We need to customize any java doc page, so we can search on class name)
https://addons.mozilla.org/en-US/firefox/addon/greasemonkey/
For greasemonkey to work, we need some user script for customization. This can be downloaded by greasemonkey automatically. Install the userscript from JavaDoc search frame or JavaDoc incremental search.
This works great for me.

Categories