Single documentation for mixed (Scala/Java) project? - java

For a project with modules in Scala and Java (side by side), how to combine scaladoc with javadoc to provide a single view of the documentation for the project?
(this could be using maven, or ant, or sbt, more a general question).
Any thoughts and experiences appreciated.

With Scala 2.8's new scaladoc that will replace the one used with Scala 2.7, the differences will be even more striking. However, there was a request that a function be provided that translated scaladoc into javadoc format, for use by IDEs when displaying help.
If this function becomes available, then something that generates javadocs from scaladocs would be theoretically feasible.
But for any of that to become true, the people who have interest in such a thing would have to speak up at the appropriate fora. And, of course, if they are too small a group, it is likely nothing happens unless they do it for themselves.

What's de advantage of having Scaladoc <> Javadoc? There is a huge number of tools for Javadoc and almost anything for Scaladoc. The mainstream IDEs (Eclipse, Netbeans, Idea - real world enterprise development - not academic research) knows nothing about Scaladoc. Seems like being in Siberia: isolated.

Scaladocs and javadoc are very different, with different formats. They are just two different animals and I don't think it makes sense to combine them. So, AFAIK, Maven doesn't offer any support for that (which is not surprising), just generate both of them separately.

Related

Java code generation

I am looking for nice (java) code generation engine.
I have found cglib but it is very poorly documented and I am not quite sure that it can generate actual java classes (files) and only dynamic classes. If I am wrong maybe someone knows has a link with an example.
Roman
Have a look at codemodel, used with success for my projects.
Didn't really try, but you may want to take a look at another code generation Java framework called Javassist, which also has pretty thorough tutorial. Also Hibernate changed code generation framework from cglib to javassist. Quote, explaining why:
The simple fact of the matter is that development on CGLIB has largely stopped. It happens. Developers for whatever reason (the reasons are their own) move on to new priorities.
Source
I just released cgV19 here: https://github.com/carstenSpraener/cgV19 it's based on a code generator i wrote in 2002 to 2006 and which is still in production use. cgV19 is a re implementation with lessons learned. It has:
Support for gradle
Uses Groovy as a template language
a modular "cartridge" system to add several generator for different aspects
small footprint
Just try it out and give me feedback would be very nice.

How to refactor thousands of lines of Java code? Is there any tool available?

In our application we have two or three classes which contains the entire Java Swing application logic. These two or three classes contain around 7k lines of code.
Now I have been assigned the task to refactor this Java code.
How do I start? Is there any tool available that will do the refactoring or at least guide us?
I'd recommend Eclipse - the brilliant Java IDE for the editing and refactoring. It has several tools for refactoring. An excellent tutorial on how to do it with Eclipse is located at:
http://www.cs.umanitoba.ca/~eclipse/13-Refactoring.pdf
There's a brililant article on the power of refactoring with Eclipse, if you're not yet convinced, at:
http://www.eclipse.org/articles/article.php?file=Article-Unleashing-the-Power-of-Refactoring/index.html
And finally another article on how to refactor in Eclipse, including techniques and tools, is available at:
http://www.ibm.com/developerworks/opensource/library/os-ecref/
There's also another stackoverflow question on strategies for refactoring Java code that you may be interested in:
https://stackoverflow.com/questions/128498/what-are-the-best-code-refactoring-strategies
Hope that helps, good luck!
I assume that you are trying to break up these large classes into smaller ones. The most common way to do this is with the Extract Class refactoring. It just happens that this is a major topic in my PhD thesis work.
One of the hard parts is deciding what goes into the new classes. There are two publicly available tools that I know of that help - ExtC (my tool) and JDeodorant. Both are Eclipse plug-ins, and I would classify both as being prototypes. If you want to try to use my tool, I'll be glad to help.
Once you decide what should go into the new class, you have to do the actual work of separating the class into others. Eclipse's Extract Class refactoring is misnamed and isn't really helpful. IntelliJ's IDEA is much better, but still has some bugs. JDeodorant can also perform the split, but it also has some bugs.
IntelliJ has all the smarts for understanding Java code and provides excellent refactorings. And now there is a free and open source version.
Eclipse has some built-in refactoring tools. You could refactor method's signatures, extract interfaces and classes, pull methods up and down in the hierarchy tree, move packages ... and all that just by two clicks.
Also, you could start with a Martin Fowler book "Refactoring: Improving the Design of Existing Code".
As refactoring code relies primarily on the developer (assisted by tooling), your IDE is a very important tool when it comes to refactoring.
Both Eclipse and IntelliJ IDEA have plenty of refactoring support.
For an overview, checkout:
http://www.jetbrains.com/idea/features/refactoring.html
http://help.eclipse.org/galileo/index.jsp?topic=/org.eclipse.jdt.doc.user/concepts/concept-refactoring.htm
I have created my own refactoring tool that tries and group together methods that use the same set of variables. It is very much an early prototype. It is only available as a Windows Eclipse plugin.
Variable Usage Eclipse Plugin

Straight Java/Groovy versus ETL tool (Talend/etc) - what libraries would you use?

Assume you have a small project which on the surface looks like a good match for an ETL tool like Talend.
But assume further, that you have never used Talend and furthermore, you do not trust "visual programming" tools in general and would rather code everything the old fashioned way (text on a nice IDE!) with the help of an appropriate language & support libraries.
What are some language patterns & support libraries that could help you stay away from the ETL tool temptation/trap?
It depends on whether the deliverable is the processor or the output itself. If you just need to deliver the output, you don't need to maintain the code. If the code needs to be maintained then will it be you maintaining it or somebody else?
If somebody else needs to maintain I'd use Java or give them Talend.
If it's throwaway code, I'd use what will be easier or fun to program with.
If you need to maintain it and the processing is complex, I'd use Scala. It has:
some libraries to interact with databases
xml literals
parser combinators
interesting features on its collection packages (map, filter, groupBy, partition, ...)
and of course any other existing Java libraries.
I used to think that "visual programming" is something for people who can't program. Then I was exposed to Talend in a project, and I realized that this type of tool is exactly right for the job, when it comes to moving data from A to B, and transforming it in the process. It's component-oriented software design, by a more academic label.
I still consider myself a decent programmer who can do anything, and then some, with a text editor and a shell prompt. But I've become a big fan of Talend as well.
Full disclosure: I now work for the company :-)
Check out DataExpress. It's a Scala-based, cross-database ETL toolkit.
I think this is a pretty good match for Rails-inspired frameworks, such as Grails on Groovy or Lift on Scala.
Depending on the size of the DB schema, you could map everything real quick in Hibernate and just use the resulting object model to do your work (depending on what you want the ETL tool for anyways)

Interfacing R to Java

Is rjava the only way to connect R to Java? I am asking because there is a disclaimer at the end of the web page:
This interface uses Java reflection
API to find the correct method so it
is much slower and may not be right
(works for simple examples but may not
for more complex ones). For now its
use is discouraged in programs as it
may change in the future.
This is slightly concerning. How do you address this issue? I know that Rweka has a self-contained interface, so I may look into that package, but maybe many R users have already gone through the pains.
It is not the only one as the Omegahat project also has the RSJava package. But as many of the other brilliant innovations from Omegahat (which practically speaking is really just Duncan Temple Lang), this one may not build as easily or reliably.
The rJava package on the other hand is used by almost thirty other packages
CADStat, Containers, Deducer, JGR,
RFreak, RImageJ, RJDBC, RLadyBug,
aCGH.Spline, ant, arulesNBMiner,
colbycol, cshapes, dynGraph, farmR,
gWidgetsrJava, glmulti,
helloJavaWorld, iplots, rSymPy, rcdk,
rcdklibs, scagnostics, spcosa, RKEA,
RWeka, Snowball, openNLP, wordnet
which I take as quite the endorsement.
I think that disclaimer only applies if you use the $ operator to access your java objects. As long as you stick with the .jcall function you won't incur the overhead.
In terms of experience using rJava, I've found it works exactly as advertised and for my package (farmR) it hasn't caused any performance problems. I don't make a huge number of calls into java though, and I haven't used any of the java GUI toolkits.
I am an Rweka user, and I can tell you it is amazingly quick, it outperforms weka alone, while using it's functions in the r environment. I think that the R package has a very special way to integrate inside the language java libraries, nevertheless these libraries need to be prepared to allow this. For being able to do a proper integration you will need to do an important amount of research in order to see how to make things fit properly. I recommend you to read the documentation that comes with R, which details which are the best practices for writing NEW LIBRARIES libraries.

Are there compelling reasons not to use Groovy?

I'm developing a LoB application in Java after a long absence from the platform (having spent the last 8 years or so entrenched in Fortran, C, a smidgin of C++ and latterly .Net).
Java, the language, is not much changed from how I remember it. I like it's strengths and I can work around its weaknesses - the platform has grown and deciding upon the myriad of different frameworks which appear to do much the same thing as one another is a different story; but that can wait for another day - all-in-all I'm comfortable with Java. However, over the last couple of weeks I've become enamoured with Groovy, and purely from a selfish point of view: but not just because it makes development against the JVM a more succinct and entertaining (and, well, "groovy") proposition than Java (the language).
What strikes me most about Groovy is its inherent maintainability. We all (I hope!) strive to write well documented, easy to understand code. However, sometimes the languages we use themselves defeat us. An example: in 2001 I wrote a library in C to translate EDIFACT EDI messages into ANSI X12 messages. This is not a particularly complicated process, if slightly involved, and I thought at the time I had documented the code properly - and I probably had - but some six years later when I revisited the project (and after becoming acclimatised to C#) I found myself lost in so much C boilerplate (mallocs, pointers, etc. etc.) that it took three days of thoughtful analysis before I finally understood what I'd been doing six years previously.
This evening I've written about 2000 lines of Java (it is the day of rest, after all!). I've documented as best as I know how, but, but, of those 2000 lines of Java a significant proportion is Java boiler plate.
This is where I see Groovy and other dynamic languages winning through - maintainability and later comprehension. Groovy lets you concentrate on your intent without getting bogged down on the platform specific implementation; it's almost, but not quite, self documenting. I see this as being a huge boon to me when I revisit my current project (which I'll port to Groovy asap) in several years time and to my successors who will inherit it and carry on the good work.
So, are there any reasons not to use Groovy?
There are two reasons I can think of not to use Groovy (or Jython, or JRuby):
If you really, truly need performance
If you will miss static type checking
Those are both big ifs. Performance is probably less of a factor in most apps than people think, and static type checking is a religious issue. That said, one strength of all of these languages is their ability to mix and match with native Java code. Best of both worlds and all that.
Since I'm not responsible for your business, I say "Go for it".
If you use Groovy, you're basically throwing away useful information about types. This leaves your code "groovy": nice and concise.
Bird b
becomes
def b
Plus you get to play with all the meta-class stuff and dynamic method calls which are a torture in Java.
However -- and yes I have tried IntelliJ, Netbeans and Eclipse extensively -- serious automatic refactoring is not possible in Groovy. It's not IntelliJ's fault: the type information just isn't there. Developers will say, "but if you have unit tests for every single code path (hmmmm), then you can refactor more easily." But don't believe the hype: adding more code (unit tests) will add to the safety of massive refactoring, but they don't make the work easier. Now you have to hand fix the original code and the unit tests.
So this means that you don't refactor as often in Groovy, especially when a project is mature. While your code will be concise and easy to read, it will not be as brilliant as code that has been automatically refactored daily, hourly and weekly.
When you realize that a concept represented by a class in Java is no longer necessary, you can just delete it. In Eclipse or Netbeans or whatever, your project hierarchy lights up like a Christmas tree, telling you exactly what you've screwed up with this change. def thing tells the compiler (and therefore your IDE) nothing about how a variable will be used, whether the method exists, etc. And IDEs can only do so much guessing.
In the end, Java code is filled with "boilerplate," but it's been kneaded into its final form after many refactorings. And to me, that's the only way to get high-quality, readable code for future programmers, including when those future programmers are you-in-the-future.
Two reasons why Scala might be a compelling alternative to Groovy:
Performance on par with Java
Static typing without clutter
One of the biggest things you lose when you use dynamic languages, especially in a large codebase is the ability to use an IDE to re-factor. Languages that allow dynamically adding code to objects simply can't be parsed by today's IDEs to allow the kind of easy refactoring methods you can get from Eclipse, etc. for Java, C++, etc.
It's not really a case of "Dynamic languages are better than Static". Use what's best for you. The really cool thing about Groovy in particular is you can mix and match Java and Groovy in the same project, and it all runs on the VM. Yes, Scala is another example.
I think the biggest issue is lack of IDE support compared to java, however the plugins for Eclipse and Netbeans are getting better all the time. Also, if I remember correctly Groovy does not support anonymous inner classes if you really need them for some reason. I would personally choose Groovy anytime though.

Categories