I'm separating a module from a Java project (essentially creating a separate Java Project for the module). I've been spending a lot of time to figure out the class dependencies. What strategy/tool can I use to help speed up the process?
For example, if I know for sure that I need to extract out class A into the new project, how would I quickly identify all the classes on which class A depends for successful compilation.
Approaches tried:
At the moment, I'm going by repetitive cycles of: add required classes to new project -> compile -> gather necessary classes from original project based on compiler error message -> repeat.
Generating UML Class Diagrams is not working out as most UML reverse-engineering tools seem to have some limitation or the other when it comes to identifying inheritance, associations, etc. I already tried ArgoUML.
It looks like you need "Dependency viewer" of Intellij IDEA.
It can be found under Analyze > Analalyse Dependecies...
In my experience, Eclipse has the best built-in utilities to get information about class dependencies and such. You could go into the debug view and trace each class and/or method.
Related
In Eclipse, when I start a new Project, I go through the wizard, and when I get to writing my first class for that project I am asked to select a package. Sometimes out of laziness I just choose the default package.
The wizard warns me this is discouraged. Even if I ignore the warnings I never have any problems with the application due to this. Or at least, so far I have never had any problem.
So why does Eclipse want me to create a new package?
Eclipse, and most other IDEs, are geared towards large projects. Hobbyist programming and small-scale assignments can get by in IDEs often, but be aware that the general assumption would be for larger projects - anything between 10 and 5,000+ classes.
There is also a chance that you create a class which has a similar name to something in the Java API - for example:
java.rmi.MarshalException and
javax.xml.bind.MarshalException
Ambiguity in instantiating the class (throw new MarshalException();) if both classes exist on the same classpath is a compilation error.
You actually can't import classes from the default package. I understand it's allowed as a convenience or for very small (one class file) tasks.
When writing code in an Eclipse project, I'm usually quite messy and undisciplined in how I create and organize my classes, at least in the early hacky and experimental stages. In particular, I create more than one class with a main method for testing different ideas that share most of the same classes.
If I come up with something like a useful app, I can export it to a runnable jar so I can share it with friends. But this simply packs up the whole project, which can become several megabytes big if I'm relying on large library such as httpclient.
Also, if I decide to refactor my lump of code into several projects once I work out what works, and I can't remember which source files are used in a particular run configuration, all I can do it copy the main class to a new project and then keep copying missing types till the new project compiles.
Is there a way in Eclipse to determine which classes are actually used in a particular run configuration?
EDIT: Here's an example. Say I'm experimenting with web scraping, and so far I've tried to scrape the search-result pages of both youtube.com and wrzuta.pl. I have a bunch of classes that implement scraping in general, a few that are specific to each of youtube and wrzuta. On top of this I have a basic gui common to both scrapers, but a few wrzuta- and youtube-specific buttons and options.
The WrzutaGuiMain and YoutubeGuiMain classes each contain a main method to configure and show the gui for each respective website. Can Eclipse look at each of these to determine which types are referenced?
Take a look at ProGuard, it is a "java shrinker, optimizer, obfuscator, and preverifier". I think you'll mainly be interested in the first capability for this problem.
Yes it's not technically part of Eclipse, as you requested, but it can be run from an Ant script, which can be pretty easily run in Eclipse.
I create more than one class with a main method for testing different ideas that share most of the same classes.
It's better to be pedantic than lazy, it saves you time when coding :-)
You can have one class with a main method that accepts a command-line argument and calls a certain branch of functionality based on its value.
Currently we are studying the Java based tool which is primarily Reporting tool.It was developed in 2000/2001 period and uses many open source libraries like Apache Avalon/Mx4J.Adaptor/edu.Oswego(java concurrent package) etc. Tool uses jdk 1.3.1 and goal is to upgrade to jdk 1.5.We have also been asked to remove these 'outdated' packages and replace by standard Java packages if possible.
Unfortunately we have the code available for study but lacks any documentation and really difficult to track the flow(Total number of classes written might be more than 1000) during debugging.
Whats the best way to understand this kind of tool? Any graphical tool to see the relationship between the classes?
Thanks,
SR
You could try some of the Source Code Analyzer plugins to eclipse. Tools like DIVER or X-Ray might be useful.
That's a common problem (unfortunately), and again unfortunately there is no easy solution.
There are many tools to help you (see below), but these are only helpers, they will not solve the problem for you.
I have found that a systematic approach is best. There is a good article on this:
Swallowing an elephant in 10 easy steps , about understanding a large, undocumented system. It's about Perl, but the ideas are independent of language.
Some tools that might help:
Step through interesting parts in a debugger (e.g. Eclipses debugger)
Use Eclipse's "Call hierarchy" and "find references" to understand which part of the code uses what
Run tests with simple input data, understand what they produce
Write javadocs into the code documenting what you found, possibly correcting existing docs
Use tools to visualize class dependencies. I have unsed JDepend with some success; there are many others.
Eclipse (and newer version of NetBeans and perhaps IntelliJ) have wonderful tools for analyzing large codebases:
Call hierarchy (CTRL + ALT + H) - you see the hierarchies of calls to/from a given method
Type hierarcy (F4) - you see the whole inheritance structure
Data hierarchy
Right click on item > References
many different search options
Any graphical tool to see the relationship between the classes?
If you want to see the relationship between classes you could try Green UML . It creates a nice UML class diagram out of your repository. It works on Eclipse.
I hope that helps.
You can do it easily in NetBeans.
Select the method signature and press ALT+F7 (or alternately right click and then click "Find Usages") this would show you from where a particular method is being called.
Second option is little hectic but may give some results. Configure log4j for your project and try to give the proper logging code in each method.
After years of programming, we all have a set of small functions used as helpers utilities that we wish it comes build-in so we can use it in any project and have ti taken care by more people (test and optimized).
I have quite a collection of these functions. I wonder how do you guys organize them? Do you have any tips?
This is how I do it. I put it in a separate project (an eclipse project) let say "MyUtils" and it referred to by other projects. This works but because the utils collection are getting bigger and bigger something it is kind of weird that the utils are bigger than the project code (for small projects). And to ship it in Jar, you have to select them all by hand (or include them all). Is there a better way?
Also, as Java requires all functions to be in a class so I have ton of static functions (those that does not fit in OOP) for example a function read text file from a file name. Like this:
package nawaman.myutil;
public class UText {
static public String ReadTextFile(String pFileName) {
...
}
static public String[] ReadLines_fromFile(String pFileName) {
...
}
static public String ReadLine_fromFile(String pFileName, int pLineNumber) {
...
}
...
}
So when I need to include all the functions goes when though it is not used.
Is there a better way to do this?
I use eclipse on Linux anyway if there is special technique for it but fell free to share if you have techniques with other tools.
I treat such utility classes just like other components external to the software that I develop:
For each component I create a Eclipse project and build it to a jar.
Classes are grouped logically in packages, e.g. [domain].util.net, [domain].util.text etc.
In a project I include the dependencies I need. Maven can help you here.
You write that utility classes have a lot of static methods. That's something I don't use a lot. For example the text functions you show can be refactored to a class or set of classes that extend or implement classes and interfaces from the collections framework. That makes it easier to integrate my code with other libraries.
This works but because the utils collection are getting bigger and bigger something it is kind of weird that the utils are bigger than the project code (for small projects). And to ship it in Jar, you have to select them all by hand (or include them all). Is there a better way?
For my projects I use javac to select all the classes from my util libraries. For this I compile all classes from my project to an empty output directory. javac automatically resolves the dependencies to the util libraries because I added the util library pathes as source pathes. Now I can create a jar that contains all classes of my project and only the needed classes of the util libraries.
Also, as Java requires all functions to be in a class so I have ton of static functions (those that does not fit in OOP) for example a function read text file from a file name.
I do it the same way. But I try have a lot of small util classes instead of a few big ones, so that I don't have to include tons of unneeded methods to my jars.
My "utilities" have their own package namespace and SVN repository. They are, in essence my own libraries: distinct projects which may be pulled in, shared, tagged, updated, whatever.
The organization used within each of these "libraries" depends on the scope and function in question.
Because I disagree with the structure being a slave to some potential class/JAR output:
If you are concerned about "method bloat" in the classes and/or JARs, please use an automated tool to combat this. ProGuards is just one example and, while it can obfuscate, it can work equally well at just "dead code elimination".
Split your utils module into smaller subprojects. Use Maven or other build system to track versions of all your util modules. They are crucial to your systems because I think they used are in almost all your projects. Use tools like Findbugs or PMD to mesure quality of your code.
Every project need to know which version of utils module is using. It unacceptable in my opinion to add to binaries/sources of one of yours 'nonutils' project some loosely coupled util classes.
Please, revise yours classes with other commons projects like Apache Commons. I assume that lot of your utility code is similiar. Think better of rewriting yours static metods, because they obstruct testing (I'm sure that Findbugs will be complaining a lot too).
To sum up - creating a utils library is a hard stuff and a lot of responsability. So requirements in area of code quality are very high. I hope that my advice will help.
You should be very careful with removing classes after compilation - you may end up in a class not found situation at runtime. If you never use reflection or Class.forName() you should be safe, but those introduce runtime dependencies which the compiler cannot help you with (like it can with "new").
Remember - those classes not used do not use memory in the running program, only uses bytes on disk.
Personally I've ended up at saying disk space is cheap, and the risk of accidntially removing a class defintion used causing a runtime break, is not worth it to me, so I say - all code used for compilation must be shipped.
I don't use Eclipse, but in Visual Studio you can add a reference to file without it being physically moved or copied. This allows you to define a file in the root of your source control that all of your projects can reference without it being included in every project or having to deal with the copying problem. With this kind of solution you can intelligently split your util methods into different files and selectively include them based on what individual projects need. Also you can get rid of the extra .jar.
That said, I have no idea if Eclipse supports this kind of file referencing, but it might be worthwhile to look.
I have recently joined a project that is using multiple different projects. A lot of these projects are depending on each other, using JAR files of the other project included in a library, so anytime you change one project, you have to then know which other projest use it and update them too. I would like to make this much easier, and was thinking about merging all this java code into one project in seperate packages. Is it possible to do this and then deploy only some of the packages in a jar. I would like to not deploy only part of it but have been sassked if this is possible.
Is there a better way to handle this?
Approach 1: Using Hudson
If you use a continuous integration server like Hudson, then you can configure upstream/downstream projects (see Terminology).
A project can have one or several downstream projcets. The downstream projects are added to the build queue if the current project is built successfully. It is possible to setup that it should add the downstream project to the call queue even if the current project is unstable (default is off).
What this means is, if someone checks in some code into one project, at least you would get early warning if it broke other builds.
Approach 2: Using Maven
If the projects are not too complex, then perhaps you could create a main project, and make these sub-projects child modules of this project. However, mangling a project into a form that Maven likes can be quite tricky.
If you use Eclipse (or any decent IDE) you can just make one project depend on another, and supply that configuration aspect in your SVN, and assume checkouts in your build scripts.
Note that if one project depends on a certain version of another project, the Jar file is a far simpler way to manage this. A major refactoring could immediately means lots of work in all the other projects to fix things, whereas you could just drop the new jar in to each project as required and do the migration work then.
I guess it probably all depends on the specific project, but I think I would keep all the projects separate. This help keep the whole system loosely coupled. You can use a tool such as maven to help manage all the dependencies between the projects. Managing dependencies like this is one of maven's main strengths.
Using Ant as your build tool, you can package your project any way that you want. However, leaving parts of your code out of the distribution seems like it would be error prone; you might accidentally leave out necessary classes (presumably, all of your classes are necessary).
In relation to keeping your code in different projects, I have a loose guideline. Keep the code that changes together in the same project and package it in its own jar file. This works best when some of your code can be broken out into utility libraries that change less frequently than your main application.
For example, you might have an application where you've generated web service client classes from a web service WSDL (using something like the Axis library). The web service interface will likely change infrequently, so you don't want to have the regeneration step reoccurring all the time in your main application build. Create a separate project for this piece so that you only have to recreate the web service client classes when the WSDL changes. Create a separate jar and use it in your main application. This style also allows other projects to reuse these utility modules.
When following this style, you should place a version number in the jar manifest so that you can keep track of which applications are using which versions of your module. Depending on how far you want to take this, you could also keep a text file in the jar that details the changes that have occurred for each revision (much like an open source library).
It's all possible (we had the same situation some years ago). How hard or easy it'll be depends on your IDE (refactoring, merging, organizing new project) and you build tool (deploying). We used IDEA as IDE and Ant as build tool and it wasn't too hard. One sunday (nobody working+committing), 2 people on one computer.
I'm not sure what you mean by
"deploy only some of the packages in a jar"
I think you will need all of them at runtime, won't you? As I understood they depend on each other.