Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
What tools do you use to find unused/dead code in large java projects? Our product has been in development for some years, and it is getting very hard to manually detect code that is no longer in use. We do however try to delete as much unused code as possible.
Suggestions for general strategies/techniques (other than specific tools) are also appreciated.
Edit: Note that we already use code coverage tools (Clover, IntelliJ), but these are of little help. Dead code still has unit tests, and shows up as covered. I guess an ideal tool would identify clusters of code which have very little other code depending on it, allowing for docues manual inspection.
An Eclipse plugin that works reasonably well is Unused Code Detector.
It processes an entire project, or a specific file and shows various unused/dead code methods, as well as suggesting visibility changes (i.e. a public method that could be protected or private).
CodePro was recently released by Google with the Eclipse project. It is free and highly effective. The plugin has a 'Find Dead Code' feature with one/many entry point(s). Works pretty well.
I would instrument the running system to keep logs of code usage, and then start inspecting code that is not used for months or years.
For example if you are interested in unused classes, all classes could be instrumented to log when instances are created. And then a small script could compare these logs against the complete list of classes to find unused classes.
Of course, if you go at the method level you should keep performance in mind. For example, the methods could only log their first use. I dont know how this is best done in Java. We have done this in Smalltalk, which is a dynamic language and thus allows for code modification at runtime. We instrument all methods with a logging call and uninstall the logging code after a method has been logged for the first time, thus after some time no more performance penalties occur. Maybe a similar thing can be done in Java with static boolean flags...
I'm suprised ProGuard hasn't been mentioned here. It's one of the most mature products around.
ProGuard is a free Java class file shrinker, optimizer, obfuscator,
and preverifier. It detects and removes unused classes, fields,
methods, and attributes. It optimizes bytecode and removes unused
instructions. It renames the remaining classes, fields, and methods
using short meaningless names. Finally, it preverifies the processed
code for Java 6 or for Java Micro Edition.
Some uses of ProGuard are:
Creating more compact code, for smaller code archives, faster transfer across networks, faster loading, and smaller memory
footprints.
Making programs and libraries harder to reverse-engineer.
Listing dead code, so it can be removed from the source code.
Retargeting and preverifying existing class files for Java 6 or higher, to take full advantage of their faster class loading.
Here example for list dead code: https://www.guardsquare.com/en/products/proguard/manual/examples#deadcode
One thing I've been known to do in Eclipse, on a single class, is change all of its methods to private and then see what complaints I get. For methods that are used, this will provoke errors, and I return them to the lowest access level I can. For methods that are unused, this will provoke warnings about unused methods, and those can then be deleted. And as a bonus, you often find some public methods that can and should be made private.
But it's very manual.
Use a test coverage tool to instrument your codebase, then run the application itself, not the tests.
Emma and Eclemma will give you nice reports of what percentage of what classes are run for any given run of the code.
We've started to use Find Bugs to help identify some of the funk in our codebase's target-rich environment for refactorings. I would also consider Structure 101 to identify spots in your codebase's architecture that are too complicated, so you know where the real swamps are.
In theory, you can't deterministically find unused code. Theres a mathematical proof of this (well, this is a special case of a more general theorem). If you're curious, look up the Halting Problem.
This can manifest itself in Java code in many ways:
Loading classes based on user input, config files, database entries, etc;
Loading external code;
Passing object trees to third party libraries;
etc.
That being said, I use IDEA IntelliJ as my IDE of choice and it has extensive analysis tools for findign dependencies between modules, unused methods, unused members, unused classes, etc. Its quite intelligent too like a private method that isn't called is tagged unused but a public method requires more extensive analysis.
In Eclipse Goto Windows > Preferences > Java > Compiler > Errors/Warnings
and change all of them to errors. Fix all the errors. This is the simplest way. The beauty is that this will allow you to clean up the code as you write.
Screenshot Eclipse Code :
IntelliJ has code analysis tools for detecting code which is unused. You should try making as many fields/methods/classes as non-public as possible and that will show up more unused methods/fields/classes
I would also try to locate duplicate code as a way of reducing code volume.
My last suggestion is try to find open source code which if used would make your code simpler.
The Structure101 slice perspective will give a list (and dependency graph) of any "orphans" or "orphan groups" of classes or packages that have no dependencies to or from the "main" cluster.
DCD is not a plugin for some IDE but can be run from ant or standalone. It looks like a static tool and it can do what PMD and FindBugs can't. I will try it.
P.S. As mentioned in a comment below, the Project lives now in GitHub.
There are tools which profile code and provide code coverage data. This lets you see (as code is run) how much of it is being called. You can get any of these tools to find out how much orphan code you have.
FindBugs is excellent for this sort of thing.
PMD (Project Mess Detector) is another tool that can be used.
However, neither can find public static methods that are unused in a workspace. If anyone knows of such a tool then please let me know.
User coverage tools, such as EMMA. But it's not static tool (i.e. it requires to actually run the application through regression testing, and through all possible error cases, which is, well, impossible :) )
Still, EMMA is very useful.
Code coverage tools, such as Emma, Cobertura, and Clover, will instrument your code and record which parts of it gets invoked by running a suite of tests. This is very useful, and should be an integral part of your development process. It will help you identify how well your test suite covers your code.
However, this is not the same as identifying real dead code. It only identifies code that is covered (or not covered) by tests. This can give you false positives (if your tests do not cover all scenarios) as well as false negatives (if your tests access code that is actually never used in a real world scenario).
I imagine the best way to really identify dead code would be to instrument your code with a coverage tool in a live running environment and to analyse code coverage over an extended period of time.
If you are runnning in a load balanced redundant environment (and if not, why not?) then I suppose it would make sense to only instrument one instance of your application and to configure your load balancer such that a random, but small, portion of your users run on your instrumented instance. If you do this over an extended period of time (to make sure that you have covered all real world usage scenarios - such seasonal variations), you should be able to see exactly which areas of your code are accessed under real world usage and which parts are really never accessed and hence dead code.
I have never personally seen this done, and do not know how the aforementioned tools can be used to instrument and analyse code that is not being invoked through a test suite - but I am sure they can be.
There is a Java project - Dead Code Detector (DCD). For source code it doesn't seem to work well, but for .jar file - it's really good. Plus you can filter by class and by method.
Netbeans here is a plugin for Netbeans dead code detector.
It would be better if it could link to and highlight the unused code. You can vote and comment here: Bug 181458 - Find unused public classes, methods, fields
Eclipse can show/highlight code that can't be reached. JUnit can show you code coverage, but you'd need some tests and have to decide if the relevant test is missing or the code is really unused.
I found Clover coverage tool which instruments code and highlights the code that is used and that is unused. Unlike Google CodePro Analytics, it also works for WebApplications (as per my experience and I may be incorrect about Google CodePro).
The only drawback that I noticed is that it does not takes Java interfaces into account.
I use Doxygen to develop a method call map to locate methods that are never called. On the graph you will find islands of method clusters without callers. This doesn't work for libraries since you need always start from some main entry point.
I'm currently making an IDE for the Java platform. This IDE for education purposes only.I'm working of the documentation and in the analysis phase.
Right now I'm at the stage of making the domain model for my project and I'm confused what to as to how the domain model figure would look like.
The IDE will feature
open/save
create/remove class
intellisense
compile
execute
syntax highlighting/formatting
so how will the domain model look like? and what is a domain?
Any guidance will be helpful. Thanks
Well, I would suggest to start with identifying the use cases for your IDE:
1. Maintain files (open, save, delete, rename)
2. Parse Code Syntax and display results.
3. Pass File to compiler and display results.
( And then write out the simple steps of what these use cases do. This will help alot as well as giving you a 'context' for all those niggly little requirements that will pop up.
Otherwise it's simply a list of functionality and very hard to organize, implement consistently and completely and know you caught everything.)
So, you could say you have 3 domain objects now: File and Code and Compiler.
Anyway it's a start
Yes, A HUGE project for simple curiosity.
You might alos look at how eclipse is built as well as how an OO compiler is built. These may give you ideas as to your domain objects
It sounds like you need to read up on Domain Driven Design. Your domain objects and ubiquitous language are driven by the language used by the domain experts. Fortunately you're familiar with this language already since you know the domain (programming) already.
We have a Java code base that has grown to be too big for a single monolithic JAR (more than 5000 classes). One of the tasks that we are investigating is how much effort would it be to break this single JAR into smaller components with controlled dependencies between them. However, it's somewhat hard to look at a big bag of code and be sure that you are finding the best points of separation without some analysis.
Are there good tools to inspect and visualize the interpackage dependencies? Given those, we would have a set of suggested cut points where we could begin separating code.
As an example, in the days before Netbeans and Eclipse (and at a different job), we used TogetherJ and TogetherEnterprise. Those had the ability to do a static package analysis and draw the UML diagram. That sort of behavior would be optimal but that feature alone is not sufficient to justify the cost.
I have recently discovered CodePro AnalytiX, formerly from Instantiations, now available for free from Google:
https://developers.google.com/java-dev-tools/codepro/doc/features/dependencies/dependencies
I used stan4j for the same purpose but unfortunately the community edition has a 500 classes limit. On the other side, it works as an eclipse extension.
Intellij IDEA has one:
(source: jetbrains.com)
JDepend is a free tool for analyzing package dependencies.
It identifies circular dependencies, which would impede breaking this monolith into smaller pieces.
We put this check for circular dependencies into our unit tests, to prevent them from the start.
There's a corresponding Eclipse plug-in.
You can send the output to GraphViz. However, the visualization becomes less understandable as the number of packages grows.
Now that CodePro AnalytiX [mentioned first by Fabian Steeg above] is free, it's worth another look. At least prior to purchase by Google, Instantiations reliably produced great software. I played with it some years back, and recall no complaints other than cost.
A good try would be to reverse your jar file into a class diagram. I have found this tutorial which explain how to reverse project composed by jar files into UML class diagram: http://www.ejb3.org/jar_file_reverse/jar_file_reverse.html
You will be able to reverse at package level at see package relation but also to see clases from one package having relation to other packages. Once the project has been reversed you can reorganize it as a model and give this documentation to the implementation team.
SonarJ is a good tool to do that, but it is expensive.
Another very good tool is XDepend, which is cheaper. For your purpose, I would recommand you this tool. The best choice in terms of quality/price I think.
With much less functionalities, you can use a Sonar (Free and OpenSource) analysis and its dependencies matrix.
Do the classes use packages in a normal fashion or are all the classes in the same package? If the first case is true, I'd consider writing a special-purpose tool to do the first cut.
This is exactly the kind of use case I build degraph for.
It allows you to define slices, i.e. sets of classes that belong together, and visualizes them as one collapsible node. You jars to be would be slices that you can tweak until they don't have any more cyclic dependencies, at which point they can become their own jar.
This makes it easy to spot dependencies between slices that you need to break. Since you can open the slice node and see all the contained classes it makes it also easy to identify possible refactorings (introducing interfaces, moving classes ..) to achieve you goal.
Can impact analysis be done in Eclipse? If there are a few classes and methods that need to be changed, finding the impact of that change on rest of the application code (other classes and methods)
The core issue is when there is code apart from core java that is XML, JSP, framework code etc
One of the most advanced project on this topic might be XRay.
You can try it and check if that does provide some of the answer you are looking for (note: I have not yet tested it)
X-Ray is an open-source software visualization plug-in for the Eclipse framework. It provides System Complexity View, Class and Package Dependency View for a given Java project.
Other advanced tools exists (but are not free) for exploring code dependencies:
nWire for SO contributor extraordinaire Zviki Cohen (zvikico)
XDepend, now part of JArchitect (lets you extract, visualize, seek and control the structure of your applications and frameworks)
The most simple way (and still free) to make a quick dependency analysis remains for me:
CDA - Class Dependency Analyzer
(not directly integrated to eclipse, but very simple to use)
Simplest method is: right-click the class or method you would like to change, select "refactor" (or press alt-shift-T) and then the refactoring you propose to do (rename, move, change method signature, etc ). Then select "preview" (or next as the case may be). You'll then see the impact of the proposed change. For rename and move class, you'll also get the option to apply the changes to non-java files. Next to that, you can use the search function.
Try JRipple eclipse plugin. Its good one.
There is a plugin for jQAssisant available, which brings Test Impact Analysis to the Java world. The plugin is called jQAssistat Test Impact Analysis and available via https://github.com/jqassistant-contrib/jqassistant-test-impact-analysis-plugin.
Is there an IDE/Tool/script/something that can show call hierarchy and/or data flow in Scala+Java programs (preferably from source code).
Or (as a backup plan) is there a tool that can show it using Java bytecode? (And preferably give the option to go to source code, if provided by user).
All that, preferably integrated into an IDE and/or Maven :-)
The requirement to support Scala is crucial in this question. I Already know of and use such tools for Java, in 3 IDEs. They do not work very well (actually: at all) when Scala is involved.
TIA
Poor man's call hierarchy: Comment the method out and see where your red squigglies show up. [/me ducks]
Did you tried Eclipse?
SBT can do that. You'll have to check it out to get more information, because I haven't done it.
EDIT
Sorry, I confused things. SBT can generate component dependencies, not call hierarchy.