I have a relatively small project (50 classes) that has included 13 JAR libraries (android-support, gson, guava etc) and uses one Android library (Sherlock).
The problem is that whenever I press "Build Project" it takes around two minutes to build/deploy to a device. (However newly created android project takes no more than 5 seconds to build).
Is there something I can do with it (limit number of libs, switch to another IDE)? Or it is just usual behavior.
The project takes that long to compile because it needs to add the JAR files and Sherlock library. Every library added to the project will increase the build time of the project.
Switching to another IDE would not make a huge difference because a lot of the building is handled by the Android SDK.
Your best bet would probably be to limit the libraries of your project using elimination to see which causes the largest build time increase.
AlexR has a good answer. To add:
I use IDEA so I don't know how to do this in Eclipse but I know it can be done - probably right click context menu.
Exclude any layouts that you don't really need for dev/test from your build. I tend to put placeholder layouts in early in a project then refine them later in the project when the main code is getting to alpha quality. I exclude everything except one set, e,g. "normal" portrait, do all of my code creation then only add the layouts back in near the end.
Same goes for assets. Do you have a lot of assets or assets of large size? Exclude these, or perhaps temporarily replace with smaller ones.
All this said, the compilers used do a very good job of optimisation and as the number of classes grows, it is inevitable that your build time will increase. It's not exponential (I've never actually measured it) but it's certainly worse than linear.
I do not have direct answer to your question but I can recommend you the following steps.
You compare build+deploy of "Big" project with build of "Small" project. Did you try to compare exactly same operation for both? What the result is?
Try to create new workspace. Sometimes it helps to Eclipse.
Try to take "Small" project and move from it to "Big" one step-by step. For example add dependencies first. Try to build the project. The time did not significantly changed? Go forward and add 20 custom classes etc. The time changed? Try to find which library causes this. You can also choose other strategy: try to add classes without adding dependent jars, just comment out code that uses dependencies. This way may be more complicated but I believe still possible since your project is indeed to too big.
Good luck.
Related
What do you normally do when you want to make use of some utility which is part of some Java library? I mean, if you add it as a Maven dependency is the whole library get included in the assembled JAR?
I don't have storage problems of any sort and I don't mind the JAR to get bloated but I'm just curious if there's some strategy to reduce the JAR size in this case.
Thanks
It is even worse: You do not only add the "whole library" to your project, but also its dependencies, whether you need them or not.
Joachim Sauer is right in saying that you do not bundle dependencies into your artifact unless you want it to be runnable. But IMHO this just moves the problem to a different point. Eventually, you want to run the stuff. At some point, a runnable JAR, a WAR or an EAR is built and it will incorporate the whole dependency tree (minus the fact that you get only one version per artifact).
The Maven shade plugin can help you to minimise your jar (see Minimize an Uber Jar correctly, Using Shade-Plugin) by only adding the "necessary" classes. But this is of course tricky in general:
Classes referenced by non-Java elements are not found.
Classes used by reflection are not found.
If you have some dependency injection going on, you only use the interfaces in your code. So the implementations might get kicked out.
So your minimizedJar might in general just be too small.
As you see I don't know any general solution for this.
What are sensible approaches?
Don't build JARs that have five purposes, but build JARs that are small. To avoid a huge number of different build processes, use multi-module projects.
Don't add libraries as dependencies unless you really need them. Better duplicate a method to convert Strings instead of adding a whole String library just for this purpose.
I am rephrasing this question to make it a little more straightforward and easy to understand, hopefully.
I have roughly 30 components (internal) that go into a single web application. That means 30 different projects with their own separate POM. I use inheritance quite a bit in my POMs so one of the things they inherit is a PMD/CPD configuration to prevent code duplication.
Even though I have CPD/PMD running, it only detects duplicate code within the same project. I would like it to detect in any of my projects if there is code shared among the projects that can be refactored out. Moreover, I was looking for something that could (using the same concept/pattern) verify that no code is shared between other open source dependencies.
It would be CPD/PMD, except it would operate on the source jars. This task would consume a large amount of memory if you scan all projects and their dependencies for duplication. Right now, I would just like to apply that to internal projects. If it works, then it would be relatively easy/straightforward to scale that out.
Walter
I'm not sure I got everything but...
I'd create an aggregating module with all projects as dependencies, use the maven-dependency-plugin and it's unpack-dependencies mojo to get all dependencies sources jar (the mojo can take a classifier as parameter) and unpack-them (maybe in target/generated-sources/java, the maven build helper plugin may help here) and finally run pmd:cpd on the whole source base.
This may need some tweaking, I didn't test this at all.
It sounds like you want to find duplicate code anywhere in your 30 projects. I can't speak for PMD; I assume you tell it to make one giant project containing all the source files from the union of the projects. But yes, this would take a lot of RAM and CPU.
Another tool that does is the Java CloneDR. The CloneDR finds duplicate code whether it is exactly the same or close (e.g., a few edits) regardless of source code layout or intervening comments. It is pretty easy to set it up to process all the files in your set of projects.
Just run PMD:CPD as a stand-alone program. All it needs is a directory, and it will recurse. At least, it did for me. I moved all my source to one directory and ran the CPD gui from the batch file distributed with PMD-4.2.5 .
You can perhaps take a look at sonar :
Sonar-CPD engine that is much more scalable and can detect cross-projects duplications.
You can try Lizard for Python.
It doesn't work on source jars, though.
"Code Duplicate Detector
lizard -Eduplicate {path to your code}"
https://pypi.org/project/lizard/
PMD/CPD provides more granularity since it allows the user to specify the number of tokens before a block of code is flagged as duplicate.
https://pmd.github.io/latest/pmd_userdocs_cpd.html#cli-options-reference
I'm working on a couple of web services that use JAXB bindings for the messages (in JAX-WS or spring-ws). When using these bindings there's always some code that is automatically generated from the WSDL to bind the message objects. I'm struggling to figure out the best way I can make this work so that it's easy to work with, hard to break and integrates nicely with IDEs (mostly using eclipse).
I think there are a couple of ways to go about this. The three main options I see right now are:
Generate code, keep the source artifacts and check them into the repository. Pros: integrates easily with IDEs (source highlighting etc), works within the build system. Cons: generated code changes each time you regenerate it, possibly creating noisy commits. It's also redundant since the WSDL file is already checked in, usually.
Generate code as part of the build process. Don't keep source artifacts or only keep them in output directories. Pros: fixes all the cons from the previous one. Cons: harder to integrate with IDE, though maybe this build step can be run automatically? I currently use this on one of my projects but the first time I checkout the project it appears broken, which is a minor nuisance.
Keep generated bindings in separate libraries (jars) included with maven or manually updated jars, depending on your build process. I got the idea from a thread on java.net. This seems more stable and uses explicit versioning but seems a bit heavyweight.
Which one of these options would you implement and how? We're currently using maven and eclipse, so any ideas in that regard would be great. I think this problem generalises to most other build systems and IDE combinations though, even other languages perhaps.
I went for option 3. If you already host your own repository (and optionally CI), it's not that heavyweight. All it takes is a simple POM. It's even possible to include some utility/wrapper/builder classes (that often make life easier with generated classes) and use them in several projects.
I'd go for option 2 and generate code in the "standard" ${project.build.directory}/generated-sources/<toolname> location as part of the build process. Using generated sources is well supported by m2eclipse (use Maven > Update Project Configuration once sources have been generated) and, if I remember well, by the maven eclipse plugin as well (i.e. the folder will be added to the Java Build Path). Actually, I think NetBeans also handle this fine. Not sure for Idea.
For the generation itself, you may need the maven-jaxb2-plugin if I understood correctly.
I'm currently working on a project that contains many different Eclipse projects referencing each other to make up one large project. Is there a point where a developer should ask themselves if they should rethink the way their development project is structured?
NOTE: My project currently contains 25+ different Eclipse projects.
My general rule of thumb is I would create a new project for every reusable component. So for example if I have some isolated functionality that can be packaged say as a jar, I would create a new project so I can build,package and distribute the component independently.
Also, if there are certain projects that you do not need to make frequent changes to, you can build them only when required and keep them "closed" in eclipse to save time on indexing, etc. Even if you think that a certain component is not reusable, as long as it is separated from the rest of the code base in terms of logic/concerns you may be well served by just separating it out. Sometimes seemingly specific code might be reusable in another project or in a future version of the same project.
When compiled, a project would typically result in a jar. So if your application consists of potentially reusable components, it is ok to use a project for each.
I'm a big fan of using a lot of projects, I feel that this "breaks down" large things beyond what I can do with packages, and helps me orient and navigate.
Of course, if you're developing Eclipse plug-ins, everything would be a project anyway.
The only thing I would watch out for has to do with your source-control and it's ability to handle moves of files between projects. Subclipse had been giving me trouble with it, or maybe it's my SVN server that did.
If your project has that many sub-projects, or modules, needed to actually compose your final artifact then it is time to look at having something like Maven and setting up a multi-module project. It will a) allow you to work on each module independently without ide worries and allow easy setup in your ide (and others' IDEs) through the mvn eclipse:eclipse goal. In addition, when building your entire top level project, maven will be able to derive from list of dependencies you have described what modules need to be built in what order.
Here's a quick link via google and a link to the book Maven: The Definitive Guide, which will explain things in much better detail in chapter 6 (once you have the basics).
This will also force your project to not be explicitly tied to Eclipse. Being able to build independent from an ide means that any Joe Schmoe can come along and easily work with your code base using whatever tools he/she needs.
Create jars for the projects you don't work in often. That should greatly reduce the clutter. If you work on all the projects often, then you can add targets to your build that will jar up the respective projects for you, which condenses everything down to one file that you can then include on the class path.
An additional method is to create many different workspaces. The benefit of separate workspaces is that you can remove some of the visual clutter/ performance overhead of having lots of projects. You can use targets to jar up all of you projects and put them in a repository so you can reference them in each workspace.
At a former job the entire application was more then +170 projects. While it was rarely necessary to have all projects checked out locally, even the 30-40 projects constantly in our scope made reindexing, etc. very slow.
Yeesh. One Project for each Project. If you are using reusable projects, make them into a library for heavens sake. Break the none re-usable projects into packages, that's what they are there for.
That's a hard question and answers span from having one eclipse project at all to having one eclipse project for every single class.
My bottomline:
You can have too few projects,
and never too many (of course use
automation e.g. mvn eclipse:eclipse)
Use
-Declipse.useProjectReferences=true/false
when using maven to switch workspace
mode btw jar and project
dependencies
Use mvn release plugin to generate
consecutive releases (automatic
version increase)
Multiple projects gives you
independent versioning which is
extremely important. E.g. one dev may work on a new version of a
module while you still depends on
the previous one and you at some
point decide to upgrade to the newer
version(possibly by increasing its version in pom.xml dependency section). Or in other scenario if one
project contains a bug you downgrade
to its previous version.
Multiple projects makes you think
about the architecture more than if
you have just packages.
Multiple projects generally make
architectural problems evident more
than if you have just one project.
Anyone would like to comment on
this?
You never know if you project
evolves into OSGI/SOA/EDA where you
need separation.
Even if you're 100% sure that you
projects will be deployed as one jar
in an old way in a single jvm, it
still does not hurt(mvn assembly
plugin) to have multiple eclipse
projects for logically independent
pieces of code
BTW, the project I work on is divided into 24 eclipse projects.
Hell, we have more than 100. Projects don't cost anything.
We've trying to separate a big code base into logical modules. I would like some recommendations for tools as well as whatever experiences you might have had with this sort of thing.
The application consists of a server WAR and several rich-clients distributed in JARs. The trouble is that it's all in one big, hairy code base, one source tree of > 2k files war. Each JAR has a dedicated class with a main method, but the tangle of dependencies ensnares quickly. It's not all that bad, good practices were followed consistently and there are components with specific tasks. It just needs some improvement to help our team scale as it grows.
The modules will each be in a maven project, built by a parent POM. The process has already started on moving each JAR/WAR into it's own project, but it's obvious that this will only scratch the surface: a few classes in each app JAR and a mammoth "legacy" project with everything else. Also, there are already some unit and integration tests.
Anyway, I'm interesting in tools, techniques, and general advice to breaking up an overly large and entangled code base into something more manageable. Free/open source is preferred.
Have a look a Structure 101. It is awesome for visualizing dependencies, and showing the dependencies to break on your way to a cleaner structure.
We recently have accomplished a similar task, i.e. a project that consisted of > 1k source files with two main classes that had to be split up. We ended up with four separate projects, one for the base utility classes, one for the client database stuff, one for the server (the project is a rmi-server-client application), and one for the client gui stuff. Our project had to be separated because other applications were using the client as a command line only and if you used any of the gui classes by accident you were experiencing headless exceptions which only occurred when starting on the headless deployment server.
Some things to keep in mind from our experience:
Use an entire sprint for separating the projects (don't let other tasks interfere with the split up for you will need the the whole time of a sprint)
Use version control
Write unit tests before you move any functionality somewhere else
Use a continuous integration system (doesn't matter if home grown or out of the box)
Minimize the number of files in the current changeset (you will save yourself a lot of work when you have to undo some changes)
Use a dependency analysis tool all the way before moving classes (we have made good experiences with DependencyFinder)
Take the time to restructure the packages into reasonable per project package sets
Don't fear to change interfaces but have all dependent projects in the workspace so that you get all the compilation errors
Two advices: The first thing you need is Test suites. The second advice is to work in small steps.
If you have a strong test suite already then you're in a good position. Otherwise, I would some good high level tests (aka: system tests).
The main advantage of high level tests is that a relatively small amount of tests can get you great coverage. They will not help you in pin-pointing a bug, but you won't really need that: if you work in small steps and you make sure to run the tests after each change you'll be able to quickly detect (accidentally introduced) bugs: the root of the bug is in the small portion of the code has changed since the last time you ran the tests.
I would start with the various tasks that you need to accomplish.
I was faced with a similar task recently, given a 15 year old code base that had been made by a series of developers who didn't have any communication with one another (one worked on the project, left, then another got hired, etc, with no crosstalk). The result is a total mishmash of very different styles and quality.
To make it work, we've had to isolate the necessary functionality, distinct from the decorative fluff to make it all work. For instance, there's a lot of different string classes in there, and one person spent what must have been a great deal of time making a 2k line conversion between COleDateTime to const char* and back again; that was fluff, code to solve a task ancillary to the main goal (getting things into and out of a database).
What we ended up having to do was identify a large goal that this code accomplished, and then writing the base logic for that. When there was a task we needed to accomplish that we know had been done before, we found it and wrapped it in library calls, so that it could exist on its own. One code chunk, for instance, activates a USB device driver to create an image; that code is untouched by this current project, but called when necessary via library calls. Another code chunk works the security dongle, and still another queries remote servers for data. That's all necessary code that can be encapsulated. The drawing code, though, was built over 15 years and such an edifice to insanity that a rewrite in OpenGL over the course of a month was a better use of time than to try to figure out what someone else had done and then how to add to it.
I'm being a bit handwavy here, because our project was MFC C++ to .NET C#, but the basic principles apply:
find the major goal
identify all the little goals that make the major goal possible
Isolate the already encapsulated portions of code, if any, to be used as library calls
figure out the logic to piece it all together.
I hope that helps...
To continue Itay's answer, I suggest reading Michael Feathers' "Working Effectively With Legacy Code"(pdf). He also recommends every step to be backed by tests. There is also A book-length version.
Maven allows you to setup small projects as children of a larger one. If you want to extract a portion of your project as a separate library for other projects, then maven lets you do that as well.
Having said that, you definitely need to document your tasks, what each smaller project will accomplish, and then (as has been stated here multiple times) test, test, test. You need tests which work through the whole project, then have tests that work with the individual portions of the project which will wind up as child projects.
When you start to pull out functionality, you need additional tests to make sure that your functionality is consistent, and that you can mock input into your various child projects.