package structure & directory structure - java

In Java web application, what is the exact meaning of the term "package structure" and "directory structure" ? Aren't they the same? I saw some articles have these two terms, but I am not sure about the exact meaning and difference.

Package is a collection of code that changes together, is used together and is shipped together. So a jar/war is a package.
Package Design Principles
I understand that you meant source package, which is more like directory structure. But I believe, a directory is a physical representation on hard drive.
EDIT: I had writtern original answer more than 3years back. But did not change as it was accepted. But changing it now so that any new visitor may benefit and also to avoid link rot. Some additional meaning of package may be extracted based on the discussion below. For example, is a jar a package?
Classes that get reused together should be packaged together so that the package can be treated as a sort of complete product available for you. And those which are reused together should be separated away from the ones those are not reused with. For example, your Logging utility classes are not necessarily used together with your file io classes. So package all logging them separately. But logging classes could be related to one another. So create a sort of complete product for logging, say, for the want of better name commons-logging package it in a (re)usable jar and another separate complete product for io utilities, again for the want of better name, say commons-io.jar. If you update say commons-io library to say support java nio, then you may not necessarily want to make any changes to the logging library. So separating them is better.
Now, let's say you wanted your logging utility classes to support structured logging for say some sort of log analysis by tools like splunk. Some clients of your logging utility may want to update to your newer version; some others may not. So when you release a new version, package all classes which are needed and reused together for migration. So some clients of your utility classes can safely delete your old commons-logging jar and move to commons-logging-new jar. Some other clients are still ok with older jar. However no clients are needed to have both these jars (new and old) just because you forced them to use some classes for older packaged jar.
Avoid cyclic dependencies. a depend on b; b on c; c on d; but d depends on a. The scenario is obviously deterring as it will be very difficult to define layers or modules, etc and you cannot vary them independly relative to each other.
Also, you could package your classes such that if a layer or module changes, other module or layers do not have to change necessarily. So, for example, if you decide to go from old MVC framework to a rest APIs upgrade, then only view and controller may need changes; your model does not.

In most Java applications, the package structure should be matched by the directory structure for the .java and .class files. However these directories are part of a larger directory structure, including other data than the source and/or the bytecode.
Depending on the context, the "package structure" might also refer to delivery packages, each containing an application or a library.

Related

How to better handle having the same classes in two JARs

I wrote two jars. Each one of them is responsible for sending different http/https request.
Each one of them uses, naturally, certain same classes. Like the ones that builds the requests or send them. The process might be a bit different, but still the general structure and classes names are the same.
Building different jars per request is a requirement from my managers! So using 1 jar for all my http requests is not acceptable.
Now, in my client program I need to send a request one time for JarA and one time from JarB. But compilation fails because, naturally, I am using very similar namings for the classes and methods.
For example, I have a UserData class in both jars. So when I try to use it in my client program, the compiler yells: "reference to SystemData is ambiguous".
I can start improvising specific classes names for each jar, but it is ugly...
How would you suggest to solve this problem?
If the classes are identical, pull them out into a third JAR and then have the client program reference the common JAR plus JarA or JarB.
If the classes are similar but not identical, then put them into different packages. You can have classes with the same names if they're in different packages.
Put common classes in a third jar and either bundle it in the two http jars or add it to the classpath at runtime (which is the best choice will depend on how you're deploying, etc.).
Firstly you have to decide which kind of architecture you are working with.
If managers asking you to have different jar's for sake of modularization - sure it's worth to make common jar which will contain all common classes.
I suppose you should have your project built with Maven, Gradle or another build system which will help you managing dependencies.
Another issue could be if you are supposed to do 'Microservices' architecture. Then code duplication is inevitable.
To overcome same class names when you have duplication - I would recommend to have for every module different package names then.
Use a build system like maven where one can have library dependencies, to a common third jar. It maintains a repository of versioned jars.
One solution is that - if you see a same class with same package in two different jars and both jars are required in your project,
Solution
you can download the source code of that duplicate class and creat keep the same in your project with package structure. So this way JVM loads your project classes first and give first preference to invoke your project class rather then other jar's class

Resolving java package dependencies

It is time to sub-divide a platform I'm developing and I'm looking for advice on how to handle cross-component dependencies. I spose there a many cases, so I'll give an example.
I have an Address class that I want to make visible to developers. It is also referenced by classes in my.Contacts, my.Appointments, and my.Location packages - each of which I want to be separately compiled, jar-d, and delivered. Of course I want Address to be a single class - an Address works across these platform components transparently.
How should Address be packaged, built, and delivered?
Thanks!
Two thoughts:
Address sounds like a common component that can be used in different deliverables and so should be available in some common or core library
It may make sense for your components to talk to an Address interface, and the implementation can be provided separately (e.g. provide an Address interface and an AddressImpl implementation). This will reduce the amount of binding between the core library and the library your developers will develop.
In this case Address is a part of a library which deserves its own jar. If you create a class named Address in my.Contacts, my.Appointments, and my.Location and you want to use all theses jar in a same application, you'll have a conflict for your Address class.
I suggest you don't "Deliver" these jars separately. Java has very subtle versioning issues that you don't want to run into. Build everything together and package it into one or two jars and always deliver both jars, or build them together and deliver a subset of jars (but never combine new and old jars--don't just try to send a single jar as an update).
If you must build them separately be very aware that final constants are compiled in and not referenced--so if you change one and deliver a new jar, any references from an older jar will not be updated.
Also method signatures that change will have strange, unpredictable results.
It sounds like you want a developer interface as well--that may be a set of interfaces and classes that reside in a separate jar. If you make that one jar well enough that you never have to rev it (and, of course, with no references to external constants) you can probably get away with not updating it which will keep your customer's extensions from getting crusty.

What is best practice (and implications) for packaging projects into JAR's?

What is considered best practice deciding how to define the set of JAR's for a project (for example a Swing GUI)? There are many possible groupings:
JAR per layer (presentation, business, data)
JAR per (significant?) GUI panel. For significant system, this results in a large number of JAR's, but the JAR's are (should be) more re-usable - fine-grained granularity
JAR per "project" (in the sense of an IDE project); "common.jar", "resources.jar", "gui.jar", etc
I am an experienced developer; I know the mechanics of creating JAR's, I'm just looking for wisdom on best-practice.
Personally, I like the idea of a JAR per component (e.g. a panel), as I am mad-keen on encapsulation, and the holy-grail of re-use accross projects. I am concerned, however, that on a practical, performance level, the JVM would struggle class loading over dozens, maybe hundreds of small JAR's. Each JAR would contain; the GUI panel code, necessary resources (i.e. not centralised) so each panel can stand alone.
When I say "holy grail of reuse", I say this more because it demonstrates a cleanly decoupled, encapsulated design, rather than necessarily expecting its re-use elsewhere. I consider myself a "normally intelligent" person; I consider the spagetti of intertwined nonsense I've had to deal with during my career slows me down 10 to 100-fold. A cleanly decoupled design allows me to deal with one concept at a time, one layer, one class.
Does anyone have wisdom to share?
I would recommend as fewer JARs as possible.
The logic behind it, the disk storage is the cheapest commodity there available, but time spending tracing down complex dependencies is priceless.
Hence the emergence of the .war files where all dependencies of the web application are put into a single file.
BTW, Eclipse has a JAR exporter plugin which puts all dependent jars into a super jar and expose the entry level main method, so you can start your app with java -jar file.jar command. Although the resultant jar may be large, the flip side is not maintaining very complex class paths for you application.
So, in your case I would go with one jar per project. If you determine that you indeed need to reuse some code in another project, just refactor it into the base project and make it a dependency in your existent project and another project.
You can actually use both approaches. Spring for example offers a big monolithic jar file, which contains most common functionality. If you want however you can also download independent jar files. It is then left to the user to select what is best. Big jar files are easier to deploy, but they are harder to upgrade. Also you may need to add a big jar whereas you only need a simple class. I find that is is easier to spot dependencies with small jar files. I also thinK that updating/upgrading is easier.
Java provides encapsulation and re-use at the class layer - the jar file format doesn't really provide it. I don't see much advantage in putting a significant component in its own jar, unless you think lots of people will be downloading it.
I read somewhere (and I was trying to find it when I found this) that project per layer is the best. It's what I've been doing. Struts, Spring MVC, Swing, whatever in one layer, EJBs in another, business services in another and DAOs in another. I put all of the DTOs in its own project as well, even though they don't represent a layer, but are instead passed through the layers.
The main benefit I remember reading about was being able to version each jar separately.
Oh, and BTW, each layer actually has two jars, one for the interfaces that the layer above uses, and another for the implementation(s).

Split packages in plain java

OSGi has a problem with split packages, i.e. same package but hosted in multiple bundles.
Are there any edge cases that split packages might pose problems in plain java (without OSGi) ?
Just curious.
Where split packages come from
Split packages (in OSGi) occur when the manifest header Require-Bundle is used (as it is, I believe, in Eclipse's manifests). Require-Bundle names other bundles which are used to search for classes (if the package isn't Imported). The search happens before the bundles own classpath is searched. This allows the classes for a single package to be loaded from the exports of multiple bundles (probably distinct jars).
The OSGi spec (4.1) section 3.13 describes Require-Bundle and has a long list of (unexpected) consequences of using this header (ought this header be deprecated?), one section of which is devoted to split packages. Some of these consequences are bizarre (and rather OSGi-specific) but most are avoided if you understand one thing:
if a class (in a package) is provided by more than one bundle then you are in trouble.
If the package pieces are disjoint, then all should be well, except that you might not have the classes visible everywhere and package visibility members might appear to be private if viewed from a "wrong" part of a split package.
[Of course that's too simple—multiple versions of packages can be installed—but from the application's point of view at any one time all classes from a package should be sourced from a single module.]
What happens in 'standard Java'
In standard Java, without fancy class-loaders, you have a classpath, and the order of searching of jars (and directories) for classes to load is fixed and well-defined: what you get is what you get. (But then, we give up manageable modularity.)
Sure, you can have split packages—it's quite common in fact—and it is an indication of poor modularity. The symptoms can be obscure compile/build-time errors, but in the case of multiple class implementations (one over-rides the rest in a single class-path) it most often produces obscure run-time behaviour, owing to subtly-different semantics.
If you are lucky you end up looking at the wrong code—without realising it—and asking yourself "but how can that possibly be doing that?"If you are unlucky you are looking at the right code and asking exactly the same thing—because something else was producing unexpected answers.
This is not entirely unlike the old database adage: "if you record the same piece of information in two places, pretty soon it won't be the same anymore". Our problem is that 'pretty soon' isn't normally soon enough.
For OSGi packages in different bundles are different, regardless of their name, because each bundle uses its own class loader. It is not a problem but a feature, to ensure encapsulation of bundles.
So in plain Java this is normally not a problem, until you start using some framework that uses class loaders. That is typically the case when components are loaded.
Splitting packages across jars probably isn't a great idea. I suggest making all packages within jars sealed (put "Sealed: true" in the main section of the manifest). Sealed packages can't be split between jars.
In the case of OSGi, classes with the same package name but a different class loader are treated as if they are in different packages.
You'll get a nasty runtime error if you have classes in the same package and some are in a signed JAR while others are not.
Are you asking because the package in question is yours, not third party code?
An easy example would be a web app with service and persistence layers as separate OSGi bundles. The persistence interfaces would have to be shared by both bundles.
If I've interpreted your question correctly, would the solution be to create a sealed JAR containing the shared interfaces and make it part of both bundles?
I don't mean to try and hijack the thread. I'm asking for clarification and some better insight from those who might have done more with OSGi to date than I have.

How should I structure a Java application, where do I put my classes?

First of all, I know how to build a Java application. But I have always been puzzled about where to put my classes. There are proponents for organizing the packages in a strictly domain oriented fashion, others separate by tier.
I myself have always had problems with
naming,
placing
So,
Where do you put your domain specific constants (and what is the best name for such a class)?
Where do you put classes for stuff which is both infrastructural and domain specific (for instance I have a FileStorageStrategy class, which stores the files either in the database, or alternatively in database)?
Where to put Exceptions?
Are there any standards to which I can refer?
I've really come to like Maven's Standard Directory Layout.
One of the key ideas for me is to have two source roots - one for production code and one for test code like so:
MyProject/src/main/java/com/acme/Widget.java
MyProject/src/test/java/com/acme/WidgetTest.java
(here, both src/main/java and src/test/java are source roots).
Advantages:
Your tests have package (or "default") level access to your classes under test.
You can easily package only your production sources into a JAR by dropping src/test/java as a source root.
One rule of thumb about class placement and packages:
Generally speaking, well structured projects will be free of circular dependencies. Learn when they are bad (and when they are not), and consider a tool like JDepend or SonarJ that will help you eliminate them.
I'm a huge fan of organized sources, so I always create the following directory structure:
/src - for your packages & classes
/test - for unit tests
/docs - for documentation, generated and manually edited
/lib - 3rd party libraries
/etc - unrelated stuff
/bin (or /classes) - compiled classes, output of your compile
/dist - for distribution packages, hopefully auto generated by a build system
In /src I'm using the default Java patterns: Package names starting with your domain (org.yourdomain.yourprojectname) and class names reflecting the OOP aspect you're creating with the class (see the other commenters). Common package names like util, model, view, events are useful, too.
I tend to put constants for a specific topic in an own class, like SessionConstants or ServiceConstants in the same package of the domain classes.
Where I'm working, we're using Maven 2 and we have a pretty nice archetype for our projects. The goal was to obtain a good separation of concerns, thus we defined a project structure using multiple modules (one for each application 'layer'):
- common: common code used by the other layers (e.g., i18n)
- entities: the domain entities
- repositories: this module contains the daos interfaces and implementations
- services-intf: interfaces for the services (e.g, UserService, ...)
- services-impl: implementations of the services (e.g, UserServiceImpl)
- web: everything regarding the web content (e.g., css, jsps, jsf pages, ...)
- ws: web services
Each module has its own dependencies (e.g., repositories could have jpa) and some are project wide (thus they belong in the common module). Dependencies between the different project modules clearly separate things (e.g., the web layer depends on the service layer but doesn't know about the repository layer).
Each module has its own base package, for example if the application package is "com.foo.bar", then we have:
com.foo.bar.common
com.foo.bar.entities
com.foo.bar.repositories
com.foo.bar.services
com.foo.bar.services.impl
...
Each module respects the standard maven project structure:
src\
..main\java
...\resources
..test\java
...\resources
Unit tests for a given layer easily find their place under \src\test... Everything that is domain specific has it's place in the entities module. Now something like a FileStorageStrategy should go into the repositories module, since we don't need to know exactly what the implementation is. In the services layer, we only know the repository interface, we do not care what the specific implementation is (separation of concerns).
There are multiple advantages to this approach:
clear separation of concerns
each module is packageable as a jar (or a war in the case of the web module) and thus allows for easier code reuse (e.g., we could install the module in the maven repository and reuse it in another project)
maximum independence of each part of the project
I know this doesn't answer all your questions, but I think this could put you on the right path and could prove useful to others.
Class names should always be descriptive and self-explanatory. If you have multiple domains of responsibility for your classes then they should probably be refactored.
Likewise for you packages. They should be grouped by domain of responsibility. Every domain has it's own exceptions.
Generally don't sweat it until you get to a point where it is becoming overwhelming and bloated. Then sit down and don't code, just refactor the classes out, compiling regularly to make sure everything works. Then continue as you did before.
Use packages to group related functionality together.
Usually the top of your package tree is your domain name reversed (com.domain.subdomain) to guarantee uniqueness, and then usually there will be a package for your application. Then subdivide that by related area, so your FileStorageStrategy might go in, say, com.domain.subdomain.myapp.storage, and then there might be specific implementations/subclasses/whatever in com.domain.subdomain.myapp.storage.file and com.domain.subdomain.myapp.storage.database. These names can get pretty long, but import keeps them all at the top of files and IDEs can help to manage that as well.
Exceptions usually go in the same package as the classes that throw them, so if you had, say, FileStorageException it would go in the same package as FileStorageStrategy. Likewise an interface defining constants would be in the same package.
There's not really any standard as such, just use common sense, and if it all gets too messy, refactor!
One thing that I found very helpful for unit tests was to have a myApp/src/ and also myApp/test_src/ directories. This way, I can place unit tests in the same packages as the classes they test, and yet I can easily exclude the test cases when I prepare my production installation.
Short answer: draw your system architecture in terms of modules, drawn side-by-side, with each module sliced vertically into layers (e.g. view, model, persistence). Then use a structure like com.mycompany.myapp.somemodule.somelayer, e.g. com.mycompany.myapp.client.view or com.mycompany.myapp.server.model.
Using the top level of packages for application modules, in the old-fashioned computer-science sense of modular programming, ought to be obvious. However, on most of the projects I have worked on we end up forgetting to do that, and end up with a mess of packages without that top-level structure. This anti-pattern usually shows itself as a package for something like 'listeners' or 'actions' that groups otherwise unrelated classes simply because they happen to implement the same interface.
Within a module, or in a small application, use packages for the application layers. Likely packages include things like the following, depending on the architecture:
com.mycompany.myapp.view
com.mycompany.myapp.model
com.mycompany.myapp.services
com.mycompany.myapp.rules
com.mycompany.myapp.persistence (or 'dao' for data access layer)
com.mycompany.myapp.util (beware of this being used as if it were 'misc')
Within each of these layers, it is natural to group classes by type if there are a lot. A common anti-pattern here is to unnecessarily introduce too many packages and levels of sub-package so that there are only a few classes in each package.
I think keep it simple and don't over think it. Don't over abstract and layer too much. Just keep it neat, and as it grows, refactoring it is trivial. One of the best features of IDEs is refactoring, so why not make use of it and save you brain power for solving problems that are related to your app, rather then meta issues like code organisation.
One thing I've done in the past - if I'm extending a class I'll try and follow their conventions. For example, when working with the Spring Framework, I'll have my MVC Controller classes in a package called com.mydomain.myapp.web.servlet.mvc
If I'm not extending something I just go with what is simplest. com.mydomain.domain for Domain Objects (although if you have a ton of domain objects this package could get a bit unwieldy).
For domain specific constants, I actually put them as public constants in the most related class. For example, if I have a "Member" class and have a maximum member name length constant, I put it in the Member class. Some shops make a separate Constants class but I don't see the value in lumping unrelated numbers and strings into a single class. I've seen some other shops try to solve this problem by creating SEPARATE Constants classes, but that just seems like a waste of time and the result is too confusing. Using this setup, a large project with multiple developers will be duplicating constants all over the place.
I like break my classes down into packages that are related to each other.
For example:
Model For database related calls
View Classes that deal with what you see
Control Core functionality classes
Util Any misc. classes that are used (typically static functions)
etc.

Categories