Automatic Refactoring of Monolith

Automatic Refactoring of Monolith - java

We have a rather large-ish monolithic software we would like to refactor at a larger scale. First step will be to derive several artefacts, which can be compiled independently. Given the size of the application we would like to automate that as much as possible.
An example:
+ package1
| |
| + Service1
|
+ package2
| |
| + Service2
|
+ interfacepackage
Assuming, Service1 is only used from within package1, it should not be touched. Assuming Service2 is used from Service1 I would like to automatically generate a minimal interface for Service2, put that interface in the package interfacepackage and change the dependency within Service1 to the interface.
Doing this manually would be no trouble at all. Both Idea and Eclipse provide semi-automatic refactorings, but we would like to formulate them as meta-rules. I had hopes, that either eclipse or intellij have a programmatic interface to define such rules, but I have not been able to find them yet.
I have even found the eclipse refactoring scripts but these seem to be restricted to refactorings of named classes, so if I knew all services which should be refactored, eclipse refactoring scripts would help but not if I want to define conditions on classes to be refactored.
Where should I look for a solution?
Clarification: Comment: So what's your problem?
We have a high 3 digit number of services which make up this monolith. These are in approx. 20 different packages. The whole software is approaching 1 million lines of code. My problem is simply the size. Doing refactorings manually could take months, we might miss something doing it manually. Also, de-tangeling the services is the first step only. So we are expecting a lot of similar refactorings applied down the road.

IntelliJ IDEA has an "open api" that can be used for plugin development.
The advantage is that IntelliJ parses the java code, and the "meta model" is available to you as a plugin author.
In IntelliJ, the "AST" model refers to the "Abstract Syntax Tree". This structure is invaluable for plugins that do refactorings.
You can easily see the package structure, class names, code, and so on.
https://www.jetbrains.org/intellij/sdk/docs/basics/getting_started.html
Note! Java functionality for plugin development has been externalised as a plugin.
https://blog.jetbrains.com/platform/2019/06/java-functionality-extracted-as-a-plugin/
Please also have a look at my own plugins on github, where I have posted the source code.
https://github.com/Steve-Murphy/unencapsulate-plugin

Related

How to split a huge project into maven modules

I have a project in OSS consisting of 20,000 lines of code and 300 classes. Also, These classes are divided into two modules: front-end and back-end.
A problem has recently arisen. It is that there are many classes in one module and it takes a long time to compile.
By changing just one line, Maven will try to rebuild all the classes in that module.
To solve this, I thought of further dividing it into several modules.
The current package configuration is as follows:
.
|
|-FrontModule/
|
|-BackEndModule/
| |-com.example.package/
| | |-resolver/
| | | |-...
| | |-loader/
| | |-installer/
| | |-BackendDaemon.java
We are considering refactoring this as follows:.
.
|
|-FrontModule/
|
|-BackEndModule/
| |-com.example.package/
| | |-BackendDaemon.java
|-ResolverModule/
| |-...
|-LoaderModule/
| |-...
|-InstallerModule/
| |-...
Currently, representative instances of resolver, loader, and installer are stored in BackendDaemon's constants.
The installer also operates using an instance of resolver in BackendDaemon.
In this situation, I believe that refactoring a package into a module will invariably result in interdependencies somewhere.
Is there any way or design to somehow split this huge code into modules?
I would appreciate it if someone could answer my question.
Also, since I am using a translator to translate from Japanese, if there are expressions that are difficult to understand or things that are not conveyed, please let me know.
Thank you.

A good way to do this is following the so-called onion architecture, see for example https://dev.to/barrymcauley/onion-architecture-3fgl. The basic idea is that you have your domain/service module at the core of your model. Being at the center means that it doesn't have any dependencies towards other modules. Around it you get in the first layer of modules where you put for example access to a database, then another layer with interfaces to other systems and finally user interfaces in the last layer of modules. The idea is that you only make dependencies that point inward and that way you prevent dependency loops.
I don't know what your application is about, but going by what you put in the question I'd say you have a domain module at the core, then a resolver in the 2nd layer. The next layer has your loader and installer. You already made the outermost layer in a separate module in the past, which is the front module.
Also keep in mind that you can use dependency inversion (the D of the famous SOLID acronym) to make sure that your dependencies point the right way. So if your domain module needs to resolve something, it'll use a 'ResolverInterface' from the domain module that is implemented by a class inside the resolver module. This way there is only a dependency from the resolver module towards the domain module. The dependency inversion framework (Spring is a very popular one) will make sure that the implementation of that 'ResolverInterface' will be available at runtime via Autowiring.

I think you should ask again why you want to split the project, using JRebel can optimize your compile time. Otherwise using a maven pom parent file with its submodules is a way! or even use other build tools or build flows with a little scripting. But I recommend organizing your project not only in IDE somehow conceptually, first pick an architecture base on your project scalability, prerequisites, and feature list then plan how to move step by step.

interfaces without implementation in Java

is there a quick way to find (may be by using intellij or maven etc) all the interfaces (java) without any implementation.
I need to do above as we did some clean up of code where we removed a number of classes (and corresponding interfaces) which we think are not in use, but there could be the possibility that some interfaces may be left to be removed (due to manual error as it was huge clean up) which may be used by a bean which may cause run time exception. I am doing manual validation but if some automate way is present then it will be great.
regards
Sanjay

Try to perform (Analyze | Run Inspection by Name | Unsued declaration | Choose the necessary filters).
See the relevant documentation for more options.

Strategy to separate module from Java project

I'm separating a module from a Java project (essentially creating a separate Java Project for the module). I've been spending a lot of time to figure out the class dependencies. What strategy/tool can I use to help speed up the process?
For example, if I know for sure that I need to extract out class A into the new project, how would I quickly identify all the classes on which class A depends for successful compilation.
Approaches tried:
At the moment, I'm going by repetitive cycles of: add required classes to new project -> compile -> gather necessary classes from original project based on compiler error message -> repeat.
Generating UML Class Diagrams is not working out as most UML reverse-engineering tools seem to have some limitation or the other when it comes to identifying inheritance, associations, etc. I already tried ArgoUML.

It looks like you need "Dependency viewer" of Intellij IDEA.
It can be found under Analyze > Analalyse Dependecies...

In my experience, Eclipse has the best built-in utilities to get information about class dependencies and such. You could go into the debug view and trace each class and/or method.

Why shouldn't we use the (default)src package?

I recently started using Eclipse IDE and have read at a number of places that one shouldn't use the default(src) package and create new packages.
I just wanted to know the reason behind this.

Using the default package may create namespace collisions. Imagine you're creating a library which contains a MyClass class. Someone uses your library in his project and also has a MyClass class in his default package. What should the compiler do? Package in Java is actually a namespace which fully identifies your project. So it's important to not use the default package in the real world projects.

Originally, it was intended as a means to ensure there were no clashes between different pieces of Java code.
Because Java was meant to be run anywhere, and over the net (meaning it might pick up bits from Sun, IBM or even Joe Bloggs and the Dodgy Software Company Pty Ltd), the fact that I owned paxdiablo.com (I don't actually but let's pretend I do for the sake of this answer) meant that it would be safe to call all my code com.paxdiablo.blah.blah.blah and that wouldn't interfere with anyone else, unless they were mentally deficient in some way and used my namespace :-)
From chapter 7, "Packages", of the Java Language Spec:
Programs are organized as sets of packages. Each package has its own set of names for types, which helps to prevent name conflicts.
I actually usually start by using the default package and only move it into a real package (something fairly easy to do with the Eclipse IDE) if it survives long enough to be released to the wild.

Java uses the package as a way to differentiate between classes. By using packages, you can have an org.example.Something class and an org.example.extended.Something class and be able to differentiate between them even though they are both named Something. Since their packages are different, you can use them both in the same project.

By declaring a package you define your own namespace (for classes). This way if you have two identical classes using a different package name (namespace) will differentiate between which one you want to use.

The main reasons I can think of are:
It keeps things organised, which will help you (and others!) know where to look for classes/functionality.
You can define classes with the same name if they are in different packages.
Classes/etc in the default package cannot be imported into named packages. This means that in order to use your classes, other people will have to put all their classes in the default package too. This exacerbates the problems which reasons 1 & 2 solve.

From a java point of view, there are two general dev/deploy lifecycles you can folllow, either using ant to build and deploy, or the maven lifecycle. Both of these lifecycles look for source code and resources in local directories, and in the case of maven, in defined repositories, either locally or on the net.
The point is, when you set up a project, for development and eventually deployment, you want to build a project structure that is portable, and not dependent on the IDE, ie. your project can be built and deployed using either of your build environments. If you use a heavy dependence on the Eclipse framework for providing class variables, compile paths, etc.. you may run into the problem that your project will only build and deploy using that configurationj, and it may not be portable to another developers environment, so to speak.

How to structure applications as multiple projects and how to name the packages in Java?

I would like to know how you set up your projects in Java. For example, in my current work project, a six year old J2EE app with approximately 2 million LoC, we only have one project in Eclipse. The package structure is split into tiers and then domains, so it follows guidelines from Sun/Oracle. A huge ant-script is building different jars out of this one source-folder
Personally I think it would be better to have multiple projects, at least for each tier. Recently I was playing around with a project structure like this:
Domainproject (contains only annotated pojos, needed by all other projects)
Datalayer (only persistence)
Businesslogic (services)
Presenter
View
This way, it should be easier to exchange components. In addition, when using a build tool like Maven I can have everything in a repository so when I am only working on the frontend I can get the rest as a dependency in my classpath.
Does this makes sense to you? Do you use different approaches and how do they look like?
Furthermore I am struggling how to name my packages/projects correctly. Right now, the above project-structure reflects in the names of the packages, eg. de.myapp.view and it continues with some technical subfolders like internal or interfaces. What I am missing here, and I don't know how to do this properly, is the distinction to a certain domain. When the project gets bigger it would be nice to recognise a particular domain but also the technical details to navigate more easily within the project.
This leads to my second question: how do you name your projects and packages?

Your approach makes sense. I would normally decompose into a model (shared), numerous libraries, and then the applications consuming that code and the GUIs - all as separate projects. I tend to follow the Pragmatic Programmers' dictum of build toolsets, not applications. That way you can reassemble your components in lots of different ways.
Each library/application would be its own project, with unit/functional tests and a deliverable (in your case, a Maven artifact that you can store and version appropriately).
The only headache is managing the interfaces and linking between these components. An effective integration test environment is key here.

This leads to my second question: how
do you name your projects and
packages?
For project names i prefer an internal name like Longhorn=WinVista. This name never changes (like my kids names). So marketing, etc can register any name, rebrand, etc.
Packages are a question of (personal) preferences and style. And normally the senior programmer decides the structure. Of course there are some "standards" as "gui" for UI classes, "util","misc", "impl" for interface implementations, "domain" for domain object classes, etc that you should use consistently and express your style.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.