Code refactoring on bad system design

Code refactoring on bad system design - java

I am a junior software engineer who've been given a task to take over a old system. This system has several problems, based on my preliminary assessment.
spaghetti code
repetitive code
classes with 10k lines and above
misuse and over-logging using log4j
bad database table design
Missing source control -> I have setup Subversion for this
Missing documents -> I have no idea of the business rule, except to read the codes
How should I go about it to enhance the quality of the system and resolve such issues? I can think of using static code analysis software to resolve any bad coding practice.
However, it can't detect any bad design issues or problems. How should I go about resolving these issues step by step?

Get and read Working Effectively With Legacy Code. It deals exactly with this situation.
As others have also advised, for refactoring you need a solid set of unit tests. However, legacy code is typically very difficult to unit test as is, since it has not been written to be unit testable. So you need to refactor first to allow unit testing, which would allow you to start refactoring... a bad catch.
This is where the book will help you. It gives lots of practical advice on how to make badly designed code unit testable with the minimal, and safest possible, code changes. Automatic refactorings can also help you here, but there are tricks described in the book which can only be done by hand. Then once the first set of unit tests are in place, you can start gradually refactoring towards better, more maintainable code.
Update: For hints on how to take over legacy code, you may find this earlier answer of mine useful.
As #Alex noted, unit tests are also very useful to understand and document the actual behaviour of the code. This is especially useful when documentation about the system is nonexistent or outdated.

Focus on stability first. You can't enhance or refactor until you have some kind of stable environment in-place around the application.
Some thoughts:
Revision control. You've made a start by setting-up subversion. Now make sure that your database schemas, stored procedures, scripts, third-party components, etc. are under revision control too. Have a version labelling system, make sure you label versions and can accurately access old versions in the future.
Build and release. Have a way to build stable releases on a machine other than your dev machine. You may want to use ant/nant, make, msbuild, or even a batch file or shell script. You may need deployment scripts / installers too if they don't exist.
Get it under test. Do not change the app until you have a way to know whether your change has broken it. For this you need tests. You should hopefully be able to write xunit unit tests for some of the simpler, stand-alone classes, but try to build some system/integration tests that exercise the application as a whole. Without high code coverage (which you won't have to begin with) integration tests are your best bet. Get into the habit of running the tests as often as possible. Take every opportunity to extend them.
Make small, focussed changes. Try to identify systems/subsystems within the application, and improve the boundaries between them. This reduces the knock-on effects of changes you may make. Beware the temptation to "pretty-up" the code by reformatting it or imposing the latest fashionable design pattern. Turning-around a system like this takes time.
Documentation. Its necessary, but don't worry too much about it. System documentation is rarely used in my experience. Good tests are usually better than good documentation. Concentrate on documenting the interfaces between the application and the system context that it runs in (inputs, outputs, file structures, db schemas, etc).
Manage expectations. If its in bad shape then it will probably resist your efforts to make changes and timescales may be harder than usual to estimate. Make sure management and stakeholders understand that.
At all costs, beware the temptation to just rewrite the whole thing. Its almost never the right thing to do in this situation. If it works, concentrate on keeping it working.
As a junior developer, don't be afraid to ask for help. As others have said, Working Effectively With Legacy Code is a good book to read, as is Martin Fowler's Refactoring.
Good luck!

First, don't fix what isn't broken. As long as the system you are to take over works, leave functionality alone.
The system is obviuosly broken when it comes to maintainability, however, so that is what you tackle. As mentioned above, write some tests first, get the source backed up in a cvs, and THEN start by cleaning up small pieces first, then the larger ones and so on. Do NOT attack the bigger architectural issues until you have gained a good understanding of how the system works. Tools won't help you as long as you don't dive into the code yourself, but when you do, they do help a lot.
Remember, nothing is "perfect". Don't over-engineer. Obey the KISS and YAGNI principles.
EDIT: Added direct link to YAGNI article

Your issue #7 is by far the most important. As long as you have no idea how the system is supposed to behave, all technical considerations are secondary. Everyone is suggesting unit tests - but how can you write a useful test if you can't distinguish between wanted and unwanted behaviour?
So before you start touching the code, you have to understand the system from the user's point of view: talk to users, observe them using the system, write documentation on the use case level.
Yes, I am seriously suggesting that you spend days, more likely weeks, without changing a single line of code. Because right now, any change you make is likely to break things without you realizing it.
Once you understand the app, you'll at least know which functionality is important to test (manually or automated).

Write some unit tests first, and make sure they pass. Then with each refactoring change you make, just keep making sure the tests keep passing. Then you can be confident that your application behaviour to the outside world hasn't changed.
This also has the added benefit that the tests will always be there, so for any future changes the tests should still pass, guarding against any regressions in the new changes.

First and foremost, make sure you have source control system installed and all source code is versioned and can be built.
Next, you can try writing unit test for core parts of your system. From there, when you have a more or less solid body of regression tests, you can actually proceed with refactoring.
When I encounter messy codebase, I usually start with renaming poorly-named types and methods to better reflect their initial intent. Next you can try splitting huge methods into smaller ones.

Keep in mind that this legacy system, with all it's spaghetti code, currently works. Don't go changing things just because they don't look as pretty as they should. Focus on stability, new features & familiarity before ripping old code out left right and centre.

Firstly, let me say that Working Effectively with Legacy Code is probably a really good book to read, judging by three answers within a minute of each other.
bad database table design
This one, you are probably stuck with. If you try to change an existing database design you are probably committing yourself to redesigning the whole system and writing migration tools for the existing data. Leave well alone.

My standard answer to this question is: Refactor the Low-hanging Fruit. In this case, I'd be inclined to take one of the 10K-line classes and seek out opportunities to Sprout Class, but that's just my own proclivity; you might be more comfortable changing other things first (setting up source control was an excellent first step!) Test what you can; refactor what can't be tested, take a step at a time, and make it better.
Keep in mind as you progress how much better you are making things; if you concentrate only on how bad things still are, you're likely to become discouraged.

As others have noted, don't change something that works just to make it prettier. The risk that you will introduce errors is great.
My philosophy is: As I have to make changes to satisfy new requirements or to fix reported bugs, I try to make the piece of code that I have to change a little cleaner. I'm going to have to test the changed code anyway, so now is a good time to do a little clean-up at small additional cost.
Fundamental design changes are the toughest and must be saved for occasions where you have to make a big enough change that you would be testing all the changed code anyway.
Changing bad database design is hardest of all because the poorly designed tables are likely used by many programs. Any change to the database requires changing every program that reads or writes it. The best way to accomplish this is usually to try to reduce the number of places that access any given part of the database. To take a simple example: Suppose there are 20 places that read through customer records and calculate the customer account balance. Replace this with one function that reads the database and returns the total, and twenty calls to that function. Now you can change the schema for the customer records and there is only one piece of code to change instead of 20. The principle is simple enough, but in practice it is unlikely that every function that accesses a given record is doing the same thing. Even if the original programmer was clumsy enough to write the same code 20 times (not unlikely -- I've seen plenty of that), the real situation is probably not that he wrote 1 function 20 times, period, but that he wrote function A 20 times, function B 12 times, function C 4 times, etc.

Working Effectively With Legacy Code might be helpful.

Design issues are very difficult to catch. The first place to start is understanding the design of the application. I find it useful to diagram using either UML or a process flow diagram, anything works that communicates the design and working for the application.
From there I go into more detail, and ask myself the questions "Would I have done it this way", what other options are there. It is easy to see code-debt, i.e. the debt that we get from making bad choices, as always bad, but sometimes there are other factors involved like budget, time, availability of resources etc. Their you have to ask the question if it is worth refactoring a working but bad designed application.
If there are many upcoming new features, changes, bug fixes, etc I would say it is good to refactor, but if the application rarely changes and is stable, then maybe leaving it as is is a better approach.
Another sidepoint to note, is that if the code is used by another application as a service or module, then refactoring might first mean create a stub around the code that servers as the interfaces, once that is defined clearly and has unit test to prove it work. You can choose any technology to fill in the details.

A good book on this subject is Working Effectively with Legacy Code By Michael Feathers (2004). It goes through the process of making small changes, while working towards a bigger clean up.
Write unit test & Find and remove duplicate code.
Write unit test & Break long methods into a series of short methods.
Write unit test & Find and remove duplicate method.
Write unit test & Break apart classes so that the follow the single responsibility principle.

Try to create some unit tests first that can trigger some actions in your code.
Commit everyting in SVN and TAG it (in case that something goes bad you'll have an escape pod).
Use inCode Eclipse plugin http://www.intooitus.com/inCode.html and look for what refactorings it proposes. Check if the refactorings proposed seem ok for your proble. Try to understand them.
Retest with the units created before.
Now you can use FindBugs and/or PMD to check for other subtle issues.
If everything is oka you might want to check-in again.
I'd also try reading the source in order to detect some cases where patterns can be applied.

Related

How should I create a Java application, where do I place the class?

First of all, I know how to build a Java application. But I have always been puzzled about where to put my classes. There are proponents for organizing the packages in a strictly domain oriented fashion, others separate by tier.
I have a problems with
naming and placing
So,
Where do you put your domain specific constants?
Where do you place class for stuff which is both infrastructural and domain specific (for instance I have a FileStorageStrategy class, which stores the files either in the database, or alternatively in database)?
thx

Here's an excellent approach to coming up with your own style and approach to class naming and organization style:
Find a great IDE. I really like Eclipse. It provides all of the features that follow.
Identify a naming style and organizational layout. This is an inductive approach to developing your own effective best practices.
Ask yourself: does #2 help you or are you fighting with it?
Archive your system frequently. Use a great configuration management system like git or GitHub! It will make #5 easier.
Make small changes. The IDE should allow you to rename class at any point of your system and make the changes globally. Moving classes should be equally easy. Doing this will prevent getting hamstrung by analysis: analysis paralysis.
Loop to #2. Yes, infinite loop. I've been doing this loop for a very long time. (Duh, it's an infinite loop!) In fact, sometimes I thrash: I flip-flop between 2 styles.
While in your infinite loop, examine online style guides and coding standards. Google and NASA JPL have great ones. Also, look at the great open-source APIs to see how the community of developers name and organize. If someone else is going to work with your code, you don't want to confuse them in any way. Worst case, they'll ignore what you did and rewrite it.
Don't be afraid to experiment. Don't be afraid to make mistakes. Don't always rely on what others do or have done. Good luck.

Beginning Android and Java

I started a job as an appdeveloper for android. After a week they figured that I'm not fast enough getting the apps ready so the fired me and offered me an internship, for free.
I'm wondering how I can get better fast. Should I read more (I'm missing architecture stuff I think)? What should I read? should I just start coding stuff, or should I do a course of some sort?
I'm reading design patterns from head first. Will Java design patterns help me with Android or are they not applicable on android. Basically I'm at home reading android stuff. The company said they would take me back when I'm good enough so I need to get better quick.

Best way to learning ( imho ) is by looking in to already coded project and disecting them. Try to understand how the program is made up. Then work on some tutorials and expand then!
Resources:
Java: http://download.oracle.com/javase/tutorial/java/index.html
Android tutorials: http://p-xr.com

I assume you are the only developer in a very small company. If I'm wrong and there are multiple developers then they should be doing design and architecture and you shouldn't need to worry about that--just code the sections they give you to code.
So the rest of this is based on the assumption that you are the sole developer.
Don't worry about architecture or design or anything--Get a GUI together as fast as you possibly can--it's all about what you can show. Even if it has nothing at all behind it, it's visible progress.
A well architected solution will save you bunches of time in the futre, but from the sound of it you don't have the experience, and also a well architected job is front-loaded.. a lot more time spent before you can show anything. Doesn't sound like that will make your employers happy.
I'd actually spend 2 or 3 hours some evening researching some of the "Project Management and Design portions of XP (you may have to do this on your own time at first--but then if you're getting paid $0 all your time is your own time right now)
XP is all about your process being transparent to your customer (in this case your customer is your employer). This allows them to understand why something might take time and it actually allows them to make decisions on the fly to correct problems or speed the process by eliminating features.
I'd start by writing down each task on a piece of paper. Each of these tasks should take a couple hours to a day or two to do so make them pretty fine-grained but don't write down a time estimate, instead when you are done with the cards go through them and rank them all in difficulty 1-5 (The purpose becomes clear when you see how XP does time estimations).
Then you start interacting with your employeer---they choose which tasks they want done for the day/week. You do them.
As you are executing tasks you may encounter new ones or have to break some up. This is expected--make more sheets of paper. If your employers want to know when it will be ready, look into how to do burndown charts and time prediction. In XP this practice does a good job of taking into account the fact that programmers suck at estimating time (It uses your history and the difficulty 1-5 ranking to determine how long a "2" actually takes you to do.
Soon after implementing this practice, decide on what metrics they want you to maintain in order to get paid, or see if they want to pay you per task.
On the other hand they are only treating you like that because you let them. You're fired now, start looking for another job. If you find one, just walk away.

Best for learning is doing. All reading will get yo nowhere if you don't practice.
And the design pattern may help, but much of the classical wisdom may not directly apply - if only for the fact that the devices have limited memory.
Take the JavaBeans approach, where a bean has to have a getter and setter. On Android (before 2.3 iirc), this had a negative impact on performance, so that making fields public and directly using them is preferred.
To learn, take an open source project (e.g. my Zwitscher app), use it, try to find out what you want to improve and improve it.

Firstly, make sure that your employer was justified in his reasons for firing you. Some of the questions you should be asking them are:
- Were the deadlines not respected? In this case, if you did give them timelines for completion of phases of your work, then you may need to look at improving your effort estimation. If they gave you timelines, see if they were reasonable
- Were features added to your work on an ad-hoc basis? Many times I've found that a freeze on requirements is consistently broken
- If you are slow and haven't picked up some of the basics, then come up with a sample application that closely resembles your employer's application, and DO it. Use SO, developer docs and books to clear doubts and hone your skills.
Best of Luck with Programming :)

Is there any tool/software that decouples Java Swing GUI logic and business logic?

I have been working on Java Swing project. Its design is pretty poor. I have been given task to make its design better. Primarily I was thinking that I will decouple Java Swing code and the business code by following MVC pattern.
Doing this stuff manually seems error prone. Is there any tool/software that decouples GUI layer code (written in Java Swing) and the business layer code (written in core Java).

As far as I can tell the answer to that question is "No, there is not such a tool."
I would hazard a guess to say that it is not even likely that someone could make such a tool. The problem is that thereis no easy way to distinquish between "business logic" and "GUI" code.
On projects that I have worked on, we made a concerted effort in the design phase to keep the GUI and the business logic separate, but I can't count the number of times that it was unavoidable that there was cross-over, especially when some business rule drove how the GUI should behave.
I don't envy your task. Dealing with, and modifying existing code, especially code that you didn't create, IS error prone. And the whole concept of MVC programming is really just way to think about how you build a program not a set of existing library of tools to use.
I learned MVC programming back in Smalltalk days, when it was originally invented by the smart people at Xerox Parc, and even they couldn't avoid some cross-over coupling of code.

Automated tools work well when the input is well defined and structured. Based on what you have described, your inputs are far from that.
Yes, doing this work is going to be complicated and a lot of work. But this is something that you just need to roll up your sleeves and dive into it. Before you start touching the code, sit down and make sure you fully understand how things are laid out now (and why that is a bad thing). Then start planning how you would do it better.
Now that you have a plan of how things should be, start working with the code and updating the legacy code.

There isn't a tool, but there are techniques.
Refactoring is a practice of making small changes to code to alter it's architecture without altering it's function. You've already probably done refactoring to a degree, but sometimes reading up on a subject you know about can help sharpen one's skills.
Many IDEs do support some of the simple refactorings. Eclipse and NetBeans have different strengths in their refactoring support, and depending on your current environment, you may find one to be more useful than the other.
What refactoring won't do for you is to give you a roadmap of how to get from point A (your spaghetti code) to point B (your cleanly separated out MVC). Instead it will just make you a better driver along the path. Unfortunately, (or fortunately for all the developers out there) good brains are still needed to figure out what sections of code intend to do, and whether they are more appropriately placed in the Model, View, or Controller. The rest is just teasing out the dependencies in such a manner that you can eventually move the code up or down in the sequence, and eventually push it into a method that can then be moved to the appropriate class.
Good luck.

I also don't know of any tool. And as others here - kind of - point out, I think such tools would end up making programmers redundant. We don't want that, do we. :-D
But, checking manually if you actually managed to decouple the code should not be to complicated. Just make sure your "business part" imports no swing packages:
find in files "swing"
move all those files in one package
check if there is any "business logic" in there => take it out
...

ADO.net principles in java

I'm asking on opinion about implementing framework that emulates ado.net in java (data tables, data sets etc). Idea is to avoid writing domain objects since they are, so far, just used to transport data from server to client and back, with no particular business methods inside them. Main goal is to speed up development time.
Could i benefit from writing this kind of framework? If it's done before provide link please.

Instead of emulating ADO.NET, I would use an ORM tool such as Hibernate. This way you can have a library that handles all of your SQL and persistence needs on it's own, and you don't have to worry about dealing with a pseudo-table like structure.
I find working with strongly-typed domain objects to be far, far easier and quicker to develop (and test) than working with SQL, resultsets, etc.

Whether it's ADO.NET or anything else, I've learned that my first step should be to not write any libraries or frameworks. Instead, I get on with the job.
The second time I'm tempted to write a library or framework, I typically think about it for a while, decide not to write it, and get on with the job.
The third time I'm tempted, I'll usually give into the temptation. By that time, I will have written a bunch of code that worked without the library or framework. I will refactor that code (that real, used, tested code). I'll do this iteratively, running my automated unit tests constantly, and writing new tests as necessary. When I'm interrupted in this task (as will happen, since writing frameworks may not be part of my job), I can be certain that I'm leaving running code behind, even though it's only half finished.
When I get back to it, I'll continue refactoring up to a point, remembering that I should get some experience using what I've refactored before going too much further in the direction I think it should go. With real code using it, I won't have to guess which direction to carry the refactoring.

How do you begin designing a large system? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
It's been mentioned to me that I'll be the sole developer behind a large new system. Among other things I'll be designing a UI and database schema.
I'm sure I'll receive some guidance, but I'd like to be able to knock their socks off. What can I do in the meantime to prepare, and what will I need to keep in mind when I sit down at my computer with the spec?
A few things to keep in mind: I'm a college student at my first real programming job. I'll be using Java. We already have SCM set up with automated testing, etc...so tools are not an issue.

Do you know much about OOP? If so, look into Spring and Hibernate to keep your implementation clean and orthogonal. If you get that, you should find TDD a good way to keep your design compact and lean, especially since you have "automated testing" up and running.
UPDATE:
Looking at the first slew of answers, I couldn't disagree more. Particularly in the Java space, you should find plenty of mentors/resources on working out your application with Objects, not a database-centric approach. Database design is typically the first step for Microsoft folks (which I do daily, but am in a recovery program, er, Alt.Net). If you keep the focus on what you need to deliver to a customer and let your ORM figure out how to persist your objects, your design should be better.

This sounds very much like my first job. Straight out of university, I was asked to design the database and business logic layer, while other people would take care of the UI. Meanwhile the boss was looking over my shoulder, unwilling to let go of what used to be his baby and was now mine, and poking his finger in it. Three years later, developers were fleeing the company and we were still X months away from actually selling anything.
The big mistake was in being too ambitious. If this is your first job, you will make mistakes and you will need to change how things work long after you've written them. We had all sorts of features that made the system more complicated than it needed to be, both on the database level and in the API that it presented to other developers. In the end, the whole thing was just far too complicated to support all at once and just died.
So my advice:
If you're not sure about taking on such a big job single-handed, don't. Tell your employers, and get them to find or hire somebody for you to work with who can help you out. If people need to be added to the project, then it should be done near the start rather than after stuff starts going wrong.
Think very carefully about what the product is for, and to boil it down to the simplest set of requirements you can think of. If the people giving you the spec aren't technical, try to see past what they've written to what will actually work and make money. Talk to customers and salespeople, and understand the market.
There's no shame in admitting you're wrong. If it turns out that the entire system needs to be rewritten, because you made some mistake in your first version, then it's better to admit this as soon as possible so you can get to it. Correspondingly, don't try to make an architecture that can anticipate every possible contingency in your first version, because you don't know what every contingency is and will just get it wrong. Write once with an eye to throwing away and starting again - you may not have to, the first version may be fine, but admit it if you do.

I also disagree about starting with the database. The DB is simply an artifact of how your business objects are persisted. I don't know of an equivalent in Java, but .Net has stellar tools such as SubSonic that allow your DB design to stay fluid as you iterate through your business objects design. I'd say first and foremost (even before deciding on what technologies to introduce) focus on the process and identify your nouns and verbs ... then build out from those abstractions. Hey, it really does work in the "real world", just like OOP 101 taught you!

Before you start coding, plan out your database schema - everything else will flow from that. Getting the database reasonably correct early on will save you time and headaches later.

The main thing is being able to abstract the complexity of the system so that you don't get bogged down by it as soon as you start off.
First read the spec like a story (skimming through it). Don't stop at every requirement to analyze it right there and then. This will allow you to get an overall picture of the system without too many details. At this point you would start identifying the major functional components of the system. Start putting these down (use a mindmap tool if you like).
Then take each component and start exploding it (and tying each detail with requirements in the spec document). Do this for all components, till you have covered all requirements.
Now, you should start looking at relationships between the components, and whether there are repetitions of features or functions across the various components (which you can then pull out to create utility components, or such). Around now, you would have a good detailed map of your requirements in your mind.
NOW, you should think of designing the database, ER diagrams, Class Design, DFDs, deployment, etc.
The problem with doing the last step first is that you can get bogged down in the complexity of your system without really gaining an overall understanding in the first place.

I do it the other way around. I find that doing it database-schema-first gets the system stuck in a data-driven-design that is difficult to abstract from persistence. We try to do domain model designs first and then base the database schema on those.
And then there's the infrastructure design: the team should settle on conventions on how to structure the program first and foremost. And then we work together to agree first on a design for the common functionality of the system (e.g., things everyone needs like persistence, logging, etc.). This becomes the framework of the system.
We all work on that together first before we split the rest of the functionalities amongst ourselves.

It has been my experience that Java applications (.NET also) that consider the database last are highly likely to perform poorly when placed into a corporate environment. You need to really think about your audience. You didn't say if it was a web app or not. Either way the infrastructure that you are implementing on is important when considering how you handle your data.
No matter what methodology you consider, how you get and save your data and it's impact on performance should be right up there as one of your #1 priorities.

I'd suggest thinking about how this application will be used. How will future users work with it? I'm sure you know at least a few things about what this application needs to handle, but my first advice is "think of the user and what he or she needs".
Draw it up on plain paper, thinking of where to section off the code. Remeber not to mix logic with GUI code (common error). This way you will be set to extend your applications reach in the future to servlets and/or applets or whatever platform comes along. Section in layers so that you can respond to large changes faster without rebuilding everything. Layers should not see any other layers than their closest neighbouring layers.
Begin with true core functionallity. All that time consuming fluff (that will make your project 4 weeks late), wont matter much to the wast majority of users. It can be added later once you are sure you can deliver on time.
Btw. Even though this has nothing to do with design I'd just like to say that you won't deliver on time. Make a realistic estimate on time consumption and then double it :-) I assume here that you will not be alone in this project and that people will come and go as the project progresses. You may need to train people midway through the project, people go on holiday / need surgery etc.

Split the big system to smaller pieces.
And don't think that it's so complex, because it usually isn't. By thinking too complex it just ruins your thoughts and eventually the design. Some point you just realize that you could do the same thing easier, and then you redesign it.
Atleast this has been my major mistake in designing.
Keep it simple!

I found very insightful ideas about starting a new large project, based on
common good practices
Test Driven Development
and pragmatic approach
in the book Growing Object-Oriented Software, Guided by Tests.
It is still under development, but first 3 chapters may be what You are looking for and IMHO worth reading.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.