Dealing with uninitialized solutions in OptaPlanner - java

I'm creating a schedule generator for a school and I am facing two challenges:
1: User feedback during construction phase
During the construction heuristic phase I'm not getting any callbacks to the bestSolutionConsumer passed in to the SolverManager.solveAndListen which means that I'm not able to give any feedback to the user during this phase. (It's only about 10 seconds or so as of today, but still annoying.)
I suspect that this is by design (judging from this question), but please correct me if I'm wrong.
(I suspect that the idea is that the construction heuristic phase should be quick anyway, and that 99% of a long running solve will be spent in the local search phase, and thus that's the only phase that actually matters. Correct?)
2: Manual placement of lectures
This scheduling program will only be semi-automatic. I'd like the user to be able to pin lectures, move lectures around manually, and even remove lectures from the schedule by putting them in a pile on the side for later placement (where the later placement could perhaps be done by OptaPlanner).
Rethink definition of initialized?
This lead me to rethink what I consider an initialized solution. If I...
want to have progress feedback even during initial placement of lectures, and
want to allow the user to interact with a schedule where only half of the lectures have been scheduled
...then maybe I should make the timeslot nullable or have a sentinel timeslot value for unscheduled lectures, and simply penalize such solutions.
In this scenario, I'm imagining that a solution is immediately and trivially initialized (all lectures initially in the unscheduled state, but formally speaking the solution is still initialized) and that the construction phase is basically skipped.
Questions
Is this stupid for some reason?! It feels like I'm throwing out a large part of OptaPlanner capabilities.
Am I overlooking any downsides of this approach?
Is it even possible to skip construction phase by doing this?
Also, repeated planning is of importance to me, and the docs say:
Repeated planning (especially real-time planning) does not mix well with a nullable planning variable.
Does the same apply also to the approach of using an unscheduled sentinel value?

1/ No, this is not stupid. It is, in fact, an example of over-constrained planning.
2/ Well, now that variables are nullable, you need to write your constraints such that nulls are acceptable. And you may run into situations where the solver will find it easier to just leave some variables null, unless there is a pretty substantial penalty for that. You may have to design special constraints to work around that, or in the worst case even custom moves.
3/ Construction heuristics are not mandatory, but they can still be useful. Even if they leave some variables null, they can still give you a decent initial solution. You may also want to try a custom phase.
4/ If you worry about some of the things above, indeed introducing a dummy value instead of making a variable nullable could solve some of those worries. (And introduce others, as every constraint now has to deal with this dummy value.)
My advice is to do a quick proof of concept. See how each of the approaches behaves. Pick the one that you prefer to deal with. There are no silver bullets.

Related

Code Standard - is it better to have a getter/setter even though they are never used?

The IDE has suggested to add a getter/setter to a private field.
It is never used, and reaching the field is only from within the class.
what is the preferred coding style? keeping the never used methods?
Im asking specifically about java/kotlin but this is a general question
There are a few distinctions that you need to know about to answer this question yourself - as it depends on a ton of things; far too much to ask for and for you to write down:
For this entire answer it's important to think about the distinction between layers of code. These layers can be a bit hard to think about if the project you're imagining when thinking about layers is something small and written just by yourself. So don't do that - think about, say, Microsoft Word as a product. It's written by many people, over many years - entire departments and dev teams. It's somewhat modular (there's the "Mail Merge" system that doesn't interact, at all, with the 'show available fonts' dropdown).
What's the whole private fields, public getters/setters all about in the first place?
Fields are highly inflexible constructs. If you 'expose' them (make it public), then there is no granularity available to you. The only knobs you can twiddle with is:
You can make a field unchangable for everybody - you can't change it, nor can anybody else. (To do this, mark it final).
That's it. You can't do anything else 'to' it. You can't have more fine grained control about access (such as allowing code 'nearby' to change it, but not code further out), you can't run some code as field writes/read happen either. Perhaps you need more granularity. Keep in mind that we're trying to wrote code that will survive 10 years in an environment with 100 programmers, most of whom won't last the entire 10 years, in many different teams. So, imagine you wanted to:
Make it a field that everybody gets to read, but only 'your' code (that is, the programmers working on this particular corner of the codebase who are aware of this particular corner's rules and needs) should get to change.
Make it a field that everybody gets to read, and write, but, if its not 'your' code doing the writing, a log line should be emitted.
Make it a field that nobody gets to write (not even you - it is initialized at object creation, that's it, makes it easier to reason abou, that's why we 'handcuff' ourselves: When you need to maintain code for 10 years, limiting certain things off and having a compiler that enforces these is quite useful), and 'outsiders' can read, but you want to tweak the read a bit, for example, substitute 'the current date' when the value is blank.
And so on.
Even more importantly, is time: Sometimes you start out just wanting to expose a field to everybody right now, but later on you realize: Oh, wait, we need to emit a log line. Or: Oops, we need to return the current date if the value is blank.
If you just make a field public, you:
Do not have any of that granularity.
Even if you're okay with that now, you can't later on update your code and add stuff that needs this granularity; not without turning the field into a getter/setter pair, and that is not backwards compatible: You need to send a mail to those 100 developers or start refactoring their code which is a huge undertaking.
Hence, even if you don't see any point or purpose in giving you the granularity powers right now, it's still advised to just make that field private and add getters and setters: That way if later on some currently not forseeable request comes in (such as: Log the writes to this field, please!), you can add that feature without having to ask all other 100 developers to pull the change and edit all their branches, which is a huge undertaking.
YAGNI
A maxim in the programming world is YAGNI - You aren't gonna need it.
YAGNI is a dangerous beast - it applies -solely- to semi-local endeavours.
The basic principle of YAGNI is: Code is a flowing concept, and you should never hesitate to make improvements, especially if you can't think of a way this would break any existing usage. Hence, given that your development processes should be set up such that adding stuff is easy, don't add stuff until you need it - after all, if you add stuff even if you don't currently need it, maybe you never need it and you're now just clogging up the code for no good reason. IF somebody needs it, they can trivially add it then.
The problem with YAGNI is that predicate: YAGNI is based on the notion that making a change is quick and painless.
Imagine this scenario: The Microsoft Office development crew decides to write their own font rendering system, because what windows delivers just looks bad on HiDPI screens. So, they spend a ton of time and research on this and with much fanfare release a new version. Everybody loves it.
The OS team comes aknocking and the MS Office team decides to 'hand over' the new font rendering engine to the OS team. In order to avoid having 2 teams spend the resources on maintaining it, the next version of MS Office is pegged to only run on a new version of the OS that includes the new pipeline, and thus, the MS Office team removes the font rendering engine from it - it's now the OS's job.
Whoops, any YAGNI is now quite a big problem: If there's something foreseeable and obvious the MSOffice team needed that they didn't add (or if the Windows OS team applied YAGNI to the API they expose to apps to do font rendering stuff), then the MS Office team needs to give a call to the Windows OS team that's in another country, working on other source control, and having entirely different versioning pipelines, and ask them for a change. It'll take 2 years before it's all said and done.
Linters/stylcheckers are tools, and fairly stupid ones at that
Any warning about style or suggestion about changes are just that - suggestions. These tools aren't perfect, and will absolutely suggest very silly things from time to time. You should never apply style advice until you understand why it is given and under what circumstances it should be followed, and you should feel free to tell linters/stylecheckers to buzz off if they are wrong.
Some dev shops put out absolutist rules ('you can NEVER check in code that fails our linter tool - we have git commit hooks that enfore this!'), but those shops are misguided: They seem to think that if only you rigorously apply enough style rules, that code will therefore be well written, bug free, and performant. This is obviously entirely false. You should absolutely help programmers (and might lightly enforce this even) to help themselves and avail themselves of the tools available to write better code, but you can't beat the bird to make it sing, so to speak.
Thus, be aware that sometimes the best thing to do about a style suggestion - is to ignore it.
Back to your question
So, now you know what I'm driving at when I ask these questions, which naturally lead you to answering your own question:
Is the field even meant to be exposed in the first place? Anything you 'expose' is likely to be used by code that's relatively far removed from you (different team, different time, different context), and once you expose it, you have to continue to support it - any changes you make can't fundamentally change/remove what you exposed. So, perhaps just having a private field with no getters and setters is the best place to start:
If you're sure it makes no sense to expose it, then don't. Just leave them as private fields, the code in this source file can edit them, and other code cannot even assume this field exists - they should know nothing about it.
If you're sure it makes perfect sense to expose it; it is the very point of the class, then make a private field with public getter (and if you want, setter - do you intend for it to be mutable or not?) - even if you don't see any need to do special stuff in that getter. Java programmers expect to access properties from other source files via getters and setters and you keep the flexibility to change things later without breaking compatibility.
If you're not sure, then think about YAGNI: Is this an API that is going to be exposed so far and wide it'll affect people who cannot easily modify the codebase? Then, sorry, you're going to have to think some more and make a decision. But most likely you're not writing that kind of code, and anybody who might want to access this thing could change the code fairly easily: It'll be you, or a colleague working in the same source tree. In which case, don't think about it too long - err on the side of caution and don't make getters and setters. If someone needs em later, well, let them make the call - with the benefit of that use case they now have, they'll be more likely to make a well informed decision than you can, without that benefit.

Choco solver propogation and search strategy interaction

I have started to use choco-solver in my work and didn't understand how propogator and search stradegies interart with each other.
I think choco has some flag that shows me if there is any domain of constraint variables that is changed during propogation. And if there is, then propogation starts again and again until no domain changes occur. And after that, if constraint still not satisfied or fails, search strategies will be connected to solving process.
But output of my progamm shows me that I'm wrong. Propogator really works 2 or 3 times, changing domains each time, but then search strategy is called.
Help me please, where am I wrong in my conclusions?
Or it should work just the way I think and there is some mistakes in my code, that lead to wrong output?
Sorry for my bad english
Choco is a Constraint Programming solver, these solvers all work according to the same principe.
Different than a brute force search, a constraint solver will first call all (relevant) propagators to remove values, from the variable domain, that it knows the variables can't take. Calling one propagator might trigger new values to become impossible and might thus trigger other propagators to run again.
Once all propagators report that they can't remove anymore values (we call this fix point), the search strategy will be consulted to see what to do next. (In general this is a guess to what should be a solution and we might need to backtrack).
If all variables have only one possible value, this is a solution. However, it can happen that in our search a variable loses all its possible values. In this case a propagator will fail. If we had already used search, we will need to backtrack. If this was at the root node, then it means the problem was unsatisfiable.
For more information, try the tutorials of several constraint solvers. A lot of them can be found on Wikipedia. You might also be able to find online courses.
To complete Dekker answer, and based on my experience, fix point is generally reached within a pretty small number of iterations in practice. This can still be slow (because there are many constraints or because "global constraints" may be slow to propagate) but it is rare that the ping-pong effect is dramatic. Choco Solver and similar solvers have many tricks to be efficient on propagation...
So it is completely ok that each propagator is called only 2-3 times before branching.

Why do we need getters?

I have read the stackoverflow page which discusses "Why use getters and setters?", I have been convinced by some of the reasons using a setter, for example: later validation, data encapsulation, etc. But what is the reason of using getters anyway? I don't see any harm of getting a value of a private field, or reasons to validation before you get the a field's value. Is it OK to never use a getter and always get a field's value using dot notation?
If a given field in a Java class be visible for reading (on the RHS of an expression), then it must also be possible to assign that field (on the LHS of an expression). For example:
class A {
int someValue;
}
A a = new A();
int value = a.someValue; // if you can do this (potentially harmless)
a.someValue = 10; // then you can also do this (bad)
Besides the above problem, a major reason for having a getter in a class is to shield the consumer of that class from implementation details. A getter does not necessarily have to simply return a value. It could return a value distilled from a Collection or something else entirely. By using a getter (and a setter), we free the consumer of the class from having to worry about the implementation changing over time.
I want to focus on practicalities, since I think you're at a point where you haven't seen the conceptual benefits line up just yet with the actual practice.
The obvious conceptual benefit is that setters and getters can be changed without impacting the outside world using those functions. Another Java-specific benefit is that all methods not marked as final are capable of being overriden, so you get the ability for subclasses to override the behavior as a bonus.
Overkill?
Yet you're probably at a point where you've heard these conceptual benefits before and it still sounds like overkill for your more daily scenarios. A difficult part of understanding software engineering practices is that they are generally designed to deal with very real world, large-scale codebases being managed by teams of developers. A lot of things are going to seem like overkill initially when you're just working on a small project of your own.
So let's get into some practical, real-world scenarios. I formerly worked in a very large-scale codebase. It a was low-level C codebase with a long legacy and sometimes barely a step above assembly, but many of the lessons I learned there translate to all kinds of languages.
Real-World Grief
In this codebase, we had a lot of bugs, and the majority of them related to state management and side effects. For example, we had cases where two fields of a structure were supposed to stay in sync with each other. The range of valid values for one field depended on the value of the other. Yet we ran into bugs where those two fields were out of sync. Unfortunately since they were just public variables with a very global scope ('global' should really be considered a degree with respect to the amount of code that can access a variable rather than an absolute), there were potentially tens of thousands of lines of code that could be the culprit.
As a simpler example, we had cases where the value of a field was never supposed to be negative, yet in our debugging sessions, we found negative values. Let's call this value that's never supposed to be negative, x. When we discovered the bugs resulting from x being negative, it was long after x was touched by anything. So we spent hours placing memory breakpoints and trying to find needles in a haystack by looking at all possible places that modified x in some way. Eventually we found and fixed the bug, but it was a bug that should have been discovered years earlier and should have been much less painful to fix.
Such would have been the case if large portions of the codebase weren't just directly accessing x and used functions like set_x instead. If that were the case, we could have done something as simple as this:
void set_x(int new_value)
{
assert(new_value >= 0);
x = new_value;
}
... and we would have discovered the culprit immediately and fixed it in a matter of minutes. Instead, we discovered it years after the bug was introduced and it took us meticulous hours of headaches to trace it down and fix.
Such is the price we can pay for ignoring engineering wisdom, and after dealing with the 10,000th issue which could have been avoided with a practice as simple as depending on functions rather than raw data throughout a codebase, if your hairs haven't all turned grey at that point, you're still generally not going to have a cheerful disposition.
The biggest value of getters and setters comes from the setters. It's the state manipulation that you generally want to control the most to prevent/detect bugs. The getter becomes a necessity simply as a result of requiring a setter to modify the data. Yet getters can also be useful sometimes when you want to exchange a raw state for a computation non-intrusively (by just changing one function's implementation), e.g.
Interface Stability
One of the most difficult things to appreciate earlier in your career is going to be interface stability (to prevent public interfaces from changing constantly). This is something that can only be appreciated with projects of scale and possibly compatibility issues with third parties.
When you're working on a small project on your own, you might be able to change the public definition of a class to your heart's content and rewrite all the code using it to update it with your changes. It won't seem like a big deal to constantly rewrite the code this way, as the amount of code using an interface might be quite small (ex: a few hundred lines of code using your class, and all code that you personally wrote).
When you work on a large-scale project and look down at millions of lines of code, changing the public definition of a widely-used class might mean that 100,000 lines of code need to be rewritten using that class in response. And a lot of that code won't even be your own code, so you have to intrusively analyze and fix other people's code and possibly collaborate with them closely to coordinate these changes. Some of these people may not even be on your team: they may be third parties writing plugins for your software or former developers who have moved on to other projects.
You really don't want to run into this scenario repeatedly, so designing public interfaces well enough to keep them stable (unchanging) becomes a key skill for your most central interfaces. If those interfaces are leaking implementation details like raw data, then the temptation to change them over and over is going to be a scenario you can face all the time.
So you generally want to design interfaces to focus on "what" they should do, not "how" they should do it, since the "how" might change a lot more often than the "what". For example, perhaps a function should append a new element to a list. However, you may want to swap out the list data structure it's using for another, or introduce a lock to make that function thread safe ("how" concerns). If these "how" concerns are not leaked to the public interface, then you can change the implementation of that class (how it's doing things) locally without affecting any of the existing code that is requesting it to do things.
You also don't want classes to do too much and become monolithic, since then your class variables will become "more global" (become visible to a lot more code even within the class's implementation) and it'll also be hard to settle on a stable design when it's already doing so much (the more classes do, the more they'll want to do).
Getters and setters aren't the best examples of such interface design, but they do avoid exposing those "how" details at least slightly better than a publicly exposed variable, and thus have fewer reasons to change (break).
Practical Avoidance of Getters/Setters
Is it OK to never use a getter and always get a field's value using dot notation?
This could sometimes be okay. For example, if you are implementing a tree structure and it utilizes a node class as a private implementation detail that clients never use directly, then trying too hard to focus on the engineering of this node class is probably going to start becoming counter-productive.
There your node class isn't a public interface. It's a private implementation detail for your tree. You can guarantee that it won't be used by anything more than the tree implementation, so there it might be overkill to apply these kinds of practices.
Where you don't want to ignore such practices is in the real public interface, the tree interface. You don't want to allow the tree to be misused and left in an invalid state, and you don't want an unstable interface which you're constantly tempted to change long after the tree is being widely used.
Another case where it might be okay is if you're just working on a scrap project/experiment as a kind of learning exercise, and you know for sure that the code you write is rather disposable and is never going to be used in any project of scale or grow into anything of scale.
Nevertheless, if you're very new to these concepts, I think it's a useful exercise even for your small scale projects to err on the side of using getters/setters. It's similar to how Mr. Miyagi got Daniel-San to paint the fence, wash the car, etc. Daniel-San finds it all pointless with his arms exhausted on top of that. Then Mr. Miyagi goes "hyah hyah hyoh hyah" throwing big punches and kicks, and using that indirect training, Daniel-San blocks all of them without realizing how he's even doing it.
In java you can't tell the compiler to allow read-only access to a public field from outside.
So exposing public fields opens the door to uncontroled modifications.
Fields are not polymorphic.
The alternative to a getter would be a public field; however, fields are not polymorphic.
This means that you cannot extend the class and "override" the field without introducing weird behaviour. Basically, the value you get will depend on how you refer to the field.
Furthermore, you can't include the field in an interface and you can't perform validation (that applies more to a setter).

Correctly using onUpgrade (and content providers) to handle updates without blocking the main thread, are `Loader`s pointless?

This is one of the questions that involves crossing what I call the "Hello World Gulf" I'm on the "Hello world" I can use SQLite and Content Providers (and resolvers) but I now need to cross to the other side, I cannot make the assumption that onUpgrade will be quick.
Now my go-to book (Wrox, Professional Android 4 development - I didn't chose it because of professional, I chose it because Wrox are like the O'Reilly of guides - O'Reilly suck at guides, they are reference book) only touches briefly on using Loaders, so I've done some searching, some more reading and so forth.
I've basically concluded a Loader is little more than a wrapper, it just does things on a different thread, and gives you a callback (on that worker thread) to process things in, it gives you 3 steps, initiating the query, using the results of the query, and resetting the query.
This seems like quite a thin wrapper, so question 1:
Why would I want to use Loaders?
I sense I may be missing something you see, most "utilities" like this with Android are really useful if you go with the grain so to speak, and as I said Loaders seem like a pretty thin wrapper, and they force me to have callback names which could become tedious of there are multiple queries going on
http://developer.android.com/reference/android/content/Loader.html
Reading that points out that "they ought to monitor the data and act upon changes" - this sounds great but it isn't obvious how that is actually done (I am thinking about database tables though)
Presentation
How should this alter the look of my application? Should I put a loading spinning thing (I'm not sure on the name, never needed them before) after a certain amount of time post activity creation? So the fragment is blank, but if X time elapses without the loader reporting back, I show a spiny thing?
Other operations
Loaders are clearly useless for updates and such, their name alone tells one this much, so any nasty updates and such would have to be wrapped by my own system for shunting work to a worker thread. This further leads me to wonder why would I want loaders?
What I think my answer is
Some sort of wrapper (at some level, content provider or otherwise) to do stuff on a worker thread will mean that the upgrade takes place on that thread, this solves the problem because ... well that's not on the main thread.
If I do write my own I can then (if I want to) ensure queries happen in a certain order, use my own data-structures (rather than Bundles) it seems that I have better control.
What I am really looking for
Discussion, I find when one knows why things are the way they are that one makes less mistakes and just generally has more confidence, I am sure there's a reason Loaders exist, and there will be some pattern that all of Android lends itself towards, I want to know why this is.
Example:
Adapters (for ListViews) it's not immediately obvious how one keeps track of rows (insert) why one must specify a default style (and why ArrayAdapter uses toString) when most of the time (in my experience, dare I say) it is subclasses, reading the source code gives one an understanding of what the Adapter must actually do, then I challenge myself "Can I think of a (better) system that meets these requirements", usually (and hopefully) my answer to that converges on how it's actually done.
Thus the "Hello World Gulf" is crossed.
I look forward to reading answers and any linked text-walls on the matter.
you shouldnt use Loaders directly, but rather LoaderManager

"Cosmetic" clean-up of old, unknown code. Which steps, which order? How invasive?

When I receive code I have not seen before to refactor it into some sane state, I normally fix "cosmetic" things (like converting StringTokenizers to String#split(), replacing pre-1.2 collections by newer collections, making fields final, converting C-style arrays to Java-style arrays, ...) while reading the source code I have to get familiar with.
Are there many people using this strategy (maybe it is some kind of "best practice" I don't know?) or is this considered too dangerous, and not touching old code if it is not absolutely necessary is generally prefered? Or is it more common to combine the "cosmetic cleanup" step with the more invasive "general refactoring" step?
What are the common "low-hanging fruits" when doing "cosmetic clean-up" (vs. refactoring with more invasive changes)?
In my opinion, "cosmetic cleanup" is "general refactoring." You're just changing the code to make it more understandable without changing its behavior.
I always refactor by attacking the minor changes first. The more readable you can make the code quickly, the easier it will be to do the structural changes later - especially since it helps you look for repeated code, etc.
I typically start by looking at code that is used frequently and will need to be changed often, first. (This has the biggest impact in the least time...) Variable naming is probably the easiest and safest "low hanging fruit" to attack first, followed by framework updates (collection changes, updated methods, etc). Once those are done, breaking up large methods is usually my next step, followed by other typical refactorings.
There is no right or wrong answer here, as this depends largely on circumstances.
If the code is live, working, undocumented, and contains no testing infrastructure, then I wouldn't touch it. If someone comes back in the future and wants new features, I will try to work them into the existing code while changing as little as possible.
If the code is buggy, problematic, missing features, and was written by a programmer that no longer works with the company, then I would probably redesign and rewrite the whole thing. I could always still reference that programmer's code for a specific solution to a specific problem, but it would help me reorganize everything in my mind and in source. In this situation, the whole thing is probably poorly designed and it could use a complete re-think.
For everything in between, I would take the approach you outlined. I would start by cleaning up everything cosmetically so that I can see what's going on. Then I'd start working on whatever code stood out as needing the most work. I would add documentation as I understand how it works so that I will help remember what's going on.
Ultimately, remember that if you're going to be maintaining the code now, it should be up to your standards. Where it's not, you should take the time to bring it up to your standards - whatever that takes. This will save you a lot of time, effort, and frustration down the road.
The lowest-hanging cosmetic fruit is (in Eclipse, anyway) shift-control-F. Automatic formatting is your friend.
First thing I do is trying to hide most of the things to the outside world. If the code is crappy most of the time the guy that implemented it did not know much about data hiding and alike.
So my advice, first thing to do:
Turn as many members and methods as
private as you can without breaking the
compilation.
As a second step I try to identify the interfaces. I replace the concrete classes through the interfaces in all methods of related classes. This way you decouple the classes a bit.
Further refactoring can then be done more safely and locally.
You can buy a copy of Refactoring: Improving the Design of Existing Code from Martin Fowler, you'll find a lot of things you can do during your refactoring operation.
Plus you can use tools provided by your IDE and others code analyzers such as Findbugs or PMD to detect problems in your code.
Resources :
www.refactoring.com
wikipedia - List of tools for static code analysis in java
On the same topic :
How do you refactor a large messy codebase?
Code analyzers: PMD & FindBugs
By starting with "cosmetic cleanup" you get a good overview of how messy the code is and this combined with better readability is a good beginning.
I always (yeah, right... sometimes there's something called a deadline that mess with me) start with this approach and it has served me very well so far.
You're on the right track. By doing the small fixes you'll be more familiar with the code and the bigger fixes will be easier to do with all the detritus out of the way.
Run a tool like JDepend, CheckStyle or PMD on the source. They can automatically do loads of changes that are cosemetic but based on general refactoring rules.
I do not change old code except to reformat it using the IDE. There is too much risk of introducing a bug - or removing a bug that other code now depends upon! Or introducing a dependency that didn't exist such as using the heap instead of the stack.
Beyond the IDE reformat, I don't change code that the boss hasn't asked me to change. If something is egregious, I ask the boss if I can make changes and state a case of why this is good for the company.
If the boss asks me to fix a bug in the code, I make as few changes as possible. Say the bug is in a simple for loop. I'd refactor the loop into a new method. Then I'd write a test case for that method to demonstrate I have located the bug. Then I'd fix the new method. Then I'd make sure the test cases pass.
Yeah, I'm a contractor. Contracting gives you a different point of view. I recommend it.
There is one thing you should be aware of. The code you are starting with has been TESTED and approved, and your changes automatically means that that retesting must happen as you may have inadvertently broken some behaviour elsewhere.
Besides, everybody makes errors. Every non-trivial change you make (changing StringTokenizer to split is not an automatic feature in e.g. Eclipse, so you write it yourself) is an opportunity for errors to creep in. Do you get the exact behaviour right of a conditional, or did you by mere mistake forget a !?
Hence, your changes implies retesting. That work may be quite substantial and severely overwhelm the small changes you have done.
I don't normally bother going through old code looking for problems. However, if I'm reading it, as you appear to be doing, and it makes my brain glitch, I fix it.
Common low-hanging fruits for me tend to be more about renaming classes, methods, fields etc., and writing examples of behaviour (a.k.a. unit tests) when I can't be sure of what a class is doing by inspection - generally making the code more readable as I read it. None of these are what I'd call "invasive" but they're more than just cosmetic.
From experience it depends on two things: time and risk.
If you have plenty of time then you can do a lot more, if not then the scope of whatever changes you make is reduced accordingly. As much as I hate doing it I have had to create some horrible shameful hacks because I simply didn't have enough time to do it right...
If the code you are working on has lots of dependencies or is critical to the application then make as few changes as possible - you never know what your fix might break... :)
It sounds like you have a solid idea of what things should look like so I am not going to say what specific changes to make in what order 'cause that will vary from person to person. Just make small localized changes first, test, expand the scope of your changes, test. Expand. Test. Expand. Test. Until you either run out of time or there is no more room for improvement!
BTW When testing you are likely to see where things break most often - create test cases for them (JUnit or whatever).
EXCEPTION:
Two things that I always find myself doing are reformatting (CTRL+SHFT+F in Eclipse) and commenting code that is not obvious. After that I just hammer the most obvious nail first...

Categories