While studying the standard Java library and its classes, i couldn't help noticing that some of those classes have methods that, in my opinion, have next to no relevance to those classes' cause.
The methods i'm talking about are, for example, Integer#getInteger, which retrieves a value of some "system property", or System#arraycopy, whose purpose is well-defined by its name.
Still, both of these methods seem kinda out of place, especially the first one, which for some reason binds working with system resources to a primitive type wrapper class.
From my current point of view, such method placement policy looks like a violation of a fundamental OOP design principle: that each class must be dedicated to solving its particular set of problems and not turn itself into a Swiss army knife.
But since i don't think that Java designers are idiots, i assume that there's some logic behind a decision to place those methods right where they are. So i'd be grateful if someone could explain what that logic really is.
Thanks!
Update
A few people have hinted at the fact that Java does have its illogical things that are simply remnants of a turbulent past. I reformulate my question then: why is Java so unwilling to mark its architectural flaws as deprecated, since it's not like that the existing deprecated features are likely to be discontinued in any observable future, and making things deprecated really helps refraining from using them in newly created code?
This is a good thing to wonder about. I know about more recent features (such as generics, lambda's etc) there are several blogs and posts on mailing lists that explain the choices made by the library makers. These are very interesting to read.
In your case I expect the answer isn't too exiting. The reason they were made is hard to tell. But both classes exist since JDK1.0. In those days the quality of programming in general (and also Java and OO in particular) was perhaps lower (meaning there were fewer common practices, library makers had to invent many paradigms themselves). Also there were other constraints in those times, such as Object creation being expensive.
Many of those awkwardly designed methods and classes now have a better alternative. (See Date and the package java.time)
The arraycopy you would expect to be added to the Arrays class, but unfortunately it is not there.
Ideally the original method would be deprecated for a while and then removed. Many libraries follow this strategy. Java however is very conservative about this and only deprecates things that really should not be used (such as Thread.stop(). I don't think a method has ever been removed in Java due to deprecation. This means it is fairly easy to upgrade your software to a newer version of Java, but it comes at the cost of leaving some clutter in the libraries.
The fact that java is so conservative about keeping the new JDK/JRE versions compatible with older source code and binaries is loved and hated. For your hobby project, or a small actively developed project upgrading to a new JVM that removes deprecated functions after a few years is not too difficult. But don't forget that many projects are not actively developed or the developers have a hard time making changes securely, for instance because they lack a proper regression test. In these projects changes in APIs cost a lot of time to comply to, and run the risk of introducing bugs.
Also libraries often try to support older versions of Java as well as newer version, they will have a problem doing so when methods have been deleted.
The Integer-example is probably just a design decision. If you want to implicitly interpret a property as Integer use java.lang.Integer. Otherwise you would have to provide a getter method for each java.lang-Type. Something like:
System.getPropertyAsBoolean(String)
System.getPropertyAsByte(String)
System.getPropertyAsInteger(String)
...
And for each data type, you'd require one additional method for the default:
- System.getPropertyAsBoolean(String, boolean)
- System.getPropertyAsByte(String, byte)
...
Since java.lang-Types already have some cast abilities (Integer.valueOf(String)), I am not too surprised to find a getProperty method here. Convenience in trade for breaking principles a tiny bit.
For the System.arraycopy, I guess it is an operation that depends on the operating system. You probably copy memory from one location to another in a very efficient way. If I would want to copy an array like that, I'd look for it in java.lang.System
"I assume that there's some logic behind a decision to place those
methods right where they are."
While that is often true, I have found that when somethings off, this assumption is typically where you are mislead.
A language is in constant development, from the day someone proposes a new language to the day it is antiquated. In between those extremes are some phases that the language, go through. Especially if someone is spending money on it and wants people to use it, a very peculiar phase often occurs, just before or after the first release:
The "we need this to work yesterday" phase.
This is where stuff like this happens, you have an almost complete language, but the programmers need to do something to to show what the language can do, or a specific application needs a feature that was not designed into the language.
So where do we add this feature?
- well, where it makes most sense to that particular programmer who's task it is to "make it work yesterday".
The logic may be that, this is where the function makes the most sense, since it doesn't belong anywhere else, and it doesn't deserve a class of its own. It could also be something like: so far, we have never done an array copy, without using system.. lets put arraycopy in there, and save everyone an extra include..
in the next generation of the language, people will not move the feature, since some experienced programmers will complain. So the feature may be duplicated, and found in a place where it makes more sense.
much later, it will be marked as deprecated, and deleted, if anyone cares to clean it up..
Related
The IDE has suggested to add a getter/setter to a private field.
It is never used, and reaching the field is only from within the class.
what is the preferred coding style? keeping the never used methods?
Im asking specifically about java/kotlin but this is a general question
There are a few distinctions that you need to know about to answer this question yourself - as it depends on a ton of things; far too much to ask for and for you to write down:
For this entire answer it's important to think about the distinction between layers of code. These layers can be a bit hard to think about if the project you're imagining when thinking about layers is something small and written just by yourself. So don't do that - think about, say, Microsoft Word as a product. It's written by many people, over many years - entire departments and dev teams. It's somewhat modular (there's the "Mail Merge" system that doesn't interact, at all, with the 'show available fonts' dropdown).
What's the whole private fields, public getters/setters all about in the first place?
Fields are highly inflexible constructs. If you 'expose' them (make it public), then there is no granularity available to you. The only knobs you can twiddle with is:
You can make a field unchangable for everybody - you can't change it, nor can anybody else. (To do this, mark it final).
That's it. You can't do anything else 'to' it. You can't have more fine grained control about access (such as allowing code 'nearby' to change it, but not code further out), you can't run some code as field writes/read happen either. Perhaps you need more granularity. Keep in mind that we're trying to wrote code that will survive 10 years in an environment with 100 programmers, most of whom won't last the entire 10 years, in many different teams. So, imagine you wanted to:
Make it a field that everybody gets to read, but only 'your' code (that is, the programmers working on this particular corner of the codebase who are aware of this particular corner's rules and needs) should get to change.
Make it a field that everybody gets to read, and write, but, if its not 'your' code doing the writing, a log line should be emitted.
Make it a field that nobody gets to write (not even you - it is initialized at object creation, that's it, makes it easier to reason abou, that's why we 'handcuff' ourselves: When you need to maintain code for 10 years, limiting certain things off and having a compiler that enforces these is quite useful), and 'outsiders' can read, but you want to tweak the read a bit, for example, substitute 'the current date' when the value is blank.
And so on.
Even more importantly, is time: Sometimes you start out just wanting to expose a field to everybody right now, but later on you realize: Oh, wait, we need to emit a log line. Or: Oops, we need to return the current date if the value is blank.
If you just make a field public, you:
Do not have any of that granularity.
Even if you're okay with that now, you can't later on update your code and add stuff that needs this granularity; not without turning the field into a getter/setter pair, and that is not backwards compatible: You need to send a mail to those 100 developers or start refactoring their code which is a huge undertaking.
Hence, even if you don't see any point or purpose in giving you the granularity powers right now, it's still advised to just make that field private and add getters and setters: That way if later on some currently not forseeable request comes in (such as: Log the writes to this field, please!), you can add that feature without having to ask all other 100 developers to pull the change and edit all their branches, which is a huge undertaking.
YAGNI
A maxim in the programming world is YAGNI - You aren't gonna need it.
YAGNI is a dangerous beast - it applies -solely- to semi-local endeavours.
The basic principle of YAGNI is: Code is a flowing concept, and you should never hesitate to make improvements, especially if you can't think of a way this would break any existing usage. Hence, given that your development processes should be set up such that adding stuff is easy, don't add stuff until you need it - after all, if you add stuff even if you don't currently need it, maybe you never need it and you're now just clogging up the code for no good reason. IF somebody needs it, they can trivially add it then.
The problem with YAGNI is that predicate: YAGNI is based on the notion that making a change is quick and painless.
Imagine this scenario: The Microsoft Office development crew decides to write their own font rendering system, because what windows delivers just looks bad on HiDPI screens. So, they spend a ton of time and research on this and with much fanfare release a new version. Everybody loves it.
The OS team comes aknocking and the MS Office team decides to 'hand over' the new font rendering engine to the OS team. In order to avoid having 2 teams spend the resources on maintaining it, the next version of MS Office is pegged to only run on a new version of the OS that includes the new pipeline, and thus, the MS Office team removes the font rendering engine from it - it's now the OS's job.
Whoops, any YAGNI is now quite a big problem: If there's something foreseeable and obvious the MSOffice team needed that they didn't add (or if the Windows OS team applied YAGNI to the API they expose to apps to do font rendering stuff), then the MS Office team needs to give a call to the Windows OS team that's in another country, working on other source control, and having entirely different versioning pipelines, and ask them for a change. It'll take 2 years before it's all said and done.
Linters/stylcheckers are tools, and fairly stupid ones at that
Any warning about style or suggestion about changes are just that - suggestions. These tools aren't perfect, and will absolutely suggest very silly things from time to time. You should never apply style advice until you understand why it is given and under what circumstances it should be followed, and you should feel free to tell linters/stylecheckers to buzz off if they are wrong.
Some dev shops put out absolutist rules ('you can NEVER check in code that fails our linter tool - we have git commit hooks that enfore this!'), but those shops are misguided: They seem to think that if only you rigorously apply enough style rules, that code will therefore be well written, bug free, and performant. This is obviously entirely false. You should absolutely help programmers (and might lightly enforce this even) to help themselves and avail themselves of the tools available to write better code, but you can't beat the bird to make it sing, so to speak.
Thus, be aware that sometimes the best thing to do about a style suggestion - is to ignore it.
Back to your question
So, now you know what I'm driving at when I ask these questions, which naturally lead you to answering your own question:
Is the field even meant to be exposed in the first place? Anything you 'expose' is likely to be used by code that's relatively far removed from you (different team, different time, different context), and once you expose it, you have to continue to support it - any changes you make can't fundamentally change/remove what you exposed. So, perhaps just having a private field with no getters and setters is the best place to start:
If you're sure it makes no sense to expose it, then don't. Just leave them as private fields, the code in this source file can edit them, and other code cannot even assume this field exists - they should know nothing about it.
If you're sure it makes perfect sense to expose it; it is the very point of the class, then make a private field with public getter (and if you want, setter - do you intend for it to be mutable or not?) - even if you don't see any need to do special stuff in that getter. Java programmers expect to access properties from other source files via getters and setters and you keep the flexibility to change things later without breaking compatibility.
If you're not sure, then think about YAGNI: Is this an API that is going to be exposed so far and wide it'll affect people who cannot easily modify the codebase? Then, sorry, you're going to have to think some more and make a decision. But most likely you're not writing that kind of code, and anybody who might want to access this thing could change the code fairly easily: It'll be you, or a colleague working in the same source tree. In which case, don't think about it too long - err on the side of caution and don't make getters and setters. If someone needs em later, well, let them make the call - with the benefit of that use case they now have, they'll be more likely to make a well informed decision than you can, without that benefit.
I'm using Java as a sample. Let's say I'm developing a software and I have this
CustomList extends ArrayList<ObjectMapper>
Now this has custom functions added to ArrayList. It's designed specifically to cater to Lists of ObjectMapper instances (maybe JsonNode, TreeNode; belongs to jackson dependency). Now the entire project utilizes this. But someone mentioned this: "What if ArrayList gets deprecated or obsolete in the future?" If that does happen, we'd have to rewrite or modify the entire source code. So they called this Bad Design. How can I avoid this? Should I create an interface OurList that implements List and have it implemented as OurArrayList which will be what we will be utilizing? Please explain how it can be properly done. Say I want to be able to use ArrayList's iterate method. What's the best way to do so without compromising the project code maintainability?
You are too worried, as ArrayList most likely won't be deprecated anytime soon.
Even if deprecated, don't think of it as a promise that your code will break.
You may have heard the term, "self-deprecating humor," or humor that minimizes the speaker's importance
A deprecated class or method is like that. It is no longer important. It is so unimportant, in fact, that you should no longer use it, since it has been superseded and may cease to exist in the future.
If your worries stem from obsoletion, you should focus more on requirements & meeting demand. Things become obsolete, it's inevitable, don't let it haunt you. ArrayList becoming deprecated shouldn't be a worry unless it's deprecated for a serious problem.
Do you worry about Thread being deprecated? Vector was quote popular until it was deprecated. You'll still see it around, as it wasn't remove to allow source backwards compatibility.
You'll be wasting time if you invest it in modifying code primarily to stay "up with the trends", rather than investing that time into current problems. Worrying about ArrayList being deprecated could be distracting you from other potential problems your project may have.
I'm almost certain ArrayList will never be deprecated. It's too widely used. However, if you're certain about doing that, use CustomList implements List
I have inherited a massive system from my predecessor and I am beginning to understand how it works but I cant fathom why.
It's in java and uses interfaces which, should add an extra layer, but they add 5 or 6.
Here's how it goes when the user interface button is pressed and that calls a function which looks like this
foo.create(stuff...)
{
bar.create;
}
bar.create is exactly the same except it calls foobar.creat and that in turn calls barfoo.create. this goes on through 9 classes before it finds a function that accessed the database.
as far as I know each extra function call incurs more performance cost so this seems stupid to me.
also in the foo.create all the variables are error checked, this makes sense but in every other call the error checks happen again, it looks like cut and paste code.
This seems like madness as once the variables are checked once they should not need to be re checked as this is just wastinh processor cycles in my opinion.
This is my first project using java and interfaces so im just confused as to whats going on.
can anyone explain why the system was designed like this, what benefits/drawbacks it has and what I can do to improve it if it is bad ?
Thank you.
I suggest you look at design patterns, and see if they are being used in the project. Search for words like factory and abstract factory initially. Only then will the intentions of the previous developer be understood correctly.
Also, in general, unless you are running on a resource constrained device, don't worry about the cost of an extra call or level of indirection. If it helps your design, makes it easier to understand or open to extension, then the extra calls are worth making.
However, if there is copy-paste in the code, then that is not a good sign, and the developer probably did not know what he was doing.
It is very hard to understand what exactly is done in your software. Maybe it even makes sense. But I've seen couple of projects done by some "design pattern maniacs". It looked like they wanted to demonstrate their knowledge of all sorts of delegates, indirections, etc. Maybe it is your case.
I cannot comment on the architecture without carefully examining it, but generally speaking separation of services across different layers is a good idea. That way if you change implementation of one service, other service remains unchanged. However this will be true only if there is loose coupling between different layers.
In addition, it is generally the norm that each service handles exceptions that specifically pertains to the kind of service it provides leaving the rest to others. This also allows us to reduce the coupling between service layers.
When I receive code I have not seen before to refactor it into some sane state, I normally fix "cosmetic" things (like converting StringTokenizers to String#split(), replacing pre-1.2 collections by newer collections, making fields final, converting C-style arrays to Java-style arrays, ...) while reading the source code I have to get familiar with.
Are there many people using this strategy (maybe it is some kind of "best practice" I don't know?) or is this considered too dangerous, and not touching old code if it is not absolutely necessary is generally prefered? Or is it more common to combine the "cosmetic cleanup" step with the more invasive "general refactoring" step?
What are the common "low-hanging fruits" when doing "cosmetic clean-up" (vs. refactoring with more invasive changes)?
In my opinion, "cosmetic cleanup" is "general refactoring." You're just changing the code to make it more understandable without changing its behavior.
I always refactor by attacking the minor changes first. The more readable you can make the code quickly, the easier it will be to do the structural changes later - especially since it helps you look for repeated code, etc.
I typically start by looking at code that is used frequently and will need to be changed often, first. (This has the biggest impact in the least time...) Variable naming is probably the easiest and safest "low hanging fruit" to attack first, followed by framework updates (collection changes, updated methods, etc). Once those are done, breaking up large methods is usually my next step, followed by other typical refactorings.
There is no right or wrong answer here, as this depends largely on circumstances.
If the code is live, working, undocumented, and contains no testing infrastructure, then I wouldn't touch it. If someone comes back in the future and wants new features, I will try to work them into the existing code while changing as little as possible.
If the code is buggy, problematic, missing features, and was written by a programmer that no longer works with the company, then I would probably redesign and rewrite the whole thing. I could always still reference that programmer's code for a specific solution to a specific problem, but it would help me reorganize everything in my mind and in source. In this situation, the whole thing is probably poorly designed and it could use a complete re-think.
For everything in between, I would take the approach you outlined. I would start by cleaning up everything cosmetically so that I can see what's going on. Then I'd start working on whatever code stood out as needing the most work. I would add documentation as I understand how it works so that I will help remember what's going on.
Ultimately, remember that if you're going to be maintaining the code now, it should be up to your standards. Where it's not, you should take the time to bring it up to your standards - whatever that takes. This will save you a lot of time, effort, and frustration down the road.
The lowest-hanging cosmetic fruit is (in Eclipse, anyway) shift-control-F. Automatic formatting is your friend.
First thing I do is trying to hide most of the things to the outside world. If the code is crappy most of the time the guy that implemented it did not know much about data hiding and alike.
So my advice, first thing to do:
Turn as many members and methods as
private as you can without breaking the
compilation.
As a second step I try to identify the interfaces. I replace the concrete classes through the interfaces in all methods of related classes. This way you decouple the classes a bit.
Further refactoring can then be done more safely and locally.
You can buy a copy of Refactoring: Improving the Design of Existing Code from Martin Fowler, you'll find a lot of things you can do during your refactoring operation.
Plus you can use tools provided by your IDE and others code analyzers such as Findbugs or PMD to detect problems in your code.
Resources :
www.refactoring.com
wikipedia - List of tools for static code analysis in java
On the same topic :
How do you refactor a large messy codebase?
Code analyzers: PMD & FindBugs
By starting with "cosmetic cleanup" you get a good overview of how messy the code is and this combined with better readability is a good beginning.
I always (yeah, right... sometimes there's something called a deadline that mess with me) start with this approach and it has served me very well so far.
You're on the right track. By doing the small fixes you'll be more familiar with the code and the bigger fixes will be easier to do with all the detritus out of the way.
Run a tool like JDepend, CheckStyle or PMD on the source. They can automatically do loads of changes that are cosemetic but based on general refactoring rules.
I do not change old code except to reformat it using the IDE. There is too much risk of introducing a bug - or removing a bug that other code now depends upon! Or introducing a dependency that didn't exist such as using the heap instead of the stack.
Beyond the IDE reformat, I don't change code that the boss hasn't asked me to change. If something is egregious, I ask the boss if I can make changes and state a case of why this is good for the company.
If the boss asks me to fix a bug in the code, I make as few changes as possible. Say the bug is in a simple for loop. I'd refactor the loop into a new method. Then I'd write a test case for that method to demonstrate I have located the bug. Then I'd fix the new method. Then I'd make sure the test cases pass.
Yeah, I'm a contractor. Contracting gives you a different point of view. I recommend it.
There is one thing you should be aware of. The code you are starting with has been TESTED and approved, and your changes automatically means that that retesting must happen as you may have inadvertently broken some behaviour elsewhere.
Besides, everybody makes errors. Every non-trivial change you make (changing StringTokenizer to split is not an automatic feature in e.g. Eclipse, so you write it yourself) is an opportunity for errors to creep in. Do you get the exact behaviour right of a conditional, or did you by mere mistake forget a !?
Hence, your changes implies retesting. That work may be quite substantial and severely overwhelm the small changes you have done.
I don't normally bother going through old code looking for problems. However, if I'm reading it, as you appear to be doing, and it makes my brain glitch, I fix it.
Common low-hanging fruits for me tend to be more about renaming classes, methods, fields etc., and writing examples of behaviour (a.k.a. unit tests) when I can't be sure of what a class is doing by inspection - generally making the code more readable as I read it. None of these are what I'd call "invasive" but they're more than just cosmetic.
From experience it depends on two things: time and risk.
If you have plenty of time then you can do a lot more, if not then the scope of whatever changes you make is reduced accordingly. As much as I hate doing it I have had to create some horrible shameful hacks because I simply didn't have enough time to do it right...
If the code you are working on has lots of dependencies or is critical to the application then make as few changes as possible - you never know what your fix might break... :)
It sounds like you have a solid idea of what things should look like so I am not going to say what specific changes to make in what order 'cause that will vary from person to person. Just make small localized changes first, test, expand the scope of your changes, test. Expand. Test. Expand. Test. Until you either run out of time or there is no more room for improvement!
BTW When testing you are likely to see where things break most often - create test cases for them (JUnit or whatever).
EXCEPTION:
Two things that I always find myself doing are reformatting (CTRL+SHFT+F in Eclipse) and commenting code that is not obvious. After that I just hammer the most obvious nail first...
I have always worked on statically typed languages (C/C++, Java). I have been playing with Clojure and I really like it.
One thing I am worried about is: say that I have a windows that takes 3 modules as arguments and along the way the requirements change and I need to pass another module to the function. I just change the function and the compiler complains everywhere I used it. But in Clojure it won't complain until the function is called. I can just do a regex search and replace but it seems there is a chance to miss a call and it will go unnoticed until that function is actually called. How do you guys deal with this?
This is one of the reasons automated testing/test driven development is even more important in dynamically typed languages. I haven't used Clojure (I mostly use Ruby), so unfortunately I can't recommend a specific testing framework.
The first thing I'd like to mention is that Bruce Eckel has written a very interesting article called Strong Typing vs Strong Testing (the link is down at the moment, unfortunately, but hopefully it will be up soon).
His idea is that when dealing with compiled languages, the compiler is just acting as the first, automatic step of automatic testing. When making the move to a dynamic language, you lose this first level of automatic testing. But in both cases, this first, automatic level is just one part of testing, and not even a very important part.
His point is that if you're developing programs properly, i.e. doing some form of tests and regression tests, the lack of a compiler will only force you to add some more, somewhat basic tests anyways, which is why it's no big loss.
So I guess the first answer I'd give you is, focus on your testing, something you should be doing anyway, and such changes shouldn't affect you too badly.
The second thing I'd like to mention is many dynamic languages that I've seen (for example, Python) have much better abilities to change what methods/classes do without breaking existing code.
For example, with Python, if your method used to accept two parameters but now requires a third one, you can always add a default parameter without breaking any existing code, but that you can now utilize. This is a very basic technique, but in Python's case (and I assume most other dynamic languages as well), these techniques can get much more interesting; since they're dynamic, you can pretty much change the implementation of functions for specific modules, change what variables mean, etc.
I'd suggest looking at which techniques Clojure has that allow similair things, and deciding if they apply in your situation.
You do the same thing you did if the method was part of a public interface that you weren't the only user of.
You add a new method with the extra module and and change the old one to call the new one with a suitable default.
Oh and if your program is that big, make sure you have good tests (test-is should make it simpler than Java)
Test coverage is definitely important. But a dynamically typed language will allow you to work in a different way. In a strongly typed language (like Java), a change in the interface needs to modify all the callers. In Ruby, you could do this-- but probably won't. Instead, you'll probably add flexibility to the method on one of a few ways. Namely:
you tend to have very few methods that take as many as three parameters in Ruby (as opposed to Java). Because you don't have strong typed interface of Java, you break the problem down into smaller pieces and steps. It's much more common to write methods that take just 1 parameter, and then refactor when it becomes more complex.
it's possible-- and common-- to leave the old behavior in place while adding more arguments. For example, if you have to add a third argument to a two argument method, you will set its default value to preserve the old behavior (and save you a refactor). If you are familiar with Javascript libraries like jQuery, they take advantage of this everywhere with "optional" arguments.
similar to optional arguments, methods can grow to take a flexible parameter list. With solid test coverage, you can quite easily add a new behavior to an existing method and safely know you haven't broken the existing code. In Rails, methods like "render" take a wide range of options.
You're not completely without compiler support in Clojure. In the specific example you give, it's the arity of the function that changed, which would be picked up by compiling the Clojure code. I'm still making the strong -> dynamic typing transition and find this comforting!
You lose some level of refactoring and type safety when you move to dynamic languages. The more information the compiler has, the more it can do at compile time for you.
Tim Bray discusses it here,critique of which by Cedric is here,and a post on artima discussing it at length.
If you really need static typing, you can use https://github.com/clojure/core.typed and it's leiningen module to test static variable passing.