Java super-tuning, a few questions - java

Before I ask my question can I please ask not to get a lecture about optimising for no reason.
Consider the following questions purely academic.
I've been thinking about the efficiency of accesses between root (ie often used and often accessing each other) classes in Java, but this applies to most OO languages/compilers. The fastest way (I'm guessing) that you could access something in Java would be a static final reference. Theoretically, since that reference is available during loading, a good JIT compiler would remove the need to do any reference lookup to access the variable and point any accesses to that variable straight to a constant address. Perhaps for security reasons it doesn't work that way anyway, but bear with me...
Say I've decided that there are some order of operations problems or some arguments to pass at startup that means I can't have a static final reference, even if I were to go to the trouble of having each class construct the other as is recommended to get Java classes to have static final references to each other. Another reason I might not want to do this would be... oh, say, just for example, that I was providing platform specific implementations of some of these classes. ;-)
Now I'm left with two obvious choices. I can have my classes know about each other with a static reference (on some system hub class), which is set after constructing all classes (during which I mandate that they cannot access each other yet, thus doing away with order of operations problems at least during construction). On the other hand, the classes could have instance final references to each other, were I now to decide that sorting out the order of operations was important or could be made the responsibility of the person passing the args - or more to the point, providing platform specific implementations of these classes we want to have referencing each other.
A static variable means you don't have to look up the location of the variable wrt to the class it belongs to, saving you one operation. A final variable means you don't have to look up the value at all but it does have to belong to your class, so you save 'one operation'. OK I know I'm really handwaving now!
Then something else occurred to me: I could have static final stub classes, kind of like a wacky interface where each call was relegated to an 'impl' which can just extend the stub. The performance hit then would be the double function call required to run the functions and possibly I guess you can't declare your methods final anymore. I hypothesised that perhaps those could be inlined if they were appropriately declared, then gave up as I realised I would then have to think about whether or not the references to the 'impl's could be made static, or final, or...
So which of the three would turn out fastest? :-)
Any other thoughts on lowering frequent-access overheads or even other ways of hinting performance to the JIT compiler?
UPDATE: After running several hours of test of various things and reading http://www.ibm.com/developerworks/java/library/j-jtp02225.html I've found that most things you would normally look at when tuning e.g. C++ go out the window completely with the JIT compiler. I've seen it run 30 seconds of calculations once, twice, and on the third (and subsequent) runs decide "Hey, you aren't reading the result of that calculation, so I'm not running it!".
FWIW you can test data structures and I was able to develop an arraylist implementation that was more performant for my needs using a microbenchmark. The access patterns must have been random enough to keep the compiler guessing, but it still worked out how to better implement a generic-ified growing array with my simpler and more tuned code.
As far as the test here was concerned, I simply could not get a benchmark result! My simple test of calling a function and reading a variable from a final vs non-final object reference revealed more about the JIT than the JVM's access patterns. Unbelievably, calling the same function on the same object at different places in the method changes the time taken by a factor of FOUR!
As the guy in the IBM article says, the only way to test an optimisation is in-situ.
Thanks to everyone who pointed me along the way.

Its worth noting that static fields are stored in a special per-class object which contains the static fields for that class. Using static fields instead of object fields are unlikely to be any faster.

See the update, I answered my own question by doing some benchmarking, and found that there are far greater gains in unexpected areas and that performance for simple operations like referencing members is comparable on most modern systems where performance is limited more by memory bandwidth than CPU cycles.

Assuming you found a way to reliably profile your application, keep in mind that it will all go out the window should you switch to another jdk impl (IBM to Sun to OpenJDK etc), or even upgrade version on your existing JVM.
The reason you are having trouble, and would likely have different results with different JVM impls lies in the Java spec - is explicitly states that it does not define optimizations and leaves it to each implementation to optimize (or not) in any way so long as execution behavior is unchanged by the optimization.

Related

Java methods in different classes returning the same variables [duplicate]

I am a Java programmer who is new to the corporate world. Recently I've developed an application using Groovy and Java. All through the code I wrote used quite a good number of statics. I was asked by the senior technical lot to cut down on the number of statics used. I've googled about the same, and I find that many programmers are fairly against using static variables.
I find static variables more convenient to use. And I presume that they are efficient too (please correct me if I am wrong), because if I had to make 10,000 calls to a function within a class, I would be glad to make the method static and use a straightforward Class.methodCall() on it instead of cluttering the memory with 10,000 instances of the class, right?
Moreover statics reduce the inter-dependencies on the other parts of the code. They can act as perfect state holders. Adding to this I find that statics are widely implemented in some languages like Smalltalk and Scala. So why is this opposition to statics prevalent among programmers (especially in the world of Java)?
PS: please do correct me if my assumptions about statics are wrong.
Static variables represent global state. That's hard to reason about and hard to test: if I create a new instance of an object, I can reason about its new state within tests. If I use code which is using static variables, it could be in any state - and anything could be modifying it.
I could go on for quite a while, but the bigger concept to think about is that the tighter the scope of something, the easier it is to reason about. We're good at thinking about small things, but it's hard to reason about the state of a million line system if there's no modularity. This applies to all sorts of things, by the way - not just static variables.
Its not very object oriented:
One reason statics might be considered "evil" by some people is they are contrary the object-oriented paradigm. In particular, it violates the principle that data is encapsulated in objects (that can be extended, information hiding, etc). Statics, in the way you are describing using them, are essentially to use them as a global variable to avoid dealing with issues like scope. However, global variables is one of the defining characteristics of procedural or imperative programming paradigm, not a characteristic of "good" object oriented code. This is not to say the procedural paradigm is bad, but I get the impression your supervisor expects you to be writing "good object oriented code" and you're really wanting to write "good procedural code".
There are many gotchyas in Java when you start using statics that are not always immediately obvious. For example, if you have two copies of your program running in the same VM, will they shre the static variable's value and mess with the state of each other? Or what happens when you extend the class, can you override the static member? Is your VM running out of memory because you have insane numbers of statics and that memory cannot be reclaimed for other needed instance objects?
Object Lifetime:
Additionally, statics have a lifetime that matches the entire runtime of the program. This means, even once you're done using your class, the memory from all those static variables cannot be garbage collected. If, for example, instead, you made your variables non-static, and in your main() function you made a single instance of your class, and then asked your class to execute a particular function 10,000 times, once those 10,000 calls were done, and you delete your references to the single instance, all your static variables could be garbage collected and reused.
Prevents certain re-use:
Also, static methods cannot be used to implement an interface, so static methods can prevent certain object oriented features from being usable.
Other Options:
If efficiency is your primary concern, there might be other better ways to solve the speed problem than considering only the advantage of invocation being usually faster than creation. Consider whether the transient or volatile modifiers are needed anywhere. To preserve the ability to be inlined, a method could be marked as final instead of static. Method parameters and other variables can be marked final to permit certain compiler optimiazations based on assumptions about what can change those variables. An instance object could be reused multiple times rather than creating a new instance each time. There may be compliler optimization switches that should be turned on for the app in general. Perhaps, the design should be set up so that the 10,000 runs can be multi-threaded and take advantage of multi-processor cores. If portablity isn't a concern, maybe a native method would get you better speed than your statics do.
If for some reason you do not want multiple copies of an object, the singleton design pattern, has advantages over static objects, such as thread-safety (presuming your singleton is coded well), permitting lazy-initialization, guaranteeing the object has been properly initialized when it is used, sub-classing, advantages in testing and refactoring your code, not to mention, if at some point you change your mind about only wanting one instance of an object it is MUCH easier to remove the code to prevent duplicate instances than it is to refactor all your static variable code to use instance variables. I've had to do that before, its not fun, and you end up having to edit a lot more classes, which increases your risk of introducing new bugs...so much better to set things up "right" the first time, even if it seems like it has its disadvantages. For me, the re-work required should you decide down the road you need multiple copies of something is probably one of most compelling reasons to use statics as infrequently as possible. And thus I would also disagree with your statement that statics reduce inter-dependencies, I think you will end up with code that is more coupled if you have lots of statics that can be directly accessed, rather than an object that "knows how to do something" on itself.
Evil is a subjective term.
You don't control statics in terms of creation and destruction. They live at the behest of the program loading and unloading.
Since statics live in one space, all threads wishing to use them must go through access control that you have to manage. This means that programs are more coupled and this change is harder to envisage and manage (like J Skeet says). This leads to problems of isolating change impact and thus affects how testing is managed.
These are the two main issues I have with them.
No. Global states are not evil per se. But we have to see your code to see if you used it properly. It is quite possible that a newbie abuses global states; just like he would abuses every language feature.
Global states are absolute necessity. We cannot avoid global states. We cannot avoid reasoning about global states. - If we care to understand our application semantics.
People who try to get rid of global states for the sake of it, inevitably end up with a much more complex system - and the global states are still there, cleverly/idiotically disguised under many layers of indirections; and we still have to reason about global states, after unwrapping all the indirections.
Like the Spring people who lavishly declare global states in xml and think somehow it's superior.
#Jon Skeet if I create a new instance of an object now you have two things to reason about - the state within the object, and the state of the environment hosting the object.
If you are using the ‘static’ keyword without the ‘final’ keyword, this should be a signal to carefully consider your design. Even the presence of a ‘final’ is not a free pass, since a mutable static final object can be just as dangerous.
I would estimate somewhere around 85% of the time I see a ‘static’ without a ‘final’, it is WRONG. Often, I will find strange workarounds to mask or hide these problems.
Please don’t create static mutables. Especially Collections. In general, Collections should be initialized when their containing object is initialized and should be designed so that they are reset or forgotten about when their containing object is forgotten.
Using statics can create very subtle bugs which will cause sustaining engineers days of pain. I know, because I’ve both created and hunted these bugs.
If you would like more details, please read on…
Why Not Use Statics?
There are many issues with statics, including writing and executing tests, as well as subtle bugs that are not immediately obvious.
Code that relies on static objects can’t be easily unit tested, and statics can’t be easily mocked (usually).
If you use statics, it is not possible to swap the implementation of the class out in order to test higher level components. For example, imagine a static CustomerDAO that returns Customer objects it loads from the database. Now I have a class CustomerFilter, that needs to access some Customer objects. If CustomerDAO is static, I can’t write a test for CustomerFilter without first initializing my database and populating useful information.
And database population and initialization takes a long time. And in my experience, your DB initialization framework will change over time, meaning data will morph, and tests may break. IE, imagine Customer 1 used to be a VIP, but the DB initialization framework changed, and now Customer 1 is no longer VIP, but your test was hard-coded to load Customer 1…
A better approach is to instantiate a CustomerDAO, and pass it into the CustomerFilter when it is constructed. (An even better approach would be to use Spring or another Inversion of Control framework.
Once you do this, you can quickly mock or stub out an alternate DAO in your CustomerFilterTest, allowing you to have more control over the test,
Without the static DAO, the test will be faster (no db initialization) and more reliable (because it won’t fail when the db initialization code changes). For example, in this case ensuring Customer 1 is and always will be a VIP, as far as the test is concerned.
Executing Tests
Statics cause a real problem when running suites of unit tests together (for example, with your Continuous Integration server). Imagine a static map of network Socket objects that remains open from one test to another. The first test might open a Socket on port 8080, but you forgot to clear out the Map when the test gets torn down. Now when a second test launches, it is likely to crash when it tries to create a new Socket for port 8080, since the port is still occupied. Imagine also that Socket references in your static Collection are not removed, and (with the exception of WeakHashMap) are never eligible to be garbage collected, causing a memory leak.
This is an over-generalized example, but in large systems, this problem happens ALL THE TIME. People don’t think of unit tests starting and stopping their software repeatedly in the same JVM, but it is a good test of your software design, and if you have aspirations towards high availability, it is something you need to be aware of.
These problems often arise with framework objects, for example, your DB access, caching, messaging, and logging layers. If you are using Java EE or some best of breed frameworks, they probably manage a lot of this for you, but if like me you are dealing with a legacy system, you might have a lot of custom frameworks to access these layers.
If the system configuration that applies to these framework components changes between unit tests, and the unit test framework doesn’t tear down and rebuild the components, these changes can’t take effect, and when a test relies on those changes, they will fail.
Even non-framework components are subject to this problem. Imagine a static map called OpenOrders. You write one test that creates a few open orders, and checks to make sure they are all in the right state, then the test ends. Another developer writes a second test which puts the orders it needs into the OpenOrders map, then asserts the number of orders is accurate. Run individually, these tests would both pass, but when run together in a suite, they will fail.
Worse, failure might be based on the order in which the tests were run.
In this case, by avoiding statics, you avoid the risk of persisting data across test instances, ensuring better test reliability.
Subtle Bugs
If you work in high availability environment, or anywhere that threads might be started and stopped, the same concern mentioned above with unit test suites can apply when your code is running on production as well.
When dealing with threads, rather than using a static object to store data, it is better to use an object initialized during the thread’s startup phase. This way, each time the thread is started, a new instance of the object (with a potentially new configuration) is created, and you avoid data from one instance of the thread bleeding through to the next instance.
When a thread dies, a static object doesn’t get reset or garbage collected. Imagine you have a thread called “EmailCustomers”, and when it starts it populates a static String collection with a list of email addresses, then begins emailing each of the addresses. Lets say the thread is interrupted or canceled somehow, so your high availability framework restarts the thread. Then when the thread starts up, it reloads the list of customers. But because the collection is static, it might retain the list of email addresses from the previous collection. Now some customers might get duplicate emails.
An Aside: Static Final
The use of “static final” is effectively the Java equivalent of a C #define, although there are technical implementation differences. A C/C++ #define is swapped out of the code by the pre-processor, before compilation. A Java “static final” will end up memory resident in the JVM's class memory, making it (usually) permanent in ram. In that way, it is more similar to a “static const” variable in C++ than it is to a #define.
Summary
I hope this helps explain a few basic reasons why statics are problematic up. If you are using a modern Java framework like Java EE or Spring, etc, you may not encounter many of these situations, but if you are working with a large body of legacy code, they can become much more frequent.
There are 2 main problems with static variables:
Thread Safety - static resources are by definition not thread-safe
Code Implicity - You do not know when a static variables is instantiated and whether or not it will be instantiated before another static variable
Summarising few basic Advantages & Disadvantages of using Static methods in Java:
Advantages:
Globally accessible i.e. not tied with any particular object instance.
One instance per JVM.
Can be accessed by using class name (No object require).
Contains a single value applicable to all instances.
Load up on JVM startup and dies when JVM shuts down.
They doesn't modify state of Object.
Disadvantages:
Static members are always part of memory whether they are in use or not.
You can not control creation and destruction of static variable. Usefully they have been created at program loading and destroyed when program unload (or when JVM shuts down).
You can make statics thread safe using synchronize but you need some extra efforts.
If one thread change value of a static variable that can possibly break functionality of other threads.
You must know “static“ before using it.
You cannot override static methods.
Serialization doesn't work well with them.
They don't participate in runtime polymorphism.
There is a memory issue (to some extent but not much I guess) if a large number of static variables/methods are used. Because they will not be Garbage Collected until program ends.
Static methods are hard to test too.
Static variables are generally considered bad because they represent global state and are therefore much more difficult to reason about. In particular, they break the assumptions of object-oriented programming. In object-oriented programming, each object has its own state, represented by instance (non-static) variables. Static variables represent state across instances which can be much more difficult to unit test. This is mainly because it is more difficult to isolate changes to static variables to a single test.
That being said, it is important to make a distinction between regular static variables (generally considered bad), and final static variables (AKA constants; not so bad).
Since no one* has mentioned it: concurrency. Static variables can surprise you if you have multiple threads reading and writing to the static variable. This is common in web applications (e.g., ASP.NET) and it can cause some rather maddening bugs. For example, if you have a static variable that is updated by a page, and the page is requested by two people at "nearly the same time", one user may get the result expected by the other user, or worse.
statics reduce the inter-dependencies on the other parts of the code. They can act as perfect state holders
I hope you're prepared to use locks and deal with contention.
*Actually, Preet Sangha mentioned it.
if I had to make 10,000 calls to a function within a class, I would be
glad to make the method static and use a straightforward
class.methodCall() on it instead of cluttering the memory with 10,000
instances of the class, Right?
You have to balance the need for encapsulating data into an object with a state, versus the need of simply computing the result of a function on some data.
Moreover statics reduce the inter-dependencies on the other parts of the code.
So does encapsulation. In large applications, statics tend to produce spaghetti code and don't easily allow refactoring or testing.
The other answers also provide good reasons against excessive use of statics.
In my opinion it's hardly ever about performance, it's about design. I don't consider the use of static methods wrong as apposed of the use of static variables (but I guess you are actually talking about method calls).
It's simply about how to isolate logic and give it a good place. Sometimes that justifies using static methods of which java.lang.Math is a good example. I think when you name most of your classes XxxUtil or Xxxhelper you'd better reconsider your design.
I have just summarized some of the points made in the answers. If you find anything wrong please feel free to correct it.
Scaling: We have exactly one instance of a static variable per JVM. Suppose we are developing a library management system and we decided to put the name of book a static variable as there is only one per book. But if system grows and we are using multiple JVMs then we dont have a way to figure out which book we are dealing with?
Thread-Safety: Both instance variable and static variable need to be controlled when used in multi threaded environment. But in case of an instance variable it does not need protection unless it is explicitly shared between threads but in case of a static variable it is always shared by all the threads in the process.
Testing: Though testable design does not equal to good design but we will rarely observe a good design that is not testable. As static variables represent global state and it gets very difficult to test them.
Reasoning about state: If I create a new instance of a class then we can reason about the state of this instance but if it is having static variables then it could be in any state. Why? Because it is possible that the static variable has been modified by some different instance as static variable is shared across instances.
Serialization: Serialization also does not work well with them.
Creation and destruction: Creation and destruction of static variables can not be controlled. Usually they are created and destroyed at program loading and unloading time. It means they are bad for memory management and also add up the initialization time at start up.
But what if we really need them?
But sometimes we may have a genuine need of them. If we really feel the need of many static variables that are shared across the application then one option is to make use of Singleton Design pattern which will have all these variables. Or we can create some object which will have these static variable and can be passed around.
Also if the static variable is marked final it becomes a constant and value assigned to it once cannot be changed. It means it will save us from all the problems we face due to its mutability.
Seems to me that you're asking about static variables but you also point out static methods in your examples.
Static variables are not evil - they have its adoption as global variables like constants in most cases combined with final modifier, but as it said don't overuse them.
Static methods aka utility method. It isn't generally a bad practice to use them but major concern is that they might obstruct testing.
As a example of great java project that use a lot of statics and do it right way please look at Play! framework. There is also discussion about it in SO.
Static variables/methods combined with static import are also widely used in libraries that facilitate declarative programming in java like: make it easy or Hamcrest. It wouldn't be possible without a lot of static variables and methods.
So static variables (and methods) are good but use them wisely!
Static variables most importantly creates problem with security of data (any time changed,anyone can change,direct access without object, etc.)
For further info read this
Thanks.
It might be suggested that in most cases where you use a static variable, you really want to be using the singleton pattern.
The problem with global states is that sometimes what makes sense as global in a simpler context, needs to be a bit more flexible in a practical context, and this is where the singleton pattern becomes useful.
Yet another reason: fragility.
If you have a class, most people expect to be able to create it and use it at will.
You can document it's not the case, or protect against it (singleton/factory pattern) - but that's extra work, and therefore an additional cost.
Even then, in a big company, chances are someone will try at some point to use your class without fully paying attention to all the nice comments or the factory.
If you're using static variables a lot, that will break. Bugs are expensive.
Between a .0001% performance improvement and robustness to change by potentially clueless developers, in a lot of cases robustness is the good choice.
I find static variables more convenient to use. And I presume that they are efficient too (Please correct me if I am wrong) because if I had to make 10,000 calls to a function within a class, I would be glad to make the method static and use a straightforward class.methodCall() on it instead of cluttering the memory with 10,000 instances of the class, Right?
I see what you think, but a simple Singleton pattern will do the same without having to instantiate 10 000 objects.
static methods can be used, but only for functions that are related to the object domain and do not need or use internal properties of the object.
ex:
public class WaterContainer {
private int size;
private int brand;
...etc
public static int convertToGallon(int liters)...
public static int convertToLiters(int gallon)...
}
The issue of 'Statics being evil' is more of an issue about global state. The appropriate time for a variable to be static, is if it does not ever have more than one state; IE tools that should be accessible by the entire framework and always return the same results for the same method calls are never 'evil' as statics. As to your comment:
I find static variables more convenient to use. And I presume that they are efficient too
Statics are the ideal and efficient choice for variables/classes that do not ever change.
The problem with global state is the inherent inconsistency that it can create. Documentation about unit tests often address this issue, since any time there is a global state that can be accessed by more than multiple unrelated objects, your unit tests will be incomplete, and not 'unit' grained. As mentioned in this article about global state and singletons, if object A and B are unrelated (as in one is not expressly given reference to another), then A should not be able to affect the state of B.
There are some exceptions to the ban global state in good code, such as the clock. Time is global, and--in some sense--it changes the state of objects without having a coded relationship.
My $.02 is that several of these answers are confusing the issue, rather than saying "statics are bad" I think its better to talk about scoping and instances.
What I would say is that a static is a "class" variables - it represenst a value that is shared across all instances of that class. Typically it should be scoped that way as well (protected or private to class and its instances).
If you plan to put class-level behavior around it and expose it to other code, then a singleton may be a better solution to support changes in the future (as #Jessica suggested). This is because you can use interfaces at the instance/singleton level in ways that you can not use at the class level - in particular inheritance.
Some thoughts on why I think some of the aspects in other answers are not core to the question...
Statics are not "global". In Java scoping is controlled separately from static/instance.
Concurrency is no less dangerous for statics than instance methods. It's still state that needs to be protected. Sure you may have 1000 instances with an instance variable each and only one static variable, but if the code accessing either isn't written in a thread-safe way you are still screwed - it just may take a little longer for you to realize it.
Managing life cycle is an interesting argument, but I think it's a less important one. I don't see why its any harder to manage a pair of class methods like init()/clear() than the creation and destroying of a singleton instance. In fact, some might say a singleton is a little more complicated due to GC.
PS, In terms of Smalltalk, many of its dialects do have class variables, but in Smalltalk classes are actually instances of Metaclass so they are really are variables on the Metaclass instance. Still, I would apply the same rule of thumb. If they are being used for shared state across instances then ok. If they are supporting public functionality you should look at a Singleton. Sigh, I sure do miss Smalltalk....
There are two main questions in your post.
First, about static variables.
Static variables are completelly unnecesary and it's use can be avoided easily. In OOP languajes in general, and in Java particularlly, function parameters are pased by reference, this is to say, if you pass an object to a funciont, you are passing a pointer to the object, so you dont need to define static variables since you can pass a pointer to the object to any scope that needs this information. Even if this implies that yo will fill your memory with pointers, this will not necesary represent a poor performance because actual memory pagging systems are optimized to handle with this, and they will maintain in memory the pages referenced by the pointers you passed to the new scope; usage of static variables may cause the system to load the memory page where they are stored when they need to be accessed (this will happen if the page has not been accesed in a long time). A good practice is to put all that static stuf together in some little "configuration clases", this will ensure the system puts it all in the same memory page.
Second, about static methods.
Static methods are not so bad, but they can quickly reduce performance. For example, think about a method that compares two objects of a class and returns a value indicating which of the objects is bigger (tipical comparison method) this method can be static or not, but when invoking it the non static form will be more eficient since it will have to solve only two references (one for each object) face to the three references that will have to solve the static version of the same method (one for the class plus two, one for each object). But as I say, this is not so bad, if we take a look at the Math class, we can find a lot of math functions defined as static methods. This is really more eficient than putting all these methods in the class defining the numbers, because most of them are rarelly used and including all of them in the number class will cause the class to be very complex and consume a lot of resources unnecesarilly.
In concluson: Avoid the use of static variables and find the correct performance equilibrium when dealing with static or non static methods.
PS: Sorry for my english.
There's nothing wrong with static variables per se. It's just the Java syntax that's broken. Each Java class actually defines two structures- a singleton object which encapsulates static variables, and an instance. Defining both in the same source block is pure evil, and results in a code that's hard to read. Scala did that right.
everything (can:) have its purpose, if you have bunch of threads that needs to share/cache data and also all accessible memory (so you dont split into contexts within one JVM) the static is best choice-> of course you can force just one instance, but why?
i find some of the comments in this thread evil, not the statics ;)
Static variables are not good nor evil. They represent attributes that describe the whole class and not a particular instance. If you need to have a counter for all the instances of a certain class, a static variable would be the right place to hold the value.
Problems appear when you try to use static variables for holding instance related values.
a) Reason about programs.
If you have a small- to midsize-program, where the static variable Global.foo is accessed, the call to it normally comes from nowhere - there is no path, and therefore no timeline, how the variable comes to the place, where it is used. Now how do I know who set it to its actual value? How do I know, what happens, if I modify it right now? I have grep over the whole source, to collect all accesses, to know, what is going on.
If you know how you use it, because you just wrote the code, the problem is invisible, but if you try to understand foreign code, you will understand.
b) Do you really only need one?
Static variables often prevent multiple programs of the same kind running in the same JVM with different values. You often don't foresee usages, where more than one instance of your program is useful, but if it evolves, or if it is useful for others, they might experience situations, where they would like to start more than one instance of your program.
Only more or less useless code which will not be used by many people over a longer time in an intensive way might go well with static variables.
All the answers above show why statics are bad. The reason they are evil is because it gives the false impression that you are writing object oriented code, when in fact you are not.
That is just plain evil.
There are plenty of good answers here, adding to it,
Memory:
Static variables are live as long as the class loader lives[in general till VM dies], but this is only in-case of bulk objects/references stored as static.
Modularization:
consider concepts like IOC, dependencyInjection, proxy etc.. All are completely against tightly coupling/static implementations.
Other Con's: Thread Safety, Testability
I've played with statics a lot and may I give you a slightly different answer--or maybe a slightly different way to look at it?
When I've used statics in a class (Members and methods both) I eventually started to notice that my class is actually two classes sharing responsibility--there is the "Static" part which acts a lot like a singleton and there is the non-static part (a normal class). As far as I know you can always separate those two classes completely by just selecting all the statics for one class and non-statics for the other.
This used to happen a lot when I had a static collection inside a class holding instances of the class and some static methods to manage the collection. Once you think about it, it's obvious that your class is not doing "Just one thing", it's being a collection and the doing something completely different.
Now, let's refactor the problem a little: If you split your class up into one class where everything is static and another which is just a "Normal Class" and forget about the "Normal Class" then your question becomes purely Static class vs Singleton which is addressed in length here (and probably a dozen other questions).
Static fields are de facto GC roots (see the How Garbage Collection Works section earlier in this chapter), which means they are never garbage-collected! For convenience alone, static fields and collections are often used to hold caches or share state across threads. Mutable static fields need to be cleaned up explicitly. If the developer does not consider every possibility (a near certainty), the cleanup will not take place, resulting in a memory leak. This sort of careless programming means that static fields and collections have become the most common cause of memory leaks!
In short, never use mutable static fields—use only constants. If you think you need mutable static fields, think about it again, and then again! There's always a more appropriate technique.
I think excessive uses of global variables with static keyword will also leads to memory leakage at some point of instance in the applica
From my point of view static variable should be only read only data or variables created by convention.
For example we have a ui of some project, and we have a list of countries, languages, user roles, etc. And we have class to organize this data. we absolutely sure that app will not work without this lists. so the first that we do on app init is checking this list for updates and getting this list from api (if needed). So we agree that this data is "always" present in app. It is practically read only data so we don't need to take care of it's state - thinking about this case we really don't want to have a lot of instances of those data - this case looks a perfect candidate to be static.

Reducing potential excess of garbage when using primitive types with generics?

I have this system that allows me to create and implement type-safe variables per-player. Each variable is defined along the lines of
#Foo
public static final Bar KILLS = new Bar();
The annotation marks the variable to be picked up at runtime for registration purposes. These variables are essentially static mutator methods that adjust the underlying value for the player in question like so
KILLS.set(player, 10);
The system works great however, each type (Object, int, String, etc) is backed by a mutable type which is lazily loaded into the respective player's variable map. I was curious on the potential garbage issues this design may introduce as player count scales upward. I know some things are unavoidable due to java's autoboxing mechanism but maybe there is some room for improvement elsewhere. I'm not entirely familiar with the java memory model so excuse my explanation or lack thereof.
EDIT:
To give a bit more clarity, each variable type extends a parent class and provides a type of T respectively. These children classes are then given access to override required methods which allow them to mutate the value of the concerning player variable.
Firstly, I would like to say I am a bit confused by your mention of "garbage issues", because your context is not entirely clear whether you're referring to Java's implicit garbage collection of runtime memory or whether you're concerned about this registration data is piling up somewhere else, like a database.
If you're referring to Java's runtime memory management: Java is (as you probably know) automatically garbage collected but you can suggest (with no guarantee whatsoever) Java to collect garbage at any time with System.gc(). This is almost always a micro-optimization and provides little to no real benefit as Java is already amazing at determining the correct time to do garbage collection without you. I wouldn't even bother worrying about it in Java.
If you're referring to an external model of storage of this data, I would be concerned. By having an abstract 'setter' method, for lack of better terms, you're exposing bugs or side effects to ravage that memory. That is something that needs to be dually managed to be as bug-free as possible on the Java implementation but also you need to manage the memory on the external storage yourself.
Nonetheless, a situation like this is almost certainly indicating the backing code requires restructuring and more direction. This abstract setter method is (in my opinion) a code smell.

Do private functions use more or less computer resources than public ones?

Computer resources being RAM, possessing power, and disk space. I am just curious, even though it is more or less by a tiny itty-bitty amount.
It could, in theory, be a hair faster in some cases. In practice, they're equally fast.
Non-static, non-public methods are invoked using the invokevirtual bytecode op. This opcode requires the JVM to dynamically look up the actual's method resolution: if you have a call that's statically compiled to AbstractList::contains, should that resolve to ArrayList::contains, or LinkedList::contains, etc? What's more, the compiler can't just reuse the result of this compilation for next time; what if the next time that myList.contains(val) gets called, it's on a different implementation? So, the compiler has to do at least some amount of checking, roughly per-invocation, for non-private methods.
Private methods can't be overridden, and they're invoked using invokespecial. This opcode is used for various kind of method calls that you can resolve just once, and then never change: constructors, call to super methods, etc. For instance, if I'm in ArrayList::add and I call super.add(value) (which doesn't happen there, but let's pretend it did), then the compiler can know for sure that this refers to AbstractList::add, since a class's super class can't ever change.
So, in very rough terms, an invokevirtual call requires resolving the method and then invoking it, while an invokespecial call doesn't require resolving the method (after the first time it's called -- you have to resolve everything at least once!).
This is covered in the JVM spec, section 5.4.3:
Resolution of the symbolic reference of one occurrence of an invokedynamic instruction does not imply that the same symbolic reference is considered resolved for any other invokedynamic instruction.
For all other instructions above, resolution of the symbolic reference of one occurrence of an instruction does imply that the same symbolic reference is considered resolved for any other non-invokedynamic instruction.
(empahsis in original)
Okay, now for the "but you won't notice the difference" part. The JVM is heavily optimized for virtual calls. It can do things like detecting that a certain site always sees an ArrayList specifically, and so "staticify" the List::add call to actually be ArrayList::add. To do this, it needs to verify that the incoming object really is the expected ArrayList, but that's very cheap; and if some earlier method call has already done that work in this method, it doesn't need to happen again. This is called a monomorphic call site: even though the code is technically polymorphic, in practice the list only has one form.
The JVM optimizes monomorphic call sites, and even bimorphic call sites (for instance, the list is always an ArrayList or a LinkedList, never anything else). Once it sees three forms, it has to use a full polymorphic dispatch, which is slower. But then again, at that point you're comparing apples to oranges: a non-private, polymorphic call to a private call that's monomorphic by definition. It's more fair to compare the two kinds of monomorphic calls (virtual and private), and in that case you'll probably find that the difference is minuscule, if it's even detectible.
I just did a quick JMH benchmark to compare (a) accessing a field directly, (b) accessing it via a public getter and (c) accessing it via a private getter. All three took the same amount of time. Of course, uber-micro benchmarks are very hard to get right, because the JIT can do such wonderful things with optimizations. Then again, that's kind of the point: The JIT does such wonderful things with optimizations that public and private methods are just as fast.
Do private functions use more or less computer resources than public ones?
No. The JVM uses the same resources regardless of the access modifier on individual fields or methods.
But, there is a far better reason to prefer private (or protected) beside resource utilization; namely encapsulation. Also, I highly recommend you read The Developer Insight Series: Part 1 - Write Dumb Code.
I am just curious, even though it is more or less by a tiny itty-bitty amount.
While it is good to be curious ... if you start taking this kind of thing into account when you are programming, then:
you are liable to waste a lot of time looking for micro-optimizations that are not needed,
your code is liable to be unmaintainable because you are sacrificing good design principles, and
you even risk making your code less efficient* than it would be if you didn't optimize.
* - It it can go like this. 1) You spend a lot of time tweaking your code to run fast on your test platform. 2) When you run on the production platform, you find that the hardware gives you different performance characteristics. 3) You upgrade the Java installation, and the new JVM's JIT compiler optimizes your code differently, or it has a bunch of new optimizations that are inhibited by your tweaks. 4) When you run your code on real-world workloads, you discover that the assumption that were the basis for your tweaking are invalid.

For what specific reason does the Java language initialize the fields of objects automatically?

"The Java language automatically initializes fields of objects, in contrast to local variables of methods that the programmers are responsible for initializing. Given what you know of intra- and inter-procedural data flow analysis, explain why the language designers may have made these design choices."
Its obvious to me that its to prevent a bug. However, what exactly is that bug?
Would it be to condense the possible control flow of some given method?
Could someone go into greater detail on this? I'd really appreciate the help.
It's really easy to do intra-procedural data flow, so it's really easy to check whether a field has been initialized and give warnings if it hasn't (one can write a simplistic decidable algorithm, e.g. make sure all branches of an if initialize a variable, and if one branch doesn't, fail, even if the branch is unreachable).
It's really hard to do inter-procedural data flow, so it's really hard to check whether a field of an object has ever been initialized anywhere in the code (you quickly get into undecidable territory for any reasonable approximation).
Thus Java does the former and gives compile-time errors when it detects uninitialized local variables, but doesn't do the latter and initializes an object's fields to their defaults.
It is not always the case that they are initialized. Objects can be instantiated without invoking any constructor by using reflections in combination with the class sun.misc.Unsafe or ObjectInputStream to access these classes private native methods or directly through JNI. These are intended for the purpose of object serialization/deserialization, and expect the fields to be populated by the deserializer. As for why the designers would have chosen to eliminate direct access to these methods(ie. without reflections) it stands to reason that pointers still left in memory could be used for stack-smashing or return-to-lib-c attacks. Clearing memory allocated for these "automatically" for most programs reduces the security risk as well as reducing the chance for errors. Also note that an attempt to read a local variable that has not been initialized results in a compile error for much the same reason

What is the reason for these PMD rules?

DataflowAnomalyAnalysis: Found
'DD'-anomaly for variable 'variable'
(lines 'n1'-'n2').
DataflowAnomalyAnalysis: Found
'DU'-anomaly for variable 'variable'
(lines 'n1'-'n2').
DD and DU sound familiar...I want to say in things like testing and analysis relating to weakest pre and post conditions, but I don't remember the specifics.
NullAssignment: Assigning an Object to
null is a code smell. Consider
refactoring.
Wouldn't setting an object to null assist in garbage collection, if the object is a local object (not used outside of the method)? Or is that a myth?
MethodArgumentCouldBeFinal: Parameter
'param' is not assigned and could be
declared final
LocalVariableCouldBeFinal: Local
variable 'variable' could be declared
final
Are there any advantages to using final parameters and variables?
LooseCoupling: Avoid using
implementation types like
'LinkedList'; use the interface
instead
If I know that I specifically need a LinkedList, why would I not use one to make my intentions explicitly clear to future developers? It's one thing to return the class that's highest up the class path that makes sense, but why would I not declare my variables to be of the strictest sense?
AvoidSynchronizedAtMethodLevel: Use
block level rather than method level
synchronization
What advantages does block-level synchronization have over method-level synchronization?
AvoidUsingShortType: Do not use the
short type
My first languages were C and C++, but in the Java world, why should I not use the type that best describes my data?
DD and DU anomalies (if I remember correctly—I use FindBugs and the messages are a little different) refer to assigning a value to a local variable that is never read, usually because it is reassigned another value before ever being read. A typical case would be initializing some variable with null when it is declared. Don't declare the variable until it's needed.
Assigning null to a local variable in order to "assist" the garbage collector is a myth. PMD is letting you know this is just counter-productive clutter.
Specifying final on a local variable should be very useful to an optimizer, but I don't have any concrete examples of current JITs taking advantage of this hint. I have found it useful in reasoning about the correctness of my own code.
Specifying interfaces in terms of… well, interfaces is a great design practice. You can easily change implementations of the collection without impacting the caller at all. That's what interfaces are all about.
I can't think of many cases where a caller would require a LinkedList, since it doesn't expose any API that isn't declared by some interface. If the client relies on that API, it's available through the correct interface.
Block level synchronization allows the critical section to be smaller, which allows as much work to be done concurrently as possible. Perhaps more importantly, it allows the use of a lock object that is privately controlled by the enclosing object. This way, you can guarantee that no deadlock can occur. Using the instance itself as a lock, anyone can synchronize on it incorrectly, causing deadlock.
Operands of type short are promoted to int in any operations. This rule is letting you know that this promotion is occurring, and you might as well use an int. However, using the short type can save memory, so if it is an instance member, I'd probably ignore that rule.
DataflowAnomalyAnalysis: Found
'DD'-anomaly for variable 'variable'
(lines 'n1'-'n2').
DataflowAnomalyAnalysis: Found
'DU'-anomaly for variable 'variable'
(lines 'n1'-'n2').
No idea.
NullAssignment: Assigning an Object to
null is a code smell. Consider
refactoring.
Wouldn't setting an object to null assist in garbage collection, if the object is a local object (not used outside of the method)? Or is that a myth?
Objects in local methods are marked to be garbage collected once the method returns. Setting them to null won't do any difference.
Since it would make less experience developers what is that null assignment all about it may be considered a code smell.
MethodArgumentCouldBeFinal: Parameter
'param' is not assigned and could be
declared final
LocalVariableCouldBeFinal: Local
variable 'variable' could be declared
final
Are there any advantages to using final parameters and variables?
It make clearer that the value won't change during the lifecycle of the object.
Also, if by any chance someone try to assign a value, the compiler will prevent this coding error at compile type.
consider this:
public void businessRule( SomeImportantArgument important ) {
if( important.xyz() ){
doXyz();
}
// some fuzzy logic here
important = new NotSoImportant();
// add for/if's/while etc
if( important.abc() ){ // <-- bug
burnTheHouse();
}
}
Suppose that you're assigned to solve some mysterious bug that from time to time burns the house.
You know what wast the parameter used, what you don't understand is WHY the burnTHeHouse method is invoked if the conditions are not met ( according to your findings )
It make take you a while to findout that at some point in the middle, somone change the reference, and that you are using other object.
Using final help to prevent this kind of things.
LooseCoupling: Avoid using
implementation types like
'LinkedList'; use the interface
instead
If I know that I specifically need a LinkedList, why would I not use one to make my intentions explicitly clear to future developers? It's one thing to return the class that's highest up the class path that makes sense, but why would I not declare my variables to be of the strictest sense?
There is no difference, in this case. I would think that since you are not using LinkedList specific functionality the suggestion is fair.
Today, LinkedList could make sense, but by using an interface you help your self ( or others ) to change it easily when it wont.
For small, personal projects this may not make sense at all, but since you're using an analyzer already, I guess you care about the code quality already.
Also, helps less experienced developer to create good habits. [ I'm not saying you're one but the analyzer does not know you ;) ]
AvoidSynchronizedAtMethodLevel: Use
block level rather than method level
synchronization
What advantages does block-level synchronization have over method-level synchronization?
The smaller the synchronized section the better. That's it.
Also, if you synchronize at the method level you'll block the whole object. When you synchronize at block level, you just synchronize that specific section, in some situations that's what you need.
AvoidUsingShortType: Do not use the
short type
My first languages were C and C++, but in the Java world, why should I not use the type that best describes my data?
I've never heard of this, and I agree with you :) I've never use short though.
My guess is that by not using it, you'll been helping your self to upgrade to int seamlessly.
Code smells are more oriented to code quality than performance optimizations. So the advice are given for less experienced programmers and to avoid pitfalls, than to improve program speed.
This way, you could save a lot of time and frustrations when trying to change the code to fit a better design.
If it the advise doesn't make sense, just ignore them, remember, you are the developer at charge, and the tool is just that a tool. If something goes wrong, you can't blame the tool, right?
Just a note on the final question.
Putting "final" on a variable results in it only be assignable once. This does not necessarily mean that it is easier to write, but it most certainly means that it is easier to read for a future maintainer.
Please consider these points:
any variable with a final can be immediately classified in "will not change value while watching".
by implication it means that if all variables which will not change are marked with final, then the variables NOT marked with final actually WILL change.
This means that you can see already when reading through the definition part which variables to look out for, as they may change value during the code, and the maintainer can spend his/her efforts better as the code is more readable.
Wouldn't setting an object to null
assist in garbage collection, if the
object is a local object (not used
outside of the method)? Or is that a
myth?
The only thing it does is make it possible for the object to be GCd before the method's end, which is rarely ever necessary.
Are there any advantages to using final parameters and variables?
It makes the code somewhat clearer since you don't have to worry about the value being changed somwhere when you analyze the code. More often then not you don't need or want to change a variable's value once it's set anyway.
If I know that I specifically need a
LinkedList, why would I not use one to
make my intentions explicitly clear to
future developers?
Can you think of any reason why you would specifically need a
LinkedList?
It's one thing to
return the class that's highest up the
class path that makes sense, but why
would I not declare my variables to be
of the strictest sense?
I don't care much about local variables or fields, but if you declare a method parameter of type LinkedList, I will hunt you down and hurt you, because it makes it impossible for me to use things like Arrays.asList() and Collections.emptyList().
What advantages does block-level synchronization have over method-level synchronization?
The biggest one is that it enables you to use a dedicated monitor object so that only those critical sections are mutually exclusive that need to be, rather than everything using the same monitor.
in the Java world, why should I not
use the type that best describes my
data?
Because types smaller than int are automtically promoted to int for all calculations and you have to cast down to assign anything to them. This leads to cluttered code and quite a lot of confustion (especially when autoboxing is involved).
AvoidUsingShortType: Do not use the short type
List item
short is 16 bit, 2's compliment in java
a short mathmatical operaion with anything in the Integer family outside of another short will require a runtime sign extension conversion to the larger size. operating against a floating point requires sign extension and a non-trivial conversion to IEEE-754.
can't find proof, but with a 32 bit or 64 bit register, you're no longer saving on 'processor instructions' at the bytecode level. You're parking a compact car in a a semi-trailer's parking spot as far as the processor register is concerned.
If your are optimizing your project at the byte code level, wow. just wow. ;P
I agree on the design side of ignoring this pmd warning, just weigh accurately describing your object with a 'short' versus the incurred performance conversions.
in my opinion, the incurred performance hits are miniscule on most machines. ignore the error.
What advantages does block-level
synchronization have over method-level
synchronization?
Synchronize a method is like do a synchronize(getClass()) block, and blocks all the class.
Maybe you don't want that

Categories