I prefer to use static field for instances of classes that not store his state in fields instead anonymous-inner-classes. I think this good practice for to less memory and GC usage if method sort(or other) call very often. But my colleague prefer to use anonymous-inner-classes for this case saying that JIT will optimize it.
class MyClass {
//non fields of class
/*access modifier*/ final static Comparator<MyClass> comparator = new Comparator<MyClass>(){
public compare(MyClass o1, MyClass o2){
//comparing logic
}
}
}
Usage example (I prefer):
List<MyClass> list = ...;
Collection.sort(list, MyClass.comparator);
Usage example (my colleague prefer):
List<MyClass> list = ...;
Collection.sort(list, new Comparator<MyClass>(){
public compare(MyClass o1, MyClass o2){
//comparing logic
}
});
1. Using anonymous-inner-classes in openJDK optimized?
2. Please, tell me good practice for this case.
I think this good practice for to less memory and GC usage if method sort(or other) call very often.
Well it's the other way round. If you are bothered about memory, the static fields will be there in memory until the class is unloaded.
However, the concern is more of readability here rather than memory or performance. If you find yourself using a Comparator instance may be 2-3 times, or more, it's better to store that in a field, to avoid repeating the code. Even better, mark the field final. If you are going to use it only once, there is no point of it being stored as static field.
But my colleague prefer to use anonymous-inner-classes for this case saying that JIT will optimize it.
I don't understand what kind of optimization is your colleague talking about. You should ask him/her for further clarification. IMO, this is just a pre-mature optimization, and you should really not be bothered.
Related
This is a bit of an interesting question but I wanted to know everyone's thoughts on this design pattern.
public class MyThreadedMap {
private ConcurrentHashMap<Integer, Object> map;
...
public class Wrapper {
public Object get(int index){
return map.get(index);
}
}
}
At this point multiple threads will have their own instance of Wrapper and would be accessing the map with wrapper.get(index).
I found that the performance change from having the wrapper and not having the wrapper is just slightly better, that is the wrapper helps a little. When I place synchronized on the get method there is a serious performance hit.
What exactly is happening here? When an inner class is instantiated am I creating a copy of that get method for each instance? Would it be best if I just left the wrapper out since there is no real performance gain?
ConcurrentHashMap has fancy ways of minimizing synchronization overhead. When you synchronize the get method, it imposes normal synchronization overhead, thus the performance hit.
If there is no other code in the Wrapper class, I would just leave it out as it doesn't appear to add anything.
During code review, a colleague of mine looked at this piece of code:
public List<Item> extractItems(List<Object[]> results) {
return Lists.transform(results, new Function<Object[], Item>() {
#Override
public Item apply(Object[] values) {
...
}
});
}
He suggests changing it to this:
public List<Item> extractItems(List<Object[]> results) {
return Lists.transform(results, getTransformer());
}
private Function<Object[], Item> transformer;
private Function<Object[], Item> getTransformer() {
if(transformer == null) {
transformer = new Function<Object[], Item>() {
#Override
public Item apply(Object[] values) {
...
}
};
}
return transformer;
}
So we are looking at taking the new Function() construction, and moving it over to be a member variable and re-used next time.
While I understand his logic and reasoning, I guess I'm not sold that I should do this for every possible object that I create that follows this pattern. It seems like there would be some good reasons not to do this, but I'm not sure.
What are your thoughts? Should we always cache duplicately created objects like this?
UPDATE
The Function is a google guava thing, and holds no state. A couple people have pointed out the non-thread-safe aspect of this change, which is perfectly valid, but isn't actually a concern here. I'm more asking about the practice of constructing vs caching small objects, which is better?
Your colleague's proposal is not thread safe. It also reeks of premature optimization. Is the construction of a Function object a known (tested) CPU bottleneck? If not, there's no reason to do this. It's not a memory problem - you're not keeping a reference, so GC will sweep it away, probably from Eden.
As already said, it's all premature optimization. The gain is probably not measurable and the whole story should be forgotten.
However, with the transformer being stateless, I'd go for it for readability reasons. Anonymous functions as an argument rather pollute the code.
Just drop the lazy initialization - you're gonna use the transformer whenever you use the class, right? (*) So put it in a static final field and maybe you can reuse it somewhere else.
(*) And even if not, creating and holding a cheap object during the whole application lifetime doesn't matter.
Given the following two code options, is there any performance benefit (over a very large scale or a long period of time) of the second over the first?
Option 1
private Map<Long, Animal> animals = ...;
public Map<Long, Animal> getAnimals() {
return animals;
}
public void useAnimals() {
for (int i=0; i < SOME_LARGE_NUMBER; i++) {
Animal animal = getAnimals().get(id);
}
// Many many calls to getAnimals() are made...
}
Option 2 - no getter
private Map<Long, Animal> animals = ...;
public void useAnimals() {
for (int i=0; i < SOME_NUMBER; i++) {
Animal animal = animals.get(id);
}
// No method calls made
}
If it is bad for performance, why, and how should I determine whether it is worth mitigating?
And, would storing the result of getAnimals() as a local provide a benefit...
if SOME_NUMBER is hundreds or thousands?
if SOME_NUMBER is only in the order of magnitude of 10?
Note: I previously said "encapsulation". I changed it to "getter" because the purpose is actually not that the field can't be modified but that it can't be reassigned. The encapsulation is simply to remove responsibility for assignment from subclasses.
Most likely JVM will inline getAnimals() invocation in tight loop effectively falling back to Option 1. So don't bother, this is really a micro (nano?) optimization.
Another thing is migrating from field access to local variable. This sounds good since instead of traversing through this reference every time you always have a reference on the stack (two memory accesses vs. one). However I believe (correct me if I'm wrong) that since animals is private and non-volatile again JVM will perform this optimization for you at runtime.
The second snippet is more encapsulated than the first one. The first one gives access to the internal map to anyone, whereas the second keeps it encapsulated in the class.
Both will lead to comparable performance.
EDIT: since you change the question, I'll also change the answer.
If you go through a getter, and the getter is not final, it means that subclasses may return another map than the one you hold in the class. Choose whether you want your method to operate on the subclass's map or on the class's map. Both could be acceptable, depending on the context.
Anyway, suppose your subclass always makes a defensive copy of the map, you'll end up having many copies if you don't cache the result of the getter in a local variable of useAnimals. It might be required to always work on the latest value of the subclass's map, but I doubt it's the case.
If there is no subclass, or the subclass doesn't override the method, or override it by always returning the same map, both will lead to comparable performance and you shouldn't care about it.
Have you profiled this to see if it matter, for a modern JIT I would guess it would get optomized away, especially if animals was marked final but there is nothing stopping you from testing this yourself.
Either way, I am 100% this would NEVER be your bottle neck in an application.
Well, I don't think that JVM will inline the function call. So probably it may affect performance. The better way is to create local variable and assign class field animals to it.
I have this class:
public MyClass {
public void initialize(Collection<String> data) {
this.data = data; // <-- Bad!
}
private Collection<String> data;
}
This is obviously bad style, because I'm introducing a shared mutable state. What's the preferred way to handle this?
Ignore it?
Clone the collection?
...?
EDIT: To clarify why this is bad, imagine this:
MyClass myObject = new MyClass();
List<String> data = new ArrayList<String>();
myObject.initialize(data); // myObject.data.size() == 0
data.add("Test"); // myObject.data.size() == 1
Just storing the reference poses a way to inject data into the private field myObject.data, although it should be completely private.
Depending on the nature of MyClass this could have serious impacts.
The best way is to deep clone the parameter. For performance reasons, this is usually not possible. On top of that, not all objects can be cloned, so deep copying might throw exceptions and cause all kinds of headache.
The next best way would be a "copy-on-write" clone. There is no support for this in the Java runtime.
If you think that it's possible someone mutates the collection, do a shallow copy using the copy constructor:
this.data = new HashSet<String> (data);
This will solve your problem (since String is immutable) but it will fail when the type in the set is mutable.
Another solution is to always make the sets immutable as soon as you store them somewhere:
Set<String> set = ...
...build the set...
// Freeze the set
set = Collections.unmodifiableSet(set);
// Now you can safely pass it elsewhere
obj.setData (set);
The idea here is turn collections into "value objects" as soon as possible. Anyone who wants to change the collection must copy it, change it and then save it back.
Within a class, you can keep the set mutable and wrap it in the getter (which you should do anyway).
Problems with this approach: Performance (but it's probably not as bad as you'd expect) and discipline (breaks if you forget it somewhere).
Null check (if you want to restrict null)
Either defensive copy (if you don't want shared state)
or as you did (if a live view on data is useful)
Depends heavily on your requirements.
Edited:
Ignoring should be no option. Silent fail is, well... a debugging nightmare.
public class Foo {
private final Collection collection = new ArrayList();
public void initialise(final Collection collection) {
this.collection.addAll(collection);
}
}
Sorry for not addressing your concern directly, but I would never directly pass a Collection to a setXxx() bean setter method. Instead, I would do:
private final List<MyClass> theList;
public void addXxx(MyClass item) { ... }
public void removeXxx(MyClass item) { ... } // or index.
public void Iterator<MyClass> iterateXxx() {
return Collections.unmodifiableList(theList).iterator();
}
I would go for defensive copying / deep cloning only if I am sure there would be no side effects from using it, and as for the speed, I wouldn't concern myself with it, since in business applications reliability has 10 times more priority than speed. ;-)
An idea will be to pass the data as a String array and create the Set inside MyClass. Of course MyClass should test that the input data is valid. I believe that this is a good practice anyway.
If both the caller of MyClass and MyClass itself actually work with a Set<String>, then you could consider cloning the collection. The Set however needs to be constructed somehow. I would prefer to move this responsibility to MyClass.
Suppose you're maintaining an API that was originally released years ago (before java gained enum support) and it defines a class with enumeration values as ints:
public class VitaminType {
public static final int RETINOL = 0;
public static final int THIAMIN = 1;
public static final int RIBOFLAVIN = 2;
}
Over the years the API has evolved and gained Java 5-specific features (generified interfaces, etc). Now you're about to add a new enumeration:
public enum NutrientType {
AMINO_ACID, SATURATED_FAT, UNSATURATED_FAT, CARBOHYDRATE;
}
The 'old style' int-enum pattern has no type safety, no possibility of adding behaviour or data, etc, but it's published and in use. I'm concerned that mixing two styles of enumeration is inconsistent for users of the API.
I see three possible approaches:
Give up and define the new enum (NutrientType in my fictitious example) as a series of ints like the VitaminType class. You get consistency but you're not taking advantage of type safety and other modern features.
Decide to live with an inconsistency in a published API: keep VitaminType around as is, and add NutrientType as an enum. Methods that take a VitaminType are still declared as taking an int, methods that take a NutrientType are declared as taking such.
Deprecate the VitaminType class and introduce a new VitaminType2 enum. Define the new NutrientType as an enum. Congratulations, for the next 2-3 years until you can kill the deprecated type, you're going to deal with deprecated versions of every single method that took a VitaminType as an int and adding a new foo(VitaminType2 v) version of each. You also need to write tests for each deprecated foo(int v) method as well as its corresponding foo(VitaminType2 v) method, so you just multiplied your QA effort.
What is the best approach?
How likely is it that the API consumers are going to confuse VitaminType with NutrientType? If it is unlikely, then maybe it is better to maintain API design consistency, especially if the user base is established and you want to minimize the delta of work/learning required by customers. If confusion is likely, then NutrientType should probably become an enum.
This needn't be a wholesale overnight change; for example, you could expose the old int values via the enum:
public enum Vitamin {
RETINOL(0), THIAMIN(1), RIBOFLAVIN(2);
private final int intValue;
Vitamin(int n) {
intValue = n;
}
public int getVitaminType() {
return intValue;
}
public static Vitamin asVitamin(int intValue) {
for (Vitamin vitamin : Vitamin.values()) {
if (intValue == vitamin.getVitaminType()) {
return vitamin;
}
}
throw new IllegalArgumentException();
}
}
/** Use foo.Vitamin instead */
#Deprecated
public class VitaminType {
public static final int RETINOL = Vitamin.RETINOL.getVitaminType();
public static final int THIAMIN = Vitamin.THIAMIN.getVitaminType();
public static final int RIBOFLAVIN = Vitamin.RIBOFLAVIN.getVitaminType();
}
This allows you to update the API and gives you some control over when to deprecate the old type and scheduling the switch-over in any code that relies on the old type internally.
Some care is required to keep the literal values in sync with those that may have been in-lined with old consumer code.
Personal opinion is that it's probably not worth the effort of trying to convert. For one thing, the "public static final int" idiom isn't going away any time soon, given that it's sprinkled liberally all over the JDK. For another, tracking down usages of the original ints is likely to be really unpleasant, given that your classes will compile away the reference so you're likely not to know you've broken anything until it's too late
(by which I mean
class A
{
public static final int MY_CONSTANT=1
}
class B
{
....
i+=A.MY_CONSTANT;
}
gets compiled into
i+=1
So if you rewrite A you may not ever realize that B is broken until you recompile B later.
It's a pretty well known idiom, probably not so terrible to leave it in, certainly better than the alternative.
There is a rumor that the creator of "make" realized that the syntax of Makefiles was bad, but felt that he couldn't change it because he already had 10 users.
Backwards compatibility at all costs, even if it hurts your customers, is a bad thing. SO can't really give you a definitive answer on what to do in your case, but be sure and consider the cost to your users over the long term.
Also think about ways you can refactor the core of your code will keeping the old integer based enums only at the outer layer.
Wait for the next major revision, change everything to enum and provide a script (sed, perl, Java, Groovy, ...) to convert existing source code to use the new syntax.
Obviously this has two drawbacks:
No binary compatibility. How important this one is depends on the use cases, but can be acceptable in the case of a new major release
Users have to do some work. If the work is simple enough, then this too may be acceptable.
In the meantime, add new types as enums and keep old types as ints.
The best would be if you could just fix the published versions, if possible. In my opinion consistency would be the best solution, so you would need to do some refactoring. I personally don't like deprecated things, because they get into way. You might be able to wait until a bigger version release and use those ints until then, and refactor everything in a big project. If that is not possible, you might consider yourself stuck with the ints, unless you create some kinds of wrappers or something.
If nothing helps but you still evolve the code, you end up losing consistency or living with the deprecated versions. In any case, usually at least at some point of time people become fed up with old stuff if it has lost it's consistency and create new from scratch... So you would have the refactoring in the future no matter what.
The customer might scrap the project and buy an other product, if something goes wrong. Usually it is not the customer's problem can you afford refactoring or not, they just buy what is appropriate and usable to them. So in the end it is a tricky problem and care needs to be taken.