Efficiency and Instantiation - Java - java

I understand it might be hard to answer this question without knowing the details of the problem but I hope some one have encountered a similar situation before and could help me.
I am writing a simulator in Java and I have n neurons that communicate to each other during the simulation time. Each of these neurons have specific parameters and properties and I need to access and maybe manipulate their values during the simulation time.
I am wondering which of the following is the "right" choice:
Storing information in 1-D and 2-D arraylists - this means a lot of look ups and requires extra care to make sure information are linked properly.
Having one class with fields and methods required for a neuron and making different instances of it for every neuron (using the constructor to provide parameters specific to that neuron).
Basically, my question is where is the limit for making instances of a class? When does it become too many and inefficient? 100s? 1000s?
Let me know if I should explain more.
Appreciate any other suggestion as well.
Thank you.

Only memory is a limit for making instances of a class, there are no other performance related issues with having many instances of classes(*).
However, class instances do have some additional information stored with them so they come with more memory overhead than using arrays. For each instance of a class you can expect 16 bytes of overhead.
(*) In theory anyway, in practice you may encounter more GC overhead or have worse performance due to cache misses if the instances are not spread favorably in memory vs the array solution.

Related

What is the benefit of computing age of an object?

Stumbled upon a question which was related to Calendar object in stackoverflow. Reading this question led me to ask myself one question : 'What if we are able to find age of an object? Age of an object in terms of hours/days/weeks/months/years. And what can do we do with that? Would it be helpful at all? Will it help me to analyze my program runtime? '
I guess we can relate to age of objects for singleton objects, since it exists during the lifetime of the application. In the sense that, from the time, singleton object is loaded, it starts its life and ends life, when application ends. But the same could be extended to non-singleton objects as well.
I assumed that this question might be worthwhile pursuing, but I wanted to confirm from stackoverflow community. Do you think we might encounter a scenario in runtime of application/system, where we need to compute age of an object in JVM or runtime memory etc. By computing which, we could make some decisions for application.
After some small research, found that Garbage-Collector needs to be aware of age of objects.
So, what I would like to know is the following:
What if we are able to find age of an object? Age of an object in terms of hours/days/weeks/months/years. And what can do we do using that?
Would it be helpful at all?
Will it help us to analyze my program runtime and do something with that and what would it be?
What is a probable/possible part of application runtime/scenario which might benefit from computing age of an object?
Have you encountered a scenario during your development/debugging/field-issues wherein you hoped that if it was able to compute age of an object, it would have benefitted?
(Hopefully, this might not get downvoted, just want to add that I was curious to ask stackoverflow community. Apologize that this is not a programming exercise which fits to be asked here. Have added tags for bringing context to this question.)
I don't know what the benefit would be. I suppose, it might help with certain performance tuning and debugging problems.
However, there is no practical way to do this (in standard Java) unless the object is coded to record its own age. And that would have significant time and space costs.
After some small research, found that Garbage-Collector needs to be aware of age of objects.
Actually, a generational GC only needs the approximate age. It can determine that based on where the object currently lives; i.e. what "space" it is currently allocated in.
For the Java language and mainstream implementations ...
What if we are able to find age of an object?
We can't, in general. This makes the remaining questions "moot", generally speaking.
And what can do we do using that?
Moot.
Would it be helpful at all?
Moot.
Will it help us to analyze my program runtime and do something with that and what would it be?
In some cases, possibly yes. However the question is moot.
What is a probable/possible part of application runtime/scenario which might benefit from computing age of an object?
Possibly performance tuning. Possibly debugging. However the question is moot.
Have you encountered a scenario during your development/debugging/field-issues wherein you hoped that if it was able to compute age of an object, it would have benefitted?
Possibly. (I've been programming for 40+ years, and it is hard to remember everything I've done in that time period.) However I didn't stress about it.
Also, I've never considered that benefits of (hypothetically) recording the lifetime of each and every object would be worth the cost.
But if you are designing / implementing a new experimental programming language, feel free to include this in the feature list. If it turns out to be a useful feature1, there should be some research papers in it.
1 - and (cynically) even if not :-)

Most efficient way to store 5 attributes

So I'm trying to store 5 attributes of an object, which are 5 different integers.
What would be the best way to store these? I was thinking of arrays, but arrays aren't flexible. I also need to be able to retrieve all 5 attributes, so arrays probably won't work well.
Here's some background if it helps: I am currently making a game similar to Terraria (or Minecraft in 2D).
I wanted to store where the object is on the map(x,y), where it is on the screen at the part of the map(x,y), and what type of object it is.
import java.awt.Point
public class MyClass {
private Point pointOnMap;
private Point pointOnScreen;
// ...
}
The Point class binds x & y values into a single object (which makes sense) and gives you useful, basic methods it sounds like you'll need, such as translate and distance. http://docs.oracle.com/javase/7/docs/api/java/awt/Point.html
It is not possible to predict what is the most efficient way to store the attributes without seeing all of your code. (And I for one don't want to :-)) Second, you haven't clearly explained what you are optimizing for. Speed? Memory usage? Minimization of GC pauses?
However, this smells of premature optimization. Wasting lost of time trying to optimize performance on something that hasn't been built, and without any evidence that the performance of this part the codebase is going to be significant.
My advice would be:
Pick a simple design and implement it; e.g. 5 private int variables with getters and setters. If that is inconvenient, then choose a more convenient API.
Complete the program.
Get it working.
Benchmark it. Does it run fast enough? If yes, stop.
Profile it. Pick the biggest performance hotspot and optimize that.
Rerun the benchmarking and profile to check that your optimization has made things faster. If yes, then "commit" it. If not then back it out.
Go to step 4.
I would suggest HashMap where key can be objectId-attributeName and value will be integer value as you have to do retrieval based on key. This will be O(1) operation

Profiling a Java EE applications - What to look for and what changes to make?

I am a bit new to profiling applications for improving performance. I have selected YourKit as my profiler. There is no doubt that YourKit provides very interesting statistics. Where I am getting stuck is what to do with these statistics.
For instance, Consider a method that operates on a JAXB POJO. The method iterates through the POJO to access a tag/element that is deeply nested inside the XML. This requires 4 layers of for loops to get to the element/tag as shown below :
List<Bundle> bundles = null;
List<Item> items = null;
for(Info info : data) {
bundles = info.getBundles();
for(Bundle bundle : bundles) {
items = bundle.getItems();
//.. more loops like this till we get to the required element
}
}
YourKit tells me that the above code is a 'hot-spot' and 80 objects are getting garbage collected for each call to the method containing this code. The above code is just an example and not the only part where I am getting stuck. Most of the times I have no clue about what to do with the information given by the profiler. What can I possibly do to reduce the number of temporary objects in the above code? Are there any well defined principles for imporoving the performance of an application? What statistics to look for when profiling an application and what implications does each kind of statistics have?
Edit :
The main objective for profiling the application is to increase the throughput and response time. The current throughput is only 10 percent of the required throughput!
Focus on the statistics relevant to your performance goal. You are interested in minimal response time, so look at how much each method contributes to response time, and focus on those that contribute much (for single threaded processing, that's simply elapsed time during method call, summed over all invocation of that method). I am not sure what YourKit defines as hot spots (check the docs), but it's probably the methods with highest cummulative elapsed time, so hot spots are a good thing to look at. In constrast, object allocation has no direct impact on response time, and is irrelavant in your case (unless you have identified that the garbage collector contributes a significant proportion of cpu time, which it usually doesn't).
I absolutely agree with the given answers.
I would like to add that considering your concrete example, you actually can make an improvement by using xpath api to access the specific location in the XML.
In situations where you don't need to actually iterate the entire DOM, this should be your first choice since it is declarative and hence more expressive and less error prone.
It would often give you superior performance as well (For very complex queries it may not be the case, but you seem to have a simple scenario).
A way to improve the loop would be to change your schema and essentially flatten the model, of course this depends on whether you can change the schema. This way the generated Java will not require 4 layers of looping. Of course at the end of the day you need to ask yourself is the code really a problem - 80 objects are getting GCed so? Is your application running slow? Are you experiencing memory issues? Remember premature optimization is the root of all evil!
Profiling and optimization is a complex beast and depends on may things (Java version, 32 vs 64 bit os, etc...). Furthermore the optimization might not always require code changes, for example you could resolve problems by changing your GC policy on the JVM - for example there are GC policies that are more effective in situations where your code is creating many small objects that need to be GCed frequently. If you had specifics maybe it would be easier to help you however your question seems too broad. In fact there are many books written on topic which might be worth a read.

If classes all contain lots of useful class variables, will it have an impact on performances?

Whenever I write a new class, I use quite a ton of class variables to describe the class's properties, up to the point where when I go back to review the codes I've typed, I see 40s to 50s of class variables, regardless of whether they are public, protected, or private, they are all used prominently throughout the classes I've defined.
Even though, the class variables consists of mostly primitive variables, like booleans, integers, doubles, etc., I still have this uneasy feeling where that some of my classes with large amounts of class variables may have an impact on performances, however negligible they may be.
But being rational as possible, if I consider unlimited RAM size and unlimited Java class variables, a Java class may be an infinitely large block of memory in the RAM, which the first portion of the block contains the class variables partitions, and the rest of the block contains the addresses to the class methods within the Java class. With this amount of RAM, the performance for it is very nontrivial.
But that above isn't making my feelings any easier than said. If we were to consider limited RAM but unlimited Java class variables, what would be the result? What would really happen in an environment where performance matters?
And probably may get mentioned beforehand, I don't know if having lots of class variables counts as bad Java practice, when all of them are important, and all classes have been refactored.
Thanks in advance.
Performance has nothing to do with the number of fields an object has. Memory consumption is of course potentially affected, but if the variables are needed, you can't do much about it.
Don't worry too much about performance. Make your code simple, readable, maintainable, tested. Then, if you notice performance problems, measure and profile to see where they come from, and optimize where needed.
Maintainability and readability is affected by the number of fields an object has though. 40 to 50 fields is quite a lot of fields, and is probably an indication that your classes do too much on their own, and have too many responsibilities. Refactoring them to many, smaller subclasses, and using composition would probably be a good idea.
I hope I don't sound like an ass, but in my view having more than 10 properties in a class is usually a hint of a bad design and requires justification.
Performance wise, if you very often need all those properties, then you're going to be saving some memory, as each object also has a header. So intead of having 5-10 classes you put everyting into one and you save some bytes.
Depending on which garbage collector you use, having bigger objects can be more expensive to allocate (this is true for the CMS garbage collector, but not for the parallel one). More GC work = less time for your app to run.
Unless you're writing a high traffic, low latency application, the benefits of having less classes (and using less memory) is going to be completely overwhelmed by the extra effort needed for maintenance.
The biggest problem I see in having a class with a lot of variables is Thread safety - it is going to be really hard to reason about the invariants in such a case. Also reading/maintaining such a class is going to be really hard.
Of course if you make as much as you can fields immutable, that is going to be a lot better.
I try to go with : less is better, easier to maintain.
A basic principle we are always taught is to keep cohesion high (one class is focusing on one task) and coupling low (less interdependency among classes so that changes in one doesnot effect others).
While designing a system, I will believe the focus should be more on maintainable design, performance will take care of itself. I don't think there is fixed limit on number of variables a class can have as a good practice, as this will strictly depend on your requirement.
For example, if I have a requirement where the application suggest a course to student, and algorithm needs 50 inputs (scores, hobbies etc), it will not matter whether this data is available in one class or multiple, as the whole information needs to be loaded in the RAM for a faster execution.
I will again say, take care of your design, it is both harmful to keep unnecessary variables in a class (as it will load non-required information to RAM) or split into more classes than required (more references and hence pointer movement)
1. I always use this as a thumb of rule. A Class should have only One reason to Change, so It should Do only One Thing.
2. Keeping this in mind i take those variables which are needed to define this class's attributes.
3. I make sure that my class is following the Cohesive principle, where the Methods within the class reflects the Class name.
4. Now after sorting everything out, if i need some other variables to work-out my class, then i need to use them, i have no choice...Moreover after all these thinking and work going into creating a class, will be hardly effected by some additional variables.
Sometimes class variables are used as static final constants to store some default strings like product name, version, OS version, etc. Or even to store product specific settings like font size, type, etc. Those static variables can be kept at class level.
You can also use HashMap instead of simple class if you just want to store fields constants or like product setting that rarely change. That may help you speed you response time.
Two things I would like to mention :
1. All instance variables are stored in Heap area of RAM..
2. All static variables are stored in non Heap area(Method area to be specific).
Whatever be the type of variable(instance or static), ultimately all reside in RAM.
Now coming to your question. As far as instance variable is concerned, java's built-in Garbage collector will work, in most cases well and truly effectively, to keep freeing memory. However, static variables are not garbage collected.
If you are highly concerned with memory issues due to large number of variables in your class, you can resort to using Weak References instead of traditional strong reference.

How to decrease java program's memory consumption

I have a Java 3D program which uses lots (hundreds) of 3D models (3ds and obj).
The more models I use (and I really, really have to, it's kind of 3D model of the real world object) the more heavier the program becomes, till the point when any single operation takes ages to complete.
CPU consumption rarely reaches 50%, mostly is moving between 10%-30%, but memory consumption is growing (obviously) with each added 3D model.
I know how to minimize memory imprint of c++/ c programs, but with Java's GC is there anything I can do about it except increasing the JVM's memory with -Xmx? I am already running with -xMx512Mb.
I have checked GC logs, using GCViewer, didn't find anything suspicious.
I am aware of some very similar questions on SO, but none answered my question exactly.
My IDE is IntelliJ 11.
There are two simple ways to decrease the number of objects you are creating, and one or both of them may work for your purposes, though without specifications I can't be sure.
1) Work with highly mutable objects. If you need to simulate a large number of things with a great deal of similarity, but which don't have to interact with each other (for example, a hundred thousand simulations of a dozen particles interacting, with slightly different particles each time), then using the same dozen or so objects over and over, and making use of mutator functions to transfer the burden to the CPU. However, I doubt that using many objects in sequence is a problem for you, given that Java already has built-in garbage collection.
2) Make the similarities between similar objects their own class. For example, if you have the need for a lot of objects, and you notice that a significant proportion of them share a number of memory-intensive characteristics, store those characteristics in their own class and refer to a single instance of that class in every object with the exact same field values. As an example, consider a set of Apple objects. You could either create different classes for GrannySmithApples, MackintoshApples, RedDeliciousApples with their own static fields for characteristics shared across the class (e.g. RedDeliciousApples have a static String field declared in the abstract class Apple and set to "red"), or (to allow more flexibility at runtime) each class could have an instance of CoreCharacteristic. If multiple objects share the same core characteristics, the code:
CoreCharacteristic c = new CoreCharacteristic(<some parameters>);
Apple apple1 = new Apple(c);
Apple apple2 = new Apple(c);
Apple apple3 = new Apple(c);
Apple apple4 = new Apple(c);
will use only one CoreCharacteristic for all four apples, quartering the amount of memory needed to store the fields of CoreCharacteristic which would have otherwise been replicated for each Apple.
There are two different approaches I can think of to handle your issues:
Control when the GC kicks in: here is a full article describing how to control the garbage collector and the different algorithms used to clean up the memory. I found this approach very useful when an application is creating thousands and thousands of DTOs in a minute.
If your application is not creating too many objects that are trashed quickly then I suggest you have a look at your model and improve its efficiency. Remember that when dealing with 3D visualization, all what matters is how you structure your scene graph.
On a side note, 3D visualization does not take that much CPU when using OpenGL based solutions. This is mainly due to the fact that it is the GPU that gets involved a lot when rendering the scene graph, not the CPU.
From my point of view you have two options:
decrease creating new objects by making them immutable, and reuse of them if they not vary
use flywieght pattern - reuse created objects and work with setters on them, in place of creating new ones again and again - here is great example of implementation that may exaclty suit your needs http://www.javacamp.org/designPattern/flyweight.html - it's about creating color circles. Without flywieght it took 2,5x longer, and memory consumption was ~200 larger. Try it.

Categories