I'm just curious: Is there a difference on speed and performance between this two loops implementation? Assume that size() method returns the length of the array,collection, or object that handles a group of elements (actually it's from XOM api).
Implementation 1:
int size = someArray.size();
for (int i = 0; i < size; i++) {
// do stuff here
}
Implementation 2:
for (int i = 0; i < someArray.size(); i++) {
// do stuff here
}
From a performance point of view, there is little difference. This is because a loop can be optimized so that the size() lookup is inlined, resulting in very little performance difference.
The main difference is if the size changes while looping. The first case will try to iterate a fixed number of times. In the second case, the number of iterations will depend on the final size().
The 1st snippet is bound to execute faster since it calls size() once only. The 2nd snippet calls size() N times. Depending on the impl. it might pose significant penalty, esp. if the compiler finds hard to inline the method and/or the size() method doesn't just return non-volatile variable, etc.
I'd have rewritten it like for(int i=0, s=someCollection.size(); i<s; i++)
Note: arrays don't have size() method.
Yes, there is a difference. In the first loop, the size() method is only called once. In the second one, it's called at each iteration.
If the iteration modifies the size of the collection (which is very very uncommon), the second one is needed. In most cases, you should prefer the first one, but limit the scope of the size variable :
for (int i = 0, size = someArray.size(); i < size; i++) {
// ...
}
But most of the time, you should prefer the foreach syntax anyway :
for (Foo foo : collection) {
// ...
}
which will iterate over the array or collection efficiently, even for a LinkedList for example, where indexed access is not optimal.
Don't worry about it, JVM optimization is very aggressive these days.
Use the 2nd form, for it's more readable, and most likely as fast. Premature optimization yada yada.
And when you do need to improve speed, always profile first, don't guess.
It is extremely unlikely that caching size() in a local variable could benefit your app noticeably. If it does, you must be doing simple operations over a huge dataset. You shouldn't use ArrayList at all in that case.
Maybe it is worth to note that this construct:
for (String s : getStringsList()) {
//...
}
invokes getStringsList() only once and then operates on iterator behind the scenes. So it is safe to perform lengthy operations or change some state inside getStringsList().
Always avoid anything that can be done outside of the loop like method calls, assigning values to variables, or testing for conditions.
Method calls are more costly than the equivalent code without the call, and by repeating method calls again and again, you just add overhead to your application.
Move any method calls out of the loop, even if this requires rewriting of the code.
Benefits :-
Unless the compiler optimizes it, the loop condition will be calculated for each iteration over the loop.
If the condition value is not going to change, the code will execute faster if the method call is moved out of the loop.
Note :-
If the method returns a value that will not change during the loop, then store its value in a temporary variable before the loop.
Hence its value is stored in a temporary variable size outside the loop, and then used as the loop termination condition.
Related
I am doing some performance optimization for my java application and I am confuse about using the tmp variable to remove the method invocation in loop termination. Here is my situation:
Vector myVector = new Vector();
// some code
for (int i=0;i<myVector.size();i++){
//some code here;
}
I want to use
int tmp = myVector.size();
for(int i=0;i<tmp;i++){
//some code here
}
What would be negative impact of using second scenario ? My application is pretty large and I am not sure when and where myVector is being updated.
This change will not have any noticable impact on performance, neither positive nor negative. So you should not change this as long as there is no profound reason to do so.
Regarding your question
What could be negative impact of using second scenario ?
you should be aware that both implementations may behave differently in a multi-threaded environment. In the first case, changes of the vector that may be done by any other thread will be taken into account, and may affect how many times to loop is run. In the second case, the number of runs for the loop is computed once, and will not change later (even if the size of the vector changes). However, changing the contents of a vector while iterating over it with any of the both loops is dangerous and should be avoided if possible
BTW: The benchmark that was linked in the comment from #geoand is as flawed as a microbenchmark can be. This does not tell you anything.
I agree foreach loop reduces typing and good for readability.
A little backup, I work on low latency application development and receive 1Million packets to process per second. Iterating through a million packets and sending this information across to its listeners. I was using foreach loop to iterate through the set of listeners.
Doing profiling i figured there are a lot of Iterator objects created to execute foreach loop. Converting foreach loop to index based foreach I observed a huge drop in the number of objects created there by reducing no. of GC's and increasing application throughput.
Edit: (Sorry for confusion, making this Q more clearer)
For example i have list of listeners(fixed size) and i loop through this forloop a million times a second. Is foreach an overkill in java?
Example:
for(String s:listOfListeners)
{
// logic
}
compared to
for (int i=0;i<listOfListeners.size();i++)
{
// logic
}
Profiled screenshot for the code
for (int cnt = 0; cnt < 1_000_000; cnt++)
{
for (String string : list_of_listeners)
{
//No Code here
}
}
EDIT: Answering the vastly different question of:
For example i have list of listeners(fixed size) and i loop through this forloop a million times a second. Is foreach an overkill in java?
That depends - does your profiling actually show that the extra allocations are significant? The Java allocator and garbage collector can do a lot of work per second.
To put it another way, your steps should be:
Set performance goals alongside your functional requirements
Write the simplest code you can to achieve your functional requirements
Measure whether that code meets the functional requirements
If it doesn't:
Profile to work out where to optimize
Make a change
Run the tests again to see whether they make a significant difference in your meaningful metrics (number of objects allocated probably isn't a meaningful metric; number of listeners you can handle probably is)
Go back to step 3.
Maybe in your case, the enhanced for loop is significant. I wouldn't assume that it is though - nor would I assume that the creation of a million objects per second is significant. I would measure the meaningful metrics before and after... and make sure you have concrete performance goals before you do anything else, as otherwise you won't know when to stop micro-optimizing.
Size of list is around a million objects streaming in.
So you're creating one iterator object, but you're executing your loop body a million times.
Doing profiling i figured there are a lot of Iterator objects created to execute foreach loop.
Nope? Only a single iterator object should be created. As per the JLS:
The enhanced for statement is equivalent to a basic for statement of the form:
for (I #i = Expression.iterator(); #i.hasNext(); ) {
VariableModifiersopt TargetType Identifier =
(TargetType) #i.next();
Statement
}
As you can see, that calls the iterator() method once, and then calls hasNext() and next() on it on each iteration.
Do you think that extra object allocation will actually hurt your performance significantly?
How much do you value readability over performance? I take the approach of using the enhanced for loop wherever it helps readability, until it proves to be a performance problem - and my personal experience is that it's never hurt performance significantly in anything I've written. That's not to say that would be true for all applications, but the default position should be to only use the less readable code after proving it will improve things significantly.
The "foreach" loop creates just one Iterator object, while the second loop creates none. If you are executing many, many separate loops that execute just a few times each, then yes, "foreach" may be unnecessarily expensive. Otherwise, this is micro-optimizing.
EDIT: The question has changed so much since I wrote my answer that I'm not sure what I'm answering at the moment.
Looking up stuff with list.get(i) can actually be a lot slower if it's a linked list, since for each lookup, it has to traverse the list, while the iterator remembers the position.
Example:
list.get(0) will get the first element
list.get(1) will first get the first element to find pointer to the next
list.get(2) will first get the first element, then go to the second and then to the third
etc.
So to do a full loop, you're actually looping over elements in this manner:
0
0->1
0->1->2
0->1->2->3
etc.
I do not think you should worry about the effectiveness here.
Most of time is consumed by your actual application logic (in this case - by what you do inside the loop).
So, I would not worry about the price you pay for convenience here.
Is this comparison even fair ? You are comparing using an Iterator Vs using get(index)
Furthermore, each loop would only create one additional Iterator. Unless the Iterator is itself inefficient for some reason, you should see comparable performance.
Which code is better for the performance point of you? I think second code because ref creation in for loop is not good.
May I know your opinion?
// First Code
for (int i = 0; i < array.size(); i++) {
SipSession abc = (SipSession) array1.get(i);
}
// Second Code
SipSession abc = null;
for (int i = 0; i < array.size(); i++) {
abc = (SipSession) array1.get(i);
}
You should only choose on performance grounds after you've profiled your code and established that this is the bottleneck.
Until you've done that, choose whichever version you think is clearer an easier to maintain.
I would always choose the first version except when I need the last SipSession reference to outlive the loop.
Ultimately it will make no difference. The JIT will optimize that code away to exactly the same thing.
The only difference is the scope, of course.
I don't think there's much performance difference between the two. Only major difference is the scope of the SipSession reference. But you should try profiling if you care that much.
In your first code the VM or even the compiler will simply remove the reference variable, because it is never used within its scope.
It will be optimized to
for(int i=0;i<array.size();i++){
array1.get(i);
}
Depending of what is done in the get method the hole loop may be removed while optimization.
If the order the elements are accessed is not important you can also:
for (int i = array.size()-1; 0 <= i ;) {
SipSession abc = (SipSession) array1.get(i--);
}
This would call array.size() only once instead of in each loop iteration.
This would be Micro Optimization and its better to do other kinds of optimizations of code than doing them without proof that it is the bottleneck. Which is not the case here.
Never try to optimize without profiling. The JIT compiler does the heavy lifting so you don't have to.
That aside, your array seems to be a raw List instead of a generic List<SipSession>. Generics won't necessarily optimize your code, but it makes it much easier to understand and maintain. Your simple loop could be rewritten as:
List<SipSession> array;
for(SipSession abc : array){
// Stuff
}
I had an argument with my friend regarding this.
Consider the below snippet,
for(i=0; i<someList.size(); i++) {
//some logic
}
Here someList.size() will be executed for every iteration, so it is recommended to migrate this size calculation to outside(before) the loop.
Now what happens when I use an extended for loop like this,
for(SpecialBean bean: someBean.getSpecialList()) {
//some logic
}
Is it necessary to move someBean.getSpecialList() to outside the loop?
How many times will someBean.getSpecialList() execute if I were to retain the 2nd snippet as it is?
Repeated calls to list.size() won't result in any performance penalty. The JIT compiler will most probably inline it and even if it doesn't, it will still be quite inexpensive because it just involves reading the value of a field.
A much more serious problem with your first example is that the loop body will have to involve list.get(i) and for a LinkedList, acessing the i th element has O(i) cost with a quite significant constant factor due to pointer chasing, which translates to data-dependent loads on the CPU level. The CPU's prefetcher can't optimize this access pattern.
This means that the overall computational complexity will be O(n2) when applied to a LinkedList.
Your second example compiles to iteration via Iterator and will evaluate someBean.getSpecialList().iterator() only once. The cost of iterator.next() is constant in all cases.
From Item 46 in Effective Java by Joshua Bloch :
The for-each loop, introduced in release 1.5, gets rid of the clutter
and the opportunity for error by hiding the iterator or index
variable completely. The resulting idiom applies equally to
collections and arrays:
// The preferred idiom for iterating over collections and arrays for
(Element e : elements) {
doSomething(e); } When you see the colon (:), read it as “in.” Thus, the loop above reads as “for each element e in elements.” Note
that there is no performance penalty for using the for-each loop, even
for arrays. In fact, it may offer a slight performance advantage over
an ordinary for loop in some circumstances, as it computes the limit
of the array index only once. While you can do this by hand (Item 45),
programmers don’t always do so.
See also is-there-a-performance-difference-between-a-for-loop-and-a-for-each-loop
An alternative to the first snippet would be:
for(i=0, l=someList.size(); i<l; i++) {
//some logic
}
With regard to the for..each loop, the call to getSpecialList() will only be made once (you could verify this by adding some debugging/logging inside the method).
As the extended loop uses an Iterator taken from the Iterable, it wouldn't be possible or sensible to execute someBean.getSpecialList() more than once. Moving it outside the loop will not change the performance of the loop, but you could do it if it improves readability.
Note: if you iterate by index it can be faster for random access collections e.g. ArrayList as it doesn't create an Iterator, but slower for indexed collections which don't support random access.
for each variation will be same as below
for (Iterator i = c.iterator(); i.hasNext(); ) {
doSomething((Element) i.next());
}
From Item 46: Prefer for-each loops to traditional for loops of Effective java
for-each loop provides compelling advantages over the tradi-
tional for loop in clarity and bug prevention, with no performance penalty. You
should use it wherever you can.
So My first guess was wrong there is no penalty using function inside for each loop.
This is my Java code:
List<Object> objects = new ArrayList();
// Assign values to objects
...
for (int i = 0; i < objects.size(); i++) {
Object object = objects.get(i);
...
}
I have two questions:
Is objects.size() calculated only once before stating the loop, or is it calculated each loop?
If objects.size() is calculated each loop, then if other thread change it at the same time without multi-threads protection, the code may be crashed.
Am I correct?
Answers:
objects.size() is called every loop (whether it is calculated depends on the ArrayList implementation, which you shouldn't care about)
Yes, another thread may change the list and this will affect your loop
Real answer:
You shouldn't have to care, and here's how you don't have to:
Use a CopyOnWriteArrayList, which is thread-safe. If you iterate over it using an Iterator (as the foreach syntax uses internally), you'll iterate over the list as it was when the iteration started
Use the foreach syntax, which means you don't have to use an index etc - it's done for you:
for (Object object : objects) {
// do something with each object
}
Yes, it is calculated each time. If you have another thread altering the size of your objects list, the loop condition will keep changing.
yes when you are using objects.size() inside the loop condition it calculates every time. better way is to keep it in a variable before going into loop;
like int
limit=objects.size();
for (int i = 0; i < limit; i++) {
Object object = objects.get(i);
...
}
If you have another thread it may change it but using the above option it will not affect or crash you programm.
yes , it will calculate each time
if you look into for loop statement, in the first statement it will set counter initial value
then it will check for maximum value then it will execute for loop body then after it will increase value of counter and then again check for maximum value. every time to check maximum value it will call size method.
Is objects.size() calculated only once before stating the loop, or is it calculated each loop?
Each time.
If objects.size() is calculated each loop, then if other thread change it at the same time without multi-threads protection, the code may be crashed.
Yes. Or at least, you may get a ConcurrentModificationException, and not have any reasonable way to deal with it.
Please note that this could happen even if you cached objects.size(), except now the .get() will fail instead because you are trying to get an index that no longer exists. objects.size() changes because something is removed from, or added to, the container.
Don't modify collections while you are iterating over them.
Notionally the objects.size() could be evaluated on each loop. However, as the method is short it can be inlined and cached as its not a volatile variable. i.e. another thread could change it but there is no guarantee that if it did you would see the change.
A short way to save the size is to use the follow.
for (int i = 0, size = objects.size(); i < size; i++) {
Object object = objects.get(i);
...
}
However if you are concerned that another thread could change the size, this approach only protects you if an object is added. If an object is removed you can still get an exception when you attempt to access the value which is now beyond the end of the list.
Using a CopyOnWriteArrayList avoids these issues (provided you use an Iterator) but makes writes more expensive.