I've been trying to optimize some of my code, and ive reached a strange conclusion regarding fors.
In my testcase ive created a new project with main activity. The activity initializes a List of 500 objects, runs an explicit GC and starts the thread. The thread loops the function doCalculations.
this.objects is a list of 500 MyObject, previous is MyObject, value is int. The function logics hold no logic, they are just there to do stuff. The difference is in the inner for.
function1
public void doCalculations()
{
for(MyObject o : this.objects)
for(int i=0; i<this.objects.size(); i++)
if(this.objects.get(i) == o)
o.value = this.objects.get(i).value;
}
function 2
public void doCalculations()
{
for(MyObject o : this.objects)
for(MyObject o2 : this.objects)
if(o2 == o)
o.value = o2.value;
}
With function 2 GC is called every ~10 secs on my nexus s, freeing ~1.7MB.
With function 1 GC is never to be seen.
Why is that?
One creates an iterator, the other doesn't.
Is the GC actually a bottleneck in your application? (It seems unlikely. Many devs, myself included, would consider the readability benefit to outweigh a few microseconds of GC.)
That said, your entire loop is a no-op anyway.
My suggestion is that' because the inner for-loop creates an Iterator for each run of the outer for loop (in function 2).
This Iterator-instances are not created in function 1
Related
Why below class is not thread safe ?
public class UnsafeCachingFactorizer implements Servlet {
private final AtomicReference<BigInteger> lastNumber = new AtomicReference<>();
private final AtomicReference<BigInteger[]> lastFactors = new AtomicReference<>();
public void service(ServletRequest req, ServletResponse resp) {
BigInteger i = extractFromRequest(req);
if i.equals(lastNumber.get())) {
encodeIntoResponse(resp, lastFactors.get());
}
else {
BigInteger[] factors = factor(i);
lastNumber.set(i);
lastFactors.set(factors);
encodeIntoResponse(resp, factors);
}
}
}
Instance variables are thread safe, then why the whole class is not thread safe ?
It's not thread safe because you don't always get the right answer when multiple threads call the code.
Let's say that lastNumber=1 and lastFactors=factors(1). In the one-thread case, where the thread calls with i=1:
T1: if (lastNumber.get().equals(1)) { // true
T1: encodeIntoResponse(resp, lastFactors.get());
Fine, this is the expected result. But consider a multi-threaded case, where the actions within each thread takes place in the same order, but can arbitrarily interleave. One such interleaving is (where i=1 and i=2 for the two threads respectively):
T1: if (lastNumber.get().equals(1)) { // true
T2: if (lastNumber.get().equals(2)) { // false
T2: } else {
T2: lastNumber.set(2);
T2: lastFactors.set(factors(2));
T1: encodeIntoResponse(resp, lastFactors.get()); // oops! You wrote the factors(2), not factors(1).
The problem is that you're not getting and setting the AtomicReferences atomically: that is, there is nothing to stop another thread sneaking in and changing the values (of one or either) between the get and the set.
In general, whilst individual calls to methods on an AtomicReference are atomic, multiple calls are not (and they definitely aren't atomic between instances of AtomicReference). So, if you ever find yourself writing code like:
if (/* some condition with ref.get() */) {
/* some statement with ref.set() */
}
then you probably aren't using AtomicReference correctly (or, at least, it's not thread-safe).
To fix this, you need something that can be read and set atomically. For example, create a simple class to hold both:
class Holder {
// Add a ctor to initialize these.
final BigInteger number;
final BigInteger[] factors;
}
Then store this in a single AtomicReference, and use updateAndGet:
BigInteger[] factors = holderRef.updateAndGet(h -> {
if (h != null && h.number.equals(i)) {
return h;
}
return new Holder(i, factor(i));
}).factors;
encodeIntoResponse(resp, factors);
Upon reflection, updateAndGet isn't necessarily the right way to do this. If factors sometimes takes a long time to compute, then a long-time computation might get done many times, because lots of other shorter-time computations preempt it, so the update function keeps having to be called.
Instead, you can just always set the reference if you had to recompute it:
Holder h = holderRef.get();
if (h == null || !h.number.equals(i)) {
h = new Holder(i, factors(i));
holderRef.set(h);
}
return h.factors;
This may seem to violate what I said previously, in that separate calls to holderRef are not atomic, and thus not thread-safe.
It's a bit more nuanced, however: my first paragraph states that the lack of thread safety in the original code stems from the fact that you might get the factors for the wrong input. This problem doesn't occur here: you either get the holder for the right number (and hence the factors for the right number), or you compute the factors for the input.
The issue arises in what this holder is actually meant to be storing: the "last" number/factors is rather hard to define in terms of multithreading. When are you measuring "last-ness" from? The most recent call to start? The most recent call to finish? Other?
This code simply stores "a" previously computed value, without attempting to nail down this ambiguity.
This question already has answers here:
Iterating through a Collection, avoiding ConcurrentModificationException when removing objects in a loop
(31 answers)
Closed 5 years ago.
I am trying to make a simulation of a program which simulates different threads removing and adding objects in an ArrayList. However, late in the simulation I get concurrentModificationExceptions (when the threads are trying to access and modify the same variable while an iterator is being used to iterate through the data). I have searched it up and seen some topics about this saying that I needed to use locks/synchronization and/or using ListIterators instead of enhanced for-loops, however, none of these options seemed to fix the problem. Here is what I have tried to do so far:
public Object removeSomething1(){
synchronized(this){ //Also tried only putting it around the remove block
for(Object o : myList){
myList.remove(o);
return o;
}
}
}
//This is another variaton which did not yield any improved result
public Object removeSomething2(){
ListIterator<Object> iter = myList.listIterator();
While(iter.hasNext()){
Object s = iter.next();
synchronized(this){
iter.remove();
}
return s;
}
}
//After some request here is also the simple code which adds to the list
public addSomething(Object o){
myList.add(o);
}
I execute 5 threads which calls upon these methods in their run() method in an interval of 500ms (using Thread.sleep()). If I increase the sleep timer in each thread and put a Thread.sleep() between each instanciation of threads, the problem seems to go away, but I want the threads to run at (closely) the same time without them interfering with the iterator at the same time which envokes the ConcurrentModificationException.
After a suggestion from the user Louis Wasserman, I move the synchronized block to contain all of removeSomething2(). This makes sense as it now only let's one thread do the whole iteration at a time. This is how the solution looks:
public Object removeSomething2(){
synchronized(this){
ListIterator<Object> iter = myList.listIterator();
While(iter.hasNext()){
Object s = iter.next();
iter.remove();
return s;
}
}
}
I'm trying to understand the difference in behaviour of an ArrayList and a Vector. Does the following snippet in any way illustrate the difference in synchronization ? The output for the ArrayList (f1) is unpredictable while the output for the Vector (f2) is predictable. I think it may just be luck that f2 has predictable output because modifying f2 slightly to get the thread to sleep for even a ms (f3) causes an empty vector ! What's causing that ?
public class D implements Runnable {
ArrayList<Integer> al;
Vector<Integer> vl;
public D(ArrayList al_, Vector vl_) {
al = al_;
vl = vl_;
}
public void run() {
if (al.size() < 20)
f1();
else
f2();
} // 1
public void f1() {
if (al.size() == 0)
al.add(0);
else
al.add(al.get(al.size() - 1) + 1);
}
public void f2() {
if (vl.size() == 0)
vl.add(0);
else
vl.add(vl.get(vl.size() - 1) + 1);
}
public void f3() {
if (vl.size() == 0) {
try {
Thread.sleep(1);
vl.add(0);
} catch (InterruptedException e) {
System.out.println(e.getMessage());
}
} else {
vl.add(vl.get(vl.size() - 1) + 1);
}
}
public static void main(String... args) {
Vector<Integer> vl = new Vector<Integer>(20);
ArrayList<Integer> al = new ArrayList<Integer>(20);
for (int i = 1; i < 40; i++) {
new Thread(new D(al, vl), Integer.toString(i)).start();
}
}
}
To answer the question: Yes vector is synchronized, this means that concurrent actions on the data structure itself won't lead to unexpected behavior (e.g. NullPointerExceptions or something). Hence calls like size() are perfectly safe with a Vector in concurrent situations, but not with an ArrayList (note if there are only read accesses ArrayLists are safe too, we get into problems as soon as at least one thread writes to the datastructure, e.g. add/remove)
The problem is, that this low level synchronization is basically completely useless and your code already demonstrates this.
if (al.size() == 0)
al.add(0);
else
al.add(al.get(al.size() - 1) + 1);
What you want here is to add a number to your datastructure depending on the current size (ie if N threads execute this, in the end we'd want the list to contain the numbers [0..N)). Sadly that does not work:
Assume that 2 threads execute this code sample concurrently on an empty list/vector. The following timeline is quite possible:
T1: size() # go to true branch of if
T2: size() # alas we again take the true branch.
T1: add(0)
T2: add(0) # ouch
Both execute size() and get back the value 0. They then go into the true branch of the and both add 0 to the datastructure. That's not what you want.
Hence you'll have to synchronize in your business logic anyhow to make sure that size() and add() are executed atomically. Hence the synchronization of vector is quite useless in almost any scenario (contrary to some claims on modern JVMs the performance hit of an uncontended lock is completely negligible though, but the Collections API is much nicer so why not use it)
In The Beginning (Java 1.0) there was the "synchronized vector".
Which entailed a potentially HUGE performance hit.
Hence the addition of "ArrayList" and friends in Java 1.2 onwards.
Your code illustrates the rationale for making vectors synchronized in the first place. But it's simply unnecessary most of the time, and better done in other ways most of the rest of the time.
IMHO...
PS:
An interesting link:
http://www.coderanch.com/t/523384/java/java/ArrayList-Vector-size-incrementation
Vectors are Thread safe. ArrayLists are not. That is why ArrayList is faster than the vector.
The below link has nice info about this.
http://www.javaworld.com/javaworld/javaqa/2001-06/03-qa-0622-vector.html
I'm trying to understand the difference in behaviour of an ArrayList
and a Vector
Vector is synchronized while ArrayList is not. ArrayList is not thread-safe.
Does the following snippet in any way illustrate the difference in
synchronization ?
No difference since only Vector is sunchronized
I have two threads modifying the same objects. The objects are custom, non-synchronized objects in an ArrayList (not a vector). I want to make these two threads work nicely together, since they are called at the same time.
Here is the only important method in thread 1.
public void doThread1Action() {
//something...
for(myObject x : MyArrayList){
modify(x);
}
}
Here is a method in thread 2:
public void doThread2Action() {
//something...
for(myObject x : MyArrayList){
modifyAgain(x);
}
}
At the moment, when testing, I occasionally get `ConcurrentModificationExceptions``. (I think it depends on how fast thread 1 finishes its iterations, before thread 2 tries to modify the objects.)
Am I right in thinking that by simply appending synchronized to the beginning of these two methods, the threads will work together in a synchronized way and not try to access the ArrayList? Or should I change the ArrayList to a Vector?
A ConcurrentModificationException does not stem from modifying objects in a collection but from adding / removing from a collection while an iterator is active.
The shared resources is the collection and there must be a third method using and add/remove. To get concurrency right you must synchronize access to the collection resource in all methods that access it.
To avoid overly long synchronized blocks a common pattern may be to copy the collection in a synchronized block and then iterate over it. If you do it this way, be aware the problem you are talking about in first place (concurrent modification of your object) is again in place - but this time you can lock on another resource.
You do not need to synchronize access to the list as long as you don't modify it structurally, i.e. as long as you don't add or remove objects from the list. You also shouldn't see ConcurrentModificationExceptions, because these are only thrown when you structurally modify the list.
So, assuming that you only modify the objects contained in the list, but you do not add or remove or reorder objects on the list, it is possible to synchronize on the contained objects whenever you modify them, like so:
void modifyAgain(MyObject x) {
synchronized(x) {
// do the modification
}
}
I would not use the synchronized modifier on the modifyAgain() method, as that would not allow two distinct objects in the list to be modified concurrently.
The modify() method in the other thread must of course be implemented in the same way as modifyAgain().
You need to sychronsize access to the collection on the same lock, so just using synchronized keyword on the methods (assuming they are in different classes) would be locking on two different objects.
so here is an example of what you might need to do:
Object lock = new Object();
public void doThread1Action(){
//something...
synchronized(lock){
for(myObject x : MyArrayList){
modify(x);
}
}
public void doThread2Action(){
//something...
synchronized(lock){
for(myObject x : MyArrayList){
modifyAgain(x);
}
}
Also you could consider using a CopyOnWriteArrayList instead of Vector
I guess your problem is related to ConcurrentModificationException. This class in its Java docs says:
/**
* This exception may be thrown by methods that have detected
concurrent
* modification of an object when such modification is not
permissible.
*/
In your case, problem is iterator in a list and may modified. I guess by following implementation your problem will sole:
public void doThread1Action()
{
synchronized(x //for sample)
{
//something...
for(myObject x : MyArrayList)
{
modify(x);
}
}
}
and then:
public void doThread2Action()
{
synchronized(x //for sample)
{
//something...
for(myObject x : MyArrayList)
{
modifyAgain(x);
}
}
}
For take better result I want anyone correct my solution.
I am wondering if it is possible to avoid the lost update problem, where multiple threads are updating the same date, while avoiding using synchronized(x) { }.
I will be doing numerous adds and increments:
val++;
ary[x] += y;
ary[z]++;
I do not know how Java will compile these into byte code and if a thread could be interrupted in the middle of one of these statements blocks of byte code. In other words are those statements thread safe?
Also, I know that the Vector class is synchronized, but I am not sure what that means. Will the following code be thread safe in that the value at position i will not change between the vec.get(i) and vec.set(...).
class myClass {
Vector<Integer> vec = new Vector<>(Integer);
public void someMethod() {
for (int i=0; i < vec.size(); i++)
vec.set(i, vec.get(i) + value);
}
}
Thanks in advance.
For the purposes of threading, ++ and += are treated as two operations (four for double and long). So updates can clobber one another. Not just be one, but a scheduler acting at the wrong moment could wipe out milliseconds of updates.
java.util.concurrent.atomic is your friend.
Your code can be made safe, assuming you don't mind each element updating individually and you don't change the size(!), as:
for (int i=0; i < vec.size(); i++) {
synchronized (vec) {
vec.set(i, vec.get(i) + value);
}
}
If you want to add resizing to the Vector you'll need to move the synchronized statement outside of the for loop, and you might as well just use plain new ArrayList. There isn't actually a great deal of use for a synchronised list.
But you could use AtomicIntegerArray:
private final AtomicIntegerArray ints = new AtomicIntegerArray(KNOWN_SIZE);
[...]
int len = ints.length();
for (int i=0; i<len; ++i) {
ints.addAndGet(i, value);
}
}
That has the advantage of no locks(!) and no boxing. The implementation is quite fun too, and you would need to understand it do more complex update (random number generators, for instance).
vec.set() and vec.get() are thread safe in that they will not set and retrieve values in such a way as to lose sets and gets in other threads. It does not mean that your set and your get will happen without an interruption.
If you're really going to be writing code like in the examples above, you should probably lock on something. And synchronized(vec) { } is as good as any. You're asking here for two operations to happen in sync, not just one thread safe operation.
Even java.util.concurrent.atomic will only ensure one operation (a get or set) will happen safely. You need to get-and-increment in one operation.