HastSet<String> thread safe

HastSet<String> thread safe - java

===update====
from comment
so I clearly read the doc and know it's not thread safe and I wanted to run a small experiment to see how it will break. So the doc says the result is non deterministic. Does anyone know what could happen? If I want to prove it's not thread safe how can I write a sample code so that I can actually see that it's no thread safe? Have you guys actually tried and seen not working example? Do you have sample code?
If I have three threads accessing the hashset of string.
One adding a new string
Second removing the string
Third removing all
Is the HashSet thread safe?
public void test()
{
Set<String> test = new HashSet<>();
Thread t0= new Thread(new Runnable() {
#Override
public void run() {
while (true) {
boolean c = test.contains("test");
System.out.println("checking " + c);
try {
Thread.sleep(50);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
});
Thread t1 = new Thread(new Runnable() {
#Override
public void run() {
while (true) {
test.add("test");
System.out.println("adding");
try {
Thread.sleep(50);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
});
Thread t2 = new Thread(new Runnable() {
#Override
public void run() {
while (true) {
if (!test.isEmpty())
{
test.removeAll(test);
}
System.out.println("removing");
try {
Thread.sleep(500);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
});
t0.start();
t1.start();
t2.start();
while(true) {
}
}
I have this test code and ran it and it seems working. No exceptions were thrown. I was little confused because HashSet is not thread-safe.
Whay am I missing?

From comment:
so I clearly read the doc and know it's not thread safe and I wanted to run a small experiment to see how it will break. So the doc says the result is non deterministic. does anyone know what could happen? If I want to prove it's not thread how cam I write a sample code so that I can actually see that it's no thread safe? Have you guys actually tried and seen that not working example? do you have sample code?
The problem is that updating the Set may not be an atomic operation, especially not when the internal hash table needs to be re-sized.
If two threads are updating at the same time, you may get the simple result that one thread overrides the change by the other thread, so you lose a change. More seriously, the conflict may corrupt the internal structure of the Set.
To show this, here is a small program that causes high conflict during add of values. All values added are distinct, so they should all be added, but you will see that size of the Set is incorrect when program is done, proving that some added values got lost.
final int THREAD_COUNT = 10;
final int NUMS_TO_ADD = 100000;
Set<Integer> set = new HashSet<>();
Thread[] threads = new Thread[THREAD_COUNT];
for (int i = 0; i < THREAD_COUNT; i++) {
final int threadNo = i;
threads[i] = new Thread() {
#Override public void run() {
for (int j = 0; j < NUMS_TO_ADD; j++)
set.add(j * THREAD_COUNT + threadNo); // all distinct values
}
};
threads[i].start();
}
for (int i = 0; i < threads.length; i++)
threads[i].join();
System.out.println("Found " + set.size() + " values, expected " + THREAD_COUNT * NUMS_TO_ADD);
Each time you run it, you will get a different result, e.g.
Found 898070 values, expected 1000000
Found 825773 values, expected 1000000
Found 731886 values, expected 1000000
Exception in thread "Thread-7" java.lang.ClassCastException: java.base/java.util.HashMap$Node cannot be cast to java.base/java.util.HashMap$TreeNode
at java.base/java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1883)
at java.base/java.util.HashMap$TreeNode.putTreeVal(HashMap.java:2063)
at java.base/java.util.HashMap.putVal(HashMap.java:638)
at java.base/java.util.HashMap.put(HashMap.java:612)
at java.base/java.util.HashSet.add(HashSet.java:220)
at Test$1.run(Test.java:16)
Or the program simply hangs!

Thread Safe
Thread unsafe does not mean you can not use it in multi-treads or the program will throw exceptions. It means you can not always get what you want when program is executed in multi threads. See this for more.
In computer programming, thread-safe describes a program portion or
routine that can be called from multiple programming threads without
unwanted interaction between the threads.
And, you can not say an object is thread safe even if you get the expected experiment results. Because the results may vary in different environments. You should use synchronization mechanism provided by JDK.
HashSet
HashSet is not thread safe, this means:
If you write an object into it, this object may not be visible to
other threads.
If you read the set from different threads at same time, they may get different results.
If you call add first, then call removeAll, the objects in this
set may not be removed.
......
User Andreas's example is pretty clear. Since HashSet is based on HashMap's key set, you can refer this How to prove that HashMap in java is not thread-safe.
Solution
JDK provided a thread safe version set, all operations on the set need aquire a inner monitor lock first.
Set s = Collections.synchronizedSet(new HashSet(...));

Related

Create a HashSet with duplicate values

Is there a possibility that HashSet could have duplicate values in case of multiple threads adding items to it?
I'm not looking from modifying the equals or hashcode methods perspective but simply from multithreaded environment.

Is there a possibility that HashSet could have duplicate values in case of multiple threads adding items to it?
HashSet is not a thread-safe class. If you update a HashSet from multiple threads without proper synchronization, then the behavior is unspecified, and difficult to predict. (And Java version dependent, given that the implementation of HashSet has changed a number of times over the lifetime of Java SE.)
The unspecified behavior could include duplicates appearing in the set as observed via the set's iterator.
If you want to share a (mutable) set between multiple threads, either use a ConcurrentHashSet or a Collections.synchronizedSet wrapper or an explicit Lock or mutex to synchronize operations.
(The different alternatives all have caveats associated with them. We can't recommend a specific alternative based on the limited information you have given.)

We can discuss this answer part by part:
HashSet does not allow duplicate elements which means you can not store duplicate values in HashSet. And its alternative Hashmap doesn't allow duplicate keys however, it allows duplicate values.
HashSet in Java is not thread-safe as it is not synchronized by default. If you are using HashSet in a multi-threaded environment where it is accessed by multiple threads concurrently and structurally modified too by even a single thread then it must be synchronized externally. A structural modification is defined as any operation that adds or deletes one or more elements, or explicitly resizes the backing array; merely setting the value of an element is not a structural modification.
So, when you update HashSet from multiple threads without external sync, its behavior will be unpredictable.
To avoid this unpredictable behavior, we can synchronize HashSet by using Collections.synchronizedSet() method.
Example:
First, we’ll see an example what happens if HashSet is used in a multi-threaded environment without synchronizing it.
In the Java code four threads are created, each of these thread adds 5 elements to the Set. After all the threads are done Set size should be 20.
public class SetSynchro implements Runnable{
private Set<String> numSet;
public SetSynchro(Set<String> numSet){
this.numSet = numSet;
}
public static void main(String[] args) {
Set<String> numSet = new HashSet<String>();
/// 4 threads
Thread t1 = new Thread(new SetSynchro(numSet));
Thread t2 = new Thread(new SetSynchro(numSet));
Thread t3 = new Thread(new SetSynchro(numSet));
Thread t4 = new Thread(new SetSynchro(numSet));
t1.start();
t2.start();
t3.start();
t4.start();
try {
t1.join();
t2.join();
t3.join();
t4.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("Size of Set is " + numSet.size());
}
#Override
public void run() {
System.out.println("in run method" + Thread.currentThread().getName());
String str = Thread.currentThread().getName();
for(int i = 0; i < 5; i++){
// adding thread name to make element unique
numSet.add(i + str);
try {
// delay to verify thread interference
Thread.sleep(500);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
}
Output:
in run methodThread-2
in run methodThread-0
in run methodThread-3
in run methodThread-1
Size of Set is 19
//In one of the run size was 19, in another run 18 and sometimes even 20, so you can see that thread interference is making the behavior unpredictable.
So you can see that thread interference is making the behavior unpredictable. So we’ll synchronize the HashSet using the same example.
public class SetSynchro implements Runnable{
private Set<String> numSet;
public SetSynchro(Set<String> numSet){
this.numSet = numSet;
}
public static void main(String[] args) {
// Synchronized Set
Set<String> numSet = Collections.synchronizedSet(new HashSet<String>());
/// 4 threads
Thread t1 = new Thread(new SetSynchro(numSet));
Thread t2 = new Thread(new SetSynchro(numSet));
Thread t3 = new Thread(new SetSynchro(numSet));
Thread t4 = new Thread(new SetSynchro(numSet));
t1.start();
t2.start();
t3.start();
t4.start();
try {
t1.join();
t2.join();
t3.join();
t4.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("Size of Set is " + numSet.size());
}
#Override
public void run() {
System.out.println("in run method" + Thread.currentThread().getName());
String str = Thread.currentThread().getName();
for(int i = 0; i < 5; i++){
// adding thread name to make element unique
numSet.add(i + str);
try {
// delay to verify thread interference
Thread.sleep(500);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
}
Output:
in run methodThread-3
in run methodThread-2
in run methodThread-1
in run methodThread-0
Size of Set is 20
//Now every time size of HashSet is 20.
For more details, this link is useful. The code block is also taken from there.

From Javadoc
Note that this implementation is not synchronized. If multiple threads
access a hash set concurrently, and at least one of the threads
modifies the set, it must be synchronized externally. This is
typically accomplished by synchronizing on some object that naturally
encapsulates the set. If no such object exists, the set should be
"wrapped" using the Collections.synchronizedSet method. This is best
done at creation time, to prevent accidental unsynchronized access to
the set:
Set s = Collections.synchronizedSet(new HashSet(...));
Basically, the way that I interpret this, is that under the right (or wrong) conditions, there might be a slight possibility that this could happen. HOWEVER, since the assumption is that you know that multiple threads will access this set, you need to synchronize access externally (since HashSet is not thread-safe). OR, in the absence of such external mechanism, you will need to wrap this set as shown above.
If you really want to find out, create an application with a lot of threads that attempt to set the same value. After insertion, print out some message to the console if the size of the set is ever greater than 1. That should tell you the answer. Maybe the chances of this happening are very small. But the class documentation tells you that if you expect to use in multithreaded process, you should synchronize it.

Real world example of Memory Consistency Errors in multi-threading?

In the tutorial of java multi-threading, it gives an exmaple of Memory Consistency Errors. But I can not reproduce it. Is there any other method to simulate Memory Consistency Errors?
The example provided in the tutorial:
Suppose a simple int field is defined and initialized:
int counter = 0;
The counter field is shared between two threads, A and B. Suppose thread A increments counter:
counter++;
Then, shortly afterwards, thread B prints out counter:
System.out.println(counter);
If the two statements had been executed in the same thread, it would be safe to assume that the value printed out would be "1". But if the two statements are executed in separate threads, the value printed out might well be "0", because there's no guarantee that thread A's change to counter will be visible to thread B — unless the programmer has established a happens-before relationship between these two statements.

I answered a question a while ago about a bug in Java 5. Why doesn't volatile in java 5+ ensure visibility from another thread?
Given this piece of code:
public class Test {
volatile static private int a;
static private int b;
public static void main(String [] args) throws Exception {
for (int i = 0; i < 100; i++) {
new Thread() {
#Override
public void run() {
int tt = b; // makes the jvm cache the value of b
while (a==0) {
}
if (b == 0) {
System.out.println("error");
}
}
}.start();
}
b = 1;
a = 1;
}
}
The volatile store of a happens after the normal store of b. So when the thread runs and sees a != 0, because of the rules defined in the JMM, we must see b == 1.
The bug in the JRE allowed the thread to make it to the error line and was subsequently resolved. This definitely would fail if you don't have a defined as volatile.

This might reproduce the problem, at least on my computer, I can reproduce it after some loops.
Suppose you have a Counter class:
class Holder {
boolean flag = false;
long modifyTime = Long.MAX_VALUE;
}
Let thread_A set flag as true, and save the time into
modifyTime.
Let another thread, let's say thread_B, read the Counter's flag. If thread_B still get false even when it is later than modifyTime, then we can say we have reproduced the problem.
Example code
class Holder {
boolean flag = false;
long modifyTime = Long.MAX_VALUE;
}
public class App {
public static void main(String[] args) {
while (!test());
}
private static boolean test() {
final Holder holder = new Holder();
new Thread(new Runnable() {
#Override
public void run() {
try {
Thread.sleep(10);
holder.flag = true;
holder.modifyTime = System.currentTimeMillis();
} catch (Exception e) {
e.printStackTrace();
}
}
}).start();
long lastCheckStartTime = 0L;
long lastCheckFailTime = 0L;
while (true) {
lastCheckStartTime = System.currentTimeMillis();
if (holder.flag) {
break;
} else {
lastCheckFailTime = System.currentTimeMillis();
System.out.println(lastCheckFailTime);
}
}
if (lastCheckFailTime > holder.modifyTime
&& lastCheckStartTime > holder.modifyTime) {
System.out.println("last check fail time " + lastCheckFailTime);
System.out.println("modify time " + holder.modifyTime);
return true;
} else {
return false;
}
}
}
Result
last check time 1565285999497
modify time 1565285999494
This means thread_B get false from Counter's flag filed at time 1565285999497, even thread_A has set it as true at time 1565285999494(3 milli seconds ealier).

The example used is too bad to demonstrate the memory consistency issue. Making it work will require brittle reasoning and complicated coding. Yet you may not be able to see the results. Multi-threading issues occur due to unlucky timing. If someone wants to increase the chances of observing issue, we need to increase chances of unlucky timing.
Following program achieves it.
public class ConsistencyIssue {
static int counter = 0;
public static void main(String[] args) throws InterruptedException {
Thread thread1 = new Thread(new Increment(), "Thread-1");
Thread thread2 = new Thread(new Increment(), "Thread-2");
thread1.start();
thread2.start();
thread1.join();
thread2.join();
System.out.println(counter);
}
private static class Increment implements Runnable{
#Override
public void run() {
for(int i = 1; i <= 10000; i++)
counter++;
}
}
}
Execution 1 output: 10963,
Execution 2 output: 14552
Final count should have been 20000, but it is less than that. Reason is count++ is multi step operation,
1. read count
2. increment count
3. store it
two threads may read say count 1 at once, increment it to 2. and write out 2. But if it was a serial execution it should have been 1++ -> 2++ -> 3.
We need a way to make all 3 steps atomic. i.e to be executed by only one thread at a time.
Solution 1: Synchronized
Surround the increment with Synchronized. Since counter is static variable you need to use class level synchronization
#Override
public void run() {
for (int i = 1; i <= 10000; i++)
synchronized (ConsistencyIssue.class) {
counter++;
}
}
Now it outputs: 20000
Solution 2: AtomicInteger
public class ConsistencyIssue {
static AtomicInteger counter = new AtomicInteger(0);
public static void main(String[] args) throws InterruptedException {
Thread thread1 = new Thread(new Increment(), "Thread-1");
Thread thread2 = new Thread(new Increment(), "Thread-2");
thread1.start();
thread2.start();
thread1.join();
thread2.join();
System.out.println(counter.get());
}
private static class Increment implements Runnable {
#Override
public void run() {
for (int i = 1; i <= 10000; i++)
counter.incrementAndGet();
}
}
}
We can do with semaphores, explicit locking too. but for this simple code AtomicInteger is enough

Sometimes when I try to reproduce some real concurrency problems, I use the debugger.
Make a breakpoint on the print and a breakpoint on the increment and run the whole thing.
Releasing the breakpoints in different sequences gives different results.
Maybe to simple but it worked for me.

Please have another look at how the example is introduced in your source.
The key to avoiding memory consistency errors is understanding the happens-before relationship. This relationship is simply a guarantee that memory writes by one specific statement are visible to another specific statement. To see this, consider the following example.
This example illustrates the fact that multi-threading is not deterministic, in the sense that you get no guarantee about the order in which operations of different threads will be executed, which might result in different observations across several runs. But it does not illustrate a memory consistency error!
To understand what a memory consistency error is, you need to first get an insight about memory consistency. The simplest model of memory consistency has been introduced by Lamport in 1979. Here is the original definition.
The result of any execution is the same as if the operations of all the processes were executed in some sequential order and the operations of each individual process appear in this sequence in the order specified by its program
Now, consider this example multi-threaded program, please have a look at this image from a more recent research paper about sequential consistency. It illustrates what a real memory consistency error might look like.
To finally answer your question, please note the following points:
A memory consistency error always depends on the underlying memory model (A particular programming languages may allow more behaviours for optimization purposes). What's the best memory model is still an open research question.
The example given above gives an example of sequential consistency violation, but there is no guarantee that you can observe it with your favorite programming language, for two reasons: it depends on the programming language exact memory model, and due to undeterminism, you have no way to force a particular incorrect execution.
Memory models are a wide topic. To get more information, you can for example have a look at Torsten Hoefler and Markus Püschel course at ETH Zürich, from which I understood most of these concepts.
Sources
Leslie Lamport. How to Make a Multiprocessor Computer That Correctly Executes Multiprocessor Programs, 1979
Wei-Yu Chen, Arvind Krishnamurthy, Katherine Yelick, Polynomial-Time Algorithms for Enforcing Sequential Consistency in SPMD Programs with Arrays, 2003
Design of Parallel and High-Performance Computing course, ETH Zürich

Why is this multithreaded counter producing the right result?

I'm learning multithreaded counter and I'm wondering why no matter how many times I ran the code it produces the right result.
public class MainClass {
public static void main(String[] args) {
Counter counter = new Counter();
for (int i = 0; i < 3; i++) {
CounterThread thread = new CounterThread(counter);
thread.start();
}
}
}
public class CounterThread extends Thread {
private Counter counter;
public CounterThread(Counter counter) {
this.counter = counter;
}
public void run() {
for (int i = 0; i < 10; i++) {
this.counter.add();
}
this.counter.print();
}
}
public class Counter {
private int count = 0;
public void add() {
this.count = this.count + 1;
}
public void print() {
System.out.println(this.count);
}
}
And this is the result
10
20
30
Not sure if this is just a fluke or is this expected? I thought the result is going to be
10
10
10

Try increasing the loop count from 10 to 10000 and you'll likely see some differences in the output.
The most logical explanation is that with only 10 additions, a thread is too fast to finish before the next thread gets started and adds on top of the previous result.

I'm learning multithreaded counter and I'm wondering why no matter how many times I ran the code it produces the right result.
<ttdr> Check out #manouti's answer. </ttdr>
Even though you are sharing the same Counter object, which is unsynchronized, there are a couple of things that are causing your 3 threads to run (or look like they are running) serially with data synchronization. I had to work hard on my 8 proc Intel Linux box to get it to show any interleaving.
When threads start and when they finish, there are memory barriers that are crossed. According to the Java Memory Model, the guarantee is that the thread that does the thread.join() will see the results of the thread published to it but I suspect a central memory flush happens when the thread finishes. This means that if the threads run serially (and with such a small loop it's hard for them not to) they will act as if there is no concurrency because they will see each other's changes to the Counter.
Putting a Thread.sleep(100); at the front of the thread run() method causes it to not run serially. It also hopefully causes the threads to cache the Counter and not see the results published by other threads that have already finished. Still needed help though.
Starting the threads in a loop after they all have been instantiated helps concurrency.
Another thing that causes synchronization is:
System.out.println(this.count);
System.out is a Printstream which is a synchronized class. Every time a thread calls println(...) it is publishing its results to central memory. If you instead recorded the value and then displayed it later, it might show better interleaving.
I really wonder if some Java compiler inlining of the Counter class at some point is causing part of the artificial synchronization. For example, I'm really surprised that a Thread.sleep(1000) at the front and end of the thread.run() method doesn't show 10,10,10.
It should be noted that on a non-intel architecture, with different memory and/or thread models, this might be easier to reproduce.
Oh, as commentary and apropos of nothing, typically it is recommended to implement Runnable instead of extending Thread.
So the following is my tweaks to your test program.
public class CounterThread extends Thread {
private Counter counter;
int result;
...
public void run() {
try {
Thread.sleep(100);
} catch (InterruptedException e1) {
Thread.currentThread().interrupt(); // good pattern
return;
}
for (int i = 0; i < 10; i++) {
counter.add();
}
result = counter.count;
// no print here
}
}
Then your main could do something like:
Counter counter = new Counter();
List<CounterThread> counterThreads = new ArrayList<>();
for (int i = 0; i < 3; i++) {
counterThread.add(new CounterThread(counter));
}
// start in a loop after constructing them all which improves the overlap chances
for (CounterThread counterThread : counterThreads) {
counterThread.start();
}
// wait for them to finish
for (CounterThread counterThread : counterThreads) {
counterThread.join();
}
// print the results
for (CounterThread counterThread : counterThreads) {
System.out.println(counterThread.result);
}
Even with this, I never see 10,10,10 output on my box and I often see 10,20,30. Closest I get is 12,12,12.
Shows you how hard it is to properly test a threaded program. Believe me, if this code was in production and you were expecting the "free" synchronization is when it would fail you. ;-)

Testing the difference between StringBuilder and StringBuffer

I just want to see the difference between them visually, so below is the code. But it always fails. Can someone please help me on this? I have seen questions on SO too, but none of them have shown the difference programatically.
public class BBDifferencetest {
protected static int testnum = 0;
public static void testStringBuilder() {
final StringBuilder sb = new StringBuilder();
Thread t1 = new Thread() {
#Override
public void run() {
for (int x = 0; x < 100; x++) {
testnum++;
sb.append(testnum);
sb.append(" ");
}
}
};
Thread t2 = new Thread() {
public void run() {
for (int x = 0; x < 100; x++) {
testnum++;
sb.append(testnum);
sb.append(" ");
}
}
};
t1.start();
t2.start();
try {
t1.join();
t2.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("Result is: " + sb.toString());
}
public static void main(String args[]) {
testStringBuilder();
}
}
When I execute this, I get the output sometimes in a random manner, so this proves my test. But when I even replace StringBuilder with StringBuffer and test, even it gives me unexpected output(rather than sequential which from 1 to 200). So can someone help me getting to know the difference visually?
P.S : If anyone has your code which shows the difference, I would be very glad to accept it as an answer. Because I am not sure whether I can achieve the difference with my code even though it is modified.

(rather than sequential which from 1 to 200)
Each thread is performing a read, modify, write operation on testnum. That in itself is not thread-safe.
Then each thread is fetching the value of testnum again in order to append it. The other thread may well have interrupted by then and incremented the value again.
If you change your code to:
AtomicInteger counter = new AtomicInteger();
...
sb.append(counter.getAndIncrement());
then you're more likely to see what you expect.
To make it clearer, change your loops to only call append once, like this:
for (int x = 0; x < 100; x++) {
sb.append(counter.incrementAndGet() + " ");
}
When I do that, for StringBuffer I always get "perfect" output. For StringBuilder I sometimes get output like this:
97 98 100 102 104
Here the two threads have both been appending at the same time, and the contents have been screwed up.
EDIT: Here's a somewhat shorter complete example:
import java.util.concurrent.atomic.AtomicInteger;
public class Test {
public static void main(String[] args) throws InterruptedException {
final AtomicInteger counter = new AtomicInteger();
// Change to StringBuffer to see "working" output
final StringBuilder sb = new StringBuilder();
Runnable runnable = new Runnable() {
#Override
public void run() {
for (int x = 0; x < 100; x++) {
sb.append(counter.incrementAndGet() + " ");
}
}
};
Thread t1 = new Thread(runnable);
Thread t2 = new Thread(runnable);
t1.start();
t2.start();
t1.join();
t2.join();
System.out.println(sb);
}
}

StringBuffer is synchronized at the method level. It means that noone can enter one of his methods if a thread is already in one of his method. But it does not guarantee that one thread will be blocked to use StringBuilder at all as long as the other thread uses it, and so the two threads will still compete for access to methods, and you may have randomly a non-ordered result.
The only way to really lock an access to the StringBuffer is to put the code that access it in a synchronized block:
public void run() {
synchronized(sb) {
for (int x = 0; x < 100; x++) {
testnum++;
sb.append(testnum);
sb.append(" ");
}
}
}
If you don't do that, then Thread 1 can go into sb.append(testnum) and Thread 2 will wait at the entry of it, and when Thread 1 goes out, Thread 2 can potentially go inside and starts to write before Thread 1 enters sb.append(" "). So you would see:
12 13 1415 16 ....
The thing is, locking like this will make things work for StringBuilder also. That's why one could say that the synchronization mechanism on StringBuffer is quite useless, and therefore why it's not used anymore (the same thing for Vector).
So, doing this way can not show you the difference between StringBuilder and StringBuffer. The suggestion in Jon Skeet answer is better.

+1 for what Cyrille said. I imagine that it is only the nature of arrays of inherently atomic types (primitives <= 32 bit) that saves you from get a ConcurrentModificationException with the StringBuilder as you would with, say, appending to a List<Integer>
Basically, you have two threads, each 100 individual operations. The two compete for lock of the object before each append, and release it afterwards, 100 times each. The thread that wins on each iteration will be randomized by the (extremely) small amount of time taken to increment the loop counter and testnum.
More exemplary of the difference from your example is not necessarily the ordering, but ensuring that all insertions are actually accounted for when using a StringBuilder. It has no internal synchronization, so it's entirely possible that some will get munged or overwritten in the process. The StringBuffer will handle this with internal synchronization guaranteeing that all inserts make it in properly, but you'll need external synchronization such as Cyrille's example above to hold a lock for the entire iteration sequence of each thread to safely use a StringBuilder.

Assigning a object to a field defined outside a synchronized block - is it thread safe?

Is there anything wrong with the thread safety of this java code? Threads 1-10 add numbers via sample.add(), and Threads 11-20 call removeAndDouble() and print the results to stdout. I recall from the back of my mind that someone said that assigning item in same way as I've got in removeAndDouble() using it outside of the synchronized block may not be thread safe. That the compiler may optimize the instructions away so they occur out of sequence. Is that the case here? Is my removeAndDouble() method unsafe?
Is there anything else wrong from a concurrency perspective with this code? I am trying to get a better understanding of concurrency and the memory model with java (1.6 upwards).
import java.util.*;
import java.util.concurrent.*;
public class Sample {
private final List<Integer> list = new ArrayList<Integer>();
public void add(Integer o) {
synchronized (list) {
list.add(o);
list.notify();
}
}
public void waitUntilEmpty() {
synchronized (list) {
while (!list.isEmpty()) {
try {
list.wait(10000);
} catch (InterruptedException ex) { }
}
}
}
public void waitUntilNotEmpty() {
synchronized (list) {
while (list.isEmpty()) {
try {
list.wait(10000);
} catch (InterruptedException ex) { }
}
}
}
public Integer removeAndDouble() {
// item declared outside synchronized block
Integer item;
synchronized (list) {
waitUntilNotEmpty();
item = list.remove(0);
}
// Would this ever be anything but that from list.remove(0)?
return Integer.valueOf(item.intValue() * 2);
}
public static void main(String[] args) {
final Sample sample = new Sample();
for (int i = 0; i < 10; i++) {
Thread t = new Thread() {
public void run() {
while (true) {
System.out.println(getName()+" Found: " + sample.removeAndDouble());
}
}
};
t.setName("Consumer-"+i);
t.setDaemon(true);
t.start();
}
final ExecutorService producers = Executors.newFixedThreadPool(10);
for (int i = 0; i < 10; i++) {
final int j = i * 10000;
Thread t = new Thread() {
public void run() {
for (int c = 0; c < 1000; c++) {
sample.add(j + c);
}
}
};
t.setName("Producer-"+i);
t.setDaemon(false);
producers.execute(t);
}
producers.shutdown();
try {
producers.awaitTermination(600, TimeUnit.SECONDS);
} catch (InterruptedException e) {
e.printStackTrace();
}
sample.waitUntilEmpty();
System.out.println("Done.");
}
}

It looks thread safe to me. Here is my reasoning.
Everytime you access list you do it synchronized. This is great. Even though you pull out a part of the list in item, that item is not accessed by multiple threads.
As long as you only access list while synchronized, you should be good (in your current design.)

Your synchronization is fine, and will not result in any out-of-order execution problems.
However, I do notice a few issues.
First, your waitUntilEmpty method would be much more timely if you add a list.notifyAll() after the list.remove(0) in removeAndDouble. This will eliminate an up-to 10 second delay in your wait(10000).
Second, your list.notify in add(Integer) should be a notifyAll, because notify only wakes one thread, and it may wake a thread that is waiting inside waitUntilEmpty instead of waitUntilNotEmpty.
Third, none of the above is terminal to your application's liveness, because you used bounded waits, but if you make the two above changes, your application will have better threaded performance (waitUntilEmpty) and the bounded waits become unnecessary and can become plain old no-arg waits.

Your code as-is is in fact thread safe. The reasoning behind this is two part.
The first is mutual exclusion. Your synchronization correctly ensures that only one thread at a time will modify the collections.
The second has to do with your concern about compiler reordering. Youre worried that the compile can in fact re order the assigning in which it wouldnt be thread safe. You dont have to worry about it in this case. Synchronizing on the list creates a happens-before relationship. All removes from the list happens-before the write to Integer item. This tells the compiler that it cannot re order the write to item in that method.

Your code is thread-safe, but not concurrent (as in parallel). As everything is accessed under a single mutual exclusion lock, you are serialising all access, in effect access to the structure is single-threaded.
If you require the functionality as described in your production code, the java.util.concurrent package already provides a BlockingQueue with (fixed size) array and (growable) linked list based implementations. These are very interesting to study for implementation ideas at the very least.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.