Simple data thread question - java - java

If I have a static array which does not change after being populated. Multiple threads can read this array at the same time can't they? I believe problems arise when one thread trys to read the array while another is modifying it.
Thank you for your response.

Just don't access it with multiple threads while the array is being populated. If nothing is modifying the data (only reads) then you should be fine. Your assumptions are correct.

It's safe. The first step in static initialization is to synchronize on the class [1]
If all other accesses are read, the program is correctly synchronized.
[1] http://java.sun.com/docs/books/jls/third_edition/html/execution.html#12.4.2

Yes, exactly. Google "critical region" for more details.

To avoid multiple threads accessing the same data at the same moment, you have to use synchronized.

Yes, as long as everything is read-only then you'll be fine. Just make sure that no threads attempt to read while the array is being populated (e.g. if it's lazily populated).

Yes, data reads are thread safe - it is only when you are changing the data that you have to be concerned.

If you are mostly reading the array, ReadWriteLock will perform better then synchronized http://download.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/locks/ReadWriteLock.html

I am sure you already have you answer by now thanks to the answers above. Since you are working with multiple threads I will assume that you are working with multiple cores either on the same chip or distributed. In both cases there is something that you can do to improve performance: Let all threads have a copy of data as it is Read only access. This way they can use caches or local memory efficiently and avoid reading over inter-core, interprocessor links. This is obviously impractical if the array is way too big (Bigger than largest cache).

Related

Java. Read, write, separate synch

I am learning multithreading, and I have a little question.
When I am sharing some variable between threads (ArrayList, or something other like double, float), should it be lcoked by the same object in read/write? I mean, when 1 thread is setting variable value, can another read at same time withoud any problems? Or should it be locked by same object, and force thread to wait with reading, until its changed by another thread?
All access to shared state must be guarded by the same lock, both reads and writes. A read operation must wait for the write operation to release the lock.
As a special case, if all you would to inside your synchronized blocks amounts to exactly one read or write operation, then you may dispense with the synchronized block and mark the variable as volatile.
Short: It depends.
Longer:
There is many "correct answer" for each different scenarios. (and that makes programming fun)
Do the value to be read have to be "latest"?
Do the value to be written have let all reader known?
Should I take care any race-condition if two threads write?
Will there be any issue if old/previous value being read?
What is the correct behaviour?
Do it really need it to be correct ? (yes, sometime you don't care for good)
tl;dr
For example, not all threaded programming need "always correct"
sometime you tradeoff correctness with performance (e.g. log or progress counter)
sometime reading old value is just fine
sometime you need eventually correct (e.g. in map-reduce, nobody nor synchronized is right until all done)
in some cases, correct is mandatory for every moment (e.g. your bank account balance)
in write-once, read-only it doesn't matter.
sometime threads in groups with complex cases.
sometime many small, independent lock run faster, but sometime flat global lock is faster
and many many other possible cases
Here is my suggestion: If you are learning, you should thing "why should I need a lock?" and "why a lock can help in DIFFERENT cases?" (not just the given sample from textbook), "will if fail or what could happen if a lock is missing?"
If all threads are reading, you do not need to synchronize.
If one or more threads are reading and one or more are writing you will need to synchronize somehow. If the collection is small you can use synchronized. You can either add a synchronized block around the accesses to the collection, synchronized the methods that access the collection or use a concurrent threadsafe collection (for example, Vector).
If you have a large collection and you want to allow shared reading but exclusive writing you need to use a ReadWriteLock. See here for the JavaDoc and an exact description of what you want with examples:
ReentrantReadWriteLock
Note that this question is pretty common and there are plenty of similar examples on this site.

Thread safety when only one thread is writing

I know if two threads are writing to the same place I need to make sure they do it in a safe way and cause no problems but what if just one thread reads and does all the writing while another just reads.
In my case I'm using a thread in a small game for the 1st time to keep the updating apart from the rendering. The class that does all the rendering will never write to anything it reads, so I am not sure anymore if I need handle every read and write for everything they both share.
I will take the right steps to make sure the renderer does not try to read anything that is not there anymore but when calling things like the player and entity's getters should I be treating them in the same way? or would setting the values like x, y cords and Booleans like "alive" to volatile do the trick?
My understanding has become very murky on this and could do with some enlightening
Edit: The shared data will be anything that needs to be drawn and moved and stored in lists of objects.
For example the player and other entity's;
With the given information it is not possible to exactly specify a solution, but it is clear that you need some kind of method to synchronize between the threads. The issue is that as long as the write operations are not atomic that you could be reading data at the moment that it is being updates. This means that you for instance get an old y-coordinate with a new x-coordinate.
Basically you only do not need to worry about synchronization if both threads are only reading the information or - even better - if all the data structures are immutable (so both threads can not modify the objects). The best way to proceed is to think about which operations need to be atomic first, and then create a solution to make the operations atomic.
Don't forget: get it working, get it right, get it optimized (in that order).
You could have problems in this case if list's sizes are variable and you don't synchronize the access to them, consider this:
read-only thread reads mySharedList size and it sees it is 15; at that moment its CPU time finishes and read-write thread is given the CPU
read-write thread deletes an element from the list, now its size is 14.
read-only thread is again granted CPU time, it tries to read the last element using the (now obsolete) size it read before being interrupted, you'll have an Exception.

Accessing one array with multiple threads but either only reading or only writing

I'm wondering if there could be any problems while accessing one array with multiple threads but either only reading or only writing.
When the threads write to the array it wouldn't matter in which order they write and even if they write to the same entry all threads would write the same value.
For example, if I want to find prime numbers via the Sieve of Eratosthenes:
I create an array of consecutive numbers and set all multiples of prime numbers to 0 using multiple threads.
It wouldn't matter if the thread which strikes off the multiples of two and the thread which strikes off the multiples of 5 set the entry of the number 20 to 0 at the same time or one before or after the other.
So it's not an question of the qualitiy or consistency of the data, but of the technical possibility to do it wihout facing any java errors.
I'm assuming you mean 'without synchronization controls'. The short answer is no.
Synchronization is used for 2 reasons:
Mutual exclusion of data
communication between threads
Your setup indicates that the first reason isn't really a problem in your case. The algorithm effectively separates the data out so that multiple worker threads won't be using the same data.
However, in order for changes done in one thread to become visible to another thread, you must use synchronization. Without synchronization, the JVM makes no guarantee as to the ordering of writes. Updates that one thread makes may be visible in another thread at any time later, or even never. See Effective Java Item #66, and maybe look at the Java Concurrency in Practice book.
I don't think it would work since eventually you need to read the variables (to output them, save to disk, etc.). And the read has to be synchronized in order to guarantee correct interthread operation ordering. Remember that without synchronization java only guarantees intrathread operation ordering.
Now, you can say that you don't want to read them at all in anyway, but if that is the case, java can just optimize throwing away the whole code.

arraylist that supports multithreads

I need to write an arraylist to a file.
It gets filled all the time and when it gets too big I need to start writing it.
So I thought to check when the arraylist sise is greater then 100 and then append the file and write the current rows .
But the problem is sometimes it doesnt get filled for a few minuettes and I will want to dump the data to a file.
So my second thought was to have another thread that will check if there are rows every few sec and dump it to the file.
But than I would need to manage locks between threads.
My questions are :
1. Is the multithread design ok ?
2. Is there an arraylist that supports multithread in any form ?
You can make synchronized lists using Collections.synchronizedList.
Check out this thread.
You should use
List synchronizedList = Collections.synchronizedList(list);
Instead of ArrayList I would recommend you to use proper implementation of Queue<E>. If you are only appending data to the list and then dumping it to the file removing saved items from the list, queue is a much better choice.
Some implementations are threads safe and will even allow the caller thread to block until something actually appears in the queue - which is much better approach than having a polling thread. BlockingQueue looks very promising for your case.
From your question it appears most of the time you are going to perform write operation and that too from a single thread & will be intermittently checking for the size of list.
other wise you can not use plain old ArrayList.
synchronizing the list and then using locks to access the list looks like a overkill.
rather have a if check that will check the size.
If you are going to access list in multiple threads then to avoid ConcurrentModificationException use the method suggested by #Ludevik
There are other approaches as well but for the sake of simplicity #Ludevik approach fits the bill.
Instead of arrays you can use vectors. Since vectors are thread-safe.
ArrayList is not thread-safe, but you can get a thread-safe list with Collections.synchronizedList()
You need to use Collections.synchronizedList(List) to create a thread safe list. However, you still need to synchronize operations such as add or remove and synchronize the updates to the objects held in the list.
The simple solutions are to use Collections.synchronizedList(list) or Vector. However, there is a gotcha.
The iterator() method for a synchronized list / Vector created as above is NOT synchronized. So there's nothing to stop a thread from trying to add new element to the list if you copy it using an iterator explicitly, by using for (type var : list) {...} or by using a copy constructor that relies on the iterator.
This is liable to result in concurrent modification exceptions. To avoid that problem, you will need to do your own locking.
It may be better idea to use a concurrent Queue class so that the thread that writes stuff to the file doesn't need to iterate a list.

Java: Large collection and concurrent threads

I am facing this issue:
I have lots of threads (1024) who access one large collection - Vector.
Question:
is it possible to do something about it which would allow me to do concurrent actions on it without having to synchronize everything (since that takes time)? What I mean, is something like Mysql database works, you don't have to worry about synchronizing and thread-safe issues. Is there some collection alike that in Java? Thanks
Vector is a very old Java class - predates the Collections API. It synchronizes on every operation, so you're not going to have any luck trying to speed it up.
You should consider reworking your code to use something like ConcurrentHashMap or a LinkedBlockingQueue, which are highly optimized for concurrent access.
Failing that, you mention that you'd like performance and access semantics similar to a database - why not use a dedicated database or a message queue? They are likely to implement it better than you ever will, and it's less code for you to write!
[edit] Given your comment:
all what thread does is adding elements to vector
(only if num of elements in vector = 0) &
removing elements from vector. (if vector size > 0)
it sounds very much like you should be using something much more like a queue than a list! A bounded queue with size 1 will give you these semantics - although I'd question why you can't add elements if there is already something there. When you've got thousands of threads this seems like a very inefficient design.
Well first off, this design doesn't sound right. It sounds like you need to think about using a proper database rather than an simple data structure, even if this means just using something like an in-memory instance of HypersonicDB.
However, if you insist on doing things this way, then the java.util.concurrent package has a number of highly concurrent, non-locking data structures. One of them might suit your purpose (e.g. ConcurrentHashMap, if you can use a Map rather than a List)
Looks like you are implementing the producer consumer pattern, you should google "producer consumer java" or have a look at the BlockingQueue interface
I agree with skaffman about looking at java.util.concurrent.
ConcurrentHashMap is very scalable. However, the size() call on it returns only an approximation. So e.g. your app will occasionally be adding elements to it even if !(num of elements in vector = 0).
If you want to strictly enforce the condition you gave, there is no other way than to synchronize.
Instead of having tons of context switches, I guess you could let your users thread post a callable on a queue and have only one thread dealing with the mutation. This will eliminate the need for synchronization on the collection. The user threads can wait on Future.get().
Just an idea.
If you do not want to change your data structure and have only infrequent writes, you might also use one or many ReentrantReadWriteLock to synchronize access. Then many threads can read at the same time, but when a thread wants to write all reads are blocked until the write is done.
But you should check whether the used data structure is appropriate for the task, or whether another of the many java.util or java.util.concurrent classes is more appropriate. java.util.Vector is synchronized, by the way.

Categories