We have a multi threaded java program. Multiple-threads will write to a file, and one thread will read from that file. I am looking for some design ideas. Is synchronization necessary?
FileChannel is in theory thread safe. From the javadoc:
File channels are safe for use by multiple concurrent threads. The
close method may be invoked at any time, as specified by the Channel
interface. Only one operation that involves the channel's position or
can change its file's size may be in progress at any given time;
attempts to initiate a second such operation while the first is still
in progress will block until the first operation completes. Other
operations, in particular those that take an explicit position, may
proceed concurrently; whether they in fact do so is dependent upon the
underlying implementation and is therefore unspecified.
If you can use these, then you can use the built-in synchronization, rather than having to write your own.
I would consider synchronization in this case. Imagine that 2 threads (t1 and t2) open the file at the same time and start writing to it. The changes performed by the first thread are overwrited by the second thread because the second thread is the last to save the changes to the file. When a thread t1 is writing to the file, t2 must wait until t1 finishes it's task before it can open it.
Also, if you care about the latest possible update of the file, you should synchronize the writing threads with the thread that reads the file so that if there's any thread writing the the file, the reading thread should wait.
If being synchronous isn't important, you could have your writer running in its own thread, and allow other threads to queue up writes to the file. Although I think the first thing to consider is whether writing to a file is really what you want to do. Especially in high-traffic situations, having a lot of disk I/O may not be very efficient.
If you wanted multiple readers and one writer, you would be looking for a Read Write Lock or a Read Write Mutex.
But you want multiple writers and one reader. How do you know these writers won't overwrite each others data? Are they somehow segregated?
Once multiple Threads access shared data then Synchronization is necessary. If multiple threads write to the same file without some form of locking, then potentially you will end up with a lost update problem.
Reading is not as big an issue in all circumstances so you need to consider...if a thread is reading the file and at the same time another thread updates the file, does the reading thread need to know about the change? If so you need to lock the file for the reading thread also.
You need synchronization (locking) if you have a mix of readers and writers or writers and writers. If you only have readers, you don't need any synchronization.
You don't want two processes writing to the same file or one process writing a file that another is reading.
Synchronization is necessary in this case. FileChannel is useful for preventing files being modified by processes outside the JVM: not so for applications which include multiple threads writing to a single file.
From (further down in) the JavaDoc for FileChannel:
File locks are held on behalf of the
entire Java virtual machine. They are
not suitable for controlling access to
a file by multiple threads within the
same virtual machine.
See this post for a brief discussion of strategies to share file writing between threads.
Related
I have a shared collection an ArrayList and also i use a ReentrantReadWriteLock lock to secure the entering on a critical area from different threads. My threads are three a writer,read,delete thread. i acquire the correct lock on each case. The logic is that i insert data to ArrayList, read them when necessary, and also when the timer reaches limits delete some entries. The process runs smoothly and everything is perfect.
My question now is can i transfer the above logic somehow and implemented it with an LMAX disruptor in order to avoid lock overheads and improve performance. If yes can you describe me an ideal case and if you are able to also post code i would really appreciate it.
i assume that instead of ArrayList data will be entered in ringbuffer and i will have 2 producers write, delete, and a consumer for read. Also i must make sure that i use producer barriers. Will the performance will be increased from lock case. i am not sure if i understand everything correctly please help me and give me directions?
If your shared state is the ArrayList and you have one thread that is reading and processing the elements in the ArrayList and you want the updates to that shared state synchronised then usually the ArrayList would be owned by one EventHandler that processes events such as Process, Write, Delete and updates and processes the shared state.
This would all run on one thread but that pretty much is what is happening now as you cannot Read at the same time as Writing/Deleting.
As you only have one reading thread there is not much to be gained from using a ReadWriteLock as you will never have concurrent reads.
I develop an application, and, at a given moment, I start about 10000 threads to stress-test a database. I want to synchronize this in the following way: I want to read all data from a table in all the threads, then I want all the treads to wait for the other threads to stop reading. After all threads finished reading, I delete all records from that table, then I want all the threads to insert the data read previously. Now, how do I synchronize my threads, to wait for each other in the before mentioned order? What is the best solution?
Use CyclicBarrier:
CyclicBarriers are useful in programs involving a fixed sized party of threads that must occasionally wait for each other.
The example in the JavaDoc quoted above solves the exact same problem.
10 thousand threads? Make sure you are testing your database, not your CPU and memory (context switching overhead might be tremendous). Have you considered jmeter in distributed mode?
This may not exactly be what you looking for, but you could give it a look CountDownLatch
A synchronization aid that allows one or more threads to wait until a
set of operations being performed in other threads completes.
I have some confusions on java file lock.
here's my situation.
Each thread can read/write a file.
My file manipulating method can be called by several threads at the same time
and, my goal is clear, no concurrent write to a file by threads. Always one thread allowed to write a file.
My questions are
If FileOutputStream.write() was thread safe, I didn't have to put any concurrency mechanism in my code since the code at the write() will block until a locked file will be released. However, my program would not seem to block when a file is opened by a thread (i am not sure for this)
If FileOutputStream.write() was NOT thread safe, I would have to write additional code to make a file accessed by only thread at a time. Therefore, I used FileChannel.lock() to do so. However, different from the JDK document it does not block but throw an OverlappingFileLockException.
I would appreciate your clear advise.
It is not thread safe and you need to programmatically ensure safety. Just put the relevant code in a synchronized block assuming there is no major performance requirement for your app.
I know of concepts that allow inter-process communication. My program needs to launch a second thread. I know how to pass or "push" data from one thread to another from Java/Android, but I have not seen a lot of information regarding "pulling" data. The child thread needs to grab data on the parent thread every so often. How is this done?
Since threads share memory you can just use a thread safe data structure. Refer to java.util.concurrent for some. Everything in that package is designed for multi threaded situations.
In your case you might want to use a LinkedBlockingQueue. This way the parent thread can put things into the queue, and the child thread can grab it off whenever it likes. It also allows the child thread to block if the Queue is empty.
You may be confusing threads and data. Threads are lines of code execution which may operate on some data but they are not data themselves and they do not contain data. Data is contained in memory and threads are executed by CPU (or vm or whatever level you choose).
You access data in the same way whether it is done in threads or not. That is you use variables or object fields etc. But with threads you need to make sure that there are no race conditions which happen when threads concurrently access the same data.
To summarize, if you have an object that has some method executed by thread, you can still get data from this object in regular way as long as you make sure that only one thread does it at the same time.
If your text can change and will only
be accessed from a single thread, use
a StringBuilder because StringBuilder
is unsynchronized.
If your text can changes, and will be
accessed from multiple threads, use a
StringBuffer because StringBuffer is
synchronous.
What does it mean by multiple threads? Can anyone explain me over this? I mean is it something two methods or two programs trying to access another method at same time.
Threads are paths of execution that can be executed concurrently. You can have multiple threads in your Java program, which can call the same method of the same object at the same time. If the method e.g. prints something on screen, you might see the messages coming from different threads jumbled up - unless you explicitly ensure that only one message can be printed out at a time, and all other requests to print shall wait until the actual message is fully printed.
Or, if you have a field in that object, all threads see it. And if one of them modifies the field... that's when the interesting part begins :-) Other threads may only see the updated value at a later time, or not at all, unless you specifically ensure that it is safe to use by multiple threads. This can result in subtle, hard to reproduce bugs. This is why writing concurrent programs correctly is a difficult task.
On machines with a single processor core, only a single thread can run at any time, thus different threads are executed one after another, but the OS switches between them frequently (many times per second), thus giving the user the illusion of seeing multiple threads running in parallel. OTOH multicore machines can really run several threads at the same time - as many as processor cores they have.
Every Java program has at least one thread. You may manually create additional threads within your program and pass them tasks to execute.
A detailed explanation of threads and processes - and further, concurrency in Java - can be found in the Java Tutorials.
Threads are like to little process.
Consider the case a string is shared between two threads which are running concurrently.
Both of them operating on it. So it will be the case where the String is under manipulation by both of the thread so it won't remain in consistent state.
So.
StringBuffer is designed to be thread-safe and all public methods in StringBuffer are synchronized. StringBuilder does not handle thread-safety issue and none of its methods is synchronized.
StringBuilder has better performance than StringBuffer under most circumstances.
Use the new StringBuilder wherever possible.
For more on concurrency refer this
I think you can use StringBuilder in both cases, but be very aware in multithreaded programs. Synchronization at StringBuffer method level is not useful when you must do more operations on such string (think of it like on database transactions) like delete 3 chars at beginning, then delete 3 chars at end, and compare it with something. Even when such delete operations are synchronized (thus atomic) you can have:
first thread can get such string, and delete 3 chars at beginning
second thread get such string and delete 3 chars at beginning
string is not in consistent state (6 chars deleted from beginning)
You should synchronize access to such variables on your method level, not relying on StringBuffer method synchronization. Using StringBuffer you will have two levels of synchronizations while with StringBuilder you will have only your own synchronization.
Mu;ltiple threads is like running parts of the same program at the same time sharing the same data.
A typical example is that when the program needs to do a long calculation, it can create a separate thread to do the calculation in the background and keep reacting to user input on the main thread.
The problem with multiple threads is that since they are running at the same time, and you do not really now what they are doing, since they can make their own decisions, it becomes dngerous to rely that certain actions on the shared data are always done in a certain order.
There a re various techniques of deqling with thqt, one is the synchronize key to qllow synchronous access. This means that one thread blocks access to an Object while it is busy so when the other threads want to get access, they have to wait.
So that's what meant with that StringBuffer is synchronous, it will block access to toher threads when one thread is updating it.
Using multiple threads is considered an advanced topic and not all problems have been solved in a satisfactory manner. Relying on 'synchronous' objects to deal with concurrency will not get you very far, because typically you will do updates to multiple objects in a coordinated manner and these must also be synchronized.
My advice : stay away from there until you've read a good book and experimented on exercises. Till then share no data between threads (other than the simplest of signaling flags).