Is Files.write() method thread safe. [duplicate]

Is Files.write() method thread safe. [duplicate] - java

I have a java program which uses 20 threads. Every one of them write their results in a file called output.txt.
I always get a different number of lines in output.txt.
Can it be a problem with the synchronization of threads? Is there a way to handle this?

can it be a problem of synchronization of threads?
Yes.
There's a way to handle this?
Yes, ensure that writes are serialized by synchronizing on a relevant mutex. Or alternately, have only one thread that actually outputs to the file, and have all of the other threads simply queue text to be written to a queue that the one writing thread draws from. (That way the 20 main threads don't block on I/O.)
Re the mutex: For instance, if they're all using the same FileWriter instance (or whatever), which I'll refer to as fw, then they could use it as a mutex:
synchronized (fw) {
fw.write(...);
}
If they're each using their own FileWriter or whatever, find something else they all share to be the mutex.
But again, having a thread doing the I/O on behalf of the others is probably also a good way to go.

I'd suggest you to organize it this way: One thread-consumer will consume all data and write it to the file. All worker threads will produce data to the consumer thread in synchronous way. Or with multiple threads file writing you can use some mutex or locks implementations.

If you want any semblance of performance and ease of management, go with the producer-consumer queue and just one file-writer, as suggested by Alex and others. Letting all the threads at the file with a mutex is just messy - every disk delay is transferred directly into your main app functionality, (with added contention). This is especially unfunny with slow network drives that tend to go away without warning.

If you can hold your file as a FileOutputStream you can lock it like this:
FileOutputStream file = ...
....
// Thread safe version.
void write(byte[] bytes) {
try {
boolean written = false;
do {
try {
// Lock it!
FileLock lock = file.getChannel().lock();
try {
// Write the bytes.
file.write(bytes);
written = true;
} finally {
// Release the lock.
lock.release();
}
} catch ( OverlappingFileLockException ofle ) {
try {
// Wait a bit
Thread.sleep(0);
} catch (InterruptedException ex) {
throw new InterruptedIOException ("Interrupted waiting for a file lock.");
}
}
} while (!written);
} catch (IOException ex) {
log.warn("Failed to lock " + fileName, ex);
}
}

You should use synchronization in this case. Imagine that 2 threads (t1 and t2) open the file at the same time and start writing to it. The changes performed by the first thread are overwrited by the second thread because the second thread is the last to save the changes to the file. When a thread t1 is writing to the file, t2 must wait until t1 finishes it's task before it can open it.

Well, without any implementation detail, it is hard to know, but as my test case shows, I always get 220 lines of output, i.e., constant number of lines, with FileWriter. Notice that no synchronized is used here.
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
/**
* Working example of synchonous, competitive writing to the same file.
* #author WesternGun
*
*/
public class ThreadCompete implements Runnable {
private FileWriter writer;
private int status;
private int counter;
private boolean stop;
private String name;
public ThreadCompete(String name) {
this.name = name;
status = 0;
stop = false;
// just open the file without appending, to clear content
try {
writer = new FileWriter(new File("test.txt"), true);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
public static void main(String[] args) {
for (int i=0; i<20; i++) {
new Thread(new ThreadCompete("Thread" + i)).start();
}
}
private int generateRandom(int range) {
return (int) (Math.random() * range);
}
#Override
public void run() {
while (!stop) {
try {
writer = new FileWriter(new File("test.txt"), true);
if (status == 0) {
writer.write(this.name + ": Begin: " + counter);
writer.write(System.lineSeparator());
status ++;
} else if (status == 1) {
writer.write(this.name + ": Now we have " + counter + " books!");
writer.write(System.lineSeparator());
counter++;
if (counter > 8) {
status = 2;
}
} else if (status == 2) {
writer.write(this.name + ": End. " + counter);
writer.write(System.lineSeparator());
stop = true;
}
writer.flush();
writer.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}
As I understand (and test), there are two phases in this process:
all threads in the pool all created and started, ready to grab the file;
one of them grabs it, and I guess it then internally locks it, prevents other threads to get access, because I never see a line combined of contents that come from two threads. So when a thread is writing, others are waiting until it completes the line, and very likely, releases the file. So, no race condition will happen.
the quickest of the others grabs the file and begins writing.
Well, it is just like a crowd waiting outside a bathroom, without queuing.....
So, if your implementation is different, show the code and we can help to break it down.

Related

Share an array between threads in java

I'm currently working on my own little boardgame and came across a problem with multithreading. I have one thread that renders the board and one that supplies the data to render. The supply thread writes his data to one array and then the renderer takes this data and renders it on the screen (The render thread never writes something to the array). So I started to read about multithreading and shared objects and arrays between threads and the volatile keyword on arrays and found out that making an array volatile doesn't solves the problem. Then I read about happens-before relations and got a little bit confused. https://docs.oracle.com/javase/6/docs/api/java/util/concurrent/package-summary.html#MemoryVisibility says that
An unlock (synchronized block or method exit) of a monitor happens-before every subsequent lock (synchronized block or method entry) of that same monitor. And because the happens-before relation is transitive, all actions of a thread prior to unlocking happen-before all actions subsequent to any thread locking that monitor.
So if I understood this correct I would have to make the reads and writes to the array in a synchronized method, so in this code, the reader always has the data that the writer wrote to the array?
class Test {
private static String[] data = new String[10];
public synchronized String read(int index) {
return data[index];
}
public synchronized void write(String value, int index) {
data[index] = value;
}
private Thread writer = new Thread(() -> {
while(true) {
write("Test", 0);
System.out.println("Wrote " + read(0) + " to 0.");
try {
Thread.sleep(10000);
} catch (InterruptedException exc) {
//Handle exception
}
}
});
private Thread reader = new Thread(() -> {
while(true) {
System.out.println("Read " + read(0) + " from 0.");
try {
Thread.sleep(10000);
} catch (InterruptedException exc) {
//Handle exception
}
//Render the data
});
public static void main(String[] args){
writer.start();
reader.start();
}
}
Thanks.
PS. I'm not a native speaker so please excuse some grammar mistakes.

The AtomicReferenceArray will give you the semantics you are looking for. So effectively, an array with volatile elements.
In technical terms, if the read of some element in the array sees a write (of the same element), then the read will synchronize with that write. So there will be a happens-before edge between the write and the read.

You can try using CopyOnWriteArrayList, but from what you describe seems that Queue is the data structure that fits your needs the most. You have a data producer (your supplier thread) and consumer (your renderer thread). So, seems like Queue really gives you what you need. Read about the Queue interface, and choose the queue implementation you need, probably LinkedBlockingQueue

Snap method for Java console game

I've just put together one of my first full Java programs for practice. It is a simple snap game but I'm not happy with the method for the actual "Snap" condition. I may be being fussy but I wonder if there is something better someone could suggest?
public static boolean snap() {
Scanner response = new Scanner(System.in);
double compReflex = (Math.random() * (1000 - 250 + 1)) + 250;
long reflex = Math.round(compReflex);
long startTime = System.currentTimeMillis();
System.out.println("go");
response.nextLine();
if (System.currentTimeMillis() > startTime + reflex) {
System.out.println("I win");
response.close();
return false;
} else {
System.out.println(System.currentTimeMillis() - startTime);
System.out.println("Well done");
response.close();
return true;
}
}
The issue is I would like the else clause to happen immediately if a button was pressed and the if=True clause to happen automatically after the reflex delay if the button isn't pressed. At the moment enter has to be pressed and then the computer judges who had the shortest reaction time. Which isn't snap...
I looked at KeyListeners but they only seem to be available for UI's such as JSwing? I also looked at thread interruption but couldn't work out how to trigger a thread interrupt and then handle the exceptions with the correct program flow? Or is it is even possible?
I think it needs to be a multi-threaded solution but don't fully have a handle on concurrency/multi-threading yet so any really good learning resources appreciated in addition to solutions.

If the console API weren't so dreadfully old, you could simply do something like
try {
System.in.readLine(100, TimeUnit.MILLIS);
System.out.println("You win!");
} catch (InterruptedException e) {
System.out.println("Too slow!");
}
but unfortunately, the API to read from a console was defined in the very first release of the Java programming language, and not reworked since, so it doesn't allow reading with a timeout. If a thread reads from an InputStream, it won't stop reading until there is data, the InputStream itself signals an error, or the entire JVM exits.
So if you really want to do this, you'd need something like this:
public static void main(String[] args) {
var readerThread = new Thread(() -> {
try (var scanner = new Scanner(System.in)) {
scanner.nextLine();
gameOver(true);
}
});
readerThread.setDaemon(true); // this thread should not inhibit JVM termination
readerThread.start();
System.out.println("Go!");
sleep(500, TimeUnit.MILLISECONDS);
gameOver(false);
}
static void sleep(int duration, TimeUnit unit) {
try {
Thread.sleep(unit.toMillis(duration));
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
}
synchronized static void gameOver(boolean victory) {
if (!over) {
System.out.println(victory ? "You win!" : "I win!");
over = true;
}
}
static boolean over;
A few things to note here:
Since the two threads race to gameOver, we need to ensure they don't both execute it. By making the method synchronized, we ensure that the threads will execute it after each other, and by setting a boolean, we can detect whether the other thread was faster.
everything is static because we can't cancel the reading thread. Granted, we could keep it running and reuse it for the next instance of the game, but it would eat any console input in the meantime (such as the answer to "do you want to try again?"), which is annoying. So I am not going to pretend that this solution is nice and reusable, and thus can make my life easier by making everything static.
the try-with-resources statement is a compact way to close a resource (such as a Scanner) once we are done with it.
the utility method for sleep is just to move the pointless, but required, catch block out of the main method, so the main method is easier to read.

Questions about Threads and Callbacks in Java

I am reading Network Programming in Java by Elliotte and in the chapter on Threads he gave this piece of code as an example of a computation that can be ran in a different thread
import java.io.*;
import java.security.*;
public class ReturnDigest extends Thread {
private String filename;
private byte[] digest;
public ReturnDigest(String filename) {
this.filename = filename;
}
#Override
public void run() {
try {
FileInputStream in = new FileInputStream(filename);
MessageDigest sha = MessageDigest.getInstance("SHA-256");
DigestInputStream din = new DigestInputStream(in, sha);
while (din.read() != -1) ; // read entire file
din.close();
digest = sha.digest();
} catch (IOException ex) {
System.err.println(ex);
} catch (NoSuchAlgorithmException ex) {
System.err.println(ex);
}
}
public byte[] getDigest() {
return digest;
}
}
To use this thread, he gave an approach which he referred to as the solution novices might use.
The solution most novices adopt is to make the getter method return a
flag value (or perhaps throw an exception) until the result field is
set.
And the solution he is referring to is:
public static void main(String[] args) {
ReturnDigest[] digests = new ReturnDigest[args.length];
for (int i = 0; i < args.length; i++) {
// Calculate the digest
digests[i] = new ReturnDigest(args[i]);
digests[i].start();
}
for (int i = 0; i < args.length; i++) {
while (true) {
// Now print the result
byte[] digest = digests[i].getDigest();
if (digest != null) {
StringBuilder result = new StringBuilder(args[i]);
result.append(": ");
result.append(DatatypeConverter.printHexBinary(digest));
System.out.println(result);
break;
}
}
}
}
He then went on to propose a better approach using callbacks, which he described as:
In fact, there’s a much simpler, more efficient way to handle the
problem. The infinite loop that repeatedly polls each ReturnDigest
object to see whether it’s finished can be eliminated. The trick is
that rather than having the main program repeatedly ask each
ReturnDigest thread whether it’s finished (like a five-year-old
repeatedly asking, “Are we there yet?” on a long car trip, and almost
as annoying), you let the thread tell the main program when it’s
finished. It does this by invoking a method in the main class that
started it. This is called a callback because the thread calls its
creator back when it’s done
And the code for the callback approach he gave is below:
import java.io.*;
import java.security.*;
public class CallbackDigest implements Runnable {
private String filename;
public CallbackDigest(String filename) {
this.filename = filename;
}
#Override
public void run() {
try {
FileInputStream in = new FileInputStream(filename);
MessageDigest sha = MessageDigest.getInstance("SHA-256");
DigestInputStream din = new DigestInputStream( in , sha);
while (din.read() != -1); // read entire file
din.close();
byte[] digest = sha.digest();
CallbackDigestUserInterface.receiveDigest(digest, filename); // this is the callback
} catch (IOException ex) {
System.err.println(ex);
} catch (NoSuchAlgorithmException ex) {
System.err.println(ex);
}
}
}
And the Implementation of CallbackDigestUserInterface and it's usage was given as:
public class CallbackDigestUserInterface {
public static void receiveDigest(byte[] digest, String name) {
StringBuilder result = new StringBuilder(name);
result.append(": ");
result.append(DatatypeConverter.printHexBinary(digest));
System.out.println(result);
}
public static void main(String[] args) {
for (String filename: args) {
// Calculate the digest
CallbackDigest cb = new CallbackDigest(filename);
Thread t = new Thread(cb);
t.start();
}
}
}
But my question (or clarification) is regarding what he said about this method...He mentioned
The trick is
that rather than having the main program repeatedly ask each
ReturnDigest thread whether it’s finished, you let the thread
tell the main program when it’s finished
Looking at the code, the Thread that was created to run a separate computation is actually the one that continues executing the original program. It is not as if it passed the result back to the main thread. It seems it becomes the MAIN Thread!
So it is not as if the Main threads gets notified when the task is done (instead of the main thread polling). It is that the main thread does not care about the result. It runs to its end and it finishes. The new thread would just run another computation when it is done.
Do I understand this correctly?
How does this play with debugging? Does the thread now becomes the Main thread? and would the debugger now treat it as such?
Is there another means to actually pass the result back to the main thread?
I would appreciate any help, that helps in understanding this better :)

It is a common misunderstanding to think that the "main" thread, the one that public static void main is run on, should be considered the main thread for the application. If you write a gui app for instance, the starting thread will likely finish and die well before the program ends.
Also, callbacks are normally called by the thread that they are handed off to. This in true in Swing, and in many other places (including DataFetcher, for example)

None of the other threads become the "main thread". Your main thread is the thread that starts with the main() method. It's job is to start the other threads... then it dies.
At this point, you never return to the main thread, but the child threads have callbacks... and that means that when they are done, they know where to redirect the flow of the program.
That is your receiveDigest() method. Its job is to display the results of the child threads once they complete. Is this method being called from the main thread, or the child threads? What do you think?
It is possible to pass the result back to the main thread. To do this, you need to keep the main thread from terminating, so it will need to have a loop to keep it going indefinitely, and to keep that loop from eating up processor duty, it will need to be put to sleep while the other threads work.
You can read an example of fork and join architecture here:
https://www.tutorialspoint.com/java_concurrency/concurrency_fork_join.htm

The book is misleading you.
First of all, there is no Callback in the example. There is only one function calling another function by name. A true callback is a means for communication between different software modules. It is pointer or reference to a function or object-with-methods that module A provides to module B so that module B can call it when something interesting happens. It has nothing at all to do with threads.
Second of all, the alleged callback communicates nothing between threads. The function call happens entirely in the new thread, after the main() thread has already died.

Using System.out.println(Thread.currentThread().getName() + " " +count); leads to synchronization

I was learning multithreading in java, In the tutorial, it said removing synchronized should make the program buggy and it did, So I was just experimenting around and wrote a print line System.out.println(Thread.currentThread().getName() + " " +count);
and removed the synchronized word, and even then the program worked fine. But if only synchronized word is removed and the printline(System.out.println(Thread.currentThread().getName() + " " +count);) is not added the program is buggy ad expected.
I can't understand how adding a print line can make it synchronized.
public class intro implements Runnable {
int n=10000;
private int count = 0;
public int getCount() { return count; }
public synchronized void incrSync() { count++; }
public void run () {
for (int i=0; i<n; i++) {
incrSync();
//System.out.println(Thread.currentThread().getName() + " " +count);
}
}
public static void main(String [] args) {
intro mtc = new intro();
Thread t1 = new Thread(mtc);
Thread t2 = new Thread(mtc);
t1.start();
t2.start();
try {
t1.join();
t2.join();
} catch (InterruptedException ie) {
System.out.println(ie);
ie.printStackTrace();
}
System.out.println("count = "+ mtc.getCount());
}
}

Synchronization issues happen between threads when multiple threads attempt access to the same field at the same time.
Without the printing the run method sits in a tight loop accessing the counter almost continuously. Making multiple threads do that without synchronization is very likely to cause a fault.
By adding the printing you are changing the loop to spend most (almost all) of its time printing and only occasionally increment the count. This is much less likely to cause contention.
The code is still buggy with the printing, the only difference is that the contention will happen much less often and your test of just 1000 loops is not sufficient to demonstrate the issue. You'd probably have to run it for a few years before the threads clashed.
This is a classic demonstration of why threading issues are so difficult to find and fix. That loop (with it's print statement) could run on multiple threads for years without contention but if just one clash between threads happens then the code breaks. Imagine that happening in a heart pacemaker or a satellite or a nuclear power station!

The println method calls newline, which is a method with a synchronized-block. It is giving correct result, though it is not thread safe.
Consider T1 read count 5 and T2 read count 5 simultaneously, then race condition happen. It is giving correct result because you are using System.out.println(Thread.currentThread().getName() + " " +count); which is blocking. Increase thread number.
private void newLine() {
try {
synchronized (this) {
ensureOpen();
textOut.newLine();
textOut.flushBuffer();
charOut.flushBuffer();
if (autoFlush)
out.flush();
}
}
catch (InterruptedIOException x) {
Thread.currentThread().interrupt();
}
catch (IOException x) {
trouble = true;
}
}

How can multithreading help increase performance in this situation?

I have a piece of code like this:
while(){
x = jdbc_readOperation();
y = getTokens(x);
jdbc_insertOperation(y);
}
public List<String> getTokens(String divText){
List<String> tokenList = new ArrayList<String>();
Matcher subMatcher = Pattern.compile("\\[[^\\]]*]").matcher(divText);
while (subMatcher.find()) {
String token = subMatcher.group();
tokenList.add(token);
}
return tokenList;
}
What I know is using multithreading can save time when one thread is get blocked by I/O or network. In this synchronous operations every step have to wait for its previous step get finished. What I want here is to maximize cpu utilization on getTokens().
My first thought is put getTokens() in the run method of a class, and create multiple threads. But I think it will not work since it seems not able to get performance benefit by having multiple threads on pure computation operations.
Is adoption of multithreading going to help increase performance in this case? If so, how can I do that?

It will depend on the pace that jdbc_readOperation() produces data to be processed in comparison with the pace that getTokens(x) processes the data. Knowing that will help you figure out if multi-threading is going to help you.
You could try something like this (just for you to get the idea):
int workToBeDoneQueueSize = 1000;
int workDoneQueueSize = 1000;
BlockingQueue<String> workToBeDone = new LinkedBlockingQueue<>(workToBeDoneQueueSize);
BlockingQueue<String> workDone = new LinkedBlockingQueue<>(workDoneQueueSize);
new Thread(() -> {
try {
while (true) {
workToBeDone.put(jdbc_readOperation());
}
} catch (InterruptedException e) {
e.printStackTrace();
// handle InterruptedException here
}
}).start();
int numOfWorkerThreads = 5; // just an example
for (int i = 0; i < numOfWorkerThreads; i++) {
new Thread(() -> {
try {
while (true) {
workDone.put(getTokens(workToBeDone.take()));
}
} catch (InterruptedException e) {
e.printStackTrace();
// handle InterruptedException here
}
}).start();
}
new Thread(() -> {
// you could improve this by making a batch operation
try {
while (true) {
jdbc_insertOperation(workDone.take());
}
} catch (InterruptedException e) {
e.printStackTrace();
// handle InterruptedException here
}
}).start();
Or you could learn how to use the ThreadPoolExecutor. (https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadPoolExecutor.html)

Okay to speed up getTokens() you can split the inputted String divText by using String.substring() method. You split it into as many substrings as you will run Threads running the getTokens() method. Then every Thread will "run" on a certain substring of divText.
Creating more Threads than the CPU can handle should be avoided since context switches create inefficiency.
https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#substring-int-int-
An alternative could be splitting the inputted String of getTokens with the String.split method http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#split%28java.lang.String%29 e.g. in case the text is made up of words seperated by spaces or other symbols. Then specific parts of the resulting String array could be passed to different Threads.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Is Files.write() method thread safe. [duplicate] - java

I have a java program which uses 20 threads. Every one of them write their results in a file called output.txt. I always get a different number of lines in output.txt. Can it be a problem with the synchronization of threads? Is there a way to handle this?

I'd suggest you to organize it this way: One thread-consumer will consume all data and write it to the file. All worker threads will produce data to the consumer thread in synchronous way. Or with multiple threads file writing you can use some mutex or locks implementations.

Related

Share an array between threads in java

Snap method for Java console game

Questions about Threads and Callbacks in Java

Using System.out.println(Thread.currentThread().getName() + " " +count); leads to synchronization

How can multithreading help increase performance in this situation?

Categories

Resources