Multi-threaded code and condition variable usage - java

A multi-threaded piece of code accesses a resource (eg: a filesystem) asynchronously.
To achieve this, I'll use condition variables. Suppose the FileSystem is an interface like:
class FileSystem {
// sends a read request to the fileSystem
read(String fileName) {
// ...
// upon completion, execute a callback
callback(returnCode, buffer);
}
}
I have now an application accessing the FileSystem. Suppose I can issue multiple reads through a readFile() method.
The operation should write data to the byte buffer passed to it.
// constructor
public Test() {
FileSystem disk = ...
boolean readReady = ...
Lock lock = ...
Condition responseReady = lock.newCondition();
}
// the read file method in quesiton
public void readFile(String file) {
try {
lock.lock(); // lets imagine this operation needs a lock
// this operation may take a while to complete;
// but the method should return immediately
disk.read(file);
while (!readReady) { // <<< THIS
responseReady.awaitUninterruptibly();
}
}
finally {
lock.unlock();
}
}
public void callback(int returnCode, byte[] buffer) {
// other code snipped...
readReady = true; // <<< AND THIS
responseReady.signal();
}
Is this the correct way to use condition variables? Will readFile() return immediately?
(I know there is some sillyness in using locks for reads, but writing to a file is also an option.)

There's a lot missing from your question (i.e. no specific mention of Threads) but I will try to answer anyway.
Neither the lock nor the conditional variables give you background capabilities -- they just are used for a thread to wait for signals from other threads. Although you don't mention it, the disk.read(file) method could spawn a thread to do the IO and then return immediately but the caller is going to sit in the readReady loop anyway which seems pointless. If the caller has to wait then it could perform the IO itself.
A better pattern could be to use something like the Java 5 Executors service:
ExecutorService pool = Executors.newFixedThreadPool(int numThreads);
You can then call pool.submit(Callable) which will submit the job to be performed in the background in another thread (when the pool next has one available). Submit returns a Future which the caller can use to investigate if the background task has finished. It can return a result object as well. The concurrent classes take care of the locking and conditional signal/wait logic for you.
Hope this helps.
p.s. Also, you should make readReady be volatile since it is not synchronized.

Related

Is it safe to use AtomicBoolean for database locking in Scala/Java?

I have an application where I want to ensure that a method is called at most once concurrently, say when updating user balance in a database.
I am thinking of using the following locking mechanism: (showing Scala code below, but should be similar with Java Lambdas):
object Foo{
val dbLocked = new java.util.concurrent.atomic.AtomicBoolean(false)
def usingAtoimcDB[T](f: => T):T = {
if (dbLocked.get) throw new Exception("db is locked")
dbLocked.set(true)
try f
finally dbLocked.set(false)
}
}
Is this safe to use when usingAtoimcDB may be called concurrently?
EDIT: The corrected code below, as pointed in this answer:
def usingAtoimcDB[T](f: => T):T = {
if(dbLocked.compareAndSet(false, true)) {
//db is now locked
try f
finally dbLocked.set(false)
} else {
//db is already locked
throw new Exception("db is locked")
}
}
EDIT 2:
Using a spinloop. Is this also ok?
def usingAtoimcDB[T](f: => T):T = {
while (!dbLocked.compareAndSet(false, true)) {Thread.sleep(1)}
try f
finally dbLocked.set(false)
}
EDIT3: Based on the answers and comments below, I am also considering using queues.
Inadvisable. You are requesting that the same pieco of code running in the same application instance on tha same server is the single point to do that transaction. There also is no provision to let this code stand-out. When you are retired, someone may start a second application instance or whatever.
Whereas a database commit/rollback is a quite simple and sure mechanism.
When you cannot write an integration (unit) test to ensure this sole point, then do not do it.
If you do it:
Revoke rights to the table modifications for the normal database user
Add a new database use who has sufficient right granted
And still: do not do it.
The code you posted above is not thread-safe, because you are not using an atomic check-and-set operation. Two threads can both be executing the if (dbLocked.get) statement at the same time and both get false as the answer, and then both will do dbLocked.set(true) and call f.
If you really want to use AtomicBoolean, then you must use compareAndSet as #leshkin already showed - this is an atomic operation that does the check and set in one go without the possibility of another thread doing the same thing at the same time, so that it is thread-safe.
You are using an AtomicBoolean as a lock here. There are classes in the standard Java library which are better suited (and specifically made) for this purpose; have a look at the package java.util.concurrent.locks.
You could for example use class ReentrantReadWriteLock, which combines two locks for reading and writing. The write lock is exclusive (when it's locked, nobody else can read or write); the read lock is shared (when it's locked, nobody can write, but others can read at the same time). This allows for there to be multiple readers concurrently, but only one writer at a time, possibly improving efficiency (it's not necessary to make reading an exclusive operation).
Example:
import java.util.concurrent.locks._
object Foo {
private val lock: ReadWriteLock = new ReentrantReadWriteLock
def doWriteOperation[T](f: => T): T = {
// Locks the write lock
lock.writeLock.lock()
try {
f
} finally {
lock.writeLock.unlock()
}
}
def doReadOperation[T](f: => T): T = {
// Locks the read lock
lock.readLock.lock()
try {
f
} finally {
lock.readLock.unlock()
}
}
}
Yes, it should work as espected. I would slightly modify your function using compareAndSet call.
compareAndSet method has the advantage to be an atomic operation - there are no race conditions and the value will be changed atomically.
def usingAtoimcDB[T](f: => T):T = {
if(dbLocked.compareAndSet(false, true)) {
//db is now locked
try f
finally dbLocked.set(false)
} else {
//db is already locked
throw new Exception("db is locked")
}
}

get input from multiple threads and upload file with fixed size to S3

I write a thread safe class to get input from multiple threads and upload the result to S3 once it runs up to a fixed size.
S3Exporter class
// this class is thread safe.
public class S3Exporter {
private static final int BUFFER_PADDING = 1000;
private final int targetSize;
private final ByteArrayOutputStream buf;
private volatile boolean started;
public S3Exporter(final int targetSize) {
buf = new ByteArrayOutputStream(targetSize + BUFFER_PADDING);
this.targetSize = targetSize;
started = false;
}
public synchronized void start() {
started = true;
}
public synchronized void end() {
started = false;
flush();
}
public synchronized void export(byte[] data) throws IOException {
Preconditions.checkState(started, "Not started!");
buf.write(b, buf.size(), b.length);
flushIfNeeded();
}
private void flushIfNeeded() {
if (buf.size() >= targetSize) {
flush();
}
}
public synchronized void flush() {
if (buf.size() > 0) {
// upload buf to s3, it's a time-consuming operation
buf.reset();
}
}
}
The client calls export method to pass data and if exception is thrown the client will pass that data later.
To avoid losing data when restarting the application, I add a shutdown hook when creating S3Exporter object:
S3Exporter exporter = new S3Exporter(10000);
Runtime.getRuntime().addShutdownHook(new Thread(() -> exporter.end()));
My concern is the class is not scalable, I mean it could become bottleneck of the system when data are getting more. I could figure out 2 ways to improve the situation:
do the time-consuming upload operation asynchronously: use an executor to upload and call ThreadPoolExecutor.awaitTermination() in the shutdown hook.
just put data to a LinkedBlockingQueue in export method and use multiple threads to handle it.( This way is more scalable than the first per my understanding)
Then I need to do more work in the shutdown hook thread to make sure not losing the accepted data and it's not a good idea as I know. I'll take the risk of losing data when restarting the application, which is the last thing I wanna see.
My question
Is my concern about the scalability a really problem?( To make the question less stupid, let's say the data size is a few bytes and TPS to call export method is 500)
If the answer to the 1st question is yes, what about my improvements, are they right? How to do the cleanup work to avoid losing data?
Scalability depends on requirements, constraints, desired service level, personal preferences, expected users growth rate, and especially money: given infinite resources, every piece of software can be scaled. You didn't mention any, so I guess you don't have any actual figure. In this phase, as a programmer, your job is to make a correct program that uses a predictable amount of resources.
Your program seems correct, and most of your assumptions are correct, too. However I suggest to immediately store chunks to some local persistent database (or the raw filesystem) and have a periodic job, run in a separate thread, that upload group of chunks to S3, and remove any shutdown hooks (you can use Camel for the boring parts). This is because such hooks are unreliable and should only be used as last resources for quick and optional cleanup (optional in the sense that you must be prepared that the cleanup could not have been run properly until the end).
Using a file instead of memory, your data can survive fatal errors and the working memory required by your application is almost independent on the load: there's an irrelevant amount of extra CPU and some disk I/O that is way cheaper then memory.

How to implement Java single Database thread

I have made a Java program that connects to a SQLite database using SQLite4Java.
I read from the serial port and write values to the database. This worked fine in the beginning, but now my program has grown and I have several threads. I have tried to handle that with a SQLiteQueue-variable that execute database operations with something like this:
public void insertTempValue(final SQLiteStatement stmt, final long logTime, final double tempValue)
{
if(checkQueue("insertTempValue(SQLiteStatement, long, double)", "Queue is not running!", false))
{
queue.execute(new SQLiteJob<Object>()
{
protected Object job(SQLiteConnection connection) throws SQLiteException
{
stmt.bind(1, logTime);
stmt.bind(2, tempValue);
stmt.step();
stmt.reset(true);
return null;
}
});
}
} // end insertTempValue(SQLiteStatement, long, double)
But now my SQLite-class can't execute the statements reporting :
DB[1][U]: disposing [INSERT INTO Temperatures VALUES (?,?)]DB[1][U] from alien thread
SQLiteDB$6#8afbefd: job exception com.almworks.sqlite4java.SQLiteException: [-92] statement is disposed
So the execution does not happen.
I have tried to figure out what's wrong and I think I need a Java wrapper that makes all the database operations calls from a single thread that the other threads go through.
Here is my problem I don't know how to implement this in a good way.
How can I make a method-call and ensure that it always runs from the same thread?
Put all your database access code into a package and make all the classes package private. Write one Runnable or Thread subclass with a run() method that runs a loop. The loop checks for queued information requests, and runs the appropriate database access code to find the information, putting the information into the request and marking the request complete before going back to the queue.
Client code queues data requests and waits for answers, perhaps by blocking until the request is marked complete.
Data requests would look something like this:
public class InsertTempValueRequest {
// This method is called from client threads before queueing
// Client thread queues this object after construction
public InsertTempValueRequest(
final long logTime,
final double tempValue
) {
this.logTime = logTime
this.tempValue = tempValue
}
// This method is called from client threads after queueing to check for completion
public isComplete() {
return isComplete;
}
// This method is called from the database thread after dequeuing this object
execute(
SQLiteConnection connection,
SQLiteStatement statement
) {
// execute the statement using logTime and tempValue member data, and commit
isComplete = true;
}
private volatile long logTime;
private volatile double tempValue;
private volatile boolean isComplete = false;
}
This will work, but I suspect there will be a lot of hassle in the implementation. I think you could also get by by using a lock that only permits one thread at a time to access the database, and also - this is the difference from your existing situation - beginning the access by creating the database resources - including statements - from scratch, and disposing of those resources before releasing the lock.
I found a solution to my problem. I have now implemented a wrapper-class that makes all operations with my older SQLite-class using an ExecutorService, inspired from Thread Executor Example and got the correct usage from Java Doc ExecutorService.

Pause execution of a method until callback is finished

I am fairly new to Java and extremely new to concurrency. However, I have worked with C# for a while. It doesn't really matter, but for the sake of example, I am trying to pull data off a table on server. I want method to wait until data is completely pulled. In C#, we have async-await pattern which can be used like this:
private async Task<List<ToDoItem>> PullItems ()
{
var newRemoteItems = await (from p in remoteTable select p).ToListAsync();
return newRemoteItems;
}
I am trying to have similar effect in Java. Here is the exact code I'm trying to port (Look inside SynchronizeAsync method.)! However, Java Azure SDK works with callbacks. So, I have a few options:
Use wait and notify pattern. Following code doesn't work since I don't understand what I'm doing.
final List<TEntity> newRemoteItems = new ArrayList<TEntity>();
synchronized( this ) {
remoteTable.where().field("lastSynchronized").gt(currentTimeStamp)
.execute(new TableQueryCallback<TEntity>() {
public void onCompleted(List<TEntity> result,
int count,
Exception exception,
ServiceFilterResponse response) {
if (exception == null) {
newRemoteItems.clear();
for (TEntity item: result) {
newRemoteItems.add(item);
}
}
}
});
}
this.wait();
//DO SOME OTHER STUFF
My other option is to move DO SOME OTHER STUFF right inside the callback's if(exception == null) block. However, this would result in my whole method logic chopped off into the pieces, disturbing the continuous flow. I don't really like this approach.
Now, here are questions:
What is recommended way of doing this? I am completing the tutorial on Java concurrency at Oracle. Still, clueless. Almost everywhere I read, it is recommended to use higher level stuff rather than wait and notify.
What is wrong with my wait and notify?
My implementation blocks the main thread and it's considered a bad practice. But what else can I do? I must wait for the server to respond! Also, doesn't C# await block the main thread? How is that not a bad thing?
Either put DO SOME OTHER STUFF into callback, or declare a semaphore, and call semaphore.release in the callback and call semaphore.aquire where you want to wait. Remove synchronized(this) and this.wait.

Encapsulating a multi-threaded operation in Java

I have a situation where I have a large number of classes that need to do file (read only) access. This is part of a web app running on top of OSGI, so there will be a lot of concurrent needs to access.
So I'm building an OSGI service to access the file system for all the other pieces that will need it and provide a centralized access as this also simplifies configuration of file locations, etc.
It occurs to me that a multi-threaded approach makes the most sense along with a thread pool.
So the question is this:
If I do this and I have a service with an interface like:
FileService.getFileAsClass(class);
and the method getFileAsClass(class) looks kinda like this: (this is a sketch it may not be perfect java code)
public < T> T getFileAsClass(Class< T> clazz) {
Future<InputStream> classFuture = threadpool.submit(new Callable< InputStream>() {
/* initialization block */
{
//any setup from configs.
}
/* implement Callable */
public InputStream call() {
InputStream stream = //new inputstream from file location;
boolean giveUp = false;
while(null == stream && !giveUp) {
//Code that tries to read in the file 4
// times with a Thread.sleep() then gives up
// this is here t make sure we aren't busy updating file.
}
return stream;
}
});
//once we have the file, convert it and return it.
return InputStreamToClassConverter< T>.convert(classFuture.get());
}
Will that correctly wait until the relevant operation is done to call InputStreamtoClassConverter.convert?
This is my first time writing multithreaded java code so I'm not sure what I can expect for some of the behavior. I don't care about order of which threads complete, only that the file handling is handled async and once that file pull is done, then and only then is the Converter used.

Categories