I have made a Java program that connects to a SQLite database using SQLite4Java.
I read from the serial port and write values to the database. This worked fine in the beginning, but now my program has grown and I have several threads. I have tried to handle that with a SQLiteQueue-variable that execute database operations with something like this:
public void insertTempValue(final SQLiteStatement stmt, final long logTime, final double tempValue)
{
if(checkQueue("insertTempValue(SQLiteStatement, long, double)", "Queue is not running!", false))
{
queue.execute(new SQLiteJob<Object>()
{
protected Object job(SQLiteConnection connection) throws SQLiteException
{
stmt.bind(1, logTime);
stmt.bind(2, tempValue);
stmt.step();
stmt.reset(true);
return null;
}
});
}
} // end insertTempValue(SQLiteStatement, long, double)
But now my SQLite-class can't execute the statements reporting :
DB[1][U]: disposing [INSERT INTO Temperatures VALUES (?,?)]DB[1][U] from alien thread
SQLiteDB$6#8afbefd: job exception com.almworks.sqlite4java.SQLiteException: [-92] statement is disposed
So the execution does not happen.
I have tried to figure out what's wrong and I think I need a Java wrapper that makes all the database operations calls from a single thread that the other threads go through.
Here is my problem I don't know how to implement this in a good way.
How can I make a method-call and ensure that it always runs from the same thread?
Put all your database access code into a package and make all the classes package private. Write one Runnable or Thread subclass with a run() method that runs a loop. The loop checks for queued information requests, and runs the appropriate database access code to find the information, putting the information into the request and marking the request complete before going back to the queue.
Client code queues data requests and waits for answers, perhaps by blocking until the request is marked complete.
Data requests would look something like this:
public class InsertTempValueRequest {
// This method is called from client threads before queueing
// Client thread queues this object after construction
public InsertTempValueRequest(
final long logTime,
final double tempValue
) {
this.logTime = logTime
this.tempValue = tempValue
}
// This method is called from client threads after queueing to check for completion
public isComplete() {
return isComplete;
}
// This method is called from the database thread after dequeuing this object
execute(
SQLiteConnection connection,
SQLiteStatement statement
) {
// execute the statement using logTime and tempValue member data, and commit
isComplete = true;
}
private volatile long logTime;
private volatile double tempValue;
private volatile boolean isComplete = false;
}
This will work, but I suspect there will be a lot of hassle in the implementation. I think you could also get by by using a lock that only permits one thread at a time to access the database, and also - this is the difference from your existing situation - beginning the access by creating the database resources - including statements - from scratch, and disposing of those resources before releasing the lock.
I found a solution to my problem. I have now implemented a wrapper-class that makes all operations with my older SQLite-class using an ExecutorService, inspired from Thread Executor Example and got the correct usage from Java Doc ExecutorService.
Related
Setup
I have a multithreaded Java application which will receive 200-300 requests per second to perform a task 'A'(which take approximately 30 milliseconds) on an input received in a request.
The application has a cache(max size = 1MB) which is read by each thread to perform task 'A' on input received:
public class DataProvider() {
private HashMap<KeyObject, ValueObject> cache;
private Database database;
// Scheduled to run in interval of 15 seconds by a background thread
public synchronized void updateData() {
this.cache = database.getData();
}
public HashMap<KeyObject, ValueObject> getCache() {
return this.cache;
}
}
KeyObject and ValueObject are POJO. ValueObject contains List of another POJO.
For every request received task is done in following way:
public class TaskExecutor() {
private DataProvider dataProvider;
public boolean doTask(final InputObject input) {
final HashMap<KeyObject, ValueObject> data = dataProvider.getCache(); // shallow copy I think
// Do Task 'A' using data
}
}
Problem
One of the thread starts executing task 'A' at timestamp 't' using data 'd1' from cache. At time 't + t1' cache data gets updated to 'd2'. Thread now starts using data 'd2' to finish rest of the task. Task gets completed at 't+t1+t2'. Half of the task was completed with different data. This will lead to invalid outcome of task.
Current Approach
Each thread will create a deep copy of the cache and then use the deep copy to perform the task using one of the following approach(best in performance) to perform deep copy:
How do you make a deep copy of an object in Java?
Deep clone utility recommendation
Limitation
Cloning using deep copy will create thousand of objects which may crash JVM.
All the cloning approaches don't look good in terms of performance.
For Your use case, returning a new cache from database.getData(); is much better choice. Because If You choose this way, You would only have to create new cache object once in 15 second. If You choose to clone cache in each task, You would have to create 4501 cache object in 15 second. Obviously returning new cache object is the right choice.
If the code You provided is the same code as in Your project, I believe database.getData(); method changing the content of a single cache object instead of returning a new one. If You return a new cache object from this method Your problem will be solved.
I write a thread safe class to get input from multiple threads and upload the result to S3 once it runs up to a fixed size.
S3Exporter class
// this class is thread safe.
public class S3Exporter {
private static final int BUFFER_PADDING = 1000;
private final int targetSize;
private final ByteArrayOutputStream buf;
private volatile boolean started;
public S3Exporter(final int targetSize) {
buf = new ByteArrayOutputStream(targetSize + BUFFER_PADDING);
this.targetSize = targetSize;
started = false;
}
public synchronized void start() {
started = true;
}
public synchronized void end() {
started = false;
flush();
}
public synchronized void export(byte[] data) throws IOException {
Preconditions.checkState(started, "Not started!");
buf.write(b, buf.size(), b.length);
flushIfNeeded();
}
private void flushIfNeeded() {
if (buf.size() >= targetSize) {
flush();
}
}
public synchronized void flush() {
if (buf.size() > 0) {
// upload buf to s3, it's a time-consuming operation
buf.reset();
}
}
}
The client calls export method to pass data and if exception is thrown the client will pass that data later.
To avoid losing data when restarting the application, I add a shutdown hook when creating S3Exporter object:
S3Exporter exporter = new S3Exporter(10000);
Runtime.getRuntime().addShutdownHook(new Thread(() -> exporter.end()));
My concern is the class is not scalable, I mean it could become bottleneck of the system when data are getting more. I could figure out 2 ways to improve the situation:
do the time-consuming upload operation asynchronously: use an executor to upload and call ThreadPoolExecutor.awaitTermination() in the shutdown hook.
just put data to a LinkedBlockingQueue in export method and use multiple threads to handle it.( This way is more scalable than the first per my understanding)
Then I need to do more work in the shutdown hook thread to make sure not losing the accepted data and it's not a good idea as I know. I'll take the risk of losing data when restarting the application, which is the last thing I wanna see.
My question
Is my concern about the scalability a really problem?( To make the question less stupid, let's say the data size is a few bytes and TPS to call export method is 500)
If the answer to the 1st question is yes, what about my improvements, are they right? How to do the cleanup work to avoid losing data?
Scalability depends on requirements, constraints, desired service level, personal preferences, expected users growth rate, and especially money: given infinite resources, every piece of software can be scaled. You didn't mention any, so I guess you don't have any actual figure. In this phase, as a programmer, your job is to make a correct program that uses a predictable amount of resources.
Your program seems correct, and most of your assumptions are correct, too. However I suggest to immediately store chunks to some local persistent database (or the raw filesystem) and have a periodic job, run in a separate thread, that upload group of chunks to S3, and remove any shutdown hooks (you can use Camel for the boring parts). This is because such hooks are unreliable and should only be used as last resources for quick and optional cleanup (optional in the sense that you must be prepared that the cleanup could not have been run properly until the end).
Using a file instead of memory, your data can survive fatal errors and the working memory required by your application is almost independent on the load: there's an irrelevant amount of extra CPU and some disk I/O that is way cheaper then memory.
I am fairly new to Java and extremely new to concurrency. However, I have worked with C# for a while. It doesn't really matter, but for the sake of example, I am trying to pull data off a table on server. I want method to wait until data is completely pulled. In C#, we have async-await pattern which can be used like this:
private async Task<List<ToDoItem>> PullItems ()
{
var newRemoteItems = await (from p in remoteTable select p).ToListAsync();
return newRemoteItems;
}
I am trying to have similar effect in Java. Here is the exact code I'm trying to port (Look inside SynchronizeAsync method.)! However, Java Azure SDK works with callbacks. So, I have a few options:
Use wait and notify pattern. Following code doesn't work since I don't understand what I'm doing.
final List<TEntity> newRemoteItems = new ArrayList<TEntity>();
synchronized( this ) {
remoteTable.where().field("lastSynchronized").gt(currentTimeStamp)
.execute(new TableQueryCallback<TEntity>() {
public void onCompleted(List<TEntity> result,
int count,
Exception exception,
ServiceFilterResponse response) {
if (exception == null) {
newRemoteItems.clear();
for (TEntity item: result) {
newRemoteItems.add(item);
}
}
}
});
}
this.wait();
//DO SOME OTHER STUFF
My other option is to move DO SOME OTHER STUFF right inside the callback's if(exception == null) block. However, this would result in my whole method logic chopped off into the pieces, disturbing the continuous flow. I don't really like this approach.
Now, here are questions:
What is recommended way of doing this? I am completing the tutorial on Java concurrency at Oracle. Still, clueless. Almost everywhere I read, it is recommended to use higher level stuff rather than wait and notify.
What is wrong with my wait and notify?
My implementation blocks the main thread and it's considered a bad practice. But what else can I do? I must wait for the server to respond! Also, doesn't C# await block the main thread? How is that not a bad thing?
Either put DO SOME OTHER STUFF into callback, or declare a semaphore, and call semaphore.release in the callback and call semaphore.aquire where you want to wait. Remove synchronized(this) and this.wait.
A multi-threaded piece of code accesses a resource (eg: a filesystem) asynchronously.
To achieve this, I'll use condition variables. Suppose the FileSystem is an interface like:
class FileSystem {
// sends a read request to the fileSystem
read(String fileName) {
// ...
// upon completion, execute a callback
callback(returnCode, buffer);
}
}
I have now an application accessing the FileSystem. Suppose I can issue multiple reads through a readFile() method.
The operation should write data to the byte buffer passed to it.
// constructor
public Test() {
FileSystem disk = ...
boolean readReady = ...
Lock lock = ...
Condition responseReady = lock.newCondition();
}
// the read file method in quesiton
public void readFile(String file) {
try {
lock.lock(); // lets imagine this operation needs a lock
// this operation may take a while to complete;
// but the method should return immediately
disk.read(file);
while (!readReady) { // <<< THIS
responseReady.awaitUninterruptibly();
}
}
finally {
lock.unlock();
}
}
public void callback(int returnCode, byte[] buffer) {
// other code snipped...
readReady = true; // <<< AND THIS
responseReady.signal();
}
Is this the correct way to use condition variables? Will readFile() return immediately?
(I know there is some sillyness in using locks for reads, but writing to a file is also an option.)
There's a lot missing from your question (i.e. no specific mention of Threads) but I will try to answer anyway.
Neither the lock nor the conditional variables give you background capabilities -- they just are used for a thread to wait for signals from other threads. Although you don't mention it, the disk.read(file) method could spawn a thread to do the IO and then return immediately but the caller is going to sit in the readReady loop anyway which seems pointless. If the caller has to wait then it could perform the IO itself.
A better pattern could be to use something like the Java 5 Executors service:
ExecutorService pool = Executors.newFixedThreadPool(int numThreads);
You can then call pool.submit(Callable) which will submit the job to be performed in the background in another thread (when the pool next has one available). Submit returns a Future which the caller can use to investigate if the background task has finished. It can return a result object as well. The concurrent classes take care of the locking and conditional signal/wait logic for you.
Hope this helps.
p.s. Also, you should make readReady be volatile since it is not synchronized.
I'm relatively new with hibernate so please be gentle. I'm having an issue with a long running method (~2 min long) and changing the value of a status field on an object stored in the DB. The pseudo-code below should help explain my issue.
public foo(thing) {
if (thing.getStatus() == "ready") {
thing.setStatus("finished");
doSomethingAndTakeALongTime();
} else {
// Thing already has a status of finished. Send the user back a message.
}
}
The pseudo-code shouldn't take much explanation. I want doSomethingAndTakeALongTime() to run, but only if it has a status of "ready". My issue arises whenever it takes 2 minutes for doSomethingAndTakeALongTime() to finish and the change to thing's status field doesn't get persisted to the database until it leaves foo(). So another user can put in a request during those 2 minutes and the if statement will evaluate to true.
I've already tried updating the field and flushing the session manually, but it didn't seem to work. I'm not sure what to do from here and would appreciate any help.
PS: My hibernate session is managed by spring.
Basically you need to let it run in a separate Thread to make the method to return immediately. Else it will indeed block until the long running task is finished. You can pass the entity itself to the thread, so that it can update the status itself. Here's a basic kickoff example using a simple Thread.
public class Task extends Thread {
private Entity entity;
public Task(Entity entity) {
this.entity = entity;
}
public void run() {
entity.setStatus(Status.RUNNING);
// ...
// Long running task here.
// ...
entity.setStatus(Status.FINISHED);
}
}
and
public synchronized void foo(Entity entity) {
if (entity.getStatus() == Status.READY) {
new Task(entity).start();
} else {
// ...
}
}
With the Status in an enum you can even use a switch statement instead of an if/else.
switch (entity.getStatus()) {
case READY:
new Task(entity).start();
break;
case RUNNING:
// It is still running .. Have patience!
break;
case FINISHED:
// It is finished!
break;
}
For a more robust control of running threads, you may want to consider ExecutorService instead. Therewith you can control the maximum number of threads and specify a timeout.
What the method doSomethingAndTakeALongTime() is doing? is it for DB operation or just executing some business logic?
If its not doing any DB operation, and you got your status fine then you can persist the object before calling that method.
And if its doing some DB operation, then you need to wait for it. So, even if you put in thread you need to wait for that thread to complete (using thread.join() we can do that)
the thing is, before you persist you must have completed all operation based on you ORM object right? so try to optimized the logic for the method to get it executed before you persist.
thanks.