Why does a single thread make my Java Program so much faster?

Why does a single thread make my Java Program so much faster? - java

I have been given the task of creating a sql database and creating a GUI in Java to access it with. I pretty much have it but I have a question about threads. Before today I did not use any threads in my program and as a result just to pull 150 records from the database i had to wait around 5 - 10 seconds. This was very inconvenient and I was not sure if i could fix the issue. Today I looked on the internet about using threads in programs similar to mine and i decided to just use one thread in this method:
public Vector VectorizeView(final String viewName) {
final Vector table = new Vector();
int cCount = 0;
try {
cCount = getColumnCount(viewName);
} catch (SQLException e1) {
e1.printStackTrace();
}
final int viewNameCount = cCount;
Thread runner = new Thread(){
public void run(){
try {
Connection connection = DriverManager.getConnection(getUrl(),
getUser(), getPassword());
Statement statement = connection.createStatement();
ResultSet result = statement.executeQuery("Select * FROM "
+ viewName);
while (result.next()) {
Vector row = new Vector();
for (int i = 1; i <= viewNameCount; i++) {
String resultString = result.getString(i);
if (result.wasNull()) {
resultString = "NULL";
} else {
resultString = result.getString(i);
}
row.addElement(resultString);
}
table.addElement(row);
}
} catch (SQLException e) {
e.printStackTrace();
}
}
};
runner.start();
return table;
}
The only thing i really changed was adding the thread 'runner' and the performance increased exponentially. Pulling 500 records occurs almost instantly this way.
The method looked like this before:
public Vector VectorizeTable(String tableName) {
Vector<Vector> table = new Vector<Vector>();
try {
Connection connection = DriverManager.getConnection(getUrl(),
getUser(), getPassword());
Statement statement = connection.createStatement();
ResultSet result = statement.executeQuery("Select * FROM "
+ tableName);
while (result.next()) {
Vector row = new Vector();
for (int i = 1; i <= this.getColumnCount(tableName); i++) {
String resultString = result.getString(i);
if (result.wasNull()) {
resultString = "NULL";
} else {
resultString = result.getString(i);
}
row.addElement(resultString);
}
table.addElement(row);
}
} catch (SQLException e) {
e.printStackTrace();
}
return table;
}
My question is why is the method with the thread so much faster than the one without? I don't use multiple threads anywhere in my program. I have looked online but nothing seems to answer my question.
Any information anyone could give would be greatly appreciated. I'm a noob on threads XO
If you need any other additional information to help understand what is going on let me know!
Answer:
Look at Aaron's answer this wasn't an issue with threads at all. I feel very noobish right now :(. THANKS #Aaron!

I think that what you are doing is appearing to make the database load faster because the VectorizeView method is returning before the data has been loaded. The load is then proceeding in the background, and completing in (probably) the same time as before.
You could test this theory by adding a thread.join() call after the thread.start() call.
If this is what is going on, you probably need to do something to stop other parts of your application from accessing the table object before loading has completed. Otherwise your application is liable to behave incorrectly if the user does something too soon after launch.
FWIW, loading 100 or 500 records from a database should be quick, unless the query itself is expensive for the database. That shouldn't be the case for a simple select from a table ... unless you are actually selecting from a view rather than the table, and the view is poorly designed. Either way, you probably would be better off focussing on why such a simple query is taking so long, rather than trying to run it in a separate thread.
In your follow-up you say that the version with the join after the start is just as fast as the version without it.
My first reaction is to say: "Leave the join there. You've fixed the problem."
But this doesn't explain what is actually going on. And I'm now completely baffled. The best I can think of is that something your application is doing before this on the current thread is the cause of this.
Maybe you should investigate what the application is doing in the period in which this is occurring. See if you can figure out where all the time is being spent.
Take a thread dump and look at the threads.
Run it under the debugger to see where the "pause" is occurring.
Profile it.
Set the application logging to a high level and see if there are any clues.
Check the database logs.
Etcetera

It looks like you kick off (i.e. start) a background thread to perform the query, but you don't join to wait for the computation to complete. When you return table, it won't be filled in with the results of the query yet -- the other thread will fill it in over time, after your method returns. The method returns almost instantly, because it's doing no real work.
If you want to ensure that the data is loaded before the method returns, you'll need to call runner.join(). If you do so, you'll see that loading the data is taking just as long as it did before. The only difference with the new code is that the work is performed in a separate thread of execution, allowing the rest of your code to get on with other work that it needs to perform. Note that failing to call join could lead to errors if code in your main thread tries to use the data in the Vector before it's actually filled in by the background thread.
Update: I just noticed that you're also precomputing getColumnCount in the multi-threaded version, while in the single-threaded version you're computing it for each iteration of the inner loop. Depending on the complexity of that method, that might explain part of the speedup (if there is any).

Are you sure that it is faster? Since you start separate thread, you will return table immediately. But are you sure that you measure time after it's fully populated with data?
Update
To measure time correctly, save runner object somewhere and call runner.join(). You can even to it in the same method for testing.

Ok, I think that if you examine table at the end of this method you will find it's empty. That's because start starts running the thread in the background, and you immediately return table without the background thread having a chance to populate it. So it appears to be going faster but actually isn't.

Related

How to call multiple Uni concurrently

Recently, I'm working on a project where I have 2 make 2 asynchronous calls at the same time. Since I'm working with Quarkus, I ended up trying to make use of Mutiny and the vert.x library. However, I can not get my code working with Unis. In the below code, I would imagine that both Unis would be called and the Uni that returns fastest would be returned. However, it seems that when combining Unis it simply returns the first one in the list, even though the first uni should take a longer time.
The below code prints out one one when it should print out two two since the uniFast should finish first. How do I combine Unis and have the faster one return first?
#Test
public void testUniJion(){
var uniSLow = Uni.createFrom().item(() -> {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
return "one";
});
var uniFast = Uni.createFrom().item(() -> {
try {
Thread.sleep(100);
} catch (InterruptedException e) {
e.printStackTrace();
}
return "two";
});
var resp = Uni.join().first(uniSLow,uniFast).withItem().await().indefinitely();
System.out.println(resp);
var resp2 = Uni.combine().any().of(uniSLow,uniFast).await().indefinitely();
System.out.println(resp2);
}
Note: This is not the actual code I am trying to implement. In my code, I am trying to fetch from 2 different databases. However, one database often has a lot more latency than the other. However, Uni seems to always wait for the slower database. I'm simply trying to understand Mutiny and Uni's better so I made this code example.

The problem is that you are not telling Mutiny on which thread should run each uni. If I add a System.out to your example:
// Slow and Fast for the different Uni
System.out.println( "Slow - " + Thread.currentThread().getId() + ":" + Thread.currentThread().getName() );
I get the following output:
Slow - 1:Test worker
one
Slow - 1:Test worker
Fast - 1:Test worker
one
The output shows that everything runs on the same thread and therefore when we block the first one, the second one is blocked too.
That's why the output is one one.
One way to run the uni in parallel is to use a different executor at subscription:
ExecutorService executorService = Executors.newFixedThreadPool( 5 );
uniSlow = uniSlow.runSubscriptionOn( executorService );
uniFast = uniFast.runSubscriptionOn( executorService );
Now, when I run the test, I have the expected output:
Slow - 16:pool-3-thread-1
Fast - 17:pool-3-thread-2
two
Slow - 18:pool-3-thread-3
Fast - 19:pool-3-thread-4
two
Note that this time Slow and Fast are running on different threads.
The Mutiny guide has a section about the difference between emitOn vs. runSubscriptionOn and some examples on how to change the emission thread.

How to prevent "NucleusDataStoreException: Concurrent Modification"?

Our app suddenly got a lot of traffic and there were some design flaws in the system (or rather we never thought it would get this much traffic so we just skipped it by choice).
As the topic states I'm looking for a way to prevent the error: org.datanucleus.exceptions.NucleusDataStoreException: Concurrent Modification
Currently I have an entity called Group that looks like this:
#PersistenceCapable
public class Group extends PersistableString {
private static final long serialVersionUID = 6215353466976945628L;
#Persistent
private int yesCount;
#Persistent
private int noCount;
public void increaseYesCount()
{
yesCount++;
}
public void increaseNoCount()
{
noCount++;
}
}
The following code is how the update of the entity is done:
int answer = Integer.parseInt(req.getParameter("answer"))
try {
PersistenceManager pm = PMF.getPersistenceManager();
for(String groupId : allGroupsToBeUpdated)
{
Group group = pm.getObjectById(Group.class, groupId);
if(answer == 0)
group.increaseNoCount();
else
group.increaseYesCount();
}
pm.close();
} catch (Exception e) {
e.printStackTrace();
}
allGroupsToBeUpdated is a list that contains around 30 string-ids. Is there some way I can avoid the Concurrent Modification-error? Can I check if the entity that I retrieve is being updated and then just discard(/ignore) the update? It's not SUPER important that the write actually succeeds, I just wanna make sure I don't get the error (or that it keeps trying to succeed with the write), because it's causing the request to take between 10-30seconds.
Should I maybe open (get new PM-instance) and close the connection (pm.close()) between each update instead of waiting for all of the 30ish updates to go through?
I know of sharded counters and should have (obviously) used them, but right now I'm looking for a "quick-fix" to this problem.
EDIT:
I'm using:
App Engine SDK 1.8.9
JDO 3.0
Stacktrace can be found at:
http://pastebin.com/TWnmkpPU

Posting as an answer due to length.
Transactions probably aren't good in your case since you are really just looking to hide the issue from the user which is manifesting itself in slow request times. Perhaps kicking off an async push task to do the writes in the background outside of the request would be your best bet.
I really would not recommend design based on hiding errors and swallowing exceptions though. Looking to "prevent" an exception that is doing what it is supposed to (signaling a failed write due to contention), means you should avoid the condition which caused it in the first place.
I totally understand needing to get things working fast early on, but it may be a good idea to start adopting best practices now once the bad design decisions just start making their mark. Continuing to rely on "quick-fixes" and hiding problems can land you in a real mess later on.

How to implement Java single Database thread

I have made a Java program that connects to a SQLite database using SQLite4Java.
I read from the serial port and write values to the database. This worked fine in the beginning, but now my program has grown and I have several threads. I have tried to handle that with a SQLiteQueue-variable that execute database operations with something like this:
public void insertTempValue(final SQLiteStatement stmt, final long logTime, final double tempValue)
{
if(checkQueue("insertTempValue(SQLiteStatement, long, double)", "Queue is not running!", false))
{
queue.execute(new SQLiteJob<Object>()
{
protected Object job(SQLiteConnection connection) throws SQLiteException
{
stmt.bind(1, logTime);
stmt.bind(2, tempValue);
stmt.step();
stmt.reset(true);
return null;
}
});
}
} // end insertTempValue(SQLiteStatement, long, double)
But now my SQLite-class can't execute the statements reporting :
DB[1][U]: disposing [INSERT INTO Temperatures VALUES (?,?)]DB[1][U] from alien thread
SQLiteDB$6#8afbefd: job exception com.almworks.sqlite4java.SQLiteException: [-92] statement is disposed
So the execution does not happen.
I have tried to figure out what's wrong and I think I need a Java wrapper that makes all the database operations calls from a single thread that the other threads go through.
Here is my problem I don't know how to implement this in a good way.
How can I make a method-call and ensure that it always runs from the same thread?

Put all your database access code into a package and make all the classes package private. Write one Runnable or Thread subclass with a run() method that runs a loop. The loop checks for queued information requests, and runs the appropriate database access code to find the information, putting the information into the request and marking the request complete before going back to the queue.
Client code queues data requests and waits for answers, perhaps by blocking until the request is marked complete.
Data requests would look something like this:
public class InsertTempValueRequest {
// This method is called from client threads before queueing
// Client thread queues this object after construction
public InsertTempValueRequest(
final long logTime,
final double tempValue
) {
this.logTime = logTime
this.tempValue = tempValue
}
// This method is called from client threads after queueing to check for completion
public isComplete() {
return isComplete;
}
// This method is called from the database thread after dequeuing this object
execute(
SQLiteConnection connection,
SQLiteStatement statement
) {
// execute the statement using logTime and tempValue member data, and commit
isComplete = true;
}
private volatile long logTime;
private volatile double tempValue;
private volatile boolean isComplete = false;
}
This will work, but I suspect there will be a lot of hassle in the implementation. I think you could also get by by using a lock that only permits one thread at a time to access the database, and also - this is the difference from your existing situation - beginning the access by creating the database resources - including statements - from scratch, and disposing of those resources before releasing the lock.

I found a solution to my problem. I have now implemented a wrapper-class that makes all operations with my older SQLite-class using an ExecutorService, inspired from Thread Executor Example and got the correct usage from Java Doc ExecutorService.

Pause execution of a method until callback is finished

I am fairly new to Java and extremely new to concurrency. However, I have worked with C# for a while. It doesn't really matter, but for the sake of example, I am trying to pull data off a table on server. I want method to wait until data is completely pulled. In C#, we have async-await pattern which can be used like this:
private async Task<List<ToDoItem>> PullItems ()
{
var newRemoteItems = await (from p in remoteTable select p).ToListAsync();
return newRemoteItems;
}
I am trying to have similar effect in Java. Here is the exact code I'm trying to port (Look inside SynchronizeAsync method.)! However, Java Azure SDK works with callbacks. So, I have a few options:
Use wait and notify pattern. Following code doesn't work since I don't understand what I'm doing.
final List<TEntity> newRemoteItems = new ArrayList<TEntity>();
synchronized( this ) {
remoteTable.where().field("lastSynchronized").gt(currentTimeStamp)
.execute(new TableQueryCallback<TEntity>() {
public void onCompleted(List<TEntity> result,
int count,
Exception exception,
ServiceFilterResponse response) {
if (exception == null) {
newRemoteItems.clear();
for (TEntity item: result) {
newRemoteItems.add(item);
}
}
}
});
}
this.wait();
//DO SOME OTHER STUFF
My other option is to move DO SOME OTHER STUFF right inside the callback's if(exception == null) block. However, this would result in my whole method logic chopped off into the pieces, disturbing the continuous flow. I don't really like this approach.
Now, here are questions:
What is recommended way of doing this? I am completing the tutorial on Java concurrency at Oracle. Still, clueless. Almost everywhere I read, it is recommended to use higher level stuff rather than wait and notify.
What is wrong with my wait and notify?
My implementation blocks the main thread and it's considered a bad practice. But what else can I do? I must wait for the server to respond! Also, doesn't C# await block the main thread? How is that not a bad thing?

Either put DO SOME OTHER STUFF into callback, or declare a semaphore, and call semaphore.release in the callback and call semaphore.aquire where you want to wait. Remove synchronized(this) and this.wait.

Huge time accessing database from Java

I'm a junior java programmer and I've finally made my first program, all by myself, apart from school :).
The basics are: you can store data on it and retrieve it anytime. The main thing is, I want to be able to run this program on another computer (as a runable .jar file).
Therefore I had to install JRE and microsoft access 2010 drivers (they both are 32 bit), and the program works perfect, but there is 1 small problem.
It takes ages (literaly, 17 seconds) to store or delete something from the database.
What is the cause of this? Can I change it?
Edit:
Here's the code to insert an object of the class Woord into the database.
public static void ToevoegenWoord(Woord woord) {
try (Connection conn = DriverManager.getConnection("jdbc:odbc:DatabaseSenne")) {
PreparedStatement addWoord =
conn.prepareStatement("INSERT INTO Woorden VALUES (?)");
addWoord.setString(1, woord.getWoord());
addWoord.executeUpdate();
} catch (SQLException ex) {
for (Throwable t : ex) {
System.out.println("Het woord kond niet worden toegevoegd aan de databank.");
t.printStackTrace();
}
}
}

Most likely creating Connection every time is slow operation in your case (especially using JDBC-ODBC bridge). To confirm this try to put print statements with timestamp before and after the line that get Connection from DriverManager. If that's the case consider not to open connection on every request but open it once and reuse, better yet use some sort of Connection Pooling, there are plenty of options available.
If that's mot the case then actual insert could be slow as well. Again simple profiling with print statements should help you to discover where your code is spending most of the time.

First of all, congrats on your first independent foray. To answer your question / elaborate on maximdim's answer, the concern is that calling:
try (Connection conn = DriverManager.getConnection("jdbc:odbc:DatabaseSenne")) {
every time you're using this function may be a major bottleneck (or perhaps another section of your code is.) Most importantly, you will want to understand the concept of using logging or even standard print statements to help diagnose where you are seeing an issue. Wrapping individual lines of code like so:
System.out.println("Before Connection retrieval: " + new Date().getTime());
try (Connection conn = DriverManager.getConnection("jdbc:odbc:DatabaseSenne")) {
System.out.println("AFTER Connection retrieval: " + new Date().getTime());
...to see how many milliseconds pass for each call can help you determine exactly where your bottleneck lies.

Advise: use another database, like Derby, hsqldb. They are not so different from MSAccess, (= can use a file based DB), but perform better (than JDBC/ODBC). And can even be embedded in the application (without extra installation of the DB).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.