I'm making a series of connections asynchronously via MySQL, and I have a class which contains a bunch of easy-accesible static methods to update/remove/clear/get/etc data.
The issue I'm confronted with is that the getter methods won't return the proper value (practically ever) because they are returned before the async connection gets a chance to update the value to be returned.
Example:
public static int getSomething(final UUID user)
{
Connection c = StatsMain.getInstance().getSQL().getConnection();
PreparedStatement ps;
try
{
ps = c.prepareStatement("select something from stats where uuid=?");
ps.setString(1, user.toString());
ResultSet result = ps.executeQuery();
return result.getInt("something");
}
catch (SQLException e)
{
return false;
}
}
(Not copy & pasted, but pretty close)
I realize I can use a 'callback' effect by passing an interface to the method and doing such, but that becomes very tedious when the database stores 10 values for a key.
Sounds like you're looking for Futures since Java 6 or CompletableFuture, which is new in Java 8
Solution 1:
The best method I've come up with is have a thread with a loop in it that waits for MySQL to return values and responds to each value. This is rather like the callback in the get routine, but you only have the one loop. Of course, the loop has to know what to do with each possible returned piece of data.
This means rethinking a bit how your program works. Instead of: ask a question, get an answer, use the answer, you have two completely independent operations. The first is: ask a question, then forget about it. The second is: get an answer, then, knowing nothing about the question, use the answer. It's a completely different approach, and you need to get your head around it before using it.
(One possible further advantage of this approach is that MySQL end can now send data without being prompted. You have the option of feeding changes made by another user to your user in real time.)
Solution 2:
My other solution is simpler in some ways, but it can have you firing off lots of threads. Just have your getSomething method block until it has the answer and returns. To keep your program from hanging, just put the whole block of code that calls the method in its own thread.
Hybrid:
You can use both solutions together. The first one makes for cleaner code, but the second lets you answer a specific question when you get the reply. (If you get a "Customer Name" from the DB, and you have a dozen fields it could go in, it might help to know that you did ask for this field specifically, and that you asked because the user pushed a button to put the value in a specific text box on the screen.)
Lastly:
You can avoid a lot of multithreading headaches by using InvokeLater to put all changes to your data structures on your EventQueue. This can nicely limit the synchronization problems. (On the other hand, having 20 or 30 threads going at once can make good use of all your computer's cores, if you like to live dangerously.)
You may want to stick with synchronized calls, but if you do want to go asynchronous, this is how I'd do it. It's not too bad once you get some basic tools written and get your brain to stop thinking synchronously.
Related
I am making an ancestor query inside of a transaction like this:
Task task = OfyService.ofy().load().type(Task.class)
.ancestor(jobKey)
.filter("locationKey", locationKey)
.first().now();
Later in the transaction I create and save a new entity that uses the key I used in ancestor() as a Ref<?> property:
Task newTask = new Task(jobKey)
// Task POJO with the following property and constructor:
#Parent
private Ref<Job> jobKey;
public Task(Key<Job> jobKey) {
this.jobKey = Ref.create(jobKey);
}
When my whole method runs several times in a second I get a ConcurrentModificationException on jobKey. This is strange to be because all I am doing with it is creating a reference and setting it as a property. I looked at the description of Ref<?> and it says:
Note that the methods might or might not throw runtime exceptions
related to datastore operations;ConcurrentModificationException,
DatastoreTimeoutException, DatastoreFailureException, and
DatastoreNeedIndexException. Some Refs hide datastore operations that
could throw these exceptions.
Can someone explain to me what is going on with Ref<?> and why it is throwing me a ConcurrentModificationException? It looks like it is the culprit here.
It's the API of Objectify messing up and misusing the exception system, in order to convey a retry.
transactional systems have three major ways to solve a fundamental problem. Imagine this series of commands, all part of a single transaction (written in SQL, assuming it is readable and familiar enough to understand. It's just an example):
// transfer 10 bucks from speedy to me
int rBalance = [SELECT balance FROM accounts WHERE user = 'rzwitserloot']
int sBalance = [SELECT balance FROM accounts WHERE user = 'Speedy']
if (sBalance < 10) throw new BalanceInsufficientException();
sBalance -= 10;
rBalance += 10;
[UPDATE accounts SET balance = %rBalance% WHERE user = 'rzwitserloot']
[UPDATE accounts SET balance = %sBalance% WHERE user = 'Speedy']
COMMIT;
Seems safe enough right?
No, actually, this is really tricky. Imagine that right in the middle, around sBalance -= 10;, you withdraw 50 bucks from your account from an ATM (and your account had 50 bucks to start with).
You are now 50 bucks richer, and your account balance ought to be -10, but it is actually 40.
Whooooops.
Horrible.
There are 3 ways to solve this problem:
Locking
Imagine that the transaction locks down the entire accounts table the moment I read from it. Nothing else on the planet can write to this table until this transaction is committed. That would solve the problem: Your ATM will just hang for a bit, wait for this balance transfer to complete, and will then do its thing. Actually, it can't even read. What if you read, then this transaction writes a new value? The same problem could occur. So, lock an entire table, for everything, globally.
Solves the problem, but this does not scale.
Eh, sod it. Who cares?
Just, don't care about this. Have basic R/W locks or row locks and the bank just loses 50 bucks here. Sounds nuts but many transactional systems work like this. i.e. are broken.
Retry
And here comes the magic. A devious way to get the best of both worlds, where the bank cannot possibly mess up and give you 50 free bucks, whilst avoiding lock-the-planet scenarios, is to rerun all queries and doublecheck that the results would be the same.
In this hypothetical scenario, the transactional system has the task of realizing that the [SELECT balance FROM accounts WHERE user = 'Speedy'] command would now return a different result compared to what it returned earlier, and that this means that the entire transaction is now invalid, and needs to be rerun from the top. This solves the problem: The whole block reruns, realizes you now have a balance of 0, and correctly aborts the attempt to transfer the funds by throwing an InsufficientBalanceException. We avoided world locks, at the cost of some bookkeeping and an atomic 'do a quick check if any queries have touched anything that has changed since then' operation on any commit.
And that is exactly what you are running into here - that is what objectify means when it throws ConcurrentModificationException. Which is bad API design: That is not the right exception, and in general you shouldn't reuse existing exceptions just because the name sounds like it vaguely matches. But, anyway, you're going to have to live with the fact that objectify made a mistake in this regard.
The general fix is extremely convoluted if you haven't programmed the right way from the start, and it sounds like you haven't.
See, there is this massive problem: That code is not just primitives in the db/persistence layer. The db engine cannot replay the block. The block contains a bunch of java code, after all!
No, the code itself needs to be told to just start over.
This is then even more complicated. Computers are very reliable machines. If 2 separate processes (Say, the bank web interface where you ordered a fund transfer of 10 bucks to me, and that ATM machine) clash and are both forced to start the command over from scratch, with some bad luck both machines reliably retry and reliably get in each others way a second time, retry yet again, and will continually dovetail together, always forcing each other into retry, forever stuck.
The solution is dice. No really. Daddy needs a new pair of shoes dice. The solution is: If a conflict occurs, wait a random amount of time (but choose from a larger and larger potential pause for each conflict that occurs until something succeeds), thus ensuring that 2 systems will eventually stop dovetailing. Sounds nuts, but without this, you wouldn't be reading this page - this algorithm is a fundamental part of ethernet, which is at the very least powering something at Stack Overflow and/or your house's internet services.
The problem thus becomes that you can't just solve this problem with a while loop. The 'whoops, retry needed' code is complicated.
The only solution is closures. All code that interacts with your transactional system must be put inside a lambda, and must be idempotent outside of those parts that modify the data in the storage system (there is no difference between running it once and running it more than once). That way, the framework itself can catch the retry issue, apply the appropriate random exponential backoff, and then start over.
SQL abstractions like JDBI get this right (This is a very large reason why you should never ever write JDBC for real applications. Always use JDBI or JOOQ or something similar). I'm not aware if objectify has such an API. If not you'll have to write it yourself.
I just have a problem relative to concurrency whose logic flow is when a client (called Oracle Forms) will submit a request (called concurrent program ) and call a plsql procedure, this procedure eventually will call a java static method.
What I find is that when I submit two request in the same time or in a very short interval(like 1 second), some concurrency problem will be noticed.
The java method is the start point of doing something that search from the database suggest that which records should be inserted into database.
The problem is that, they will result in duplicated result since when I query, both request find it fine to insert new records.
I tried add synchronized in the static java method, but this does not solve this problem, why?
What I do is:
public static synchronized void execute
Please note that the insert will be called in plsql, which means I do a not sufficient synchronize if only synchronize the java method. But when I look into the log, it shows the two request run in the same second, which I do not think it is normal! since query database and doing the suggestion is time consuming.
To make the java method really time consuming, I add a code call Thread.sleep(5000), and log for the time after this code and log the thread id.
Surprise to see that the Thread id is 1! And also, the time where they pass the sleep is in the same time. Why is that?
What can I do to solve the problem? Any lock on the java method or pl sql?
PS: I am now trying to use DMBS_LOCK, and which seems to be working but I still hope to know the reason why the java method is not synchronized.
I have no idea how the JVM inside the Oracle DB is implemented, but since (at least in some common configurations) every database connection gets its own server process, then if a separate JVM is embedded into each of those, a synchronized block won't do you much good. You'd need to use database locks.
Assuming that the calls to the Java static method are done within the same classloader, then synchronized is all you need.
Your logging may be faulty. How exactly are you logging?
Your statement about the database lookup being "time consuming" is not convincing. Databases tend to cache data, for example.
In a nutshell: if, by your definition, an "atomic operation" is a combination of lookup + insert, then you should "synchronize" over both. Acquiring a database lock seems like a reasonable way to go about it.
I'm not quite sure exactly how to go about this...so it may take me a few tries to get this question right. I have a annotation for caching the results of a method. My code is a private fork for now, but the part I'm working on starts from here:
https://code.google.com/p/cache4guice/source/browse/trunk/src/org/cache4guice/aop/CacheInterceptor.java#46
I have annotated a method that I want cached, that runs a VERY slow query, sometimes takes a few minutes to run. The problem is, that my async web app keeps getting new users coming and asking for the same data. However, the getSlowData() method hasn't completed yet.
So something like this:
#Cached
public getSlowData() {
...
}
Inside the interceptor, we check the cache and find that it's not cached, which passes us down to:
return getResultAndCache(methodInvocation, cacheKey);
I've never gotten comfortable with the whole concept of concurrency. I think what I need is to mark that the getResultAndCache() method, for the given getSlowData(), has already been kicked off and subsequent requests should wait for the result.
Thanks for any thoughts or advice!
Most cache implementations synchronize calls to 'get' and 'set' but that's only half of the equation. What you really need to do is make sure that only one thread enters the 'check if loaded and load if not there' part. For most situations, the cost to serializing thread access may not be worth if there's
1) no risk
2) little cost
to loading the data multiple times through parallel threads (comment here if you need more clarification on this). Since this annotation is used universally, I would suggest creating a second annotation, something like '#ThreadSafeCached' and the invoke method will look like this
Object cacheElement = cache.get(cacheKey);
if (cacheElement != null) {
LOG.debug("Returning element in cache: {}", cacheElement);
} else {
synchronized(<something>) {
// double-check locking, works in Java SE 5 and newer
if ((cacheElement = cache.get(cacheKey)) == null) {
// a second check to make sure a previous thread didn't load it
cacheElement = getResultAndCache(methodInvocation, cacheKey);
} else {
LOG.debug("Returning element in cache: {}", cacheElement);
}
}
}
return cacheElement;
Now, I left the part about what you synchronize on. It'd be most optimal to lock down on the item being cached since you won't make any threads not specifically loading this cache item wait. If that's not possible, another crude approach may be to lock down on the annotation class itself. This is obviously less efficient but if you have no control over the cache loading logic (seems like you do), it's an easy way out!
I have a piece of code that looks like this:
Algorithm a = null;
while(a == null)
{
a = grid.getAlgorithm();
}
getAlgorithm() in my Grid class returns some subtype of Algorithm depending on what the user chooses from some options.
My problem is that even after an algorithm is selected, the loop never terminates. However, that's not the tricky bit, if I simply place a System.out.println("Got here"); after my call to getAlgorithm(), the program runs perfectly fine and the loop terminates as intended.
My question is: why does adding that magic print statement suddenly make the loop terminate?
Moreover, this issue first came up when I started using my new laptop, I doubt that's related, but I figured it would be worth mentioning.
Edit: The program in question is NOT multithreaded. The code for getAlgorithm() is:
public Algorithm getAlgorithm ()
{
return algorithm;
}
Where algorithm is initially null, but will change value upon some user input.
I believe the issue has to deal with how grid.getAlgorithm is executed. If there is very little cost associated with executing the method, then your while loop will cycle very quickly as long the method continues to return null. That is often referred to as a busy wait.
Now it sounds like your new laptop is encountering a starvation issue which didn't manifest on your old computer. It is hard to say why but if you look at the link I included above, the Wikipedia article does indicate that busy waits do have unpredictable behavior. Maybe your old computer handles user IO better than your new laptop. Regardless, on your new laptop, that loop is taking resources away from whatever is handling your user IO hence it is starving the process that is responsible for breaking the loop.
You are doing active polling. This is a bad practice. You should at least let the polling thread sleep (with Thread.sleep). Since println does some io, it probably does just that. If your app is not multithreaded it is unlikely to work at all.
If this loop is to wait for user input in a GUI then ouch. Bad, bad idea and even with Thread.sleep() added I'd never recommend it. Instead, you most likely want to register an event listener on the component in question, and only have the validation code fire off when the contents change.
It's more than likely you're program is locking up because you've reached some form of deadlock more than anything else, especially if your application is multithreaded. Rather than try to solve this issue and hack your way round it, I'd seriously consider redesigning how this part of the application works.
You should check getAlgorithm(), there must be something wrong in the method.
There are two scenarios:
Your code is really not meant to be multi-threaded. In this case you need to insert some sort of user input in the loop. Otherwise you might as well leave it as Algorithm a = grid.getAlgorithm(); and prevent the infinite loop.
Your code is multi-threaded in which case you have some sort of 'visibility' problem. Go to Atomicity, Visibility and Ordering or read Java Concurrency in Practice to learn more about visibility. Essentially it means that without some sort of synchronization between threads, the thread you are looping in may never find out that the value has changed due to optimizations the JVM may perform.
You did not mention any context around how this code is run. If it is a console based application and you started from a 'main' function, you would know if there was multi-threading. I am assuming this is not the case since you say there is no multithreading. Another option would be that this is a swing application in which case you should read Multithreaded Swing Applications. It might be a web application in which case a similar case to swing might apply.
In any case you could always debug the application to see which thread is writing to the 'algorithm' variable, then see which thread is reading from it.
I hope this is helpful. In any case, you may find more help if you give a little more context in your question. Especially for a question with such an intriguing title as 'Weird Java problem, while loop termination'.
I really want to abuse #Asynchronous to speed up my web application, therefore I want to understand this a bit more in order to avoid incorrectly using this annotation.
So I know business logic inside this annotated method will be handled in a separated thread, so the user wont have to wait. So I have two method that persist data
public void persist(Object object) {
em.persist(object);
}
#Asynchronous
public void asynPersist(Object object) {
em.persist(object);
}
So I have couple scenario I want to ask which one of these scenario is not ok to do
1. B is not depend on A
A a = new A();
asynPersist(a); //Is it risky to `asynPersist(a) here?`
B b = new B();
persist(b);
//Cant asynPersist(B) here because I need the `b` to immediately
//reflect to the view, or should I `asynPersist(b)` as well?
2. Same as first scenario but B now depend on A. Should I `asynPersist(a)`
3. A and B are not related
A a = new A();
persist(a); //Since I need `a` content to reflect on the view
B b = new B();
asynPersist(b); //If I dont need content of b to immediately reflect on the view. Can I use asyn here?
EDIT: hi #arjan, thank you so much for your post, here is another scenario I want to ask your expertise. Please let me know if my case does not make any sense to u.
4. Assume User has an attribute call `count` type `int`
User user = null;
public void incVote(){
user = userDAO.getUserById(userId);
user.setCount(u.getCount() + 1);
userDAO.merge(user);
}
public User getUser(){ //Accessor method of user
return user;
}
If I understand you correctly, if my method getUserById use #Asynchronous, then the line u.setCount(u.getCount() + 1); will block until the result of u return, is it correct? So in this case, the use of #Asynchronous is useless, correct?
If the method merge (which merge all changes of u back to database) use #Asynchronous, and if in my JSF page, I have something like this
<p:commandButton value="Increment" actionListener="#{myBean.incVote}" update="cnt"/>
<h:outputText id="cnt" value="#{myBean.user.count}" />
So the button will invoke method incVote(), then send and ajaxical request to tell the outputText to update itself. Will this create an race condition (remember we make merge asynchronous)? So when the button tell the outputText to update itself, it invoke the accessor method getUser(), will the line return user; block to wait for the asynchronous userDAO.merge(user), or there might possible a race condition here (that count might not display the correct result) and therefore not recommend to do so?
There are quite a few places where you can take advantage of #Asynchronous. With this annotation, you can write your application as intended by the Java EE specification; don't use explicit multi-threading but let work being done by managed thread pools.
In the first place you can use this for "fire and forget" actions. E.g. sending an email to a user could be done in an #Asynchronous annotated method. The user does not need to wait for your code to contact the mail-server, negotiate the protocol, etc. It's a waste of everyone's time to let the main request processing thread wait for this.
Likewise, maybe you do some audit logging when a user logs in to your application and logs off again. Both these two persist actions are perfect candidates to put in asynchronous methods. It's senseless to let the user wait for such backend administration.
Then there is a class of situations where you need to fetch multiple data items that can't be (easily) fetched using a single query. For instance, I often see apps that do:
User user = userDAO.getByID(userID);
Invoice invoice = invoiceDAO.getByUserID(userID);
PaymentHistory paymentHistory = paymentDAO.getHistoryByuserID(userID);
List<Order> orders = orderDAO.getOpenOrdersByUserID(userID);
If you execute this as-is, your code will first go the DB and wait for the user to be fetched. It sits idle in between. Then it will go fetch the invoice and waits again. Etc etc.
This can be sped up by doing these individual calls asynchronously:
Future<User> futureUser = userDAO.getByID(userID);
Future<Invoice> futureInvoice = invoiceDAO.getByUserID(userID);
Future<PaymentHistory> futurePaymentHistory = paymentDAO.getHistoryByuserID(userID);
Future<List<Order>> futureOrders = orderDAO.getOpenOrdersByUserID(userID);
As soon as you actually need one of those objects, the code will automatically block if the result isn't there yet. This allows you to overlap fetching of individual items and even overlap other processing with fetching. For example, your JSF life cycle might already go through a few phases until you really need any of those objects.
The usual advice that multi threaded programming is hard to debug doesn't really apply here. You're not doing any explicit communication between threads and you're not even creating any threads yourself (which are the two main issues this historical advice is based upon).
For the following case, using asynchronous execution would be useless:
Future<user> futureUser = userDAO.getUserById(userId);
User user = futureUser.get(); // block happens here
user.setCount(user.getCount() + 1);
If you do something asynchronously and right thereafter wait for the result, the net effect is a sequential call.
will the line return user; block to wait for the asynchronous userDAO.merge(user)
I'm afraid you're not totally getting it yet. The return statement has no knowledge about any operation going on for the instance being processed in another context. This is not how Java works.
In my previous example, the getUserByID method returned a Future. The code automatically blocks on the get() operation.
So if you have something like:
public class SomeBean {
Future<User> futureUser;
public String doStuff() {
futureUser = dao.getByID(someID);
return "";
}
public getUser() {
return futureUser.get(); // blocks in case result is not there
}
}
Then in case of the button triggering an AJAX request and the outputText rendering itself with a binding to someBean.user, then there is no race condition. If the dao already did its thing, futureUser will immediately return an instance of type User. Otherwise it will automatically block until the User instance is available.
Regarding doing the merge() operation asynchronous in your example; this might run into race conditions. If your bean is in view scope and the user quickly presses the button again (e.g. perhaps having double clicked the first time) before the original merge is done, an increment might happen on the same instance that the first merge invocation is still persisting.
In this case you have to clone the User object first before sending it to the asynchronous merge operation.
The simple examples I started this answer with are pretty save, as they are about doing an isolated action or about doing reads with an immutable type (the userID, assume it is an int or a String) as input.
As soon as you start passing mutable data into asynchronous methods you'll have to be absolutely certain that there is no mutation being done to that data afterwards, otherwise stick to the simple rule to only pass in immutable data.
You should not use asynch this way if any process that follows the asynch piece depends on the outcome. If you persist data that a later thread needs, you'll have a race condition that will be a bad idea.
I think you should take a step back before you go this route. Write your app as recommended by Java EE: single threaded, with threads handled by the container. Profile your app and find out where the time is being spent. Make a change, reprofile, and see if it had the desired effect.
Multi-threaded apps are hard to write and debug. Don't do this unless you have a good reason and solid data to support your changes.