I am working on a project to create a simple auction server that multiple clients connect to. The server class implements Runnable and so creates a new thread for each client that connects.
I am trying to have the current highest bid stored in a variable that can be seen by each client. I found answers saying to use AtomicInteger, but when I used it with methods such as atomicVariable.intValue() I got null pointer exception errors.
What ways can I manipulate the AtomicInteger without getting this error or is there an other way to have a shared variable that is relatively simple?
Any help would be appreciated, thanks.
Update
I have the AtomicInteger working. The problem is now that only the most recent client to connect to the server seems to be able to interact with it. The other client just sort of freeze.
Would I be correct in saying this is a problem with locking?
Well, most likely you forgot to initialize it:
private final AtomicInteger highestBid = new AtomicInteger();
However working with highestBid requires a great deal of knowledge to get it right without any locking. For example if you want to update it with new highest bid:
public boolean saveIfHighest(int bid) {
int currentBid = highestBid.get();
while (currentBid < bid) {
if (highestBid.compareAndSet(currentBid, bid)) {
return true;
}
currentBid = highestBid.get();
}
return false;
}
or in a more compact way:
for(int currentBid = highestBid.get(); currentBid < bid; currentBid = highestBid.get()) {
if (highestBid.compareAndSet(currentBid, bid)) {
return true;
}
}
return false;
You might wonder, why is it so hard? Image two threads (requests) biding at the same time. Current highest bid is 10. One is biding 11, another 12. Both threads compare current highestBid and realize they are bigger. Now the second thread happens to be first and update it to 12. Unfortunately the first request now steps in and revert it to 11 (because it already checked the condition).
This is a typical race condition that you can avoid either by explicit synchronization or by using atomic variables with implicit compare-and-set low-level support.
Seeing the complexity introduced by much more performant lock-free atomic integer you might want to restore to classic synchronization:
public synchronized boolean saveIfHighest(int bid) {
if (highestBid < bid) {
highestBid = bid;
return true;
} else {
return false;
}
}
I wouldn't look at the problem like that. I would simply store all the bids in a ConcurrentSkipListSet, which is a thread-safe SortedSet. With the correct implementation of compareTo(), which determines the ordering, the first element of the Set will automatically be the highest bid.
Here's some sample code:
public class Bid implements Comparable<Bid> {
String user;
int amountInCents;
Date created;
#Override
public int compareTo(Bid o) {
if (amountInCents == o.amountInCents) {
return created.compareTo(created); // earlier bids sort first
}
return o.amountInCents - amountInCents; // larger bids sort first
}
}
public class Auction {
private SortedSet<Bid> bids = new ConcurrentSkipListSet<Bid>();
public Bid getHighestBid() {
return bids.isEmpty() ? null : bids.first();
}
public void addBid(Bid bid) {
bids.add(bid);
}
}
Doing this has the following advantages:
Automatically provides a bidding history
Allows a simple way to save any other bid info you need
You could also consider this method:
/**
* #param bid
* #return true if the bid was successful
*/
public boolean makeBid(Bid bid) {
if (bids.isEmpty()) {
bids.add(bid);
return true;
}
if (bid.compareTo(bids.first()) <= 0) {
return false;
}
bids.add(bid);
return true;
}
Using an AtomicInteger is fine, provided you initialise it as Tomasz has suggested.
What you might like to think about, however, is whether all you will literally ever need to store is just the highest bid as an integer. Will you never need to store associated information, such as the bidding time, user ID of the bidder etc? Because if at a later stage you do, you'll have to start undoing your AtomicInteger code and replacing it.
I would be tempted from the outset to set things up to store arbitrary information associated with the bid. For example, you can define a "Bid" class with the relevant field(s). Then on each bid, use an AtomicReference to store an instance of "Bid" with the relevant information. To be thread-safe, make all the fields on your Bid class final.
You could also consider using an explicit Lock (e.g. see the ReentrantLock class) to control access to the highest bid. As Tomasz mentions, even with an AtomicInteger (or AtomicReference: the logic is essentially the same) you need to be a little careful about how you access it. The atomic classes are really designed for cases where they are very frequently accessed (as in thousands of times per second, not every few minutes as on a typical auction site). They won't really give you any performance benefit here, and an explicit Lock object might be more intuitive to program with.
Related
Everytime before I place a new order to IB, I need to make a request to IB for next valid orderId and do Thread.Sleep(500) to sleep for 0.5 seconds and wait for IB API's callback function nextValidId to return the latest orderID. If I want to place multiple orders out, then I have to naively do thread.sleep multiple times, This is not a very good way to handle this, as the orderID could have been updated earlier and hence the new order could have been placed earlier. And what if the orderID takes longer time to update than thread sleep time, this would result in error.
Is there a more efficient and elegant way to do this ?
Ideally, I want the program to prevent running placeNewOrder until the latest available orderID is updated and notify the program to run placeNewOrder.
I do not know much about Java data synchronization but I reckon there might be a better solution using synchronized or wait-notify or locking or blocking.
my code:
// place first order
ib_client.reqIds(-1);
Thread.sleep(500);
int currentOrderId = ib_wrapper.getCurrentOrderId();
placeNewOrder(currentOrderId, orderDetails); // my order placement method
// place 2nd order
ib_client.reqIds(-1);
Thread.sleep(500);
int currentOrderId = ib_wrapper.getCurrentOrderId();
placeNewOrder(currentOrderId, orderDetails); // my order placement method
IB EWrapper:
public class EWrapperImpl implements EWrapper {
...
protected int currentOrderId = -1;
...
public int getCurrentOrderId() {
return currentOrderId;
}
public void nextValidId(int orderId) {
System.out.println("Next Valid Id: ["+orderId+"]");
currentOrderId = orderId;
}
...
}
You never need to ask for id's. Just increment by one for every order.
When you first connect, nextValidId is the first or second message to be received, just keep track of the id and keep incrementing.
The only rules for orderId is to use an integer and always increment by some amount. This is per clientId so if you connect with a new clientId then the last orderId is something else.
I always use max(1000, nextValidId) to make sure my id's start at 1000 or more since I use <1000 for data requests. It just helps with errors that have ids.
You can also reset the sequence somehow.
https://interactivebrokers.github.io/tws-api/order_submission.html
This means that if there is a single client application submitting
orders to an account, it does not have to obtain a new valid
identifier every time it needs to submit a new order. It is enough to
increase the last value received from the nextValidId method by one.
You should not mess around with order ID, it's automatically tracked and being set by the API. Otherwise you will get the annoying "Duplicate order id" error 103. From ApiController class:
public void placeOrModifyOrder(Contract contract, final Order order, final IOrderHandler handler) {
if (!checkConnection())
return;
// when placing new order, assign new order id
if (order.orderId() == 0) {
order.orderId( m_orderId++);
if (handler != null) {
m_orderHandlers.put( order.orderId(), handler);
}
}
m_client.placeOrder( contract, order);
sendEOM();
}
We have server APIs to support clients running on ten millions devices. Normally clients call server once a day. That is about 116 clients seen per second. For each client (each with unique ID), it may make several APIs calls concurrently. Server then need to sequence those API calls from the same client. Because, those API calls will update the same document in the Mongodb database. For example: last seen time and other embedded documents.
Therefore, I need to create a synchronization mechanism based on client's unique Id. After some research, I found String Pool is appealing and easy to implement. But, someone made a comment that locking on String Pool may conflict with other library/module which also use it. And, therefore, String Pool should never be used for synchronization purpose. Is the statement true? Or should I implement my own "String Pool" by WeakHashMap as mentioned in the link below?
Good explanation of String Pool implementation in Java:
http://java-performance.info/string-intern-in-java-6-7-8/
Article stating String Pool should not be use for synchronization:
http://www.journaldev.com/1061/thread-safety-in-java
==================================
Thanks for BeeOnRope's suggestion, I will use Guava's Interner to explain the solution. This way, client that don't send multiple requests at the same time will not be blocked. In addition, it guarantees only one API request from one client is processed at the same time. By the way, we need to use a wrapper class as it's bad idea to lock on String object as explained by BeeOnRope and the link he provided in his answer.
public class Key {
private String id;
public Key(String id) {
this.id = id;
}
public String getId() {
return id;
}
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ( (id == null) ? 0 : id.hashCode());
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj) return true;
if (obj == null) return false;
if (getClass() != obj.getClass()) return false;
Key other = (Key)obj;
if (id == null) {
if (other.id != null) return false;
} else if (!id.equals(other.id)) return false;
return true;
}
}
Interner<Key> myIdInterner = Interners.newWeakInterner();
public void processApi1(String clientUniqueId, RequestType1 request) {
synchronized(myIdInterner.intern(new Key(clientUniqueId))) {
// code to process request
}
}
public void processApi2(String clientUniqueId, RequestType2 request) {
synchronized(myIdInterner.intern(new Key(clientUniqueId))) {
// code to process request
}
}
Well if your strings are unique enough (e.g., generated via a cryptographic hash1) synchronizing on client IDs will probably work, as long as you call String.intern() on them first. Since the IDs are unique, you aren't likely to run into conflicts with other modules, unless you happen to pass your IDs in to them and they follow the bad practice of locking on them.
That said, it is probably a bad idea. In addition to the small chance of one day running into unnecessary contention if someone else locks on the same String instance, the main problem is that you have to intern() all your String objects, and this often suffers from poor performance because of the native implementation of the string intern table, it's fixed size, etc. If you really need to lock based only on a String, you are better off using Guava's Interners.newWeakInterner() interner implementation, which is likely to perform much better. Wrap your string in another class to avoid clashing on the built-in String lock. More details on that approach in this answer.
Besides that, there is often another natural object to lock on, such as a lock in a session object, etc.
This is quite similar to this question which has more fleshed out answers.
1 ... or, at a minimum, have at least have enough bits to make collision unlikely enough and if your client IDs aren't part of your attack surface.
So I have been having a go with using the method reference in Java 8 (Object::Method). What I am attempting to do, which I have done before but have forgotten (last time I used this method reference was about 4 months ago), is find the amount of players that != online using the Method Reference.
public static Set<Friend> getOnlineFriends(UUID playerUUID)
{
Set<Friend> friends = new HashSet<>(Arrays.asList(ZMFriends.getFriends(playerUUID)));
return friends.stream().filter(Friend::isOnline).collect(Collectors.toSet());
}
public static Set<Friend> getOfflineFriends(UUID playerUUID)
{
Set<Friend> friends = new HashSet<>(Arrays.asList(ZMFriends.getFriends(playerUUID)));
return friends.stream().filter(Friend::isOnline).collect(Collectors.toSet());
As you can see I managed to so it when the player (friend) is online but I cannot figure out how to filter though the Set and collect the offline players. I'm missing something obvious, but what is it?!?!
Thanks,
Duke.
In you code
public static Set<Friend> getOnlineFriends(UUID playerUUID)
{
Set<Friend> friends = new HashSet<>(Arrays.asList(ZMFriends.getFriends(playerUUID)));
return friends.stream().filter(Friend::isOnline).collect(Collectors.toSet());
}
you are creating a List view to the array returned by ZMFriends.getFriends(playerUUID), copy its contents to a HashSet, just to call stream() on it.
That’s a waste of resources, as the source type is irrelevant to the subsequent stream operation. You don’t need to have a Set source to get a Set result. So you can implement your operation simply as
public static Set<Friend> getOnlineFriends(UUID playerUUID)
{
return Arrays.stream(ZMFriends.getFriends(playerUUID))
.filter(Friend::isOnline).collect(Collectors.toSet());
}
Further, you should consider whether you really need both, getOnlineFriends and getOfflineFriends in your actual implementation. Creating utility methods in advance, just because you might need them, rarely pays off. See also “You aren’t gonna need it”.
But if you really need both operations, it’s still an unnecessary code duplication. Just consider:
public static Set<Friend> getFriends(UUID playerUUID, boolean online)
{
return Arrays.stream(ZMFriends.getFriends(playerUUID))
.filter(f -> f.isOnline()==online).collect(Collectors.toSet());
}
solving both tasks. It still wastes resource, if the application really needs both Sets, as the application still has to perform the same operation twice to get both Sets. Consider:
public static Map<Boolean,Set<Friend>> getOnlineFriends(UUID playerUUID)
{
return Arrays.stream(ZMFriends.getFriends(playerUUID))
.collect(Collectors.partitioningBy(Friend::isOnline, Collectors.toSet()));
}
This provides you both Sets at once, the online friends being associated to true, the offline friends being associated to false.
There are 2 ways I can think of:
friends.stream().filter(i -> !i.isOnline()).collect(Collectors.toSet());
But I guess that's not what you want, since it's not using a method reference. So maybe something like this:
public static <T> Predicate<T> negation(Predicate<T> predicate) {
return predicate.negate();
}
...
friends.stream().filter(negation(Friend::isOnline)).collect(Collectors.toSet());
Assume that we have a given interface:
public interface StateKeeper {
public abstract void negateWithoutCheck();
public abstract void negateWithCheck();
}
and following implementations:
class StateKeeperForPrimitives implements StateKeeper {
private boolean b = true;
public void negateWithCheck() {
if (b == true) {
this.b = false;
}
}
public void negateWithoutCheck() {
this.b = false;
}
}
class StateKeeperForObjects implements StateKeeper {
private Boolean b = true;
#Override
public void negateWithCheck() {
if (b == true) {
this.b = false;
}
}
#Override
public void negateWithoutCheck() {
this.b = false;
}
}
Moreover assume that methods negate*Check() can be called 1+ many times and it is hard to say what is the upper bound of the number of calls.
The question is which method in both implementations is 'better'
according to execution speed, garbage collection, memory allocation, etc. -
negateWithCheck or negateWithoutCheck?
Does the answer depend on which from the two proposed
implementations we use or it doesn't matter?
Does the answer depend on the estimated number of calls? For what count of number is better to use one or first method?
There might be a slight performance benefit in using the one with the check. I highly doubt that it matters in any real life application.
premature optimization is the root of all evil (Donald Knuth)
You could measure the difference between the two. Let me emphasize that these kind of things are notoriously difficult to measure reliably.
Here is a simple-minded way to do this. You can hope for performance benefits if the check recognizes that the value doesn't have to be changed, saving you an expensive write into the memory. So I have changed your code accordingly.
interface StateKeeper {
public abstract void negateWithoutCheck();
public abstract void negateWithCheck();
}
class StateKeeperForPrimitives implements StateKeeper {
private boolean b = true;
public void negateWithCheck() {
if (b == false) {
this.b = true;
}
}
public void negateWithoutCheck() {
this.b = true;
}
}
class StateKeeperForObjects implements StateKeeper {
private Boolean b = true;
public void negateWithCheck() {
if (b == false) {
this.b = true;
}
}
public void negateWithoutCheck() {
this.b = true;
}
}
public class Main {
public static void main(String args[]) {
StateKeeper[] array = new StateKeeper[10_000_000];
for (int i=0; i<array.length; ++i)
//array[i] = new StateKeeperForObjects();
array[i] = new StateKeeperForPrimitives();
long start = System.nanoTime();
for (StateKeeper e : array)
e.negateWithCheck();
//e.negateWithoutCheck();
long end = System.nanoTime();
System.err.println("Time in milliseconds: "+((end-start)/1000000));
}
}
I get the followings:
check no check
primitive 17ms 24ms
Object 21ms 24ms
I didn't find any performance penalty of the check the other way around when the check is always superfluous because the value always has to be changed.
Two things: (1) These timings are unreliable. (2) This benchmark is far from any real life application; I had to make an array of 10 million elements to actually see something.
I would simply pick the function with no check. I highly doubt that in any real application you would get any measurable performance benefit from the function that has the check but that check is error prone and is harder to read.
Short answer: the Without check will always be faster.
An assignment takes a lot less computation time than a comparison. Therefore: an IF statement is always slower than an assignment.
When comparing 2 variables, your CPU will fetch the first variable, fetch the second variable, compare those 2 and store the result into a temporary register. That's 2 fetches, 1 compare and a 1 store.
When you assign a value, your CPU will fetch the value on the right hand of the '=' and store it into the memory. That's 1 fetch and 1 store.
In general, if you need to set some state, just set the state. If, on the otherhand, you have to do something more - like log the change, inform about the change, etc. - then you should first inspect the old value.
But, in the case when methods like the ones you provided are called very intensely, there may be some performance difference in checking vs non-checking (whether the new value is different). Possible outcomes are:
1-a) check returns false
1-b) check returns true, value is assigned
2) value is assigned without check
As far as I know, writing is always slower than reading (all the way down to register level), so the fastest outcome is 1-a. If your case is that the most common thing that happens is that the value will not be changed ('more than 50%' logic is just not good enough, the exact percentage has to be figured out empirically) - then you should go with checking, as this eliminates redundant writing operation (value assignment). If, on the other hand, value is different more than often - assign it without checking.
You should test your concrete cases, do some profiling, and based on the result determine the best implementation. There is no general "best way" for this case (apart from "just set the state").
As for boolean vs Boolean here, I would say (off the top of my head) that there should be no performance difference.
Only today I've seen few answers and comments repeating that
Premature optimization is the root of all evil
Well obviously one if statement more is one thing more to do, but... it doesn't really matter.
And garbage collection and memory allocation... not an issue here.
I would generally consider the negateWithCheck to be slightly slower due there always being a comparison. Also notice in the StateKeeperOfObjects you are introducing some autoboxing. 'true' and 'false' are primitive boolean values.
Assuming you fix the StateKeeperOfObjects to use all objects, then potentially, but most likely not noticeable.
The speed will depend slightly on the number of calls, but in general the speed should be considered to be the same whether you call it once or many times (ignoring secondary effects such as caching, jit, etc).
It seems to me, a better question is whether or not the performance difference is noticeable. I work on a scientific project that involves millions of numerical computations done in parallel. We started off using Objects (e.g. Integer, Double) and had less than desirable performance, both in terms of memory and speed. When we switched all of our computations to primitives (e.g. int, double) and went over the code to make sure we were not introducing anything funky through autoboxing, we saw a huge performance increase (both memory and speed).
I am a huge fan of avoiding premature optimization, unless it is something that is "simple" to implement. Just be wary of the consequences. For example, do you have to represent null values in your data model? If so, how do you do that using a primitive? Doubles can be done easily with NaN, but what about Booleans?
negateWithoutCheck() is preferable because if we consider the number of calls then negateWithoutCheck() has only one call i.e. this.b = false; where as negateWithCheck() has one extra with previous one.
I have the following code:
Note: I simplified the code as much as possible for readability.
If I forgot any critical pieces let me know.
public class User(){
private Relations relations;
public User(){
relations = new Relations(this);
}
public getRelations(){
return relations;
}
}
public class Relations(){
private User user;
public Relations(User user){
this.user = user;
}
public synchronized void setRelation(User user2){
Relations relations2 = user2.getRelations();
synchronized(relations2){
storeRelation(user2);
if(!relations2.hasRelation(user))
relations2.setRelation(user);
}
}
public synchronized boolean hasRelation(User user2){
... // Checks if this relation is present in some kind of collection
}
/*Store this relation, unless it is already present*/
private void storeRelation(User user2){
... // Stores this relation in some kind of collection
}
}
This implementation should make sure that for all Relations x, y with:
x.user = u_x
y.user = u_y
the following invariant holds:
x.hasRelation( u_y ) <=> y.hasRelation( u_x )
I believe that holds for the code stated above?
Note: It does of course not hold during the execution of setRelation(..),
but at that moment the locks for both relations involved are
held by the executing thread so no other thread can read the
hasRelation(..) of one of the relations involved.
Assuming that this holds i believe there is still a potential deadlock-risk.
Is that correct? And if it is, how can I solve it?
I think i would need to obtain both locks in setRelation(..) atomically somehow.
You are correct on both points: your invariant does hold (assuming that I understand correctly what your method-names mean and so on, and assuming that by if(!relations.hasRelation(user)) relations2.setRelation(user2); you meant to write if(!relations2.hasRelation(user)) relations2.setRelation(user);), but you do have the risk of a deadlock: if one thread needs to obtain a lock on x and then on y, and another thread needs to obtain a lock on y and then on x, then there's a risk that each thread will succeed in getting its first lock, and thereby prevent the other from getting its second lock.
One solution is to enforce a strict universal ordering for getting locks on Relations instances. What you do is, you add a constant integer field lockOrder:
private final int lockOrder;
and a static integer field currentLockOrder:
private static int currentLockOrder = 0;
and every time you create a Relations instance, you set its lockOrder to the current value of currentLockOrder, and increment said:
public Relations()
{
synchronized(Relations.class) // a lock on currentLockOrder
{
lockOrder = currentLockOrder;
++currentLockOrder;
}
}
such that every instance of Relations will have a distinct, immutable value for lockOrder. Your setRelation method would then obtain locks in the specified order:
public void setRelation(final User thatUser)
{
final Relations that = thatUser.getRelations();
synchronized(lockOrder < that.lockOrder ? this : that)
{
synchronized(lockOrder < that.lockOrder ? that : this)
{
storeRelation(thatUser);
if(! that.hasRelation(user))
that.storeRelation(user);
}
}
}
thereby ensuring that if two threads both need to get locks on both x and y, then either they'll both first get locks on x, or they'll both first get locks on y. Either way, no deadlock will occur.
Note, by the way, that I changed setRelation to storeRelation. setRelation would work, but why add that complexity?
Also, there's still one thing I don't get: how come x.setRelation(u_y) calls x.storeRelation(u_y) unconditionally, but calls y.setRelation(u_x) (or y.storeRelation(u_x)) only if y doesn't already have the relationship? It doesn't make sense. It seems like either both checks are needed, or neither check is. (Without seeing the implementation of Relations.storeRelation(...), I can't guess which of those is the case.)