I am using Jeromq in multithreaded environment as shown below. Below is my code in which constructor of SocketManager connects to all the available sockets first and I put them in liveSocketsByDatacenter map in the connectToZMQSockets method. After that I start a background thread in the same constructor which runs every 30 seconds and it calls updateLiveSockets method to ping all those socket which were already there in liveSocketsByDatacenter map and update the liveSocketsByDatacenter map with whether those sockets were alive or not.
And getNextSocket() method is called by multiple reader threads concurrently to get the next live available socket and then we use that socket to send the data on it. So my question is are we using Jeromq correctly in multithreaded environment? Because we just saw an exception in our production environment with this stacktrace while we were trying to send data to that live socket so I am not sure whether it's a bug or something else?
java.lang.ArrayIndexOutOfBoundsException: 256
at zmq.YQueue.push(YQueue.java:97)
at zmq.YPipe.write(YPipe.java:47)
at zmq.Pipe.write(Pipe.java:232)
at zmq.LB.send(LB.java:83)
at zmq.Push.xsend(Push.java:48)
at zmq.SocketBase.send(SocketBase.java:590)
at org.zeromq.ZMQ$Socket.send(ZMQ.java:1271)
at org.zeromq.ZFrame.send(ZFrame.java:131)
at org.zeromq.ZFrame.sendAndKeep(ZFrame.java:146)
at org.zeromq.ZMsg.send(ZMsg.java:191)
at org.zeromq.ZMsg.send(ZMsg.java:163)
Below is my code:
public class SocketManager {
private static final Random random = new Random();
private final ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
private final Map<Datacenters, List<SocketHolder>> liveSocketsByDatacenter = new ConcurrentHashMap<>();
private final ZContext ctx = new ZContext();
private static class Holder {
private static final SocketManager instance = new SocketManager();
}
public static SocketManager getInstance() {
return Holder.instance;
}
private SocketManager() {
connectToZMQSockets();
scheduler.scheduleAtFixedRate(this::updateLiveSockets, 30, 30, TimeUnit.SECONDS);
}
// during startup, making a connection and populate once
private void connectToZMQSockets() {
Map<Datacenters, List<String>> socketsByDatacenter = Utils.SERVERS;
for (Map.Entry<Datacenters, List<String>> entry : socketsByDatacenter.entrySet()) {
List<SocketHolder> addedColoSockets = connect(entry.getValue(), ZMQ.PUSH);
liveSocketsByDatacenter.put(entry.getKey(), addedColoSockets);
}
}
private List<SocketHolder> connect(List<String> addresses, int socketType) {
List<SocketHolder> socketList = new ArrayList<>();
for (String address : addresses) {
try {
Socket client = ctx.createSocket(socketType);
// Set random identity to make tracing easier
String identity = String.format("%04X-%04X", random.nextInt(), random.nextInt());
client.setIdentity(identity.getBytes(ZMQ.CHARSET));
client.setTCPKeepAlive(1);
client.setSendTimeOut(7);
client.setLinger(0);
client.connect(address);
SocketHolder zmq = new SocketHolder(client, ctx, address, true);
socketList.add(zmq);
} catch (Exception ex) {
// log error
}
}
return socketList;
}
// this method will be called by multiple threads concurrently to get the next live socket
// is there any concurrency or thread safety issue or race condition here?
public Optional<SocketHolder> getNextSocket() {
for (Datacenters dc : Datacenters.getOrderedDatacenters()) {
Optional<SocketHolder> liveSocket = getLiveSocket(liveSocketsByDatacenter.get(dc));
if (liveSocket.isPresent()) {
return liveSocket;
}
}
return Optional.absent();
}
private Optional<SocketHolder> getLiveSocket(final List<SocketHolder> listOfEndPoints) {
if (!CollectionUtils.isEmpty(listOfEndPoints)) {
// The list of live sockets
List<SocketHolder> liveOnly = new ArrayList<>(listOfEndPoints.size());
for (SocketHolder obj : listOfEndPoints) {
if (obj.isLive()) {
liveOnly.add(obj);
}
}
if (!liveOnly.isEmpty()) {
// The list is not empty so we shuffle it an return the first element
return Optional.of(liveOnly.get(random.nextInt(liveOnly.size()))); // just pick one
}
}
return Optional.absent();
}
// runs every 30 seconds to ping all the socket to make sure whether they are alive or not
private void updateLiveSockets() {
Map<Datacenters, List<String>> socketsByDatacenter = Utils.SERVERS;
for (Map.Entry<Datacenters, List<String>> entry : socketsByDatacenter.entrySet()) {
List<SocketHolder> liveSockets = liveSocketsByDatacenter.get(entry.getKey());
List<SocketHolder> liveUpdatedSockets = new ArrayList<>();
for (SocketHolder liveSocket : liveSockets) { // LINE A
Socket socket = liveSocket.getSocket();
String endpoint = liveSocket.getEndpoint();
Map<byte[], byte[]> holder = populateMap();
Message message = new Message(holder, Partition.COMMAND);
// pinging to see whether a socket is live or not
boolean status = SendToSocket.getInstance().execute(message.getAdd(), holder, socket);
boolean isLive = (status) ? true : false;
SocketHolder zmq = new SocketHolder(socket, liveSocket.getContext(), endpoint, isLive);
liveUpdatedSockets.add(zmq);
}
liveSocketsByDatacenter.put(entry.getKey(), Collections.unmodifiableList(liveUpdatedSockets));
}
}
}
And here is how I am using getNextSocket() method of SocketManager class concurrently from multiple reader threads:
// this method will be called from multiple threads
public boolean sendAsync(final long addr, final byte[] reco) {
Optional<SocketHolder> liveSockets = SocketManager.getInstance().getNextSocket();
return sendAsync(addr, reco, liveSockets.get().getSocket(), false);
}
public boolean sendAsync(final long addr, final byte[] reco, final Socket socket,
final boolean messageA) {
ZMsg msg = new ZMsg();
msg.add(reco);
boolean sent = msg.send(socket);
msg.destroy();
retryHolder.put(addr, reco);
return sent;
}
public boolean send(final long address, final byte[] encodedRecords, final Socket socket) {
boolean sent = sendAsync(address, encodedRecords, socket, true);
// if the record was sent successfully, then only sleep for timeout period
if (sent) {
try {
TimeUnit.MILLISECONDS.sleep(500);
} catch (InterruptedException ex) {
Thread.currentThread().interrupt();
}
}
// ...
return sent;
}
I don't think this is correct I believe. It seems getNextSocket() could return a 0MQ socket to thread A. Concurrently, the timer thread may access the same 0MQ socket to ping it. In this case thread A and the timer thread are mutating the same 0MQ socket, which will lead to problems. So what is the best and efficient way to fix this issue?
Note: SocketHolder is an immutable class
Update:
I just noticed same issue happened on my another box with same ArrayIndexOutOfBoundsException but this time its 71 line number in "YQueue" file. The only consistent thing is 256 always. So there should be something related to 256 for sure and I am not able to figure out what is this 256 here?
java.lang.ArrayIndexOutOfBoundsException: 256
at zmq.YQueue.backPos(YQueue.java:71)
at zmq.YPipe.write(YPipe.java:51)
at zmq.Pipe.write(Pipe.java:232)
at zmq.LB.send(LB.java:83)
at zmq.Push.xsend(Push.java:48)
at zmq.SocketBase.send(SocketBase.java:590)
at org.zeromq.ZMQ$Socket.send(ZMQ.java:1271)
at org.zeromq.ZFrame.send(ZFrame.java:131)
at org.zeromq.ZFrame.sendAndKeep(ZFrame.java:146)
at org.zeromq.ZMsg.send(ZMsg.java:191)
at org.zeromq.ZMsg.send(ZMsg.java:163)
Fact #0: ZeroMQ is not thread-safe -- by definition
While ZeroMQ documentation and Pieter HINTJENS' excellent book "Code Connected. Volume 1" do not forget to remind this fact wherever possible, the idea of returning or even sharing a ZeroMQ socket-instance among threads appear from time to time. Sure, class-instances' methods may deliver this almost "hidden" inside theirs internal methods and attributes, but proper design efforts ought prevent any such side-effects with no exceptions, no excuse.
Sharing, if reasonably supported by quantitative facts, may be a way for a common instance of the zmq.Context(), but a crystal-clear distributed system design may live on a truly multi-agent scheme, where each agent operates its own Context()-engine, fine-tuned to the respective mix of configuration and performance preferences.
So what is the best and efficient way to fix this issue?
Never share a ZeroMQ socket. Never, indeed. Even if the newest development started to promise some near future changes in this direction. It is a bad habit to pollute any high-performance, low-latency distributed system design with sharing. Share nothing is the best design principle for this domain.
Yeah I can see we should not share sockets between threads but in my codewhat do you think is the best way to resolve this?
Yeah, the best and efficient way to fix this issue is to never share a ZeroMQ socket.
This means never return any object, attributes of which are ZeroMQ sockets ( which you actively build and return in a massive manner from the .connect(){...} class-method. In your case, all the class-methods seem to be kept private, which may fuse the problem of allowing "other threads" to touch the class-private socket instances, but the same principle must be endorsed also on all the attribute-level, so as to be effective. Finally, this "fusing" gets shortcut and violated by the
public static SocketManager getInstance(),
which promiscuitively offers any external asker to get a straight access to sharing the class-private instances of the ZeroMQ sockets.
If some documentation explicitly warns in almost every chapter not to share things, one rather should not share the things.
So, re-design the methods, so that the SocketManager gets more functionalities as it's class-methods, which will execute the must-have functionalities embedded, so as to explicitly prevent any external-world thread to touch a non-share-able instance, as documented in ZeroMQ publications.
Next comes the inventory of resources: your code seems to re-check every 30 seconds the state-of-the-world in all DataCenters-of-Interest. This actually creates new List objects twice a minute. While you may speculatively let java Garbage Collector to tidy up all the thrash, that is not further referenced from anywhere, this is not a good idea for ZeroMQ-related objects, embedded inside List-s from your previous re-check runs. ZeroMQ-objects are still referenced from inside the Zcontext() - the ZeroMQ Context()-core-factory instantiated I/O-thread(s), which could be also viewed as the ZeroMQ socket-inventory resources-manager. So, all the new-created socket-instances get not only the external-handle from the java-side, but also an internal-handle, from inside the (Z)Context(). So far so good. But what is not seen, anywhere in the code, is any method, that would de-commission any and all the ZeroMQ sockets in object-instances, that have got deassociated from java-side, but yet remain referenced from the (Z)Context()-side. Explicit resource-decommissioning of allocated resources is a fair design-side practice, the more for resources, that are limited or otherwise constrained. The way how to do this may differ for { "cheap" | "expensive" }-maintenance costs of such resources-management processing ( ZeroMQ socket-instances being remarkably expensive to get handled as some lightweight "consumable/disposable" ... but that is another story ).
So, add also a set of proper resources-re-use / resources-dismantling methods, that would get the total amount of new-created sockets back under your responsibility of control ( your code is responsible for how many socket-handlers inside the (Z)Context()-domain-of-resources-control may get created and must remain to have been managed -- be it knowingly or not ).
One may object there might be some "promises" from automated detection and ( potentially well deferred ) garbage collection, but still, your code is responsible for proper resources-management and even LMAX guys would never get such brave performance, if they were relying on "promises" from standard gc. Your problem is way worse than LMAX top-performance had to fight with. Your code ( so far published ) does nothing to .close() and .term() the ZeroMQ-associated resources at all. This is a straight impossible practice inside an ecosystem with uncontrolled-(distributed-demand-for)-consumption. You have to protect your boat from getting overloaded beyond a limit you know it can safely handle and dynamically unload each and every box, that has no recipient on the "opposite coast".
That is the Captain's ( your code designer's ) responsibility.
Not telling explicitly the sailor-in-charge of the inventory-management on the lowest level ( ZeroMQ Context()-floor ) that some boxes are to get un-loaded, the problem is still yours. The standard gc-chain-of-command will not do this "automatically", whatever "promises" might look like it would, it would not. So be explicit towards your ZeroMQ resources-management, evaluate return-codes from ordering these steps to be taken and handle appropriately any and all exceptions raised from doing these resources-management operations under your code explicit control.
Lower ( if not the lowest achievable at all ) resources utilisation-envelopes and higher ( if not the highest achievable at all ) performance is a bonus from doing this job right. LMAX guys are a good example in doing this remarkably well beyond the standard java "promises", so one can learn from the bests of the bests.
Call signatures declared, vs. used, do not seem to match:
while I may be wrong in this point, as most of my design efforts are not in java polymorphic call-interfaces, there seems to be a mis-match in a signature, published as:
private List<SocketHolder> connect( Datacenters dc, // 1-st
List<String> addresses, // 2-nd
int socketType // 3-rd
) {
... /* implementation */
}
andthe actual method invocation,called inside connectToZMQSockets() method just by:
List<SocketHolder> addedColoSockets = connect( entry.getValue(), // 1-st
ZMQ.PUSH // 2-nd
);
Related
I developed an application using Java socket. I am exchanging messages with this application with the help of byte arrays. I have a message named M1, 1979 bytes long. My socket buffer length is 512 bytes. I read this message in 4 parts, each with 512 bytes, but the last one is of course 443 bytes. I will name these parts like A, B, C, and D. So ABCD is a valid message of mine respectively.
I have a loop with a thread which is like below.
BlockingQueue<Chunk> queue = new LinkedBlockingQueue<>();
InputStream in = socket.getInputStream()
byte[] buffer = new byte[512];
while(true) {
int readResult = in.read(buffer);
if(readResult != -1) {
byte[] arr = Arrays.copyOf(buffer, readResult);
Chunk c = new Chunk(arr);
queue.put(c);
}
}
I'm filling the queue with the code above. When the message sending starts, I see the queue fill up in ABCD form but sometimes I put the data in the queue as a BACD. But I know that this is impossible because the TCP connection guarantees the order.
I looked at the dumps with Wireshark. This message comes correctly with a single tcp package. So there is no problem on the sender side. I am 100% sure that the message has arrived correctly but the read method does not seem to read in the correct order and this situation doesn't always happen. I could not find a valid reason for what caused this.
When I tried the same code on two different computers I noticed that the problem was in only one. The jdk versions on these computers are different. I looked at the version differences between the two jdk versions. When the Jdk version is "JDK 8u202", I am getting the situation where it works incorrectly. When I tried it with jdk 8u271, there was no problem. Maybe it is related to that but I wasn't sure. Because I have no valid evidence.
I am open to all kinds of ideas and suggestions. It's really on its way to being the most interesting problem I've ever encountered.
Thank you for your help.
EDIT: I found similar question.
Blocking Queue Take out of Order
EDIT:
Ok, I have read all the answers given below. Thank you for providing different perspectives for me. I will try to supplement some missing information.
Actually I have 2 threads. Thread 1(SocketReader) is responsible for reading socket. It wraps the information it reads with a Chunk class and puts it on the queue in the other Thread 2. So queue is in Thread 2. Thread 2(MessageDecoder) is consuming the blocking queue. There are no threads other than these. Actually this is a simple example of a "producer consumer design patter".
And yes, other messages are sent, but other messages take up less than 512 bytes. Therefore, I can read in one go. I do not encounter any sort problem.
MessageDecoder.java
public class MessageDecoder implements Runnable{
private BlockingQueue<Chunk> queue = new LinkedBlockingQueue<>();
public MessageDecoder() {
}
public void run() {
while(true) {
Chunk c;
try {
c = queue.take();
System.out.println(c.toString());
} catch (InterruptedException e) {
e.printStackTrace();
}
decodeMessageChunk(c);
}
}
public void put(Chunk c) {
try {
queue.put(c);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
SocketReader.java
public class SocketReader implements Runnable{
private final MessageDecoder msgDec;
private final InputStream in;
byte[] buffer = new byte[512];
public SocketReader(InputStream in, MessageDecoder msgDec) {
this.in = in;
this.msgDec = msgDec;
}
public void run() {
while(true) {
int readResult = in.read(buffer);
if(readResult != -1) {
byte[] arr = Arrays.copyOf(buffer, readResult);
Chunk c = new Chunk(arr);
msgDec.put(c);
}
}
}
}
Even if it's a FIFO queue, the locking of the LinkedBloquingQueue is unfair, so you can't guarantee the ordering of elements. More info regarding this here
I'd suggest using an ArrayBlockingQueue instead. Like the LinkedBloquingQueue, the order is not guaranteed but offers a slightly different locking mechanism.
This class supports an optional fairness policy for ordering waiting
producer and consumer threads. By default, this ordering is not
guaranteed. However, a queue constructed with fairness set to true
grants threads access in FIFO order. Fairness generally decreases
throughput but reduces variability and avoids starvation.
In order to set fairness, you must initialize it using this constructor:
So, for example:
ArrayBlockingQueue<Chunk> fairQueue = new ArrayBlockingQueue<>(1000, true);
/*.....*/
Chunk c = new Chunk(arr);
fairQueue.add(c);
As the docs state, this should grant thread access in FIFO order, guaranteeing the retrievement of the elements to be consistent while avoiding possible locking robbery happening in LinkedBloquingQueue's lock mechanism.
I have a class in which I am populating a map liveSocketsByDatacenter from a single background thread every 30 seconds inside updateLiveSockets() method and then I have a method getNextSocket() which will be called by multiple reader threads to get a live socket available which uses the same map to get this information.
public class SocketManager {
private static final Random random = new Random();
private final ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
private final AtomicReference<Map<Datacenters, List<SocketHolder>>> liveSocketsByDatacenter =
new AtomicReference<>(Collections.unmodifiableMap(new HashMap<>()));
private final ZContext ctx = new ZContext();
// Lazy Loaded Singleton Pattern
private static class Holder {
private static final SocketManager instance = new SocketManager();
}
public static SocketManager getInstance() {
return Holder.instance;
}
private SocketManager() {
connectToZMQSockets();
scheduler.scheduleAtFixedRate(new Runnable() {
public void run() {
updateLiveSockets();
}
}, 30, 30, TimeUnit.SECONDS);
}
// during startup, making a connection and populate once
private void connectToZMQSockets() {
Map<Datacenters, ImmutableList<String>> socketsByDatacenter = Utils.SERVERS;
// The map in which I put all the live sockets
Map<Datacenters, List<SocketHolder>> updatedLiveSocketsByDatacenter = new HashMap<>();
for (Map.Entry<Datacenters, ImmutableList<String>> entry : socketsByDatacenter.entrySet()) {
List<SocketHolder> addedColoSockets = connect(entry.getKey(), entry.getValue(), ZMQ.PUSH);
updatedLiveSocketsByDatacenter.put(entry.getKey(),
Collections.unmodifiableList(addedColoSockets));
}
// Update the map content
this.liveSocketsByDatacenter.set(Collections.unmodifiableMap(updatedLiveSocketsByDatacenter));
}
private List<SocketHolder> connect(Datacenters colo, List<String> addresses, int socketType) {
List<SocketHolder> socketList = new ArrayList<>();
for (String address : addresses) {
try {
Socket client = ctx.createSocket(socketType);
// Set random identity to make tracing easier
String identity = String.format("%04X-%04X", random.nextInt(), random.nextInt());
client.setIdentity(identity.getBytes(ZMQ.CHARSET));
client.setTCPKeepAlive(1);
client.setSendTimeOut(7);
client.setLinger(0);
client.connect(address);
SocketHolder zmq = new SocketHolder(client, ctx, address, true);
socketList.add(zmq);
} catch (Exception ex) {
// log error
}
}
return socketList;
}
// this method will be called by multiple threads to get the next live socket
// is there any concurrency or thread safety issue or race condition here?
public Optional<SocketHolder> getNextSocket() {
// For the sake of consistency make sure to use the same map instance
// in the whole implementation of my method by getting my entries
// from the local variable instead of the member variable
Map<Datacenters, List<SocketHolder>> liveSocketsByDatacenter =
this.liveSocketsByDatacenter.get();
Optional<SocketHolder> liveSocket = Optional.absent();
List<Datacenters> dcs = Datacenters.getOrderedDatacenters();
for (Datacenters dc : dcs) {
liveSocket = getLiveSocket(liveSocketsByDatacenter.get(dc));
if (liveSocket.isPresent()) {
break;
}
}
return liveSocket;
}
// is there any concurrency or thread safety issue or race condition here?
private Optional<SocketHolder> getLiveSocketX(final List<SocketHolder> endpoints) {
if (!CollectionUtils.isEmpty(endpoints)) {
// The list of live sockets
List<SocketHolder> liveOnly = new ArrayList<>(endpoints.size());
for (SocketHolder obj : endpoints) {
if (obj.isLive()) {
liveOnly.add(obj);
}
}
if (!liveOnly.isEmpty()) {
// The list is not empty so we shuffle it an return the first element
Collections.shuffle(liveOnly);
return Optional.of(liveOnly.get(0));
}
}
return Optional.absent();
}
// Added the modifier synchronized to prevent concurrent modification
// it is needed because to build the new map we first need to get the
// old one so both must be done atomically to prevent concistency issues
private synchronized void updateLiveSockets() {
Map<Datacenters, ImmutableList<String>> socketsByDatacenter = Utils.SERVERS;
// Initialize my new map with the current map content
Map<Datacenters, List<SocketHolder>> liveSocketsByDatacenter =
new HashMap<>(this.liveSocketsByDatacenter.get());
for (Entry<Datacenters, ImmutableList<String>> entry : socketsByDatacenter.entrySet()) {
List<SocketHolder> liveSockets = liveSocketsByDatacenter.get(entry.getKey());
List<SocketHolder> liveUpdatedSockets = new ArrayList<>();
for (SocketHolder liveSocket : liveSockets) { // LINE A
Socket socket = liveSocket.getSocket();
String endpoint = liveSocket.getEndpoint();
Map<byte[], byte[]> holder = populateMap();
Message message = new Message(holder, Partition.COMMAND);
boolean status = SendToSocket.getInstance().execute(message.getAdd(), holder, socket);
boolean isLive = (status) ? true : false;
// is there any problem the way I am using `SocketHolder` class?
SocketHolder zmq = new SocketHolder(socket, liveSocket.getContext(), endpoint, isLive);
liveUpdatedSockets.add(zmq);
}
liveSocketsByDatacenter.put(entry.getKey(),
Collections.unmodifiableList(liveUpdatedSockets));
}
this.liveSocketsByDatacenter.set(Collections.unmodifiableMap(liveSocketsByDatacenter));
}
}
As you can see in my class:
From a single background thread which runs every 30 seconds, I populate liveSocketsByDatacenter map with all the live sockets in updateLiveSockets() method.
And then from multiple threads, I call the getNextSocket() method to give me a live socket available which uses a liveSocketsByDatacenter map to get the required information.
I have my code working fine without any issues and wanted to see if there is any better or more efficient way to write this. I also wanted to get an opinion on thread safety issues or any race conditions if any are there, but so far I haven't seen any but I could be wrong.
I am mostly worried about updateLiveSockets() method and getLiveSocketX() method. I am iterating liveSockets which is a List of SocketHolder at LINE A and then making a new SocketHolder object and adding to another new list. Is this ok here?
Note: SocketHolder is an immutable class.
Neither of codes B or C is thread-safe.
Code B
When you are iterating on the enpoints list to copy it, nothing prevents another thread to modify, i.e. elements to be added and/or removed.
Code C
Assuming endpoints is not null, you are doing three calls to the list object: isEmpty, size, and get. There are several problems from a concurrency perspective:
Based on the type List<SocketHolder> of the argument, there is no guarantee that these methods enforce internal changes to the list to be propagated to other threads (memory visibility), let apart race conditions (if the list is modified while while your thread execute one of this function).
Let's suppose that the list endpoints provide the guarantee described just before - e.g. it has been wrapped with Collections.synchronizedList(). In this case, thread safety is still missing because between each of the calls to isEmpty, size, and get, the list can be modified while your thread executes the getLiveSocketX method. This could make your code use an outdated state of the list. For instance, your could use a size returned by endpoints.size() which is no longer valid because an element has been added to or removed from the list.
Edit - after code update
In the code you provided, it seems at first sight that:
You are indeed not co-modifying the endpoints list we were discussing about before in the method getLiveSocketX, because the method updateLiveSockets creates a new list liveUpdatedSockets which you populate from the existing liveSockets.
You use an AtomicReference to keep a map of Datacenters to the lists of sockets of interest. The consequences of this AtomicReference is to force memory visibility from this map down to all the lists and their elements. This means that, by side-effect, you are protected from memory inconsistencies between your "producer" and "consumer" threads (executing updateLiveSockets and getLiveSocket respectively). You are still exposed to race conditions, though - let's imagine updateLiveSockets and getLiveSocket running at the same time. Consider a socket S which status just switches from alive to closed. updateLiveSockets will see the status of a socket S as non-alive and create a new SocketHolder accordingly. However, getLiveSocket which is running at the exact same time will see an outdated state of S - since it will still use the list of sockets which updateLiveSockets is re-creating.
The synchronized keyword used on the method updateLiveSockets does not provide you any guarantee here, because no other part of the code is also synchronized.
To summarize, I would say:
The code of getLiveSocketX as it is written is not inherently thread-safe;
However, the way you copy the lists prevents concurrent modifications; and you are benefiting from a side-effect of the AtomicReference to have the minimal guarantee on memory visibility one would expect to get consistent list of sockets in getNextSocket after those have been generated from another thread;
You are still exposed to race conditions as described in (2), but this may be fine depending on the specifications you wish to confer to the getLiveSocket and getNextSocket methods - you may accept one socket returned by the getLiveSocket to be unavailable and have a retry mechanism.
All of that being said, I would thoroughly review and refactor the code to exhibit a more readable and explicit thread-safe consumer/producer pattern. Extra care should be taken with the use of the AtomicReference and the single synchronized, which seem to me being improperly used - although in fine the AtomicReference does help you as discussed before.
Goal: To know, as I fork off a thread, which processor it's going to land on. Is that possible? Regardless of whether the underlying approach is valid, is there a good answer to that narrow question? Thanks.
(Right now I need to make a copy of one of our classes for each thread, write to it in that thread and merge them all later. Using a synchronized approach is not possible because my Java expert boss thinks it's a bad idea, and after a lot of discussion I agree. If I knew which processor each thread would land on, I would only need to make as many copies of that class as there are processors.)
We use Apache Spark to get our jobs spread across a cluster, but in our application is makes sense to run one big executor and then do some multi-threading of our own out on each machine in the cluster.
I could save a lot of deep copying if I could know which processor a thread is being sent to, is that possible? I threw in our code but it's probably more of a conceptual question:
When I get down to the "do task" part of compute(), can I know which processor it's running on?
public class TholdExecutor extends RecursiveTask<TholdDropEvaluation> {
final static Logger logger = LoggerFactory.getLogger(TholdExecutor.class);
private List<TholdDropResult> partitionOfN = new ArrayList<>();
private int coreCount;
private int desiredPartitionSize; // will be updated by whatever is passed into the constructor per-chromosome
private TholdDropEvaluation localDropEvaluation; // this DropEvaluation
private TholdDropResult mSubI_DR;
public TholdExecutor(List<TholdDropResult> subsetOfN, int cores, int partSize, TholdDropEvaluation passedDropEvaluation, TholdDropResult mDrCopy) {
partitionOfN = subsetOfN;
coreCount = cores;
desiredPartitionSize = partSize;
// the TholdDropEvaluation needs to be a copy for each thread? It can't be the same one passed to threads ... so ...
TholdDropEvaluation localDropEvaluation = makeDECopy(passedDropEvaluation); // THIS NEEDS TO BE A DEEP COPY OF THE DROP EVAL!!! NOT THE ORIGINAL!!
// we never modify the TholdDropResult that is passed in, we just need to read it all on the same JVM/worker, so
mSubI_DR = mDrCopy; // this is purely a reference and can point to the passed in value (by reference, right?)
}
// this makes a deep copy of the TholdDropEvaluation for each thread, we copy the SharingRun's startIndex and endIndex only,
// as LEG events will be calculated during the subsequent dropComparison. The constructor for TholdDropEvaluation must set
// LEG events to zero.
private void makeDECopy(TholdDropEvaluation passedDropEvaluation) {
TholdDropEvaluation tholdDropEvaluation = new TholdDropEvaluation();
// iterate through the SharingRuns in the SharingRunList from the TholdDropEval that was passed in
for (SharingRun sr : passedDropEvaluation.getSharingRunList()) {
SharingRun ourSharingRun = new SharingRun();
ourSharingRun.startIndex = sr.startIndex;
ourSharingRun.endIndex = sr.endIndex;
tholdDropEvaluation.addSharingRun(ourSharingRun);
}
return tholdDropEvaluation
}
#Override
protected TholdDropEvaluation compute() {
int simsToDo = partitionOfN.size();
UUID tag = UUID.randomUUID();
long computeStartTime = System.nanoTime();
if (simsToDo <= desiredPartitionSize) {
logger.debug("IN MULTI-THREAD compute() --- UUID {}:Evaluating partitionOfN sublist length", tag, simsToDo);
// job within size limit, do the task and return the completed TholdDropEvaluation
// iterate through each TholdDropResult in the sub-partition and do the dropComparison to the refernce mSubI_DR,
// writing to the copy of the DropEval in tholdDropEvaluation
for (TholdDropResult currentResult : partitionOfN) {
mSubI_DR.dropComparison(currentResult, localDropEvaluation);
}
} else {
// job too large, subdivide and call this recursively
int half = simsToDo / 2;
logger.info("Splitting UUID = {}, half is {} and simsToDo is {}", tag, half, simsToDo );
TholdExecutor nextExec = new TholdExecutor(partitionOfN.subList(0, half), coreCount, desiredPartitionSize, tholdDropEvaluation, mSubI_DR);
TholdExecutor futureExec = new TholdExecutor(partitionOfN.subList(half, simsToDo), coreCount, desiredPartitionSize, tholdDropEvaluation, mSubI_DR);
nextExec.fork();
TholdDropEvaluation futureEval = futureExec.compute();
TholdDropEvaluation nextEval = nextExec.join();
tholdDropEvaluation.merge(futureEval);
tholdDropEvaluation.merge(nextEval);
}
logger.info("{} Compute time is {} ns",tag, System.nanoTime() - computeStartTime);
// NOTE: this was inside the else block in Rob's example, but don't we want it outside the block so it's returned
// whether
return tholdDropEvaluation;
}
}
Even if you could figure out where a thread would run initially there's no reason to assume it would live on that processor/core for the rest of its life. In all probability for any task big enough to be worth the cost of spawning a thread it won't, so you'd need to control where it ran completely to offer that level of assurance.
As far as I know there's no standard mechanism for controlling mappings from threads to processor cores inside Java. Typically that's known as "thread affinity" or "processor affinity". On Windows and Linux for example you can control that using:
Windows: SetThreadAffinityMask
Linux: sched_setaffinity or pthread_setaffinity_np
so in theory you could write some C and JNI code that allowed you to abstract this enough on the Java hosts you cared about to make it work.
That feels like the wrong solution to the real problem you seem to be facing, because you end up withdrawing options from the OS scheduler, which potentially doesn't allow it to make the smartest scheduling decisions causing total runtime to increase. Unless you're pushing an unusual workload and modelling/querying processor information/topology down to the level of NUMA and shared caches it ought to do a better job of figuring out where to run threads for most workloads than you could. Your JVM typically runs a large number of additional threads besides just the ones you explicitly create from after main() gets called. Additionally I wouldn't like to promise anything about what the JVM you run today (or even tomorrow) might decide to do on its own about thread affinity.
Having said that it seems like the underlying problem is that you want to have one instance of an object per thread. Typically that's much easier than predicting where a thread will run and then manually figuring out a mapping between N processors and M threads at any point in time. Usually you'd use "thread local storage" (TLS) to solve this problem.
Most languages provide this concept in one form or another. In Java this is provided via the ThreadLocal class. There's an example in the linked document given:
public class ThreadId {
// Atomic integer containing the next thread ID to be assigned
private static final AtomicInteger nextId = new AtomicInteger(0);
// Thread local variable containing each thread's ID
private static final ThreadLocal<Integer> threadId =
new ThreadLocal<Integer>() {
#Override protected Integer initialValue() {
return nextId.getAndIncrement();
}
};
// Returns the current thread's unique ID, assigning it if necessary
public static int get() {
return threadId.get();
}
}
Essentially there are two things you care about:
When you call get() it returns the value (Object) belonging to the current thread
If you call get in a thread which currently has nothing it will call initialValue() you implement, which allows you to construct or obtain a new object.
So in your scenario you'd probably want to deep copy the initial version of some local state from a read-only global version.
One final point of note: if your goal is to divide and conquer; do some work on lots of threads and then merge all their results to one answer the merging part is often known as a reduction. In that case you might be looking for MapReduce which is probably the most well known form of parallelism using reductions.
I want to do a task that I've already completed except this time using multithreading. I have to read a lot of data from a file (line by line), grab some information from each line, and then add it to a Map. The file is over a million lines long so I thought it may benefit from multithreading.
I'm not sure about my approach here since I have never used multithreading in Java before.
I want to have the main method do the reading, and then giving the line that has been read to another thread which will format a String, and then give it to another thread to put into a map.
public static void main(String[] args)
{
//Some information read from file
BufferedReader br = null;
String line = '';
try {
br = new BufferedReader(new FileReader("somefile.txt"));
while((line = br.readLine()) != null) {
// Pass line to another task
}
// Here I want to get a total from B, but I'm not sure how to go about doing that
}
public class Parser extends Thread
{
private Mapper m1;
// Some reference to B
public Parse (Mapper m) {
m1 = m;
}
public parse (String s, int i) {
// Do some work on S
key = DoSomethingWithString(s);
m1.add(key, i);
}
}
public class Mapper extends Thread
{
private SortedMap<String, Integer> sm;
private String key;
private int value;
boolean hasNewItem;
public Mapper() {
sm = new TreeMap<String, Integer>;
hasNewItem = false;
}
public void add(String s, int i) {
hasNewItem = true;
key = s;
value = i;
}
public void run() {
while (!Thread.currentThread().isInterrupted()) {
try {
if (hasNewItem) {
// Find if street name exists in map
sm.put(key, value);
newEntry = false;
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
// I'm not sure how to give the Map back to main.
}
}
I'm not sure if I am taking the right approach. I also do not know how to terminate the Mapper thread and retrieve the map in the main. I will have multiple Mapper threads but I have only instantiated one in the code above.
I also just realized that my Parse class is not a thread, but only another class if it does not override the run() method so I am thinking that the Parse class should be some sort of queue.
And ideas? Thanks.
EDIT:
Thanks for all of the replies. It seems that since I/O will be the major bottleneck there would be little efficiency benefit from parallelizing this. However, for demonstration purpose, am I going on the right track? I'm still a bit bothered by not knowing how to use multithreading.
Why do you need multiple threads? You only have one disk and it can only go so fast. Multithreading it won't help in this case, almost certainly. And if it does, it will be very minimal from a user's perspective. Multithreading isn't your problem. Reading from a huge file is your bottle neck.
Frequently I/O will take much longer than the in-memory tasks. We refer to such work as I/O-bound. Parallelism may have a marginal improvement at best, and can actually make things worse.
You certainly don't need a different thread to put something into a map. Unless your parsing is unusually expensive, you don't need a different thread for it either.
If you had other threads for these tasks, they might spend most of their time sitting around waiting for the next line to be read.
Even parallelizing the I/O won't necessarily help, and may hurt. Even if your CPUs support parallel threads, your hard drive might not support parallel reads.
EDIT:
All of us who commented on this assumed the task was probably I/O-bound -- because that's frequently true. However, from the comments below, this case turned out to be an exception. A better answer would have included the fourth comment below:
Measure the time it takes to read all the lines in the file without processing them. Compare to the time it takes to both read and process them. That will give you a loose upper bound on how much time you could save. This may be decreased by a new cost for thread synchronization.
You may wish to read Amdahl's Law. Since the majority of your work is strictly serial (the IO) you will get negligible improvements by multi-threading the remainder. Certainly not worth the cost of creating watertight multi-threaded code.
Perhaps you should look for a new toy-example to parallelise.
I have a Java method that performs two computations over an input set: an estimated and an accurate answer. The estimate can always be computed cheaply and in reliable time. The accurate answer can sometimes be computed in acceptable time and sometimes not (not known a priori ... have to try and see).
What I want to set up is some framework where if the accurate answer takes too long (a fixed timeout), the pre-computed estimate is used instead. I figured I'd use a thread for this. The main complication is that the code for computing the accurate answer relies on an external library, and hence I cannot "inject" Interrupt support.
A standalone test-case for this problem is here, demonstrating my problem:
package test;
import java.util.Random;
public class InterruptableProcess {
public static final int TIMEOUT = 1000;
public static void main(String[] args){
for(int i=0; i<10; i++){
getAnswer();
}
}
public static double getAnswer(){
long b4 = System.currentTimeMillis();
// have an estimate pre-computed
double estimate = Math.random();
//try to get accurate answer
//can take a long time
//if longer than TIMEOUT, use estimate instead
AccurateAnswerThread t = new AccurateAnswerThread();
t.start();
try{
t.join(TIMEOUT);
} catch(InterruptedException ie){
;
}
if(!t.isFinished()){
System.err.println("Returning estimate: "+estimate+" in "+(System.currentTimeMillis()-b4)+" ms");
return estimate;
} else{
System.err.println("Returning accurate answer: "+t.getAccurateAnswer()+" in "+(System.currentTimeMillis()-b4)+" ms");
return t.getAccurateAnswer();
}
}
public static class AccurateAnswerThread extends Thread{
private boolean finished = false;
private double answer = -1;
public void run(){
//call to external, non-modifiable code
answer = accurateAnswer();
finished = true;
}
public boolean isFinished(){
return finished;
}
public double getAccurateAnswer(){
return answer;
}
// not modifiable, emulate an expensive call
// in practice, from an external library
private double accurateAnswer(){
Random r = new Random();
long b4 = System.currentTimeMillis();
long wait = r.nextInt(TIMEOUT*2);
//don't want to use .wait() since
//external code doesn't support interruption
while(b4+wait>System.currentTimeMillis()){
;
}
return Math.random();
}
}
}
This works fine outputting ...
Returning estimate: 0.21007465651836377 in 1002 ms
Returning estimate: 0.5303547292361411 in 1001 ms
Returning accurate answer: 0.008838428149438915 in 355 ms
Returning estimate: 0.7981717302567681 in 1001 ms
Returning estimate: 0.9207406241557682 in 1000 ms
Returning accurate answer: 0.0893839926072787 in 175 ms
Returning estimate: 0.7310211480220586 in 1000 ms
Returning accurate answer: 0.7296754467596422 in 530 ms
Returning estimate: 0.5880164300851529 in 1000 ms
Returning estimate: 0.38605296260291233 in 1000 ms
However, I have a very large input set (in the order of billions of items) to run my analysis over, and I'm uncertain as to how to clean up the threads that do not finish (I do not want them running in the background).
I know that various methods to destroy threads are deprecated with good reason. I also know that the typical way to stop a thread is to use interrupts. However, in this case, I don't see that I can use an interrupt since the run() method passes a single call to an external library.
How can I kill/clean-up threads in this case?
If you know enough about the external library, such as:
never acquires any locks;
never opens any files/network connections;
never involves any I/O whatsoever, not even logging;
then it may be safe to use Thread#stop on it. You could try it and do extensive stress testing. Any resource leaks should manifest themselves soon enough.
I'd try it to see if it will respond to an Thread.interrupt(). Reduce your data of course so it doesn't run forever, but if it responds to an interrupt() then you're home free. If they lock anything, perform a wait(), or sleep() the code will have to handle the InterruptedException and it's possible the author did what was right. They may swallow it and continue, but it's possible they didn't.
While technically you can call Thread.stop() you'll need to know everything about that code to know for sure if it's safe and you won't leak resources. However, doing that research will clue you into how you could easily modify the code to look for interrupt() as well. You'll pretty much have to have the source code to audit it to know for sure which means you could easily do the right thing and add the checks there without involving as much research to know if its safe to call Thread.stop().
The other option is to cause a RuntimeException in the thread. Try nulling a reference it might have or closing some IO (socket, file handle, etc). Modify the array of data it's walking over by changing the size or null out the data. There's something you can do to cause it to throw an exception and that is not handled and it will shutdown.
Extending on the answer by chubbsondubs, if the third-party library uses some well-defined API (such as java.util.List or some library-specific API) to access the input data set, you could wrap the input data set that you pass to the third-party code with a wrapper class that will throw exceptions, e.g. in the List.get method, after a cancel flag is set.
For instance, if you pass a List to your third-party library, then it might be possible to do something along the lines of:
class CancelList<T> implements List<T> {
private final List<T> wrappedList;
private volatile boolean canceled = false;
public CancelList(List<T> wrapped) { this.wrappedList = wrapped; }
public void cancel() { this.canceled = true; }
public T get(int index) {
if (canceled) { throw new RuntimeException("Canceled!"); }
return wrappedList.get(index);
}
// Other List method implementations here...
}
public double getAnswer(List<MyType> inputList) {
CancelList<MyType> cancelList = new CancelList<MyType>(inputList);
AccurateAnswerThread t = new AccurateAnswerThread(cancelList);
t.start();
try{
t.join(TIMEOUT);
} catch(InterruptedException ie){
cancelList.cancel();
}
// Get the result of your calculation here...
}
Of course, this approach depends on a few things:
You must know the third-party code well-enough to know what methods it calls that you can control through input parameters.
The third-party code would need to make frequent calls to these methods throughout the computation process (i.e. it won't work if it copies all the data at once into an internal structure and does its computation there).
Obviously this won't work if the library catches and handles runtime exceptions and continues processing.