Profiling single Web session

Profiling single Web session - java

While searching for a performance-problem in a Java-web-app, I would like to profile (via sampling) a single web session. The web-app is deployed on a Java EE Application-Server (either JBoss or WLS, so no OpenJDK-specific stuff possible)
Using a traditional sampler like JVisualVM or YourKit will only allow me to profile all running threads at once and thus will not reveal the CPU-Usage of said web-session separated from the base-load, which is already on the server. Part of the problem is, that there is (obviously) no technical connection between web-session and a single thread, i.e. every request might get a different thread out of the threadpool of the application-server.
My idea is to implement a sampler and register the thread associated with the observed web-session every time a new request is received from the server (deregistering the thread, as soon as the request is finished).
First question: Do you have to do this by hand or is there a tool already available to do it for you?
Second question (as I've found none and assume that this is a rather specific problem): What's the best approach?
Obviously one would try to minimize impact of the profiling to the application-performance. Also obviously, as I need to connect the (technical) thread info to application-specific data (websession-ID), this appears to be nothing, which can be done via JVMTI.
Leaves the option of coding a in-app-profiler, where a thread is doing the sampling via Thread.stackTrace or ThreadMXBean.getThreadInfo. Which is better? Is there another, better option?

JProfiler can split the call tree for different URLs or query parameters.
If you can modify the URL or the query parameters for that purpose, then you will be able to separate the session from the background load. For example, you might add a query parameter profile=true that you only pass for requests that should be profiled.
Disclaimer: My company develops JProfiler.

Have a look at:
http://messadmin.sourceforge.net/
This should help with what you want to profile.
Also, http://www.appdynamics.com/ and http://newrelic.com/ are very nice tools for profiling a web app.

If you don't have JProfiler, "coding a in-app-profiler" is certainly an option. Simply printing stack traces at specific intervals during a single method call already gives a nice snapshot view into what's going on.
Here's an example helper class that does that:
import java.io.Closeable;
import java.io.IOException;
import java.lang.management.ManagementFactory;
public class StackDumper extends Thread implements Closeable {
static final String pid = ManagementFactory.getRuntimeMXBean().getName().replaceFirst("#.*", "");
final Thread t;
final int step;
public StackDumper(int timeStepMs) {
setName("Profiler");
this.t = Thread.currentThread();
this.step = timeStepMs;
start();
}
#Override
public void run() {
StringBuilder buf = new StringBuilder(4096);
for (int i = 0;; i += step) {
try {
Thread.sleep(step);
} catch (InterruptedException e) {
break;
}
buf.setLength(0);
for (StackTraceElement e : t.getStackTrace()) {
buf.append(" ").append(e).append('\n');
}
System.out.print(pid + " #" + i + "ms: [" + t.getName() + "]:\n" + buf);
}
}
#Override
public void close() throws IOException {
interrupt();
}
}
Use it as:
try (StackDumper p = new StackDumper(100)) { // prints stack every 100 ms ...
methodUnderTest();
}

Related

How to avoid context switching in Java ExecutorService

I use a software (AnyLogic) to export runnable jar files that themselves repeated re-run a set of simulations with different parameters (so-called parameter variation experiments). The simulations I'm running have very RAM intensive, so I have to limit the number of cores available to the jar file. In AnyLogic, the number of available cores is easily set, but from the Linux command line on the servers, the only way I know how to do this is by using the taskset command to just manually specify the available cores to use (using a CPU affinity "mask"). This has worked very well so far, but since you have to specify individual cores to use, I'm learning that there can be pretty substantial differences in performance depending on which cores you select. For example, you would want to maximize the use of CPU cache levels, so if you choose cores that share too much cache, you'll get much slower performance.
Since AnyLogic is written in Java, I can use Java code to specify the running of simulations. I'm looking at using the Java ExecutorService to build a pool of individual runs such that I can just specify the size of the pool to be whatever number of cores would match the RAM of the machine I'm using. I'm thinking that this would offer a number of benefits, most importantly perhaps the computer's scehduler can do a better job of selecting the cores to minimize runtime.
In my tests, I built a small AnyLogic model that take about 10 seconds to run (it just switches between 2 statechart states repeatedly). Then I created a custom experiment with this simple code.
ExecutorService service = Executors.newFixedThreadPool(2);
for (int i=0; i<10; i++)
{
Simulation experiment = new Simulation();
experiment.variable = i;
service.execute( () -> experiment.run() );
}
What I would hope to see is that only 2 Simulation objects start up at a time, since that's the size of the thread pool. But I see all 10 start up and running in parallel over the 2 threads. This makes me think that context switching is happening, which I assume is pretty inefficient.
When, instead of calling the AnyLogic Simulation, I just call a custom Java class (below) in the service.execute function, it seems to work fine, showing only 2 Tasks running at a time.
public class Task implements Runnable, Serializable {
public void run() {
traceln("Starting task on thread " + Thread.currentThread().getName());
try {
TimeUnit.SECONDS.sleep(5);
} catch (InterruptedException e) {
e.printStackTrace();
}
traceln("Ending task on thread " + Thread.currentThread().getName());
}
}
Does anyone know why the AnyLogic function seems to be setting up all the simulations at once?

I'm guessing Simulation extends from ExperimentParamVariation. The key to achieve what you want would be to determine when the experiment has ended.
The documentation shows some interesting methods like getProgress() and getState(), but you would have to poll those methods until the progress is 1 or the state is FINISHED or ERROR. There are also the methods onAfterExperiment() and onError() that should be called by the engine to indicate that the experiment has ended or there was an error. I think you could use these last two methods with a Semaphore to control how many experiments run at once:
import java.util.concurrent.Semaphore;
import com.anylogic.engine.ExperimentParamVariation;
public class Simulation extends ExperimentParamVariation</* Agent */> {
private final Semaphore semaphore;
public Simulation(Semaphore semaphore) {
this.semaphore = semaphore;
}
public void onAfterExperiment() {
this.semaphore.release();
super.onAfterExperiment();
}
public void onError(Throwable error) {
this.semaphore.release();
super.onError(error);
}
// run() cannot be overriden because it is final
// You could create another run method or acquire a permit from the semaphore elsewhere
public void runWithSemaphore() throws InterruptedException {
// This acquire() will block until a permit is available or the thread is interrupted
this.semaphore.acquire();
this.run();
}
}
Then you will have to configure a semaphore with the desired number of permits an pass it to the Simulation instances:
import java.util.concurrent.Semaphore;
// ...
Semaphore semaphore = new Semaphore(2);
for (int i = 0; i < 10; i++)
{
Simulation experiment = new Simulation(semaphore);
// ...
// Handle the InterruptedException thrown here
experiment.runWithSemaphore();
/* Alternative to runWithSemaphore(): acquire the permit and call run().
semaphore.acquire();
experiment.run();
*/
}

Firstly, this whole question has been nullified by what I think is a relatively new addition to AnyLogic's functionality. You can specify an ini file with a specified number of "parallel workers".
https://help.anylogic.com/index.jsp?topic=%2Fcom.anylogic.help%2Fhtml%2Frunning%2Fexport-java-application.html&cp=0_3_9&anchor=customize-settings
But I had managed to find a workable solution just before finding this (better) option. Hernan's answer was almost enough. I think it was hampered by some vagaries of AnyLogic's engine (as I detailed in a comment).
The best version I could muster myself was using ExecuterService. In a Custom Experiment, I put this code:
ExecutorService service = Executors.newFixedThreadPool(2);
List<Callable<Integer>> tasks = new ArrayList<>();
for (int i=0; i<10; i++)
{
int t = i;
tasks.add( () -> simulate(t) );
}
try{
traceln("starting setting up service");
List<Future<Integer>> futureResults = service.invokeAll(tasks);
traceln("finished setting up service");
List<Integer> res = futureResults.stream().parallel().map(
f -> {
try {
return f.get();
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
return null;
}).collect(Collectors.toList());
System.out.println("----- Future Results are ready -------");
System.out.println("----- Finished -------");
} catch (InterruptedException e) {
e.printStackTrace();
}
service.shutdown();
The key here was using the Java Future. Also, to use the invokeAll function, I created a function in the Additional class code block:
public int simulate(int variable){
// Create Engine, initialize random number generator:
Engine engine = createEngine();
// Set stop time
engine.setStopTime( 100000 );
// Create new root object:
Main root = new Main( engine, null, null );
root.parameter = variable;
// Prepare Engine for simulation:
engine.start( root );
// Start simulation in fast mode:
//traceln("attempting to acquire 1 permit on run "+variable);
//s.acquireUninterruptibly(1);
traceln("starting run "+variable);
engine.runFast();
traceln("ending run "+variable);
//s.release();
// Destroy the model:
engine.stop();
traceln( "Finished, run "+variable);
return 1;
}
The only limitation I could see to this approach is that I don't have a waiting-while loop to output progress every few minutes. But instead of finding a solution to that, I must abandon this work for the much better settings file solution in the link up top.

Send data to serial port when data is available

I'm building an interactive LED table with a 14x14 matrix consisting of addressable LED strips for an university assignment. Those are being controlled by 2 arduinos that get the data about which LED should have which RGB value from a Pi running a server that runs several games which should be playable on the LED table. To control the games I send respective int codes from an android app to the server running on the Raspi.
The serial communication is realized by using jSerialComm. The problem I'm facing is, that I don't want to permanently send data over the serial port but only at the moment, when a new array that specifies the matrix is available.
Therefore I don't want to be busy waiting and permanently checking if the matrix got updated not do I want to check for a update with
while(!matrixUpdated) {
try {
Thread.sleep(100);
} catch (InterruptedException e) {}
}
So what I've been trying was running a while(true) in which I call wait(), so the thread stops until I wake the thread up by calling notify when an updated matrix is available.
My run() method in the serial thread looks like this at the moment:
#Override
public void run() {
arduino1.setComPortTimeouts(SerialPort.TIMEOUT_SCANNER, 0, 0);
arduino2.setComPortTimeouts(SerialPort.TIMEOUT_SCANNER, 0, 0);
try {
Thread.sleep(100);
} catch (Exception e) {}
PrintWriter outToArduino1 = new PrintWriter(arduino1.getOutputStream());
PrintWriter outToArduino2 = new PrintWriter(arduino2.getOutputStream());
while(true) {
try {
wait();
} catch (InterruptedException e) {}
System.out.println("Matrix received");
outToArduino1.print(matrix);
outToArduino2.print(matrix);
}
}
I wake the thread up by this method which is nested in the same class:
public void setMatrix(int[][][] pixelIdentifier) {
matrix = pixelIdentifier;
notify();
}
I also tried notifyAll() which didn't change the outcome.
In one of the games (Tic Tac Toe) I call this method after every game turn to update and send the matrix to the arduinos:
private void promptToMatrix() {
synchronized (GameCenter.serialConnection) {
GameCenter.serialConnection.setMatrix(matrix);
}
}
I previously called it without using the synchronized block but as I've been reading through many articles on that topic on StackOverflow I have read that one should use synchronized for this. Further I have also read that using wait() and notify() is not recommended, however as the assignment needs to get done quite quickly I don't know if any other approach makes sense as I don't want to restructure my whole application as I run up to 5 threads when a game is being played (due to threads for communication and so on).
If there is a possibility to solve this using wait() and notify() I would be really grateful to hear how that would be done, as I have not been able to really comprehend how working properly with the synchronized block is being done and so on.
However if such a solution is not possible or would also end in restructuring the whole application I'm also open to different suggestions. Pointing out that using wait() and notify() is not recommended however doesn't help me, as I've already read that often enough, I'm aware of that but prefer to use it in that case if possible!!!
EDIT:
The application executes like this:
Main Thread
|--> SerialCommunication Thread --> waiting for updated data
|--> NetworkController Thread
|--> Client Thread --> interacting with the game thread
|--> Game Thread --> sending updated data to the waiting SerialCommunication Thread
Really appreciate any help and thanks in advance for your time!

You are dealing with asynchronous update possibly running on different threads, the best match in my opinion is using RxJava.
You could use a Subject to receive matrix event and then subscribe to it to update the leds.
You can write something like this (don't take it too literally).
public static void main(String[] args) {
int[][] initialValue = new int[32][32];
BehaviorSubject<int[][]> matrixSubject = BehaviorSubject.createDefault(initialValue);
SerialPort arduino1 = initSerial("COM1");
SerialPort arduino2 = initSerial("COM2");;
PrintWriter outToArduino1 = new PrintWriter(arduino1.getOutputStream());
PrintWriter outToArduino2 = new PrintWriter(arduino2.getOutputStream());
Observable<String> serializedMatrix = matrixSubject.map(Sample::toChars);
serializedMatrix.observeOn(Schedulers.io()).subscribe(mat -> {
// Will run on a newly created thread
outToArduino1.println(mat);
});
serializedMatrix.observeOn(Schedulers.io()).subscribe(mat -> {
// Will run on a newly created thread
outToArduino2.println(mat);
});
// Wait forever
while(true) {
try {
// get your matrix somehow ...
// then publish it on your subject
// your subscribers will receive the data and use it.
matrixSubject.onNext(matrix);
Thread.sleep(100);
} catch (InterruptedException e) {
// SWALLOW error
}
}
}
public static String toChars(int[][] data) {
// Serialize data
return null;
}
There are may operators that you could use to make it do what you need, also you can use different schedulers to choose from different thread policies.
You can also transform your input in the subject you publish, an observable or a subject can be created directly from your input.

java.lang.ArrayIndexOutOfBoundsException: 256 with jeromq 0.3.6 version

I am using Jeromq in multithreaded environment as shown below. Below is my code in which constructor of SocketManager connects to all the available sockets first and I put them in liveSocketsByDatacenter map in the connectToZMQSockets method. After that I start a background thread in the same constructor which runs every 30 seconds and it calls updateLiveSockets method to ping all those socket which were already there in liveSocketsByDatacenter map and update the liveSocketsByDatacenter map with whether those sockets were alive or not.
And getNextSocket() method is called by multiple reader threads concurrently to get the next live available socket and then we use that socket to send the data on it. So my question is are we using Jeromq correctly in multithreaded environment? Because we just saw an exception in our production environment with this stacktrace while we were trying to send data to that live socket so I am not sure whether it's a bug or something else?
java.lang.ArrayIndexOutOfBoundsException: 256
at zmq.YQueue.push(YQueue.java:97)
at zmq.YPipe.write(YPipe.java:47)
at zmq.Pipe.write(Pipe.java:232)
at zmq.LB.send(LB.java:83)
at zmq.Push.xsend(Push.java:48)
at zmq.SocketBase.send(SocketBase.java:590)
at org.zeromq.ZMQ$Socket.send(ZMQ.java:1271)
at org.zeromq.ZFrame.send(ZFrame.java:131)
at org.zeromq.ZFrame.sendAndKeep(ZFrame.java:146)
at org.zeromq.ZMsg.send(ZMsg.java:191)
at org.zeromq.ZMsg.send(ZMsg.java:163)
Below is my code:
public class SocketManager {
private static final Random random = new Random();
private final ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
private final Map<Datacenters, List<SocketHolder>> liveSocketsByDatacenter = new ConcurrentHashMap<>();
private final ZContext ctx = new ZContext();
private static class Holder {
private static final SocketManager instance = new SocketManager();
}
public static SocketManager getInstance() {
return Holder.instance;
}
private SocketManager() {
connectToZMQSockets();
scheduler.scheduleAtFixedRate(this::updateLiveSockets, 30, 30, TimeUnit.SECONDS);
}
// during startup, making a connection and populate once
private void connectToZMQSockets() {
Map<Datacenters, List<String>> socketsByDatacenter = Utils.SERVERS;
for (Map.Entry<Datacenters, List<String>> entry : socketsByDatacenter.entrySet()) {
List<SocketHolder> addedColoSockets = connect(entry.getValue(), ZMQ.PUSH);
liveSocketsByDatacenter.put(entry.getKey(), addedColoSockets);
}
}
private List<SocketHolder> connect(List<String> addresses, int socketType) {
List<SocketHolder> socketList = new ArrayList<>();
for (String address : addresses) {
try {
Socket client = ctx.createSocket(socketType);
// Set random identity to make tracing easier
String identity = String.format("%04X-%04X", random.nextInt(), random.nextInt());
client.setIdentity(identity.getBytes(ZMQ.CHARSET));
client.setTCPKeepAlive(1);
client.setSendTimeOut(7);
client.setLinger(0);
client.connect(address);
SocketHolder zmq = new SocketHolder(client, ctx, address, true);
socketList.add(zmq);
} catch (Exception ex) {
// log error
}
}
return socketList;
}
// this method will be called by multiple threads concurrently to get the next live socket
// is there any concurrency or thread safety issue or race condition here?
public Optional<SocketHolder> getNextSocket() {
for (Datacenters dc : Datacenters.getOrderedDatacenters()) {
Optional<SocketHolder> liveSocket = getLiveSocket(liveSocketsByDatacenter.get(dc));
if (liveSocket.isPresent()) {
return liveSocket;
}
}
return Optional.absent();
}
private Optional<SocketHolder> getLiveSocket(final List<SocketHolder> listOfEndPoints) {
if (!CollectionUtils.isEmpty(listOfEndPoints)) {
// The list of live sockets
List<SocketHolder> liveOnly = new ArrayList<>(listOfEndPoints.size());
for (SocketHolder obj : listOfEndPoints) {
if (obj.isLive()) {
liveOnly.add(obj);
}
}
if (!liveOnly.isEmpty()) {
// The list is not empty so we shuffle it an return the first element
return Optional.of(liveOnly.get(random.nextInt(liveOnly.size()))); // just pick one
}
}
return Optional.absent();
}
// runs every 30 seconds to ping all the socket to make sure whether they are alive or not
private void updateLiveSockets() {
Map<Datacenters, List<String>> socketsByDatacenter = Utils.SERVERS;
for (Map.Entry<Datacenters, List<String>> entry : socketsByDatacenter.entrySet()) {
List<SocketHolder> liveSockets = liveSocketsByDatacenter.get(entry.getKey());
List<SocketHolder> liveUpdatedSockets = new ArrayList<>();
for (SocketHolder liveSocket : liveSockets) { // LINE A
Socket socket = liveSocket.getSocket();
String endpoint = liveSocket.getEndpoint();
Map<byte[], byte[]> holder = populateMap();
Message message = new Message(holder, Partition.COMMAND);
// pinging to see whether a socket is live or not
boolean status = SendToSocket.getInstance().execute(message.getAdd(), holder, socket);
boolean isLive = (status) ? true : false;
SocketHolder zmq = new SocketHolder(socket, liveSocket.getContext(), endpoint, isLive);
liveUpdatedSockets.add(zmq);
}
liveSocketsByDatacenter.put(entry.getKey(), Collections.unmodifiableList(liveUpdatedSockets));
}
}
}
And here is how I am using getNextSocket() method of SocketManager class concurrently from multiple reader threads:
// this method will be called from multiple threads
public boolean sendAsync(final long addr, final byte[] reco) {
Optional<SocketHolder> liveSockets = SocketManager.getInstance().getNextSocket();
return sendAsync(addr, reco, liveSockets.get().getSocket(), false);
}
public boolean sendAsync(final long addr, final byte[] reco, final Socket socket,
final boolean messageA) {
ZMsg msg = new ZMsg();
msg.add(reco);
boolean sent = msg.send(socket);
msg.destroy();
retryHolder.put(addr, reco);
return sent;
}
public boolean send(final long address, final byte[] encodedRecords, final Socket socket) {
boolean sent = sendAsync(address, encodedRecords, socket, true);
// if the record was sent successfully, then only sleep for timeout period
if (sent) {
try {
TimeUnit.MILLISECONDS.sleep(500);
} catch (InterruptedException ex) {
Thread.currentThread().interrupt();
}
}
// ...
return sent;
}
I don't think this is correct I believe. It seems getNextSocket() could return a 0MQ socket to thread A. Concurrently, the timer thread may access the same 0MQ socket to ping it. In this case thread A and the timer thread are mutating the same 0MQ socket, which will lead to problems. So what is the best and efficient way to fix this issue?
Note: SocketHolder is an immutable class
Update:
I just noticed same issue happened on my another box with same ArrayIndexOutOfBoundsException but this time its 71 line number in "YQueue" file. The only consistent thing is 256 always. So there should be something related to 256 for sure and I am not able to figure out what is this 256 here?
java.lang.ArrayIndexOutOfBoundsException: 256
at zmq.YQueue.backPos(YQueue.java:71)
at zmq.YPipe.write(YPipe.java:51)
at zmq.Pipe.write(Pipe.java:232)
at zmq.LB.send(LB.java:83)
at zmq.Push.xsend(Push.java:48)
at zmq.SocketBase.send(SocketBase.java:590)
at org.zeromq.ZMQ$Socket.send(ZMQ.java:1271)
at org.zeromq.ZFrame.send(ZFrame.java:131)
at org.zeromq.ZFrame.sendAndKeep(ZFrame.java:146)
at org.zeromq.ZMsg.send(ZMsg.java:191)
at org.zeromq.ZMsg.send(ZMsg.java:163)

Fact #0: ZeroMQ is not thread-safe -- by definition
While ZeroMQ documentation and Pieter HINTJENS' excellent book "Code Connected. Volume 1" do not forget to remind this fact wherever possible, the idea of returning or even sharing a ZeroMQ socket-instance among threads appear from time to time. Sure, class-instances' methods may deliver this almost "hidden" inside theirs internal methods and attributes, but proper design efforts ought prevent any such side-effects with no exceptions, no excuse.
Sharing, if reasonably supported by quantitative facts, may be a way for a common instance of the zmq.Context(), but a crystal-clear distributed system design may live on a truly multi-agent scheme, where each agent operates its own Context()-engine, fine-tuned to the respective mix of configuration and performance preferences.
So what is the best and efficient way to fix this issue?
Never share a ZeroMQ socket. Never, indeed. Even if the newest development started to promise some near future changes in this direction. It is a bad habit to pollute any high-performance, low-latency distributed system design with sharing. Share nothing is the best design principle for this domain.
Yeah I can see we should not share sockets between threads but in my codewhat do you think is the best way to resolve this?
Yeah, the best and efficient way to fix this issue is to never share a ZeroMQ socket.
This means never return any object, attributes of which are ZeroMQ sockets ( which you actively build and return in a massive manner from the .connect(){...} class-method. In your case, all the class-methods seem to be kept private, which may fuse the problem of allowing "other threads" to touch the class-private socket instances, but the same principle must be endorsed also on all the attribute-level, so as to be effective. Finally, this "fusing" gets shortcut and violated by the
public static SocketManager getInstance(),
which promiscuitively offers any external asker to get a straight access to sharing the class-private instances of the ZeroMQ sockets.
If some documentation explicitly warns in almost every chapter not to share things, one rather should not share the things.
So, re-design the methods, so that the SocketManager gets more functionalities as it's class-methods, which will execute the must-have functionalities embedded, so as to explicitly prevent any external-world thread to touch a non-share-able instance, as documented in ZeroMQ publications.
Next comes the inventory of resources: your code seems to re-check every 30 seconds the state-of-the-world in all DataCenters-of-Interest. This actually creates new List objects twice a minute. While you may speculatively let java Garbage Collector to tidy up all the thrash, that is not further referenced from anywhere, this is not a good idea for ZeroMQ-related objects, embedded inside List-s from your previous re-check runs. ZeroMQ-objects are still referenced from inside the Zcontext() - the ZeroMQ Context()-core-factory instantiated I/O-thread(s), which could be also viewed as the ZeroMQ socket-inventory resources-manager. So, all the new-created socket-instances get not only the external-handle from the java-side, but also an internal-handle, from inside the (Z)Context(). So far so good. But what is not seen, anywhere in the code, is any method, that would de-commission any and all the ZeroMQ sockets in object-instances, that have got deassociated from java-side, but yet remain referenced from the (Z)Context()-side. Explicit resource-decommissioning of allocated resources is a fair design-side practice, the more for resources, that are limited or otherwise constrained. The way how to do this may differ for { "cheap" | "expensive" }-maintenance costs of such resources-management processing ( ZeroMQ socket-instances being remarkably expensive to get handled as some lightweight "consumable/disposable" ... but that is another story ).
So, add also a set of proper resources-re-use / resources-dismantling methods, that would get the total amount of new-created sockets back under your responsibility of control ( your code is responsible for how many socket-handlers inside the (Z)Context()-domain-of-resources-control may get created and must remain to have been managed -- be it knowingly or not ).
One may object there might be some "promises" from automated detection and ( potentially well deferred ) garbage collection, but still, your code is responsible for proper resources-management and even LMAX guys would never get such brave performance, if they were relying on "promises" from standard gc. Your problem is way worse than LMAX top-performance had to fight with. Your code ( so far published ) does nothing to .close() and .term() the ZeroMQ-associated resources at all. This is a straight impossible practice inside an ecosystem with uncontrolled-(distributed-demand-for)-consumption. You have to protect your boat from getting overloaded beyond a limit you know it can safely handle and dynamically unload each and every box, that has no recipient on the "opposite coast".
That is the Captain's ( your code designer's ) responsibility.
Not telling explicitly the sailor-in-charge of the inventory-management on the lowest level ( ZeroMQ Context()-floor ) that some boxes are to get un-loaded, the problem is still yours. The standard gc-chain-of-command will not do this "automatically", whatever "promises" might look like it would, it would not. So be explicit towards your ZeroMQ resources-management, evaluate return-codes from ordering these steps to be taken and handle appropriately any and all exceptions raised from doing these resources-management operations under your code explicit control.
Lower ( if not the lowest achievable at all ) resources utilisation-envelopes and higher ( if not the highest achievable at all ) performance is a bonus from doing this job right. LMAX guys are a good example in doing this remarkably well beyond the standard java "promises", so one can learn from the bests of the bests.
Call signatures declared, vs. used, do not seem to match:
while I may be wrong in this point, as most of my design efforts are not in java polymorphic call-interfaces, there seems to be a mis-match in a signature, published as:
private List<SocketHolder> connect( Datacenters dc, // 1-st
List<String> addresses, // 2-nd
int socketType // 3-rd
) {
... /* implementation */
}
andthe actual method invocation,called inside connectToZMQSockets() method just by:
List<SocketHolder> addedColoSockets = connect( entry.getValue(), // 1-st
ZMQ.PUSH // 2-nd
);

Using Multiple Threads in Java To Shorten Program Time

I do not have much experience making multi-threaded applications but I feel like my program is at a point where it may benefit from having multiple threads. I am doing a larger scale project that involves using a classifier (as in machine learning) to classify roughly 32000 customers. I have debugged the program and discovered that it takes about a second to classify each user. So in other words this would take 8.8 hours to complete!
Is there any way that I can run 4 threads handling 8000 users each? The first thread would handle 1-8000, the second 8001-16000, the third 16001-23000, the fourth 23001-32000. Also, as of now each classification is done by calling a static function from another class...
Then when the other threads besides the main one should end. Is something like this feasible? If so, I would greatly appreciate it if someone could provide tips or steps on how to do this. I am familiar with the idea of critical sections (wait/signal) but have little experience with it.
Again, any help would be very much appreciated! Tips and suggestions on how to handle a situation like this are welcomed! Not sure it matters but I have a Core 2 Duo PC with a 2.53 GHZ processor speed.

This is too lightweight for Apache Hadoop, which requires around 64MB chunks of data per server... but.. it's a perfect opportunity for Akka Actors, and, it just happens to support Java!
http://doc.akka.io/docs/akka/2.1.4/java/untyped-actors.html
Basically, you can have 4 actors doing the work, and as they finish classifying a user, or probably better, a number of users, they either pass it to a "receiver" actor, that puts the info into a data structure or a file for output, or, you can do concurrent I/O by having each write to a file on their own.. then the files can be examined/combined when they're all done.
If you want to get even more fancy/powerful, you can put the actors on remote servers. It's still really easy to communicate with them, and you'd be leveraging the CPU/resources of multiple servers.
I wrote an article myself on Akka actors, but it's in Scala, so I'll spare you that. But if you google "akka actors", you'll get lots of hand-holding examples on how to use it. Be brave, dive right in and experiment. The "actor system" is such an easy concept to pick up. I know you can do it!

Split the data up into objects that implement Runnable, then pass them to new threads.
Having more than four threads in this case won't kill you, but you cannot get more parallel work than you have cores (as mentioned in the comments) - if there are more threads than cores the system will have to handle who gets to go when.
If I had a class customer, and I want to issue a thread to prioritize 8000 customers of a greater collection I might do something like this:
public class CustomerClassifier implements Runnable {
private customer[] customers;
public CustomerClassifier(customer[] customers) {
this.customers = customers;
}
#Override
public void run() {
for (int i=0; i< customers.length; i++) {
classify(customer);//critical that this classify function does not
//attempt to modify a resource outside this class
//unless it handles locking, or is talking to a database
//or something that won't throw fits about resource locking
}
}
}
then to issue these threads elsewhere
int jobSize = 8000;
customer[] customers = new customer[jobSize]();
int j = 0;
for (int i =0; i+j< fullCustomerArray.length; i++) {
if (i == jobSize-1) {
new Thread(new CustomerClassifier(customers)).start();//run will be invoked by thread
customers = new Customer[jobSize]();
j += i;
i = 0;
}
customers[i] = fullCustomerArray[i+j];
}
If you have your classify method affect the same resource somewhere you will have to
implement locking and will also kill off your advantage gained to some degree.
Concurrency is extremely complicated and requires a lot of thought, I also recommend looking at the oracle docs http://docs.oracle.com/javase/tutorial/essential/concurrency/index.html
(I know links are bad, but hopefully the oracle docs don't move around too much?)
Disclaimer: I am no expert in concurrent design or in multithreading (different topics).

If you split the input array in 4 equal subarrays for 4 threads, there is no guarantee that all threads finish simultaneously. You better put all data in a single queue and let all working threads feed from that common queue. Use thead-safe BlockingQueue implementations in order to not write low level synchronize/wait/notify code.

From java 6 we have some handy utils for concurrency. You might want to consider using thread pools for cleaner implementation.
package com.threads;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
public class ParalleliseArrayConsumption {
private int[] itemsToBeProcessed ;
public ParalleliseArrayConsumption(int size){
itemsToBeProcessed = new int[size];
}
/**
* #param args
*/
public static void main(String[] args) {
(new ParalleliseArrayConsumption(32)).processUsers(4);
}
public void processUsers(int numOfWorkerThreads){
ExecutorService threadPool = Executors.newFixedThreadPool(numOfWorkerThreads);
int chunk = itemsToBeProcessed.length/numOfWorkerThreads;
int start = 0;
List<Future> tasks = new ArrayList<Future>();
for(int i=0;i<numOfWorkerThreads;i++){
tasks.add(threadPool.submit(new WorkerThread(start, start+chunk)));
start = start+chunk;
}
// join all worker threads to main thread
for(Future f:tasks){
try {
f.get();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (ExecutionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
threadPool.shutdown();
while(!threadPool.isTerminated()){
}
}
private class WorkerThread implements Callable{
private int startIndex;
private int endIndex;
public WorkerThread(int startIndex, int endIndex){
this.startIndex = startIndex;
this.endIndex = endIndex;
}
#Override
public Object call() throws Exception {
for(int currentUserIndex = startIndex;currentUserIndex<endIndex;currentUserIndex++){
// process the user. Add your logic here
System.out.println(currentUserIndex+" is the user being processed in thread " +Thread.currentThread().getName());
}
return null;
}
}
}

Any way to "reboot" the JVM?

Is there any way to reboot the JVM? As in don't actually exit, but close and reload all classes, and run main from the top?

Your best bet is probably to run the java interpreter within a loop, and just exit. For example:
#!/bin/sh
while true
do
java MainClass
done
If you want the ability to reboot or shutdown entirely, you could test the exit status:
#!/bin/sh
STATUS=0
while [ $STATUS -eq 0 ]
do
java MainClass
STATUS=$?
done
Within the java program, you can use System.exit(0) to indicate that you want to "reboot," and System.exit(1) to indicate that you want to stop and stay stopped.

IBM's JVM has a feature called "resettable" which allows you to effectively do what you are asking.
http://publib.boulder.ibm.com/infocenter/cicsts/v3r1/index.jsp?topic=/com.ibm.cics.ts31.doc/dfhpj/topics/dfhpje9.htm
Other than the IBM JVM, I don't think it is possible.

Not a real "reboot" but:
You can build your own class loader and load all your classes (except a bootstrap) with it. Then, when you want to "reboot", make sure you do the following:
End any threads that you've opened and are using your classes.
Dispose any Window / Dialog / Applet you've created (UI application).
Close / dispose any other GC root / OS resources hungry peered resource (database connections, etc).
Throw away your customized class loader, create another instance of it and reload all the classes. You can probably optimize this step by pre-processing the classes from files so you won't have to access the codebase again.
Call your main point of entry.
This procedure is used (to some extent) while "hot-swapping" webapps in web servers.
Note though, static class members and JVM "global" objects (ones that are accessed by a GC root that isn't under your control) will stay. For example, Locale.setLocale() effects a static member on Locale. Since the Locale class is loaded by the system class loader, it will not be "restarted". That means that the old Locale object that was used in Locale.setLocale() will be available afterward if not explicitly cleaned.
Yet another route to take is instrumentation of classes. However, since I know little of it, I'm hesitant to offer advice.
Explanation about hot deploy with some examples

If you're working in an application server, they typically come with built-in hot deployment mechanisms that'll reload all classes in your application (web app, enterprise app) when you redeploy it.
Otherwise, you'll have to look into commercial solutions. Java Rebel (http://www.zeroturnaround.com/javarebel/) is one such option.

AFAIK there is no such way.
Notice that if there were a way to do that, it would highly depend on the current loaded code to properly release all held resources in order to provide a graceful restart (think about files, socket/tcp/http/database connections, threads, etc).
Some applications, like Jboss AS, capture Ctrl+C on the console and provide a graceful shutdown, closing all resources, but this is application-specific code and not a JVM feature.

I do something similar using JMX, I will 'unload' a module using JMX and then 'reload' it. Behind the scenes I am sure they are using a different class loader.

Well, I currently have this, it works perfectly, and completely OS-independent. The only thing that must work: executing the java process without any path/etc, but I think this can also be fixed.
The little code pieces are all from stackoverflow except RunnableWithObject and restartMinecraft() :)
You need to call it like this:
restartMinecraft(getCommandLineArgs());
So what it basically does, is:
Spawns a new Process and stores it in the p variable
Makes two RunnableWithObject instances and fills the process object into their data value, then starts two threads, they just print the inputStream and errorStream when it has available data until the process is exited
Waits for the process to exit
prints debug message about process exit
Terminates with the exit value of the process(not necessary)
And yes it is directly pulled from my minecraft project:)
The code:
Tools.isProcessExited() method:
public static boolean isProcessExited(Process p) {
try {
p.exitValue();
} catch (IllegalThreadStateException e) {
return false;
}
return true;
}
Tools.restartMinecraft() method:
public static void restartMinecraft(String args) throws IOException, InterruptedException {
//Here you can do shutdown code etc
Process p = Runtime.getRuntime().exec(args);
RunnableWithObject<Process> inputStreamPrinter = new RunnableWithObject<Process>() {
#Override
public void run() {
// TODO Auto-generated method stub
while (!Tools.isProcessExited(data)) {
try {
while (data.getInputStream().available() > 0) {
System.out.print((char) data.getInputStream().read());
}
} catch (IOException e) {
}
}
}
};
RunnableWithObject<Process> errorStreamPrinter = new RunnableWithObject<Process>() {
#Override
public void run() {
// TODO Auto-generated method stub
while (!Tools.isProcessExited(data)) {
try {
while (data.getErrorStream().available() > 0) {
System.err.print((char) data.getErrorStream().read());
}
} catch (IOException e) {
}
}
}
};
inputStreamPrinter.data = p;
errorStreamPrinter.data = p;
new Thread(inputStreamPrinter).start();
new Thread(errorStreamPrinter).start();
p.waitFor();
System.out.println("Minecraft exited. (" + p.exitValue() + ")");
System.exit(p.exitValue());
}
Tools.getCommandLineArgs() method:
public static String getCommandLineArgs() {
String cmdline = "";
List<String> l = ManagementFactory.getRuntimeMXBean().getInputArguments();
cmdline += "java ";
for (int i = 0; i < l.size(); i++) {
cmdline += l.get(i) + " ";
}
cmdline += "-cp " + System.getProperty("java.class.path") + " " + System.getProperty("sun.java.command");
return cmdline;
}
Aaaaand finally the RunnableWithObject class:
package generic.minecraft.infinityclient;
public abstract class RunnableWithObject<T> implements Runnable {
public T data;
}
Good luck :)

It's easy in JavaX: You can use the standard functions nohupJavax() or restart().

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.