I'm working on a multi-user Java webapp, where it is possible for clients to use the webapp API to do potentially naughty things, by passing code which will execute on our server in a sandbox.
For example, it is possible for a client to write a tight while(true) loop that impacts the performance of other clients.
Can you guys think of ways to limit the damage caused by these sorts of behaviors to other clients' performance?
We are using Glassfish for our application server.
The halting problem show that there is no way that a computer can reliably identify code that will not terminate.
The only way to do this reliably is to execute your code in a separate JVM which you then ask the operating system to shut down when it times out. A JVM not timing out can process more tasks so you can just reuse it.
One more idea would be byte-code instrumentation. Before you load the code sent by your client, manipulate it so it adds a short sleep in every loop and for every method call (or method entry).
This avoids clients clogging a whole CPU until they are done. Of course, they still block a Thread object (which takes some memory), and the slowing down is for every client, not only the malicious ones. Maybe make the first some tries free, then scale the waiting time up with each try (and set it down again if the thread has to wait for other reasons).
Modern day app servers use Thread Pooling for better performance. The problem is that one bad apple can spoil the bunch. What you need is an app server with one thread or maybe process per request. Of course there are going to be trade offs. but the OS will handle making sure that processing time gets allocated evenly.
NOTE: After researching a little more what you need is an engine that will create another process per request. If not a user can either cripple you servlet engine by having servlets with infinite loops and then posting multiple requests. Or he could simply do a System.exit in his code and bring everybody down.
You could use a parent thread to launch each request in a separate thread as suggested already, but then monitor the CPU time used by the threads using the ThreadMXBean class. You could then have the parent thread kill any threads that are misbehaving. This is if, of course, you can establish some kind of reasonable criteria for how much CPU time a thread should or should not be using. Maybe the rule could be that a certain initial amount of time plus a certain additional amount per second of wall clock time is OK?
I would make these client request threads have lower priority than the thread responsible for monitoring them.
Related
I have read that each core can have two threads. So, if my application in prod uses three octa core servers, does that mean my application can only handle 48 concurrent requests? Or am I mixing two different things here?
Would appreciate any clarity here.
In Java, you can have as many threads as you like and you're not limited by how many CPU cores you have. I.e. Even if you only had one processor with a single core, you could still write a multi-threaded application.
The JVM will perform context switching - it will execute Thread 1 for some time, then Thread 2 for some time, then maybe Thread 1 again, and so on, switching between the threads. These switches between threads can occur after just a few milliseconds, so it can give the illusion that the threads are running in parallel.
On some applications, it is faster to just use a single thread - because this process of context switching just adds further expense.
I did actually write a small multi-threaded application the other day though. It had about 30 threads, and this was a use case where multithreading did make the app more efficient.
I had about 30 URLs that I needed to hit and retrieve some data from. If I did this in a single thread, there would be waiting time each time I made a request and waited for a response (thus, blocking the application). When multi-threading, other threads will have been able to run whilst this waiting went on.
I hope this makes sense. It'll be worth reading up on Java Context Switching for more info.
This is a good source on the topic: https://docs.oracle.com/javase/tutorial/essential/concurrency/index.html
I am writing a server side application using Java.
The server holds a number of users of the system. For each user, I want to synchronize its disk space with a remote network storage. Because synchronizations are independent, I am thinking to do them concurrently.
I am thinking to create one thread for each user and let the synchronization tasks to fire at the same time.
But the system can have tens of thousands of users. This means creating tens of thousand thread at one time and fire at the same time. I am not sure if this is something JVM can handle.
Even if it can handle this, will that be memory efficient because each thread have its own stack and this could be a big memory hit!
Please let me know your opinion.
Many thanks.
You could look at a fixed size thread pool giving a pool of threads to execute your task. This would give the benefit of multithreading with a sensible limit.
Check out Executors.newFixedThreadPool()
You should look into Non-blocking IO.
Here is a "random" article about it from courtesy of google:
http://www.developer.com/java/article.php/3837316/Non-Blocking-IO-Made-Possible-in-Java.htm
Personally I wouldn't have tens of thousands of users on a single machine. You won't be able to much per user with this many users active. You should be able to afford more than one machine.
You can have this many thread in Java but as you say this is not efficient. You can use an NIO library to manage multiple connection with each thread.
Libraries like
http://mina.apache.org/
http://www.jboss.org/netty
Are suitable.
Also interesting http://code.google.com/p/nfs-rpc/
I'm using Java and RMI in order to execute 100k Montecarlo Simulations on a cluster of hundreds of cores.
The approach I'm using is to have a client app that invokes RMI processes and divides simulations on the number of available (RMI) processes on the grid.
Once that the simulations have been run I have to reaggregate results.
The only limit I have is that all this has to happen in less than 500ms.
The process is actually in place BUT randomly, from time to time, one of the RMI call takes 200ms more to execute.
I've added loads of logs and timings all over the place and as possible reason I've already discarded:
1) Simulations taking extra time
2) Data transfer (it constantly works, only sometimes the slowdown is verified, and only on a subset of RMI calls)
3) Transferring results back (I can clearly timing how long from last RMI calls return to the end of the process)
The only thing I cannot measure is IF any of the RMI Call is taking extra time to be initialized (and honestly is the only thing I can suppose). The reason of this is that -unfortunately- clocks are not synchronized :(
Is that possible that the RMI remote process got passivated/detached/collected even if I keep a (Remote) reference to it from the client?
Hope the question is clear enough (I'm pretty much sure it isn't).
Thanks a mil and do not hesitate to make more questions if it is not clear enough.
Regards,
Giovanni
Is that possible that the RMI remote process got passivated/detached/collected even if I keep a (Remote) reference to it from the client?
Unlikely, but possible. The RMI remote process should not be collected (as the RMI FAQ indicates for VM exit conditions). It could, however, be paged to disk if the OS desired.
Do you have a way to rule out GC calls (other than writing a monitor with JVM TI)?
Also, is your code structured in such a way that you send off all calls from your aggregator asynchronously, have the replies append to a list, and aggregate the results when your critical time is up, even if some processors haven't returned results? I'm assuming that each processor is an independent, random event and that it's safe to ignore some results. If not, disregard.
I finally came up with issue. Basically after insuring that the stub wasn't getting deallocated and that the GC wasn't triggered behind the scene, I used wireshark for understanding if there was any network issue.
What I found out it was that randomly one of the packet got lost and TCP needed on our network 120ms (41 retransmission) for correctly re-transfer data.
When switching to jdk7, SDP and infiniband, we didn't experience the issue anymore.
So basically the answer to my question was... PACKET LOST!
Thanks who replied to the post it helped to focus on the right path!
Gio
I might have a problem with my application. There is a client running multiple threads which might execute rather time consuming calls to the server over Java RMI. Of course a time consuming call from one client should not block everyone else.
I tested it, and it works on my machine. So I created two Threads on the client and a dummy call on the server. On startup the clients both call the dummy method which just does a huge number of sysout. It can be seen that these calls are handled in parallel, without blocking.
I was very satisfied until a collegue indicated that the RMI spec does not necessarily guarantee that behavior.
And really a text on the hp of the university of Lancaster states that
“A method dispatched by the RMI runtime to a remote object
implementation (a server) may or may not execute in a separate thread.
Calls originating from different clients Virtual Machines will execute
in different threads. From the same client machine it is not
guaranteed that each method will run in a separate thread” [1]
What can I do about that? Is it possible that it just won't work in practice?
in theory, yes, you may have to worry about this. in reality, all mainstream rmi impls multi-thread all incoming calls, so unless you are running against some obscure jvm, you don't have anything to worry about.
What that wording means is that you can't assume it will all execute in the same thread. So you are responsible for any required synchronization.
Based on my testing on a Mac laptop, every single client request received in parallel seems to be executed on a separate thread (I tried upto a thousand threads without any issues. I don't know if there is an upper bound though. My guess is that the max no. of threads will be limited only by memory).
These threads then hang around for some time (a minute or two), in case they can service more clients. If they are unused for some time, they get GC'ed.
Note that I used Thread.sleep() on the server to hold up every request, hence none of the threads could finish the task and move on to another request.
The point is that, if required, the JVM can even allocate a separate thread for each client request. If work is done and threads are free, it could reuse existing threads without creating new ones.
I don't see a situation where any client request would be stuck waiting due to RMI constraints. No matter how many threads on the server are "busy" processing existing requests, new client requests will be received.
These things obviously require close inspection and availability of code to thoroughly analyze and give good suggestions. Nevertheless, that is not always possible and I hope it may be possible to provide me with good tips based on the information I provide below.
I have a server application that uses a listener thread to listen for incoming data. The incoming data is interpreted into application specific messages and these messages then give rise to events.
Up to that point I don't really have any control over how things are done.
Because this is a legacy application, these events were previously taken care of by that same listener thread (largely a single-threaded application). The events are sent to a blackbox and out comes a result that should be written to disk.
To improve throughput, I wanted to employ a threadpool to take care of the events. The idea being that the listener thread could just spawn new tasks every time an event is created and the threads would take care of the blackbox invocation. Finally, I have a background thread performing the writing to disk.
With just the previous setup and the background writer, everything works OK and the throughput is ~1.6 times more than previously.
When I add the thread pool however performance degrades. At the start, everything seems to run smoothly but then after awhile everything is very slow and finally I get OutOfMemoryExceptions. The weird thing is that when I print the number of active threads each time a task is added to the pool (along with info on how many tasks are queued and so on) it looks as if the thread pool has no problem keeping up with the producer (the listener thread).
Using top -H to check for CPU usage, it's quite evenly spread out at the outset, but at the end the worker threads are barely ever active and only the listener thread is active. Yet it doesn't seem to be submitting more tasks...
Can anyone hypothesize a reason for these symptoms? Do you think it's more likely that there's something in the legacy code (that I have no control over) that just goes bad when multiple threads are added? The out of memory issue should be because some queue somewhere grows too large but since the threadpool almost never contains queued tasks it can't be that.
Any ideas are welcome. Especially ideas of how to more efficiently diagnose a situation like this. How can I get a better profile on what my threads are doing etc.
Thanks.
Slowing down then out of memory implies a memory leak.
So I would start by using some Java memory analyzer tools to identify if there is a leak and what is being leaked. Sometimes you get lucky and the leaked object is well-known and it becomes pretty clear who is hanging on to things that they should not.
Thank you for the answers. I read up on Java VisualVM and used that as a tool. The results and conclusions are detailed below. Hopefully the pictures will work long enough.
I first ran the program and created some heap dumps thinking I could just analyze the dumps and see what was taking up all the memory. This would probably have worked except the dump file got so large and my workstation was of limited use in trying to access it. After waiting two hours for one operation, I realized I couldn't do this.
So my next option was something I, stupidly enough, hadn't thought about. I could just reduce the number of messages sent to the application, and the trend of increasing memory usage should still be there. Also, the dump file will be smaller and faster to analyze.
It turns out that when sending messages at a slower rate, no out of memory issue occured! A graph of the memory usage can be seen below.
The peaks are results of cumulative memory allocations and the troughs that follow are after the garbage collector has run. Although the amount of memory usage certainly is quite alarming and there are probably issues there, no long term trend of memory leakage can be observed.
I started to incrementally increase the rate of messages sent per second to see where the application hits the wall. The image below shows a very different scenario then the previous one...
Because this happens when the rate of messages sent are increased, my guess is that my freeing up the listener thread results in it being able to accept a lot of messages very quickly and this causes more and more allocations. The garbage collector doesn't run and the memory usage hits a wall.
There's of course more to this issue but given what I have found out today I have a fairly good idea of where to go from here. Of course, any additional suggestions/comments are welcome.
This questions should probably be recategorized as dealing with memory usage rather than threadpools... The threadpool wasn't the problem at all.
I agree with #djna.
Thread Pool of java concurrency package works. It does not create threads if it does not need them. You see that number of threads is as expected. This means that probably something in your legacy code is not ready for multithreading. For example some code fragment is not synchronized. As a result some element is not removed from collection. Or some additional elements are stored in collection. So, the memory usage is growing.
BTW I did not understand exactly which part of the application uses threadpool now. Did you have one thread that processes events and now you have several threads that do this? Have you probably changed the inter-thread communication mechanism? Added queues? This may be yet another direction of your investigation.
Good luck!
As mentioned by djna, it's likely some type of memory leak. My guess would be that you're keeping a reference to the request around somewhere:
In the dispatcher thread that's queuing the requests
In the threads that deal with the requests
In the black box that's handling the requests
In the writer thread that writes to disk.
Since you said everything works find before you add the thread pool into the mix, my guess would be that the threads in the pool are keeping a reference to the request somewhere. Th idea being that, without the threadpool, you aren't reusing threads so the information goes away.
As recommended by djna, you can use a Java memory analyzer to help figure out where the data is stacking up.