The motivation behind this question is to see whether we can make a theoretical load balancer more efficient for edge-cases by first applying its regular strategy of nominating a particular node to route an HTTP request to (say, via a round robin strategy) and then "peeking" into the internal state of the system to see whether it is undergoing garbage collection. If so, the load balancer avoids the node altogether and moves onto the next one.
In the ideal scenario, each node would "emit" its internal state every few seconds via UDP to some message queue letting the load balancer know which nodes are potentially "radio-active" if they're going through GC (I'm visualizing it as a simple boolean).
The question here is: can I tweak my application to tap into its JVM's internal state and (a) figure out whether we're in GC mode right this instant (b) emit this result via some protocol (UDP/HTTP) over the wire to some other entity (like an MQ or something).
There are a whole bunch of ways to monitor and report on a VM remotely. A well-known protocol, for example, is SNMP. But this is a very complicated subject.
Implementation sort of depends on your requirements. If you need to be really sure a VM is in a good state, you might need to wrap your application in a wrapper VM that controls the actual VM. This is pretty involved.
Many implementations use the built-in monitoring and profiling interfaces that are exposed as beans to participating applications via JMX. Again, this requires a fair amount of tweaking.
I suppose you could create a worker thread that simply acts as a canary. It broadcasts a ping every X seconds, and if the pinged service misses two or three pings, it assumes the VM is not ready to serve.
The problem is deciding what to do when a VM never seems to come back. Is it the VM, the network, or something else? How do you keep track of the VMs? These are not intractable problems, but they combine in interesting ways to make your life equally interesting.
There are a lot of ways to approach this problem, and each has subtle implications.
Can you do it? Yes.
The GarbageCollectorMXBean can provide notifications of GC events to application code. (For instance, see this article which includes example code for configuring a notification listener and processing the events.)
Given this, you could easily code your application so that key GC events were sent out as UDP messages, and/or regular UDP messages were sent to report the current GC state.
However, if the GC performs a "stop the world" collection, then your code to send out messages will also be stopped, and there is no way around that1. If this is a problem then you probably need to take the "canary" approach ... or switch to a low-pause collector. The "canary" or "heart-beat" approaches also detects other kinds of unavailability, which will be relevant to a load balancer. However, the flip-side is that you can also get false positives; e.g. the "heart" is still "beating" but the "patient" is "comatose".
Whether this is going to actually useful for load balancing purposes is a different question entirely. There is certainly scope for additional failure modes. For instance, if the load balancer misses a UDP message saying that a JVM GC has finished, then the JVM could effectively drop out of the load balancer's pool.
1 - At least, not within Java. You could conceivably build something on the outside of the JVM that (for example) reads the GC log file and checks OS-level process usage information.
You can write an external application that instruments the JVM, e.g. via dtrace probes and sends the events to the load balancer or is queriable by the load balancer.
Related
There isn't a single answer for this, but I don't know where else to ask this question.
I work on a large enterprise system that uses Tomcat to run REST services running in containers, managed by kubernetes.
Tomcat, or really any request processor, has a "max threads" property, such that if enough requests come in that cause creation of many threads, if the number of created threads reaches that defined limit, it will put additional requests into a queue (limited by the value of another property), and then possibly requests will be rejected after that queue is full.
It's reasonable to consider whether this property should be set to a value that could possibly be reached, or whether it should be set to effective infinity.
There are many scenarios to consider, although the only interesting ones are when traffic is extremely higher than normal, either from real customer traffic, or malicious ddos traffic.
In managed container environments, and other similar cases, this also begs the question of how many instances, pods, or containers should be running copies of the service. I would assume you would want to have as few of these as possible, to reduce duplication of resources for each pod, which would increase the average number of threads in each container, but I would assume that's better than spreading them thinly across a set of containers.
Some members of my team think it's better to set the "max threads" property to effective infinity.
What are some reasonable thoughts about this?
As a general rule, I'd suggest trying to scale by running more pods (which can easily be scheduled on multiple hosts) rather than by running more threads. It's also easier for the cluster to schedule 16 1-core pods than to schedule 1 16-core pod.
In terms of thread count, it depends a little bit on how much work your process is doing. A typical Web application spends most of its time talking to the database, and does a little bit of local computation, so you could often set it to run 50 or 100 threads but still with a limit of 1.0 CPU, and be effectively using resources. If it's very computation-heavy (it's doing real image-processing or machine-learning work, say) you might be limited to 1 thread per CPU. The bad case is where your process is allocating 16 threads, but the system only actually has 4 cores available, in which case your process will get throttled but you really want it to scale up.
The other important bad state to be aware of is the thread pool filling up. If it does, requests will get queued up, as you note, but if some of those requests are Kubernetes health-check probes, that can result in the cluster recording your service as unhealthy. This can actually lead to a bad spiral where an overloaded replica gets killed off (because it's not answering health checks promptly), so its load gets sent to other replicas, which also become overloaded and stop answering health checks. You can escape this by running more pods, or more threads. (...or by rewriting your application in a runtime which doesn't have a fixed upper capacity like this.)
It's also worth reading about the horizontal pod autoscaler. If you can connect some metric (CPU utilization, thread pool count) to say "I need more pods", then Kubernetes can automatically create more for you, and scale them down when they're not needed.
I have a Java application named 'X'. In Windows environment, at a given point of time there might be more than one instance of the application.
I want a common piece of code to be executed sequentially in the Application 'X' no matter how many instances of the application are running. Is that something possible and can be achieved ? Any suggestions will help.
Example :- I have a class named Executor where a method execute() will be invoked. Assuming there might be two or more instances of the application at any given point of time, how can i have the method execute() run sequential from different instances ?
Is there something like a lock which can be accessed from two instances and see if the lock is currently active or not ? Any help ?
I think what you are looking for is a distributed lock (i.e. a lock which is visible and controllable from many processes). There are quite a few 3rd party libraries that have been developed with this in mind and some of them are discussed on this page.
Distributed Lock Service
There are also some other suggestions in this post which use a file on the underlying system as a synchornization mechanism.
Cross process synchronization in Java
To my knowledge, you cannot do this that easily. You could implement TCP calls between processes... but well I wouldn't advice it.
You should better create an external process in charge of executing the task and a request all the the tasks to execute by sending a message to a JMS queue that your executor process would consume.
...Or maybe you don't really need to have several processes running in the same time but what you might require is just an application that would have several threads performing things in the same time and having one thread dedicated to the Executor. That way, synchronizing the execute()method (or the whole Executor) would be enough and spare you some time.
You cannot achieve this with Executors or anything like that because Java virtual machines will be separate.
If you really need to synchronize between multiple independent instances, one of the approaches would be to dedicate internal port and implement a simple internal server within the application. Look into ServerSocket or RMI is full blown solution if you need extensive communications. First instance binds to the dedicated application port and becomes the master node. All later instances find the application port taken but then can use it to make HTTP (or just TCP/IP) call to the master node reporting about activities they need to do.
As you only need to execute some action sequentially, any slave node may ask master to do this rather than executing itself.
A potential problem with this approach is that if the user shuts down the master node, it may be complex to implement approach how another running node could take its place. If only one node is active at any time (receiving input from the user), it may take a role of the master node after discovering that the master is not responding and then the port is not occupied.
A distributed queue, could be used for this type of load-balancing. You put one or more 'request messages' into a queue, and the next available consumer application picks it up and processes it. Each such request message could describe your task to process.
This type of queue could be implemented as JMS queue (e.g. using ActiveMQ http://activemq.apache.org/), or on Windows there is also MSMQ: https://msdn.microsoft.com/en-us/library/ms711472(v=vs.85).aspx.
If performance is an issue and you can have C/C++ develepors, also the 'shared memory queue' could be interesting: shmemq API
I'm using Java and RMI in order to execute 100k Montecarlo Simulations on a cluster of hundreds of cores.
The approach I'm using is to have a client app that invokes RMI processes and divides simulations on the number of available (RMI) processes on the grid.
Once that the simulations have been run I have to reaggregate results.
The only limit I have is that all this has to happen in less than 500ms.
The process is actually in place BUT randomly, from time to time, one of the RMI call takes 200ms more to execute.
I've added loads of logs and timings all over the place and as possible reason I've already discarded:
1) Simulations taking extra time
2) Data transfer (it constantly works, only sometimes the slowdown is verified, and only on a subset of RMI calls)
3) Transferring results back (I can clearly timing how long from last RMI calls return to the end of the process)
The only thing I cannot measure is IF any of the RMI Call is taking extra time to be initialized (and honestly is the only thing I can suppose). The reason of this is that -unfortunately- clocks are not synchronized :(
Is that possible that the RMI remote process got passivated/detached/collected even if I keep a (Remote) reference to it from the client?
Hope the question is clear enough (I'm pretty much sure it isn't).
Thanks a mil and do not hesitate to make more questions if it is not clear enough.
Regards,
Giovanni
Is that possible that the RMI remote process got passivated/detached/collected even if I keep a (Remote) reference to it from the client?
Unlikely, but possible. The RMI remote process should not be collected (as the RMI FAQ indicates for VM exit conditions). It could, however, be paged to disk if the OS desired.
Do you have a way to rule out GC calls (other than writing a monitor with JVM TI)?
Also, is your code structured in such a way that you send off all calls from your aggregator asynchronously, have the replies append to a list, and aggregate the results when your critical time is up, even if some processors haven't returned results? I'm assuming that each processor is an independent, random event and that it's safe to ignore some results. If not, disregard.
I finally came up with issue. Basically after insuring that the stub wasn't getting deallocated and that the GC wasn't triggered behind the scene, I used wireshark for understanding if there was any network issue.
What I found out it was that randomly one of the packet got lost and TCP needed on our network 120ms (41 retransmission) for correctly re-transfer data.
When switching to jdk7, SDP and infiniband, we didn't experience the issue anymore.
So basically the answer to my question was... PACKET LOST!
Thanks who replied to the post it helped to focus on the right path!
Gio
I'm using the synchronous implementation of JRedis, but I'm planning to switch to the asynchronous way to communicate with the redis server.
But before that I would like to ask the community whether the JRedisFuture implementation of alphazero's jredis is stable enough for production use or not?
Is there anybody out there who is using it or have experience with it?
Thanks!
When JRedis gets support for transaction semantics (Redis 1.3.n, JRedis master branch) then certainly, it should be "stable" enough.
Redis protocol for non-transactional commands, themselves atomic, allows a window of unrecoverable failure when a destructive command has been sent, and on the read phase the connection faults. The client has NO WAY of knowing if Redis in fact processed the last request but the response got dropped due to network failure (for example). Even the basic request/reply client is susceptible to this (and I think this is not limited to Java, per se.)
Since Redis's protocol does not require any metadata (at all) with the DML and DDL type commands (e.g. no command sequent number) this window of failure is opened.
With pipelining, there is no longer a sequential association between the command that is being written and the response that is being read. (The pipe is sending a command that is N commands behind the one that caused Redis to issue the response being read at the same time. If anything goes kaput, there are a LOT of dishes in air :)
That said, every single future object in the pipe will be flagged as faulted and you will know precisely at which response the fault occurred.
Does that qualify as "unstable"? In my opinion, no. That is an issue with pipelining.
Again, Redis 1.3.n with transaction semantics completely addresses this issue.
Outside of that issue, with asynchronous (pipelines), there is a great deal of responsibility on your part for making sure you do not excessively overload the input to the connector. To a huge extent JRedis pipelines protect you from this (since the caller's thread is used to make the network write thus naturally damping the input load on the pending response queue).
But you still need to run tests -- you did say "Production", right? )) -- and size your boxes and put a cap on the number of loading threads on the front end.
I would also potentially recommend not running more than one JRedis pipeline on multi-core machines. In the existing implementation (which does not chunk the write buffer) there is room for efficiencies (in context of full bandwidth utilization and maximizing throughput) to be gained by running multiple pipelines to the same server. While one pipeline is busy creating buffers to write, the other is writing, etc. But, these two pipelines will interfere with one another due to their (inevitable -- remember they are queues and some form of synchronization must occur) and periodic cache invalidation (on each dequeue/enqueue in worst case -- but in Doug Lea we trust.) So if pipeline A average latency hit d1 (in isolation), then so does pipe B. Regrettably, running two of them on the same cores will result in a new system wide cache invalidation period that is HALF of the original system so TWICE as more cache invalidations occur (on average). So it is self defeating. But test your load conditions, and on your projected production deployment platform.
My java application functionality is to provide reference data (basically loads lots of data from xml files into hashmap) and hence we request for one such data from the hashmap based on a id and we have such multiple has map for different set of business data. The problem is that when i tried executing the java application for the same request multiple times, the response times are different like 31ms, 48ms, 72ms, 120ms, 63ms etc. hence there is a considerable gap between the min and max time taken for the execution to complete. Ideally, i would expect the response times to be like, 63ms, 65ms, 61ms, 70ms, 61ms, but in my case the variation of the response time for the same request is varying hugely. I had used a opensource profile to understand if there is any extra execution of the methods or memory leak, but as per my understanding there was no problem. Please let me know what could be the reasons and how can i address this problem.
There could be many causes:
Is your Java application restarted for each run? If not, it could be that the garbage collector kicks in at an unfortunate time. If so, the JVM startup time could be responsible for the variations.
Is anything else running on that machine?
Is the disk cache "warmed up" in some cases, but not in others? That is, have the files been recently accessed so that they are still in memory?
If this is a networked application, is there any network activity during the measurements?
If there is a remote machine involved (e.g. a database server or a file server), do the above apply to that machine as well?
Use a profiler to find out which piece of code is responsible for the variations in time.
If you don't run a real-time system, then you can't be sure it will execute within a certain time.
OSes constantly do other things, mostly housekeeping tasks, and providing the system other services. This easily will slow down the rest of your system for 50ms.
There also might be time that you need to wait for IO. Such as harddisks or network communication.
Besides that there is also the fact that your JVM doesn't do any real-time promises. This can mean the garbage collector runs through. The effect of this is very small on a normal application, but can be large if you create and forget lots of objects (as you might do when loading many or large files).
Finally it can be your algorithm (do you run the same data each time?) if you have different data, you can have different execution times.