spring rabbitmq java application correct shutdown

spring rabbitmq java application correct shutdown - java

I have a standalone java application based on spring and spring-rabbit libraries. I start it like this:
nohup java -jar myapp.jar &
But sometimes I have to restart the application for upgrading it. Now I use killall -9 java, but it's not the best way. How to stop it correctly and make sure that all requests which comes to rabbit listeners of this app at that period of time will not be partially processed and will just be rejected and go to other rabbit consumer?

First of all - do not use killall -9 - it sends a SIGKILL signal to the JVM which cannot be intercepted and will not allow for an orderly shutdown. Instead, use killall or killall -15 (15 is the default signal) that sends SIGTERM which is intercepted by the JVM and allows for an orderly shutdown.
Second of all - do not ACK the RabbitMQ message prematurely - only do so once the message has actually been processed. Until you ACK the message, RabbitMQ will keep it in the "unacked" state. If the consumer dies without ACKing, the message will be put back on the queue for another consumer to pick it up.
Depending on the framework you are using, you may need to register a shutdown hook to close your application in a clean way. For example, if you are using standalone Spring, you should call ConfigurableApplicationContext#registerShutdownHook on the ApplicationContext that you create to make sure all the beans (including RabbitMQ consumers) are closed correctly.

First of all, don't use autoack if you afraid of data loss. Set autoAck = false, then in your consumer, when handling a message, ack it or nack when you really finished message processing. So, you will not loss data if you somehow shutdown your java process (even if you shutdown your machine). RabbitMq will store it until your client will ack it.
read here https://www.rabbitmq.com/tutorials/tutorial-two-java.html
in 'message acknoledgement' part.
In order to properly run your java process, write your bash script, that will stop/start/restart your programm.

Related

Cloud Run graceful shutdown

I am following up on:
https://cloud.google.com/blog/topics/developers-practitioners/graceful-shutdowns-cloud-run-deep-dive
How to process SIGTERM signal gracefully in Java?
I have a CloudRun service which is doing some while cycles which currently seem not to end after the CloudRun revision is replaced. I guess the CloudRun manager is waiting for the revision to gracefully end, which unfortunately does not happen.
I tried adding:
Runtime.getRuntime().addShutdownHook()
And end the loops with this listener. Unfortunately there are two issues:
It is not possible to subscribe more shutdownHooks (each for one running loop) - getting errors of Hook already running. This is an implementation issue and I wonder if there is a different way to do this.
When I send the SIGTERM or SIGINT to locally running service, it ends immediately and the shutdownHook is never called. This is a logical issue and I am not sure if this is the way CloudRun ends the revisions - it seems not, otherwise the loops would be immediately ended like on my localhost.

Linux Shutdown Order

Gracefully shutting down a system ( using "shutdown" command ), terminates all the services registered under systemd in order and also send kill signal to all the running processes to give them a chance gracefully shut down.
Is there any specific order in which kill signal is sent to the processes which are not registered as service in systemd?
Is any any order between systemd services shut down and kill signal sent to other processes?
I've a java application process running on a VM and want that it's terminated only after a particular service registered under systemd has terminated. Is there any other way to achive this thing?

I would not count on non-service process order, because it might not exist or depend on OS flavors / versions.
How about creating our service which controls the process start/stop order and adding it to the system?

wait with systemd until a service socket becomes available and then start a depended service

Currently I have slow starting java service in systemd which takes about 60 seconds until it opens its HTTP port and serves other clients.
Another client service expects this service to be available (is a client of the this service), otherwise it dies after a certain retry. It also started with systemd. This is to be clear also a service. But uses the former like database.
Can I configure systemd to wait until the first service has made his socket available? (something like if the socket is actually listens , then the second client service should start).

Initialization Process Requires Forking
systemd waits for a daemon to initialize itself if the daemon forks. In your situation, that's pretty much the only way you have to do this.
The daemon offering the HTTP service must do all of its initialization in the main thread, once that initialization is done and the socket is listening for connections, it will fork(). The main process then exits. At that point systemd knows that your process was successfully initialized (exit 0) or not (exit 1).
Such a service receives the Type=... value of forking as follow:
[Service]
Type=forking
...
Note: If you are writing new code, consider not using fork. systemd already creates a new process for you so you do not have to fork. That was an old System V boot requirement for services.
"Requires" will make sure the process waits
The other services have to wait so they have to require the first to be started. Say your first service is called A, you would have a Requires like this:
[Unit]
...
Requires=A
...
Program with Patience in Mind
Of course, there is always another way which is for the other services to know to be patient. That means try to connect to the HTTP port, if it fails, sleep for a bit (in your case, 1 or 2 seconds would be just fine) then try again, until it works.
I have developed both methods and they both work very well.
Note: A powerful aspect to this method, if service A gets restarted, you'd get a new socket. This server can then auto-reconnect to the new socket when it detects that the old one goes down. This means you don't have to restart the other services when restarting service A. I like this method, but it's a bit more work to make sure it's all properly implemented.
Use the systemd Auto-Restart Feature?
Another way, maybe, would be to use the restart on failure. So if the child attempts to connect to that HTTP service and fails, it should fail, right? systemd can automatically restart your process over and over again until it succeeds. It's sucky, but if you have no control over the code of those daemons, it's probably the easiest way.
[Service]
...
Restart=on-failure
RestartSec=10
#SuccessExitStatus=3 7 # if success is not always just 0
...
This example waits 10 seconds after a failure before attempting to restart.
Hack (last resort, not recommended)
You could attempt a hack, although I do not ever recommend such things because something could happen that breaks such... in the services, change the files so that they have a sleep 60 then start the main process. For that, just write a script like so:
#!/bin/sh
sleep 60
"$#"
Then in the .service files, call that script as in:
ExecStart=/path/to/script /path/to/service args to service
This will run the script instead of directly your code. The script will first sleep for 60 seconds and then try to run your service. So if for some reason this time the HTTP service takes 90 seconds... it will still fail.
Still, this can be useful to know since that script could do all sorts of things, such as use the nc tool to probe the port before actually starting the service process. You could even write your own probing tool.
#!/bin/sh
while true
do
sleep 1
if probe
then
break
fi
done
"$#"
However, notice that such a loop is blocking until probe returns with exit code 0.

You have several options here.
Use a socket unit
The most elegant solution is to let systemd manage the socket for you. If you control the source code of the Java service, change it to use System.inheritedChannel() instead of allocating its own socket, and then use systemd units like this:
# example.socket
[Socket]
ListenStream=%t/example
[Install]
WantedBy=sockets.target
# example.service
[Service]
ExecStart=/usr/bin/java ...
StandardInput=socket
StandardOutput=socket
StandardError=journal
systemd will create the socket immediately (%t is the runtime directory, so in a system unit, the socket will be /run/example), and start the service as soon as the first connection attempt is made. (If you want the service to be started unconditionally, add an Install section to it as well, with WantedBy=multi-user.target.) When your client program connects to the socket, it will be queued by the kernel and block until the server is ready to accept connections on the socket. One additional benefit from this is that you can restart the service without any downtime on the socket – connection attempts will be queued until the restarted service is ready to accept connections again.
Make the service signal readiness to systemd
Alternatively, you can set up the service so that it signals to systemd when it is ready, and order the client after it. (Note that this requires After=example.service, not just Requires=example.service! Dependencies and ordering are orthogonal – without After=, both will be started in parallel.) There are two main service types that might make this possible:
Type=forking: systemd will consider the service to be ready as soon as the main program exits. Since you can’t fork in Java, I think you would have to write a small shell script which starts the server in the background and then waits until the socket is available (while ! test -S /run/example; do sleep 1s; done). Once the script exits, the service is considered ready.
Type=notify: systemd will wait for a message from the service before it is considered ready. Ideally, the message should be sent from the service PID itself: check if you can call the sd_notify function from libsystemd via JNI/JNA/whatever (specifically, sd_notify(0, "READY=1")). If that’s not possible, you can use the systemd-notify command-line tool (--ready option), but then you need to set NotifyAccess=all in the service unit (by default, only the main process may send notifications), and even then it likely will not work (systemd needs to process the message before systemd-notify exits, otherwise it will not be able to verify which cgroup the message came from).

is there a java pattern for a process to constantly run to poll or listen for messages off a queue and process them?

planning on moving a lot of our single threaded synchronous processing batch jobs to a more distributed architecture with workers. the thought is having a master process read records off the database, and send them off to a queue. then have a multiple workers read off the queue to process the records in parallel.
is there any well known java pattern for a simple CLI/batch job that constantly runs to poll/listen for messages on queues? would like to use that for all the workers. or is there a better way to do this? should the listener/worker be deployed in an app container or can it be just a standalone program?
thanks
edit: also to note, im not looking to use JavaEE/JMS, but more hosted solutions like SQS, a hosted RabbitMQ, or IronMQ

If you're using a JavaEE application server (and if not, you should), you don't have to program that logic by hand since the application server does it for you.
You then implement and deploy a message driven bean that listens to a queue and processes the message received. The application server will manage a connection pool to listen to queue messages and create a thread with an instance of your message driven bean which will receive the message and be able to process it.
The messages will be processed concurrently since the application server will have a connection pool and a thread pool available to listen to the queue.
All JavaEE-featured application servers like IBM Websphere or JBoss have configurations available in their admin consoles to create Message Queue listeners depending or the message queue implementation and then bind this message queue listeners to your Message Driven Bean.

I don't a lot about this, and I maybe don't really answer your question, but I tried something a few month ago that might interest you to deals with message queue.
You can have a look at this: http://www.rabbitmq.com/getstarted.html
I seems Work Queue could fix your requirements.

How do I stop losing messages on MQ

I am writing a Java application, running in a LINUX environment, that does transactions on an MQ using SYNCPOINT. It uses Websphere MQ Java Classes to interact with the MQ service. What I am doing in my code is the following (pseudo):
MQGetMessageOptions gmo = new MQGetMessageOptions();
gmo.options = MQConstants.MQGMO_FAIL_IF_QUIESCING | MQConstants.MQGMO_SYNCPOINT;
MQMessage message = new Message();
queue.get(message, gmo);
// process the message, save to database
databaseConnection.commit();
queueManager.commit();
I basically grab the message, process it, persist to database, then call a commit on the queueManager. The process listens for a message on TIBRV in order to do a graceful shutdown.
I've been testing the process to make sure no messages are lost. I place 20k messages on a queue, then run the process. I perform a graceful shutdown call in the middle of processing. I then compare the amount of messages on the queue versus the amount of messages in database. When a graceful shutdown occurs via TIBRV message, the number of MQ messages + the number of DB messages = total messages originally on the queue.
However, when I do a kill or kill -9, I see that a message is lost. I always end up with a result of 19999 total messages.
Is there a way I can investigate how I am losing this message? Is there anything that occurs on the Websphere App Server that I would need to be aware of?

There is no reason to expect the numbers to reconcile when using single-phase commit. The program will always be between the WMQ and DB Commit call or else between the DB and WMQ Commit call when you kill it.
What you are asking for would require 2-Phase (XA) Commit. For WMQ, 2PC would require the application to be using bindings mode and for WMQ to be the resource coordinator. You would then call MQBEGIN, execute your WMQ and DB updates, then call MQCOMMIT. This way both the WMQ and DB transactions would succeed or fail together.

Are you connecting or MQ in bindings mode or client mode? In my experience, transactions didn't work out of the box in client mode, but they did in bindings mode.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.