I am working on a Java Spring Web Service using #SqsLister to hand items on an SQS Queue. My cluster is configured to use start and stop tasks based on the depth of the SQS Queue. This part works fine.
I am worried about a task being terminated while the previously dequeued item is still being processed.
Does Spring support this in any way? Is there a preferred way of handing this scenario?
You have to implement that yourself in your tasks. Basically it is done through ecs task termination protection, and AWS provides examples of how to do it:
Elastic Container Service (ECS) Task Protection Examples
The examples are not for spring, but you have to implement something similar yourself in your application.
Related
I'm coming from the PHP/Python/JS environment where it's a standard to run multiple instances of web application as separate processes and asynchronous tasks like queue processing as separate scripts.
eg. in the k8s environment, there would be
N instances of web server only, each running in separate pod
For each queue, dynamic number of consumers, each in separate pod
Cron scheduling using k8s crontab functionality, leaving the scheduling process to k8s
Such approach matches well the cloud nature where the workload can be scheduled across both smaller number of powerful machines and lot of less powerful machines and allows very fine control of auto scaling (based on the number of messages in specific queue for example).
Also, there is a clear separation between the developer and DevOps responsibility.
Recently, I tried to replicate the same setup with Java Spring Boot application and failed miserably.
Even though Java frameworks say that they are "cloud native", it seems like all the documentation is still built around monolith application, which handles all consumers and cron scheduling in separate threads.
Clear answer to this problem is microservices but that's way out of scope.
What I need is to deploy separate parts of application (like 1 queue listener only) per pod in the cloud yet keep the monolith code architecture.
So, the question is:
How do I design my Spring Boot application so that:
I can run the webserver separately without queue listeners and scheduled jobs
I can run one queue listener per pod in the k8s
I can use k8s cron scheduling instead of App level Spring scheduler?
I found several ways to achieve something like this but I expect there must be some "more or less standard way".
Alternative solutions that came to my mind:
Having separate module with separate Application definition so that each "command" is built separately
Using Spring Profiles to instantiate specific services only according to some environment variables
Implement custom command line runner which would parse command name/queue name and dynamically create appropriate services (this seems to be the most similar approach to the way how it's done in "scripting languages")
What I mainly want to achieve with such setup is:
To be able to run the application on lot of weak HW instead of having 1 machine with 32 cpu cores
Easier scaling per workload
Removing one layer from already complex monitoring infrastructure (k8s already allows very fine resource monitoring, application level task scheduling and parallelism makes this way more difficult)
Do I miss something or is it just that it's not standard to write Java server apps this way?
Thank you!
What I need is to deploy separate parts of application (like 1 queue listener only) per pod in the cloud yet keep the monolith code architecture.
I agree with #jacky-neo's answer in terms of the appropriate architecture/best practice, but that may require you to break up your monolithic application.
To solve this without breaking up your monolithic application, deploy multiple instances of your monolith to Kubernetes each as a separate Deployment. Each deployment can have its own configuration. Then you can utilize feature flags and define the environment variables for each deployment based on the functionality you would like to enable.
In application.properties:
myapp.queue.listener.enabled=${QUEUE_LISTENER_ENABLED:false}
In your Deployment for the queue listener, enable the feature flag:
env:
- name: 'QUEUE_LISTENER_ENABLED'
value: 'true'
You would then just need to configure your monolithic application to use this myapp.queue.listener.enabled property and only enable the queue listener when the property is set to true.
Similarly, you could also apply this logic to the Spring profile to only run certain features in your app based on the profile defined in your ConfigMap.
This Baeldung article explains the process I'm presenting here in detail.
For the scheduled task, just set up a CronJob using a curl container which can invoke the service you want to perform the work.
Edit
Another option based on your comments below -- split the shared logic out into a shared module (using Gradle or Maven), and have two other runnable modules like web and listener that depend on the shared module. This will allow you to keep your shared logic in the same repository, and keep you from having to build/maintain an extra library which you would like to avoid.
This would be a good step in the right direction, and it would lend well to breaking the app into smaller pieces later down the road.
Here's some additional info about multi-module Spring Boot projects using Maven or Gradle.
According to my expierence, I will resolve these issue as below. Hope it is what you want.
I can run the webserver separately without queue listeners and
scheduled jobs
Develop a Spring Boot app to do it and deploy it as service-A in Kubernetes. In this app, you use spring-mvc to define the controller or REST controller to receive requests. Then use the Kubernetes Nodeport or define ingress-gateway to make the service accessible from outside the Kubernetes cluster. If you use session, you should save it into Redis or a similar shared place so that more instances of the service (pod) can share same session value.
I can run one queue listener per pod in the k8s
Develop a new Spring Boot app to do it and deploy it as service-B in Kubernetes. This service only processes queue messages from RabbitMQ or others, which can be sent from service-A or another source. In most times it should not be accessed from outside the Kubernetes cluster.
I can use k8s cron scheduling instead of App level Spring scheduler?
In my opinion, I like to define a new Spring Boot app with spring-scheduler called service-C in Kubernetes. It will have only one instance and will not be scaled. Then, it will invoke service-A method at the scheduled time. It will not be accessible from outside the Kubernetes cluster. But if you like Kubernetes CronJob, you can just write a bash shell using service-A's dns name in Kubernetes to access its REST endpoint.
The above three services can each be configured with different resources such as CPU and memory usage.
I do not get the essence of your post.
You want to have an application with "monolithic code architecture".
And then deploy it to several pods, but only parts of the application are actually running.
Why don't you separate the parts you want to be special to be applications in their own right?
Perhaps this is because I come from a Java background and haven't deployed monolithic scripting apps.
I'm exposing functionality to access user details via a rest call.
From reading this post: Is Spring Boot MVC controller multithreaded? spring boot rest services are multithreaded. Does this mean using Akka to multi-thread web services does not serve any use?
Using Java Akka will not offer any multi-threaded advantages but will offer:
If a rest call fails with error (e.g 404) Akka can be used to restart the rest call or kill the thread, so stopping the service.
If the a certain rest call is taking much of time to complete Akka can be used to kill the call after a duration of time.
Akka can be used to throttle requests to rest client, useful if service allows max requests in period of time.
Are my assertions correct? If I'm not concerned with these points above, should I still use Akka or use the functionality to access the user details and not wrap the it with Akka? Could Java futures be also used for these points?
I am trying to determine the best way to implement handling long running batch jobs in Spring MVC. I come across Akka in my searching as a non blocking framework for aync processing, which is preferred because I don't want the batch processing to eat up all the threads from the thread pool.
Essentially what I will be doing is have a job that needs to run on some set schedule that will go out and call various web services, process the data, and persist it.
I have seen some code example with using it with Spring, but I've never seen it used with a CRON type scheduler. It always seems to be using a fixed time period.
I'm not sure if this is even the best approach to handling large scale batch processing within Spring. Any suggestions or links to good Akka Spring resources are welcome.
I would suggest you to look into Spring Integration and Spring Batch projects. The first one allows you configure chains of services using EIP. We used it in or project to fetch files from FTP, deserialize and process them, import into DB, send emails if required etc. - all by schedule. The second one is more straightforward and basically provides a framework to work on rows of data. Both are configurable with Quartz and integrate into Spring MVC project nicely.
I need to develop an IMAP poller which pings an email server every few seconds and fetches every new email which arrives.
I've done it once for another application, but there I used an inbound mail channel from Spring Integration.
I just started "playing" with Play, and am not sure what the best way to achieve this is. I know that JavaMail already offers the possibility to fetch mails, but I am not sure how to actually package this. Should this be a separate module, a separate plugin, a service, or sth?
Should the polling functionality be implemented as a job?
NOTE: It is a web application BTW, although the description above may suggest it is not.
There are a few options to solve this:
1) Use java in a Job to poll the IMAP server at regular intervals
documentation on creating a Job is available and is pretty straight forward, just setup the job to run every minute or 5 minutes and then add the code to actually check for new emails.
http://www.playframework.org/documentation/1.2.4/jobs
If you're looking for how to check for new emails on IMAP then have a look through stack exchange there. For example, to poll gmail check out this question: Getting mail from GMail into Java application using IMAP
2) Use camel module to poll IMAP server with a custom route/processor
This is a heavyweight solution and only recommended if you want to make use of other features of Apache Camel.
The module is available here: http://www.playframework.org/modules/camel
Using camel to poll for IMAP messages is fairly easy once you get your head around how to use camel, the specific info for the IMAP route is here: http://camel.apache.org/mail.html
In my opinion you shouldn't use Play at all for this — if I understand your requirements correctly. Play is a web framework intended to handle HTTP requests. Your requirements say nothing about HTTP at all, so a large part of Play! would be useless.
You could use Play's server runtime and Job (and cron) architecture to run this, but you would be misusing the facilities of the framework for something for which they were never intended. You may also be inheriting requirements from Play that you wouldn't ever actually need for an application/service like the one you want to build (for example the Python runtime).
I think you should not use Play for this, but rather create this as a simple, straight-forward Java application using Spring. With Spring's scheduling capabilities you can just as easily implement what you want.
Naturally, when you intend to build a web front-end on top of this in the future, that would make it a completely different story.
The premise is this: For asynchronous job processing I have a homemade framework that:
Stores jobs in database
Has a simple java api to create more jobs and processors for them
Processing can be embedded in a web application or can run by itself in on different machines for scaling out
Web UI for monitoring the queue and canceling queue items
I would like to replace this with some ready made library because I would expect more robustness from those and I don't want to maintain this. I've been researching the issue and figured you could use JMS for something similar. But I would still have to build a simple java API, figure out a runtime where I would put the processing when I want to scale out and build a monitoring UI. I feel like the only thing I would benefit from JMS is that I would not have to do is the database stuff.
Is there something similar to this that is ready made?
UPDATE
Basically this is the setup I would want to do:
Web application runs in a Servlet container or Application Server
Web application uses a client api to create jobs
X amount of machines process those jobs
Monitor and manage jobs from an UI
You can use Quartz:
http://www.quartz-scheduler.org/
Check out Spring Batch.
Link to sprint batch website: http://projects.spring.io/spring-batch/