I am beginner to web development,in my project people gave me R&D task and the task is
I am deploying same war in two servers and running the servers in cluster mode, in the code they have schedule some jobs, As there are two servers these jobs are running on two servers and results in the duplication of data in DB(as two servers are running same jobs independently),please any one help me how to solve this situation.
Related
I am new to Flink and kubernetes. I am planning to creating a flink streaming job that streams data from a FileSystem to Kafka.
Have the flink job jar which is working fine(tested locally). Now I am trying to host this job in kubernetes, and would like to use EKS in AWS.
I have read through official flink documentation on how to setup flink cluster.
https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/deployment/kubernetes.html
I tried to set it up locally using minikube and brought up session cluster and submitted the job which is working fine.
My questions:
1)Out of the two options Job cluster and session cluster, since the job is streaming job and should keep monitor the filesystem and when any new files came in it should stream it to destination, can I use job cluster in this case? As per documentation job cluster is something that executes the job and terminates once it is completed, if the job has monitor on a folder does it ever complete?
2)I have a maven project that builds the flink jar, would like to know the ideal way to spin a session/job cluster using this jar in production ? what is the normal CI CD process ? Shall I build a session cluster initially and submit the jobs whenever needed ? or spinning up Job cluster with the jar built ?
First off, the link that you provided is for Flink 1.5. If you are starting fresh, I'd recommend using Flink 1.9 or the upcoming 1.10.
For your questions:
1) A job with file monitor never terminates. It cannot know when no more files arrive, so you have to cancel it manually. Job cluster is fine for that.
2) There is no clear answer to that and it's also not Flink specific. Everyone has a different solution with different drawbacks.
I'd aim for a semi-automatic approach, where everything is automatic but you need to explicitly press a deploy button (and not just a git push). Often times, these CI/CD pipelines deploy on a test cluster first and make a smoke test before allowing a deploy on production.
If you are completely fresh, you could check the AWS codedeploy. However, I made good experiences with Gitlab and AWS runner.
The normal process would be something like:
build
integration/e2e tests on build machine (dockerized)
deploy on test cluster/preprod cluster
run smoke tests
deploy on prod
I have also seen processes that go quickly on prod and invest the time in better monitoring and a fast rollback instead of preprod cluster and smoke tests. That's usually viable for business uncritical processes and how expensive a reprocessing is.
I am using spring batch local partitioning to process my Job.In local partitioning multiple slaves will be created in same instance i.e in the same job. How Remote partitioning is different from local partitioning.What i am assuming is that in Remote partitioning each slave will be executed in different machine. Is my understanding correct. If my understanding is correct how to start the slaves in different machines without using cloudfoundry. I have seen Michael Minella talk on Remote partitioning https://www.youtube.com/watch?v=CYTj5YT7CZU tutorial. I am curious to know how remote partitioning works without using cloudfoundry. How can I start slaves in different machines?
While that video uses CloudFoundry, the premise of how it works applies off CloudFoundry as well. In that video I launch multiple JVM processes (web apps in that case). Some are configured as slaves so they listen for work. The other is configured as a master and he's the one I use to do the actual launching of the job.
Off of CloudFoundry, this would be no different than deploying WAR files onto Tomcat instances on multiple servers. You could also use Spring Boot to package executable jar files that run your Spring applications in a web container. In fact, the code for that video (which is available on Github here: https://github.com/mminella/Spring-Batch-Talk-2.0) can be used in the same way it was on CF. The only change you'd need to make is to not use the CF specific connection factories and use traditional configuration for your services.
In the end, the deployment model is the same off CloudFoundry or on. You launch multiple JVM processes on multiple machines (connected by middleware of your choice) and Spring Batch handles the rest.
We are creating a test automation framework for web application. For Test scenarios including the Scheduled jobs, we need to advance the time to let the jobs triggered, But this is making the server (TOMCAT) very slow. What could be the reason and solution?
I have some jobs that run in my application (These jobs created and managed by the application) and the application is deployed on a cluster of 2 managed servers.
We have distributed the load based on the even and odd number of the jobs on these 2 managed servers.
Now, we want to create the jobs if one of the instances goes down into the other instance.
How do we know if the other instance went down in Weblogic server cluster and my application is built in Java and spring.
Thanks
I searched around looking for my situation but found many threads on making multiple quartz schedulers on different machines run a job once. My situation is the opposite. We have multiple web servers behind a load balancer, all are using quartz and connect to the same database. One of the jobs is to load log files from a third party app into the database. When the job is triggered only one of the web servers picks it up. I am trying to find a solution to have one scheduled job and when it is triggered all the attached web servers will pick it up and start processing the logs on that machine from this third party app.