I am writing test cases for an IaaC project. The use case is like the following:
Given I do not have a AKS cluster
And I call an api to create a cluster
Then I have a cluster up and running
In the above scenario for 3rd step, I would like to implement two things, which are as follows:
First wait for the AKS resource to be created and
Once the resource is created, wait for the its provisioning state to be success
I searched the Azure documentation to see if there are any out of the box Watchers that azure provides as a part of it's azure-sdk-for-java. But did not find any. Could somebody share your knowledge or experience is solving such a problem statement?
Related
I've got a Java code to perform some interactions with web pages and used Selenium for it.
Now I'd like to get this code executed every hours and I've thought it's a great occasion to discover the cloud world.
I've created an account on Google Cloud.
Because my app need to have a driver to use Selenium (gecko driver for Firefox), I'll have to create an docker image to set everything it need inside it.
In Google Cloud services, there is the "Cloud Scheduler" which can allow me to run a code when I want to.
But here are my questions :
What kind of target should I configure (HTTP, Pub/Sub, HTTP App Engine)?
Because I'm not using the Google Cloud Functions, my container will always be up, it doesn't seems as a great idea for a pricing reason? I would have like to have my container up only the time of the execution.
Also I was thinking to use Quarkus framework to wrap my application since I've since it was made for the cloud and very quick to start, is that the best option for me?
I'll be very glade if someone can help me to see this a little better. I'm not a total beginner I work as a Java / JavaScript developer for 5 years now and dockerized some application but everything about the cloud is a big piece, not easy to know where to start.
So you:
are using docker images
run your workload occasionally
aren't willing to use Cloud Function
==> Cloud Run is your best bet. Here is Google Cloud Run Quick start : https://cloud.google.com/run/docs/quickstarts/prebuilt-deploy
Keep in mind that your containerised application needs to be listening to HTTP requests so take a look at Cloud Run Container runtime contract
Finally you can indeed trigger Cloud Run from Cloud Scheduler, and here a detailed documentation on how to do it https://cloud.google.com/run/docs/triggering/using-scheduler
As #MBHAPhoenix says, Cloud Run is your best option. You can then trigger the job from Cloud Scheduler. We have this exact scenario currently running for one of our projects but our container is Python. We wrote an article about it here
You should note that to trigger your Cloud Run job from Cloud Scheduler, you'll have to 'secure it'. This means means you won't be able to just type the URL in a web browser. A service account will be responsible for running the Cloud Run job and you'll then need to grant your Cloud Scheduler service access to this service account so it can invoke the Cloud Run Job. I've been meaning to put up a post about the exact steps for doing this (will try to get it done this weekend).
In terms of cost, we have this snippet from our article
...Cloud Run only runs when it receives an HTTP request. It plays dead and comes alive to execute your code when an HTTP request comes in. When it is done executing the request, it goes 'dead' again till the next request comes in. This means you're not paying for time spent idling i.e. when it is not doing anything.....
We have several Java standalone applications (in form of Jar files) running on multiple servers. These applications mainly read and stream data between systems. We are using Java 8 mainly in our development. I was put in charge recently. My main function is to manage and maintain these apps.
Currently, I check these apps manually by accessing these servers, check if the app is running, and sometimes run some database queries to see if the app started pulling data. My problem is that in many cases, some of these apps fail and shutdown due to data issue or edge cases without anyone noticing. We need some monitoring and application recovery in place.
We don't have docker infrastructure in place. We plan to implement docker in the future, but for now this is not an option.
After research, the following are options I thought of or solutions I tried:
Have the apps create a socket client which sends a heartbeat to a monitoring app (which needs to be developed). I am keeping this as my last option.
I tried to use Eclipse Vertx to wrap the apps into Verticles. Then create a web view that can show me status and other info. After several tries, the apps fail to parse the data correctly (might be due to my lack of understanding to Vertx library).
Have a third party solution that does this, but I have no idea what solutions are out there. I am open for suggestions.
My requirements are:
Proper monitoring of the apps running and their status.
In case of failure, the app should start again while notifying the admin/developer.
I am willing to develop a solution or implement a third party one. I need you guidance on this.
Thank you.
You could use spring-boot-actuator (see health). It comes with a built-in endpoint that has some health checks(depending on your spring-boot project), but you can create your own as well.
Then, doing a http request to http://{host}:{port}/{context}/actuator/health (replace with yours), you could see those health checks status and also use the response status code to monitor your application.
Have you heard of Java Service Wrappers? Not a full management functionality, however it would monitor for JVM crashes and out of memory conditions and restart your application for sure. Alerting should also be possible.
There is a small comparison table here: https://yajsw.sourceforge.io/#mozTocId284533
So some basic monitoring and management is included already. If you need more, I suggest using JMX (https://www.oracle.com/java/technologies/javase/javamanagement.html) or Prometheus (https://prometheus.io/ and https://github.com/prometheus/client_java)
I have a question related to Apache Spark. I work with Java language for writing client code but my question can be answered in any language.
The title of the question may seem like there is already a general question in Google that can be found by a simple search, but the problem is that my question is something else and unfortunately every time I search, I didn't find something about this topic and my requirement. Similar topics that are usually found by searching but not my question is:
Multiple SparkSession for one SparkContext
Multiple SparkSessions in single JVM
...
My question is not the above questions at all, although it seems similar. I will first explain my question. In the following, after stating the question, I will say my requirement in a higher level because of which I asked the question. My goal is a requirement that will be solved if the question is answered or another solution to the requirement is provided.
The problem I am trying to solve
I wrote a rest server component in which I used Spark Java library. This rest server can receive a series of requests in a specific format and then form a query based on the requests and submit a job through the Spark library functions to the Spark cluster. (My own cluster) Also return the query answer in the form of a asynchronous response (when it is ready and user request it).
I use some code like this to create spark session (summary of it):
SparkConf sparkConf = new SparkConf()
.setMaster("spark://localhost:7077")
.setAppName("test");
SparkSession session = SparkSession.builder()
.config(sparkConf)
.getOrCreate();
...
As far as I know, we I run above code, spark create application test for me and allocate some of resources from my spark cluster. (I use Spark in standalone mode) for example assume it use all my resources. (So there is no resource for extra new application)
Now I have just one rest server, it can not be scaled at all, and if it goes down, the user can no longer work with the rest server API. So I want to scale it to two instance (at least) on different machines and on different JVMs. (This is the part where my question differs from the others)
If I bring another instance of my rest sever with same code as above, then it will create new Spark session (because it is different JVM on another machine) and it also creates another application with test name in Spark. But since I said all my resources have been used by the first Spark session, this application is on standby and can do nothing. (until resources become free)
Notes about problem:
I do not want to split the cluster resources and add some to the first rest server and some to the second rest server.
I want both versions (or any other numbers of instance if I mentioned) have a single Spark application. In other words, I want same SparkContext across different JVMs. Also note that I submit my spark query as cluster mode in Spark so my application is not worker and one of nodes in cluster becomes driver.
Requirement
As it is clear in the above description, I want my rest server to be HA of type active-active, so that both spark clients are connected to an same application, and the request to the rest servers can be given to each of them. This is my need at a higher level, which may be another way to meet it.
I would be very grateful if there would be a similar application or special documentation or experience, because my searches always ended with questions that I showed at the beginning, while they had nothing to do with my problem. Shame if there is a typo in some parts due to my weakness in English. Thanks.
I like your idea a lot (probably because I had to implement quite a few similar things in the past).
In short, I am 95% sure that there is no way to share JVM, SparkContext between machines, executions, etc. I tried to share dataframes between SparkContext and this was a huge fiasco ;).
The way I would approach that:
If your REST server connects to a cluster, once the Spark session is available, register the server to a load balancer.
If you submit your REST server as a Spark job, you can have it register to the load balancer.
You can submit multiple job/start multiple server. They can pick any advertise port, which they will share with the load balancer.
Your REST client would interact with the load balancer, not directly with the Spark REST server. Your REST server will have to have healthcheck endpoints so that the load balancer can do its job.
If one of your REST server goes down, the load balancer could start a new one. You will lose the dataframes of your application, but not multiple applications.
If multiple REST servers need to exchange data, I would use Delta as a "cache" or staging zone.
Does that make sense? It should not be too hard to implement and provide a good HA.
I have an application (Spring Boot + Hibernate + Postgres) which executes ETL process. The application is deployed in OpenShift and has a scale n > 1, so this application always has more than 1 replica. But if every app launched own ETL in same database then data wouldn't be consistent.
Therefore, I think the process should be launched via something external.
I see a decision of my task as a method of API which can "doEtl()" and the method can be called a kubernete (OS) 'schedule' or another kuber (OS) tool. However I can't understand how to google it. I try to look for 'kubernetes custom schedule' but the found results explain 'how to work' or how to write custom the schedule for auto-scale.
Can someone advice me, if it is generally possible and if yes how to google it or how to named it?
You might be looking for the CronJobs object that is available and can be used to regularly execute a certain action.
For OpenShift, you can find more information in the documentation: https://docs.openshift.com/container-platform/4.3/nodes/jobs/nodes-nodes-jobs.html
I have a application that uses gemfire locator and servers. I would like to write an integration test that could help me start a locator & a server within the JVM and also shut them down when ending the tests. I could not find any single documentation which could help me do this.
I have tried starting a locator and a server when the tests start using LocatorLauncher an ServerLauncher. It starts the locator but throws an exception stating IllegalStateException: A connection to a distributed system already exists in this VM.
I am not very good with gemfire and do not understand what am I missing here, or is it that I am trying in a complete wrong direction.
It would be useful to know a bit more about what you're trying to test exactly. We have different levels of testing in the Geode codebase. If you can get away with just a server, I'd suggest using the ServerStarterRule in your JUnits. Here is an example of that: https://github.com/apache/geode/blob/f12055ae3ae4b1f4731c0447af0c4cb9abdd4159/geode-core/src/integrationTest/java/org/apache/geode/management/internal/cli/commands/AlterRegionCommandIntegrationTest.java
This rule will start up a server as part of the JUnit JVM. This means that you won't be able to use a ClientCache at the same time (you cannot have both a ClientCache and a Cache instance in the same JVM instance).
The next level of test is called DUnit testing. This framework allows you to spin up multiple JVMs and form an actual cluster. The best way to use this is with the ClusterStartupRule together with the GfshCommandRule. An example of this would be: https://github.com/apache/geode/blob/10d89ede6f90f046c15e12e3d16aed259d7044b0/geode-cq/src/distributedTest/java/org/apache/geode/management/internal/cli/commands/ListClientCommandDUnitTest.java
Here, various components are being started up including a client VM. The nice thing about using these rules is that they will handle startup and teardown for you in a consistent and safe manner.