Stop specific running Kettle Job in java - java

How would it be possible to stop a specific running job in Kettle?
I'm using the following code:
KettleEnvironment.init();
JobMeta jobmeta = new JobMeta(C://Users//Admin//DBTOOL//EDW_Testing_Tool - 1.8(VersionUpgraded)//data-integration//Regress_bug//Start_Validation.kjb,
null);
Job job = new Job(null, jobmeta);
job.initializeVariablesFrom(null);
job.setVariable("Internal.Job.Filename.Directory", Constants.JOB_EXECUTION_KJB_FILE_PATH);
job.setVariable("jobId", jobId.toString());
job.getJobMeta().setInternalKettleVariables(job);
job.stopAll();
How would I ensure that the job which I want to stop is getting stopped and it is not executed after setting the flag?
I'm using rest api to stop the job and i'm not able to get the job Object.
if i'm using CarteSingleton and store the object in map i'm not able to execute the job it gives driver error could not connect to database(eg:-jtds) url not working.

Related

Dataproc Job Submit Via API

I have streaming job running which will run forever and will execute the query on Kafka topic, I am going through DataProc Documentation for submitting a job via Java, here is the link
// Submit an asynchronous request to execute the job.
OperationFuture<Job, JobMetadata> submitJobAsOperationAsyncRequest =
jobControllerClient.submitJobAsOperationAsync(projectId, region, job);
Job response = submitJobAsOperationAsyncRequest.get();
For the above line of code I am not able to get the response , the above code keeps on running ? Is it because it's streaming job and it's running forever ?
How I can get a response ? So to end user I can provide some job information like URL where they can see their Jobs or any monitoring dashaborad
The OperationFuture<Job, JobMetadata> class has a getMetadata() method which returns a com.google.api.core.ApiFuture<JobMetadata>. You can get the job metadata before the job finishes by calling jobMetadataApiFuture.get().
See more details about the OperationFuture.getMetadata() method and the ApiFuture class.

Getting current AWS Data Pipeline status from Java

I am trying to access the current status of a data pipeline from Java Data Pipeline client. My use case is to activate a pipeline and wait till it's in completed state.
I tried the answer from this thread: AWS Data Pipeline - Components, Instances and Attempts and Pipeline Status but I am only getting the current state as Scheduled even though the pipeline is in running state. This my code snippet:
DescribePipelinesRequest describePipelinesRequest = new DescribePipelinesRequest();
describePipelinesRequest.setPipelineIds(Arrays.asList(pipelineId));
final DescribePipelinesResult describePipelinesResult =
dataPipelineClient.describePipelines(describePipelinesRequest);
final List<Field> testPipeline =
describePipelinesResult.getPipelineDescriptionList().get(0).getFields();
for (Field field : testPipeline) {
log.debug("Field: {} and {}", field.getKey(), field.getStringValue());
if (field.getKey().equals("#pipelineState")) {
log.debug("Pipeline state current: {} and {}", field.getStringValue());
}
}
Has anyone faced issues like this before? Btw, this pipeline has been made a on trigger pipeline scheduled to run every 100 years. We need to trigger this pipeline manually.
I'm not sure this does exactly what you want but should help to point you in the right direction. You'll need the query the objects in the pipeline and get their status. These are what actually are running.
Java code
String pipelineid = "df-06036888777666777";//replace with your pipeline id
DataPipelineClient client = new DataPipelineClient();
QueryObjectsResult tasks = client.queryObjects( new QueryObjectsRequest().withPipelineId(pipelineid).withSphere("INSTANCE"));
DescribeObjectsResult results = client.describeObjects(new DescribeObjectsRequest().withObjectIds(tasks.getIds()).withPipelineId(pipelineid));
for (PipelineObject obj : results.getPipelineObjects()){
for (Field field : obj.getFields()){
if (field.getKey().equals("#status") && !field.getStringValue().equals("FINISHED") ){
System.out.println(obj.getName() + " is still running...");
}
}
}
OUTPUT:
#CliActivity_2020-01-11T21:34:45 is still running...
#Ec2Instance_2020-01-11T21:34:45 is still running...
what you're doing currently is getting the pipeline information which will only show that it's been created successfully and scheduled.
We need to trigger this pipeline manually.
To do this activate the pipeline again. This will create new task objects which Data Pipeline will start to process. Currently as described this is an on-demand pipeline which will only create new tasks when it's activated manually.

Spring batch execute last step even get an exception

I want to run a spring batch job which has set of steps and finally I want to send a notification to redis containing the status of the Job execution. Let's say if all the steps are executed, I should send "Pass". If there was any execution or any error, I want to pass "Fail". So my last step will be notification to redis updating the status regardless of it finished fine or got an exception.
My question is:
Can I achieve this in Spring Batch?
Can I use notification
function as a last step or should I use any specific method for
this?
How can I get the status of jobs?
I know I can get the job status like :
JobExecution execution = jobLauncher.run(job, params);
System.out.println("Exit Status : " + execution.getStatus());
But I call the job in command-line like java -jar app.jar ----spring.batch.job.names=myjobnamehere so that I do not use a JobExecution object.
You can use a JobExecutionListener for that. In the afterJob method, you have a reference to the JobExecution from which you can get the status of the job and send the notification as required.
You can find an example in the getting started guide (See JobCompletionNotificationListener).

Google Cloud Platform blocking BatchRequest request - Java

Is it possible to wait until a BatchJob (BatchRequest objecT) in GCP is completed?
I.g. you can do it with a normal Job:
final Job job = createJob(jobId, projectId, datasetId, tableId, destinationBucket);
service.jobs().insert(projectId, job).execute();
final Get request = service.jobs().get(projectId, jobId);
JobStatus response;
while (true) {
Thread.sleep(500); // improve this sleep policy
response = request.execute().getStatus();
if (response.getState().equals("DONE") || response.getState().equals("FAILED"))
break;
}
Something like the above code works fine. The problem with batchRequest is that the jobRequest.execute() method does not return a Response object.
When you execute it, the batch request returns after it has initialised all the jobs specified in its queue but it does not wait until all of them are really finished. Indeed your execute() method returns but you can have failing jobs later on (i.g. error due to quota issue, schema issues etc.) and I can't notify the client on time with the right information.
You can just check the status of all the created jobs in the web UI with the job history button from the BigQuery view, you can't return error message to a client.
Any idea with that?

submit a job from eclipse to a running cluster on amazon EMR

I want to add jobs from my java code in eclipse to a running cluster of EMR for saving startup time (creating ec2, bootstrapping...).
I know how to run a new cluster from java code but it's terminating after all jobs are done.
RunJobFlowRequest runFlowRequest = new RunJobFlowRequest()
.withName("Some name")
.withInstances(instances)
// .withBootstrapActions(bootstrapActions)
.withJobFlowRole("EMR_EC2_DefaultRole")
.withServiceRole("EMR_DefaultRole")
.withSteps(firstJobStep, secondJobStep, thirdJobStep)
.withLogUri("s3n://path/to/logs");
// Run the jobs
RunJobFlowResult runJobFlowResult = mapReduce
.runJobFlow(runFlowRequest);
String jobFlowId = runJobFlowResult.getJobFlowId();
You have to set KeepJobFlowAliveWhenNoSteps parameter to TRUE, otherwise the cluster will be terminated after executing all the steps. If this property is set, the cluster will continue in waiting state after executing all the steps.
Add .withKeepJobFlowAliveWhenNoSteps(true) to the existing code.
Refer this doc for further details.

Categories