I am trying to get an oozie job status using oozie java API. Currently it is failing with the message
Exception in thread "main" HTTP error code: 401 : Unauthorized
We are using a kerberos authentication in our cluster with a keytab file.
Please guide as how to proceed to implement the authentication.
My current program is:
import org.apache.oozie.client.OozieClient;
public class oozieCheck
{
public static void main(String[] args)
{
// get a OozieClient for local Oozie
OozieClient wc = new OozieClient(
"http://myserver:11000/oozie");
System.out.println(wc.getJobInfo(args[1]));
}
}
I figured a way to use kerberos in my java api.
First obtain kerberos tgt.
Then the below code works:
import java.io.BufferedReader;
import java.io.FileReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.Properties;
import org.apache.log4j.Logger;
import org.apache.oozie.client.AuthOozieClient;
import org.apache.oozie.client.WorkflowJob.Status;
public class Wrapper
{
public static AuthOozieClient wc = null;
public static String OOZIE_SERVER_URL="http://localhost:11000/oozie";
public Wrapper ( String oozieUrlStr ) throws MalformedURLException
{
URL oozieUrl = new URL(oozieUrlStr);
// get a OozieClient for local Oozie
wc = new AuthOozieClient(oozieUrl.toString());
}
public static void main ( String [] args )
{
String lineCommon;
String jobId = args[0]; // The first argument is the oozie jobid
try
{
Wrapper client = new Wrapper(OOZIE_SERVER_URL);
Properties conf = wc.createConfiguration();
if(wc != null)
{
// get status of jobid from CLA
try
{
while (wc.getJobInfo(jobId).getStatus() == Status.RUNNING)
{
logger.info("Workflow job running ...");
logger.info("Workflow job ID:["+jobId+"]");
}
if(wc.getJobInfo(jobId).getStatus() == Status.SUCCEEDED)
{
// print the final status of the workflow job
logger.info("Workflow job completed ...");
logger.info(wc.getJobInfo(jobId));
}
else
{
// print the final status of the workflow job
logger.info("Workflow job Failed ...");
logger.info(wc.getJobInfo(jobId));
}
}
catch(Exception e)
{
e.printStackTrace();
}
}
else
{
System.exit(9999);
}
}
catch (Exception e)
{
e.printStackTrace();
}
}
}
You have to patch the oozie client if docs do not mention Kerberos.
Related
I was trying to get this section of code to submit a Hadoop job request based on this code sample:
import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.dataproc.v1.Job;
import com.google.cloud.dataproc.v1.JobControllerClient;
import com.google.cloud.dataproc.v1.JobControllerSettings;
import com.google.cloud.dataproc.v1.JobMetadata;
import com.google.cloud.dataproc.v1.JobPlacement;
import com.google.cloud.dataproc.v1.SparkJob;
import com.google.cloud.storage.Blob;
import com.google.cloud.storage.Storage;
import com.google.cloud.storage.StorageOptions;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class SubmitJob {
public static void submitJob() throws IOException, InterruptedException {
// TODO(developer): Replace these variables before running the sample.
String projectId = "your-project-id";
String region = "your-project-region";
String clusterName = "your-cluster-name";
submitJob(projectId, region, clusterName);
}
public static void submitJob(
String projectId, String region, String clusterName)
throws IOException, InterruptedException {
String myEndpoint = String.format("%s-dataproc.googleapis.com:443", region);
// Configure the settings for the job controller client.
JobControllerSettings jobControllerSettings =
JobControllerSettings.newBuilder().setEndpoint(myEndpoint).build();
// Create a job controller client with the configured settings. Using a try-with-resources
// closes the client,
// but this can also be done manually with the .close() method.
try (JobControllerClient jobControllerClient =
JobControllerClient.create(jobControllerSettings)) {
// Configure cluster placement for the job.
JobPlacement jobPlacement = JobPlacement.newBuilder().setClusterName(clusterName).build();
// Configure Spark job settings.
HadoopJob hadJob =
HadoopJob.newBuilder()
.setMainClass("my jar file")
.addArgs("input")
.addArgs("output")
.build();
Job job = Job.newBuilder().setPlacement(jobPlacement).setHadoopJob(hadJob).build();
// Submit an asynchronous request to execute the job.
OperationFuture<Job, JobMetadata> submitJobAsOperationAsyncRequest =
jobControllerClient.submitJobAsOperationAsync(projectId, region, job);
// THIS IS WHERE IT SEEMS TO TIMEOUT VVVVVVVV
Job response = submitJobAsOperationAsyncRequest.get();
// Print output from Google Cloud Storage.
Matcher matches =
Pattern.compile("gs://(.*?)/(.*)").matcher(response.getDriverOutputResourceUri());
matches.matches();
Storage storage = StorageOptions.getDefaultInstance().getService();
Blob blob = storage.get(matches.group(1), String.format("%s.000000000", matches.group(2)));
System.out.println(
String.format("Job finished successfully: %s", new String(blob.getContent())));
} catch (ExecutionException e) {
// If the job does not complete successfully, print the error message.
System.err.println(String.format("submitJob: %s ", e.getMessage()));
}
}
}
When running this sample, the code seems to timeout on Job response = submitJobAsOperationAsyncRequest.get(), and the Job is never submitted to my Google Cloud. I've checked all my project, region, and cluster names and I'm sure that is not the issue. I also have the following dependencies installed for the sample:
jar files
I believe I am not missing any .jar files.
Any suggestions? I appreciate any and all help.
I'm not able to find a way to read messages from pub/sub using java.
I'm using this maven dependency in my pom
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-pubsub</artifactId>
<version>0.17.2-alpha</version>
</dependency>
I implemented this main method to create a new topic:
public static void main(String... args) throws Exception {
// Your Google Cloud Platform project ID
String projectId = ServiceOptions.getDefaultProjectId();
// Your topic ID
String topicId = "my-new-topic-1";
// Create a new topic
TopicName topic = TopicName.create(projectId, topicId);
try (TopicAdminClient topicAdminClient = TopicAdminClient.create()) {
topicAdminClient.createTopic(topic);
}
}
The above code works well and, indeed, I can see the new topic I created using the google cloud console.
I implemented the following main method to write a message to my topic:
public static void main(String a[]) throws InterruptedException, ExecutionException{
String projectId = ServiceOptions.getDefaultProjectId();
String topicId = "my-new-topic-1";
String payload = "Hellooooo!!!";
PubsubMessage pubsubMessage =
PubsubMessage.newBuilder().setData(ByteString.copyFromUtf8(payload)).build();
TopicName topic = TopicName.create(projectId, topicId);
Publisher publisher;
try {
publisher = Publisher.defaultBuilder(
topic)
.build();
publisher.publish(pubsubMessage);
System.out.println("Sent!");
} catch (IOException e) {
System.out.println("Not Sended!");
e.printStackTrace();
}
}
Now I'm not able to verify if this message was really sent.
I would like to implement a message reader using a subscription to my topic.
Could someone show me a correct and working java example about reading messages from a topic?
Anyone can help me?
Thanks in advance!
Here is the version using the google cloud client libraries.
package com.techm.data.client;
import com.google.cloud.pubsub.v1.AckReplyConsumer;
import com.google.cloud.pubsub.v1.MessageReceiver;
import com.google.cloud.pubsub.v1.Subscriber;
import com.google.cloud.pubsub.v1.SubscriptionAdminClient;
import com.google.common.util.concurrent.MoreExecutors;
import com.google.pubsub.v1.ProjectSubscriptionName;
import com.google.pubsub.v1.ProjectTopicName;
import com.google.pubsub.v1.PubsubMessage;
import com.google.pubsub.v1.PushConfig;
/**
* A snippet for Google Cloud Pub/Sub showing how to create a Pub/Sub pull
* subscription and asynchronously pull messages from it.
*/
public class CreateSubscriptionAndConsumeMessages {
private static String projectId = "projectId";
private static String topicId = "topicName";
private static String subscriptionId = "subscriptionName";
public static void createSubscription() throws Exception {
ProjectTopicName topic = ProjectTopicName.of(projectId, topicId);
ProjectSubscriptionName subscription = ProjectSubscriptionName.of(projectId, subscriptionId);
try (SubscriptionAdminClient subscriptionAdminClient = SubscriptionAdminClient.create()) {
subscriptionAdminClient.createSubscription(subscription, topic, PushConfig.getDefaultInstance(), 0);
}
}
public static void main(String... args) throws Exception {
ProjectSubscriptionName subscription = ProjectSubscriptionName.of(projectId, subscriptionId);
createSubscription();
MessageReceiver receiver = new MessageReceiver() {
#Override
public void receiveMessage(PubsubMessage message, AckReplyConsumer consumer) {
System.out.println("Received message: " + message.getData().toStringUtf8());
consumer.ack();
}
};
Subscriber subscriber = null;
try {
subscriber = Subscriber.newBuilder(subscription, receiver).build();
subscriber.addListener(new Subscriber.Listener() {
#Override
public void failed(Subscriber.State from, Throwable failure) {
// Handle failure. This is called when the Subscriber encountered a fatal error
// and is
// shutting down.
System.err.println(failure);
}
}, MoreExecutors.directExecutor());
subscriber.startAsync().awaitRunning();
// In this example, we will pull messages for one minute (60,000ms) then stop.
// In a real application, this sleep-then-stop is not necessary.
// Simply call stopAsync().awaitTerminated() when the server is shutting down,
// etc.
Thread.sleep(60000);
} finally {
if (subscriber != null) {
subscriber.stopAsync().awaitTerminated();
}
}
}
}
This is working fine for me.
The Cloud Pub/Sub Pull Subscriber Guide has sample code for reading messages from a topic.
I haven't used google cloud client libraries but used the api client libraries. Here is how I created a subscription.
package com.techm.datapipeline.client;
import java.io.IOException;
import java.security.GeneralSecurityException;
import com.google.api.client.googleapis.json.GoogleJsonResponseException;
import com.google.api.client.http.HttpStatusCodes;
import com.google.api.services.pubsub.Pubsub;
import com.google.api.services.pubsub.Pubsub.Projects.Subscriptions.Create;
import com.google.api.services.pubsub.Pubsub.Projects.Subscriptions.Get;
import com.google.api.services.pubsub.Pubsub.Projects.Topics;
import com.google.api.services.pubsub.model.ExpirationPolicy;
import com.google.api.services.pubsub.model.Subscription;
import com.google.api.services.pubsub.model.Topic;
import com.techm.datapipeline.factory.PubsubFactory;
public class CreatePullSubscriberClient {
private final static String PROJECT_NAME = "yourProjectId";
private final static String TOPIC_NAME = "yourTopicName";
private final static String SUBSCRIPTION_NAME = "yourSubscriptionName";
public static void main(String[] args) throws IOException, GeneralSecurityException {
Pubsub pubSub = PubsubFactory.getService();
String topicName = String.format("projects/%s/topics/%s", PROJECT_NAME, TOPIC_NAME);
String subscriptionName = String.format("projects/%s/subscriptions/%s", PROJECT_NAME, SUBSCRIPTION_NAME);
Topics.Get listReq = pubSub.projects().topics().get(topicName);
Topic topic = listReq.execute();
if (topic == null) {
System.err.println("Topic doesn't exist...run CreateTopicClient...to create the topic");
System.exit(0);
}
Subscription subscription = null;
try {
Get getReq = pubSub.projects().subscriptions().get(subscriptionName);
subscription = getReq.execute();
} catch (GoogleJsonResponseException e) {
if (e.getStatusCode() == HttpStatusCodes.STATUS_CODE_NOT_FOUND) {
System.out.println("Subscription " + subscriptionName + " does not exist...will create it");
}
}
if (subscription != null) {
System.out.println("Subscription already exists ==> " + subscription.toPrettyString());
System.exit(0);
}
subscription = new Subscription();
subscription.setTopic(topicName);
subscription.setPushConfig(null); // indicating a pull
ExpirationPolicy expirationPolicy = new ExpirationPolicy();
expirationPolicy.setTtl(null); // never expires;
subscription.setExpirationPolicy(expirationPolicy);
subscription.setAckDeadlineSeconds(null); // so defaults to 10 sec
subscription.setRetainAckedMessages(true);
Long _week = 7L * 24 * 60 * 60;
subscription.setMessageRetentionDuration(String.valueOf(_week)+"s");
subscription.setName(subscriptionName);
Create createReq = pubSub.projects().subscriptions().create(subscriptionName, subscription);
Subscription createdSubscription = createReq.execute();
System.out.println("Subscription created ==> " + createdSubscription.toPrettyString());
}
}
And once you create the subscription (pull type)...this is how you pull the messages from the topic.
package com.techm.datapipeline.client;
import java.io.IOException;
import java.security.GeneralSecurityException;
import java.util.ArrayList;
import java.util.List;
import com.google.api.client.googleapis.json.GoogleJsonResponseException;
import com.google.api.client.http.HttpStatusCodes;
import com.google.api.client.util.Base64;
import com.google.api.services.pubsub.Pubsub;
import com.google.api.services.pubsub.Pubsub.Projects.Subscriptions.Acknowledge;
import com.google.api.services.pubsub.Pubsub.Projects.Subscriptions.Get;
import com.google.api.services.pubsub.Pubsub.Projects.Subscriptions.Pull;
import com.google.api.services.pubsub.model.AcknowledgeRequest;
import com.google.api.services.pubsub.model.Empty;
import com.google.api.services.pubsub.model.PullRequest;
import com.google.api.services.pubsub.model.PullResponse;
import com.google.api.services.pubsub.model.ReceivedMessage;
import com.techm.datapipeline.factory.PubsubFactory;
public class PullSubscriptionsClient {
private final static String PROJECT_NAME = "yourProjectId";
private final static String SUBSCRIPTION_NAME = "yourSubscriptionName";
private final static String SUBSCRIPTION_NYC_NAME = "test";
public static void main(String[] args) throws IOException, GeneralSecurityException {
Pubsub pubSub = PubsubFactory.getService();
String subscriptionName = String.format("projects/%s/subscriptions/%s", PROJECT_NAME, SUBSCRIPTION_NAME);
//String subscriptionName = String.format("projects/%s/subscriptions/%s", PROJECT_NAME, SUBSCRIPTION_NYC_NAME);
try {
Get getReq = pubSub.projects().subscriptions().get(subscriptionName);
getReq.execute();
} catch (GoogleJsonResponseException e) {
if (e.getStatusCode() == HttpStatusCodes.STATUS_CODE_NOT_FOUND) {
System.out.println("Subscription " + subscriptionName
+ " does not exist...run CreatePullSubscriberClient to create");
}
}
PullRequest pullRequest = new PullRequest();
pullRequest.setReturnImmediately(false); // wait until you get a message
pullRequest.setMaxMessages(1000);
Pull pullReq = pubSub.projects().subscriptions().pull(subscriptionName, pullRequest);
PullResponse pullResponse = pullReq.execute();
List<ReceivedMessage> msgs = pullResponse.getReceivedMessages();
List<String> ackIds = new ArrayList<String>();
int i = 0;
if (msgs != null) {
for (ReceivedMessage msg : msgs) {
ackIds.add(msg.getAckId());
//System.out.println(i++ + ":===:" + msg.getAckId());
String object = new String(Base64.decodeBase64(msg.getMessage().getData()));
System.out.println("Decoded object String ==> " + object );
}
//acknowledge all the received messages
AcknowledgeRequest content = new AcknowledgeRequest();
content.setAckIds(ackIds);
Acknowledge ackReq = pubSub.projects().subscriptions().acknowledge(subscriptionName, content);
Empty empty = ackReq.execute();
}
}
}
Note: This client only waits until it receives at least one message and terminates if it's receives one (up to a max of value - set in MaxMessages) at once.
Let me know if this helps. I'm going to try the cloud client libraries soon and will post an update once I get my hands on them.
And here's the missing factory class ...if you plan to run it...
package com.techm.datapipeline.factory;
import java.io.IOException;
import java.security.GeneralSecurityException;
import java.util.ArrayList;
import java.util.Collection;
import java.util.logging.Level;
import java.util.logging.Logger;
import com.google.api.client.googleapis.auth.oauth2.GoogleCredential;
import com.google.api.client.googleapis.javanet.GoogleNetHttpTransport;
import com.google.api.client.http.HttpTransport;
import com.google.api.client.json.JsonFactory;
import com.google.api.client.json.jackson2.JacksonFactory;
import com.google.api.services.pubsub.Pubsub;
import com.google.api.services.pubsub.PubsubScopes;
public class PubsubFactory {
private static Pubsub instance = null;
private static final Logger logger = Logger.getLogger(PubsubFactory.class.getName());
public static synchronized Pubsub getService() throws IOException, GeneralSecurityException {
if (instance == null) {
instance = buildService();
}
return instance;
}
private static Pubsub buildService() throws IOException, GeneralSecurityException {
logger.log(Level.FINER, "Start of buildService");
HttpTransport transport = GoogleNetHttpTransport.newTrustedTransport();
JsonFactory jsonFactory = new JacksonFactory();
GoogleCredential credential = GoogleCredential.getApplicationDefault(transport, jsonFactory);
// Depending on the environment that provides the default credentials (for
// example: Compute Engine, App Engine), the credentials may require us to
// specify the scopes we need explicitly.
if (credential.createScopedRequired()) {
Collection<String> scopes = new ArrayList<>();
scopes.add(PubsubScopes.PUBSUB);
credential = credential.createScoped(scopes);
}
logger.log(Level.FINER, "End of buildService");
// TODO - Get the application name from outside.
return new Pubsub.Builder(transport, jsonFactory, credential).setApplicationName("Your Application Name/Version")
.build();
}
}
The message reader is injected on the subscriber. This part of the code will handle the messages:
MessageReceiver receiver =
new MessageReceiver() {
#Override
public void receiveMessage(PubsubMessage message, AckReplyConsumer consumer) {
// handle incoming message, then ack/nack the received message
System.out.println("Id : " + message.getMessageId());
System.out.println("Data : " + message.getData().toStringUtf8());
consumer.ack();
}
};
I am using CuratorFramework (I'm still a newbie) in order to connect to a Zookeeper instance. I would like to import a configuration but before that I would like to test that my program is able to connect to Zookeeper. So far I have something like that:
public Boolean zookeeperRunning() {
CuratorFramework curatorFramework =
CuratorFrameworkFactory.newClient(zookeeperConn, new RetryOneTime(1));
curatorFramework.start();
CuratorZookeeperClient zkClient = curatorFramework.getZookeeperClient();
return zkClient.isConnected();
}
I've already started ZooKeeper on my local machine and I checked the connection with zkCli and the client is able to connect to it. The zookeeperCon variable is set to "127.0.0.1:2181" (I tried with localhost:2181 as well). The problem is that the above method always returns false despite the fact that zkServer is up n running. Most probably, the syntax is not correct but I could not find a solution online. Could you please help me with why the above code cannot find the zkServer which is up and running?
You can use a builder to create a configured client and setup a listener to monitor your zk instance's state:
// start client
client = CuratorFrameworkFactory.builder()
.connectString("localhost:2181")
.retryPolicy(new ExponentialBackoffRetry(1000, 3))
.namespace("heavenize")
.build();
client.getConnectionStateListenable().addListener(new ConnectionStateListener() {
#Override
public void stateChanged(CuratorFramework client, ConnectionState newState)
{
log.info("State changed to: "+newState);
}
});
}
You should first connect to zookeeper after you get the zkClient, if success, then check the isConnected status. Demo code below(Refer: here):
private static CuratorFramework buildConnection(String url) {
CuratorFramework curatorFramework = CuratorFrameworkFactory.newClient(url, new ExponentialBackoffRetry(100, 6));
// start connection
curatorFramework.start();
// wait 3 second to establish connect
try {
curatorFramework.blockUntilConnected(3, TimeUnit.SECONDS);
if (curatorFramework.getZookeeperClient().isConnected()) {
return curatorFramework.usingNamespace("");
}
} catch (InterruptedException ignored) {
Thread.currentThread().interrupt();
}
// fail situation
curatorFramework.close();
throw new RuntimeException("failed to connect to zookeeper service : " + url);
}
you should connect to zookeeper server then check it. for example:
import org.apache.curator.framework.CuratorFramework;
import org.apache.curator.test.TestingServer;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import java.io.IOException;
import java.util.concurrent.TimeUnit;
import static org.junit.Assert.assertTrue;
public class ZkClientTest {
TestingServer zkServer;
#Before
public void startZookeeper() throws Exception {
zkServer = new TestingServer(2181);
zkServer.start();
}
#After
public void stopZookeeper() throws IOException {
zkServer.stop();
}
#Test
public void should_connect_to_zookeeper_server_when_config_use_default_localhost_2181()
throws InterruptedException {
CuratorFramework client = ZkClient.getInstance().getClient();
try {
client.blockUntilConnected(3, TimeUnit.SECONDS);
assertTrue(ZkClient.getInstance().getClient().getZookeeperClient().isConnected());
} finally {
ZkClient.getInstance().close();
}
}
}
I have some housekeeping tasks within an Elastic Beanstalk Java application running on Tomcat, and I need to run them every so often. I want these tasks run only on the leader node (or, more correctly, on a single node, but the leader seems like an obvious choice).
I was looking at running cron jobs within Elastic Beanstalk, but it feels like this should be more straightforward than what I've come up with. Ideally, I'd like one of these two options within my web app:
Some way of testing within the current JRE whether or not this server is the leader node
Some some way to hit a specific URL (wget?) to trigger the task, but also restrict that URL to requests from localhost.
Suggestions?
It is not possible, by design (leaders are only assigned during deployment, and not needed on other contexts). However, you can tweak and use the EC2 Metadata for this exact purpose.
Here's an working example about how to achieve this result (original source). Once you call getLeader, it will find - or assign - an instance to be set as a leader:
package br.com.ingenieux.resource;
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
import java.util.List;
import javax.inject.Inject;
import javax.ws.rs.GET;
import javax.ws.rs.Path;
import javax.ws.rs.Produces;
import org.apache.commons.io.IOUtils;
import com.amazonaws.services.ec2.AmazonEC2;
import com.amazonaws.services.ec2.model.CreateTagsRequest;
import com.amazonaws.services.ec2.model.DeleteTagsRequest;
import com.amazonaws.services.ec2.model.DescribeInstancesRequest;
import com.amazonaws.services.ec2.model.Filter;
import com.amazonaws.services.ec2.model.Instance;
import com.amazonaws.services.ec2.model.Reservation;
import com.amazonaws.services.ec2.model.Tag;
import com.amazonaws.services.elasticbeanstalk.AWSElasticBeanstalk;
import com.amazonaws.services.elasticbeanstalk.model.DescribeEnvironmentsRequest;
#Path("/admin/leader")
public class LeaderResource extends BaseResource {
#Inject
AmazonEC2 amazonEC2;
#Inject
AWSElasticBeanstalk elasticBeanstalk;
#GET
public String getLeader() throws Exception {
/*
* Avoid running if we're not in AWS after all
*/
try {
IOUtils.toString(new URL(
"http://169.254.169.254/latest/meta-data/instance-id")
.openStream());
} catch (Exception exc) {
return "i-FFFFFFFF/localhost";
}
String environmentName = getMyEnvironmentName();
List<Instance> environmentInstances = getInstances(
"tag:elasticbeanstalk:environment-name", environmentName,
"tag:leader", "true");
if (environmentInstances.isEmpty()) {
environmentInstances = getInstances(
"tag:elasticbeanstalk:environment-name", environmentName);
Collections.shuffle(environmentInstances);
if (environmentInstances.size() > 1)
environmentInstances.removeAll(environmentInstances.subList(1,
environmentInstances.size()));
amazonEC2.createTags(new CreateTagsRequest().withResources(
environmentInstances.get(0).getInstanceId()).withTags(
new Tag("leader", "true")));
} else if (environmentInstances.size() > 1) {
DeleteTagsRequest deleteTagsRequest = new DeleteTagsRequest().withTags(new Tag().withKey("leader").withValue("true"));
for (Instance i : environmentInstances.subList(1,
environmentInstances.size())) {
deleteTagsRequest.getResources().add(i.getInstanceId());
}
amazonEC2.deleteTags(deleteTagsRequest);
}
return environmentInstances.get(0).getInstanceId() + "/" + environmentInstances.get(0).getPublicIpAddress();
}
#GET
#Produces("text/plain")
#Path("am-i-a-leader")
public boolean isLeader() {
/*
* Avoid running if we're not in AWS after all
*/
String myInstanceId = null;
String environmentName = null;
try {
myInstanceId = IOUtils.toString(new URL(
"http://169.254.169.254/latest/meta-data/instance-id")
.openStream());
environmentName = getMyEnvironmentName();
} catch (Exception exc) {
return false;
}
List<Instance> environmentInstances = getInstances(
"tag:elasticbeanstalk:environment-name", environmentName,
"tag:leader", "true", "instance-id", myInstanceId);
return (1 == environmentInstances.size());
}
protected String getMyEnvironmentHost(String environmentName) {
return elasticBeanstalk
.describeEnvironments(
new DescribeEnvironmentsRequest()
.withEnvironmentNames(environmentName))
.getEnvironments().get(0).getCNAME();
}
private String getMyEnvironmentName() throws IOException,
MalformedURLException {
String instanceId = IOUtils.toString(new URL(
"http://169.254.169.254/latest/meta-data/instance-id"));
/*
* Grab the current environment name
*/
DescribeInstancesRequest request = new DescribeInstancesRequest()
.withInstanceIds(instanceId)
.withFilters(
new Filter("instance-state-name").withValues("running"));
for (Reservation r : amazonEC2.describeInstances(request)
.getReservations()) {
for (Instance i : r.getInstances()) {
for (Tag t : i.getTags()) {
if ("elasticbeanstalk:environment-name".equals(t.getKey())) {
return t.getValue();
}
}
}
}
return null;
}
public List<Instance> getInstances(String... args) {
Collection<Filter> filters = new ArrayList<Filter>();
filters.add(new Filter("instance-state-name").withValues("running"));
for (int i = 0; i < args.length; i += 2) {
String key = args[i];
String value = args[1 + i];
filters.add(new Filter(key).withValues(value));
}
DescribeInstancesRequest req = new DescribeInstancesRequest()
.withFilters(filters);
List<Instance> result = new ArrayList<Instance>();
for (Reservation r : amazonEC2.describeInstances(req).getReservations())
result.addAll(r.getInstances());
return result;
}
}
You can keep a secret URL (a long URL is un-guessable, almost as safe as a password), hit this URL from somewhere. On this you can execute the task.
One problem however is that if the task takes too long, then during that time your server capacity will be limited. Another approach would be for the URL hit to post a message to the AWS SQS. The another EC2 can have a code which waits on SQS and execute the task. You can also look into http://aws.amazon.com/swf/
Another approach if you're running on the Linux-type EC2 instance:
Write a shell script that does (or triggers) your periodic task
Leveraging the .ebextensions feature to customize your Elastic Beanstalk instance, create a container command that specifies the parameter leader_only: true -- this command will only run on an instance that is designated the leader in your Auto Scaling group
Have your container command copy your shell script into /etc/cron.hourly (or daily or whatever).
The result will be that your "leader" EC2 instance will have a cron job running hourly (or daily or whatever) to do your periodic task and the other instances in your Auto Scaling group will not.
Constantly monitor a http request which if returns code 200 then no action is taken but if a 404 is returned then the administrator should be alerted via warning or mail.
I wanted to know how to approach it from a Java perspective. The codes available are not very useful.
First of all, you should consider using an existing tool designed for this job (e.g. Nagios or the like). Otherwise you'll likely find yourself rewriting many of the same features. You probably want to send only one email once a problem has been detected, otherwise you'll spam the admin. Likewise you might want to wait until the second or third failure before sending an alert, otherwise you could be sending false alarms. Existing tools do handle these things and more for you.
That said, what you specifically asked for isn't too difficult in Java. Below is a simple working example that should help you get started. It monitors a URL by making a request to it every 30 seconds. If it detects a status code 404 it'll send out an email. It depends on the JavaMail API and requires Java 5 or higher.
public class UrlMonitor implements Runnable {
public static void main(String[] args) throws Exception {
URL url = new URL("http://www.example.com/");
Runnable monitor = new UrlMonitor(url);
ScheduledExecutorService service = Executors.newScheduledThreadPool(1);
service.scheduleWithFixedDelay(monitor, 0, 30, TimeUnit.SECONDS);
}
private final URL url;
public UrlMonitor(URL url) {
this.url = url;
}
public void run() {
try {
HttpURLConnection con = (HttpURLConnection) url.openConnection();
if (con.getResponseCode() == HttpURLConnection.HTTP_NOT_FOUND) {
sendAlertEmail();
}
} catch (IOException e) {
e.printStackTrace();
}
}
private void sendAlertEmail() {
try {
Properties props = new Properties();
props.setProperty("mail.transport.protocol", "smtp");
props.setProperty("mail.host", "smtp.example.com");
Session session = Session.getDefaultInstance(props, null);
Message message = new MimeMessage(session);
message.setFrom(new InternetAddress("me#example.com", "Monitor"));
message.addRecipient(Message.RecipientType.TO,
new InternetAddress("me#example.com"));
message.setSubject("Alert!");
message.setText("Alert!");
Transport.send(message);
} catch (Exception e) {
e.printStackTrace();
}
}
}
I'd start with the quartz scheduler, and create a SimpleTrigger. The SimpleTrigger would use httpclient to create connection and use the JavaMail api to send the mail if an unexpected answer occurred. I'd probably wire it using spring as that has good quartz integration and would allow simple mock implementations for testing.
A quick and dirt example without spring combining Quartz and HttpClient (for JavaMail see How do I send an e-mail in Java?):
imports (so you know where I got the classes from):
import java.io.IOException;
import org.apache.http.HttpResponse;
import org.apache.http.StatusLine;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.DefaultHttpClient;
import org.quartz.Job;
import org.quartz.JobExecutionContext;
import org.quartz.JobExecutionException;
code:
public class CheckJob implements Job {
public static final String PROP_URL_TO_CHECK = "URL";
public void execute(JobExecutionContext context)
throws JobExecutionException {
String url = context.getJobDetail().getJobDataMap()
.getString(PROP_URL_TO_CHECK);
System.out.println("Starting execution with URL: " + url);
if (url == null) {
throw new IllegalStateException("No URL in JobDataMap");
}
HttpClient client = new DefaultHttpClient();
HttpGet get = new HttpGet(url);
try {
processResponse(client.execute(get));
} catch (ClientProtocolException e) {
mailError("Got a protocol exception " + e);
return;
} catch (IOException e) {
mailError("got an IO exception " + e);
return;
}
}
private void processResponse(HttpResponse response) {
StatusLine status = response.getStatusLine();
int statusCode = status.getStatusCode();
System.out.println("Received status code " + statusCode);
// You may wish a better check with more valid codes!
if (statusCode <= 200 || statusCode >= 300) {
mailError("Expected OK status code (between 200 and 300) but got " + statusCode);
}
}
private void mailError(String message) {
// See https://stackoverflow.com/questions/884943/how-do-i-send-an-e-mail-in-java
}
}
and the main class which runs forever and checks every 2 minutes:
imports:
import org.quartz.JobDetail;
import org.quartz.SchedulerException;
import org.quartz.SchedulerFactory;
import org.quartz.SimpleScheduleBuilder;
import org.quartz.SimpleTrigger;
import org.quartz.TriggerBuilder;
import org.quartz.impl.StdSchedulerFactory;
code:
public class Main {
public static void main(String[] args) {
JobDetail detail = JobBuilder.newJob(CheckJob.class)
.withIdentity("CheckJob").build();
detail.getJobDataMap().put(CheckJob.PROP_URL_TO_CHECK,
"http://www.google.com");
SimpleTrigger trigger = TriggerBuilder.newTrigger()
.withSchedule(SimpleScheduleBuilder
.repeatMinutelyForever(2)).build();
SchedulerFactory fac = new StdSchedulerFactory();
try {
fac.getScheduler().scheduleJob(detail, trigger);
fac.getScheduler().start();
} catch (SchedulerException e) {
throw new RuntimeException(e);
}
}
}