(JAVA) How check status of Step in AWS EMR?

(JAVA) How check status of Step in AWS EMR? - java

I am using Java EMR API to run pig job on EMR cluster. I am using following code to add Steps in JobFLow:
String jobFlowId = "j-assdasd";
AmazonElasticMapReduceClient client = new AmazonElasticMapReduceClient(
credentials);
StepFactory stepFactory = new StepFactory();
StepConfig executePig = new StepConfig()
.withName("Execute Pig")
.withActionOnFailure(ActionOnFailure.CANCEL_AND_WAIT)
.withHadoopJarStep(
stepFactory
.newRunPigScriptStep("s3://bucket/script/load.pig"));
AddJobFlowStepsRequest pig = new AddJobFlowStepsRequest(jobFlowId)
.withSteps( executePig);
AddJobFlowStepsResult result = client.addJobFlowSteps(pig);
How can i get the status of the "Execute Pig" status? I want to make program wait till the step finishes on EMR.

I found a way to do it java:
List<String> id = result.getStepIds();
DescribeStepResult res = client.describeStep(new DescribeStepRequest().withStepId(id.get(0)));
StepStatus status = res.getStep().getStatus();
String stas = status.getState();
But, here we need to loop on status till its return completed.

As Ajay mentioned on his own answer, there is a need for a loop to constantly check the statuses of the cluster, bootstrap actions, and steps. This post shows how to create such loop to keep the program inside of it until a certain status is reached.

Related

ChromeDevTools in selenium, waiting for response bodies

I need to work on ajax response, that is one of responses received upon visiting a page. I use selenium dev tools and java. I create a listener, that intercepts a specific request and then I want to work on response it brings. However I need to setup static wait, or else selenium don't have time to save RequestId. I read Chrome Dev Tools documentation, but it's a new thing for me. I wonder if there is a method that would allow me to wait for this call to be completed, other than the static wait.
Here is my code:
#Test(groups = "test")
public void x() throws InterruptedException, JsonProcessingException {
User user = User.builder();
ManageAccountStep manageAccountStep = new ManageAccountStep(getDriver());
DashboardPO dashboardPO = new DashboardPO(getDriver());
manageAccountStep.login(user);
DevTools devTools = ((HasDevTools) getDriver()).maybeGetDevTools().orElseThrow();
devTools.createSessionIfThereIsNotOne();
devTools.send(Network.enable(Optional.empty(), Optional.empty(), Optional.empty()));
// end of boilerplate
final RequestId[] id = new RequestId[1];
devTools.addListener(Network.responseReceived(), response -> {
log.info(response.getResponse().getUrl());
if (response.getResponse().getUrl().contains(DESIRED_URL)){
id[0] = response.getRequestId();
}
});
dashboardPO
.clickLink(); // here is when my DESIRED_URL happens
Utils.sleep(5000); // Something like Thread.sleep(5000)
String responseBody = devTools.send(Network.getResponseBody(id[0])).getBody();
// some operations on responseBody
devTools.clearListeners();
devTools.disconnectSession();
}
If I don't use 5 seconds wait id variable gets never assigned and I null pointer exception requestId is required. During these 5 seconds log.info prints all api calls that are happening and it almost always finds my id. I would like to refrain from static wait though. I am thinking about something similiar to maybe jQuery.active()==0, but my page doesn't use jQuery.

You may try custom function Explicit Wait. Something like this:
public String getResponseBody(WebDriver driver, DevTools devTools) {
return new WebDriverWait(driver,5)
.ignoring(NullPointerException.class)
.until(driver ->
devTools.send(Network.getResponseBody(id[0])).getBody());
}
So, it won't wait for all 5 seconds. The moment it got the data, it would come of out of the until method. Also add whichever Exception that was coming up here.
Has put these lines of code as separate method because, devTools object is locally defined. In order to use them inside this anonymous inner function, it has to be final or effectively final.

I seem to run into this issue when running tests in parallel (and headless) and trying to capture the requests and responses, I get:
{"No data found for resource with given identifier"},"sessionId" ...
However, now .until seems to only take ExpectedCondition
So a similar solution (to the accepted answer), but without using "WebDriverWait.until" that I use is:
public static String getResponseBody(DevTools devTools, RequestId id) {
String requestPostData = "";
LocalDateTime then = LocalDateTime.now();
String err = "";
Integer it = 0;
while (true) {
err = "";
try{requestPostData = devTools.send(Network.getResponseBody(id)).getBody();} catch( Exception e){err = e.getMessage();};
if (requestPostData != null && !requestPostData.equals("")) {break;}
if (err.equals("")) {break;} // if we don't have an error message, its quite possible the responseBody really is an empty string
long timeTaken = ChronoUnit.SECONDS.between(then, LocalDateTime.now());
if (timeTaken >= 5) {requestPostData = err + ", timeTaken:" + timeTaken; break;}
if(it > 0) {TimeUnit.SECONDS.sleep(it);} // I prefer waiting longer and longer, avoiding stack overflows
it++;
}
return requestPostData;
}
It just loops until it doesn't error, and returns the string back as soon as it can (but I actually set timeTaken >= 60 due to many parallel requests)

Apache Storm 2.1.0 local DRPC does not return any response although a tuple is well emitted to the collector by the last bolt

I have a problem trying to run a DRPC topology containing one single bolt and query it through a local cluster. After debugging with IntelliJ, the bolt is indeed executed but the JCQueue is stuck in an infinite loop after that the bolt has been executed and until a timeout is sent to the server.
Here is the code used to build the topology builder:
public static LinearDRPCTopologyBuilder createBuilder()
{
var bolt = new MRedisLookupBolt(createRedisConfiguration(), new RedisTurnoverMapper());
var builder = new LinearDRPCTopologyBuilder("sales");
builder.addBolt(bolt, 1).localOrShuffleGrouping();
return builder;
}
The MRedisLookupBolt is just a very simple implementation of IBasicBolt executing a hget command against Jedis. The execute method of the MRedisLookupBolt is just emitting an instance of Values containing the value for two fields that are declared like this:
declarer.declare(new Fields("id", "Value"));
The topology is built and queried in an unit test like this:
Config conf = new Config();
conf.setDebug(true);
conf.setNumWorkers(1);
try(LocalDRPC drpc = new LocalDRPC())
{
LocalCluster cluster = new LocalCluster();
var builder = BasicRedisRPCTopology.createBuilder();
LocalCluster.LocalTopology topo = cluster.submitTopology(
"Sales-fetch", conf, builder.createLocalTopology(drpc));
var result = drpc.execute("sales", "XXXXX");
System.out.println("################ Result: " + result);
}
catch (Exception e)
{
e.printStackTrace();
}
When reading the logs, I am sure that the data is well red by the bolt and that everything is emitted
But at the end, I have this stack trace gently printed out by my test method. Of course, no value is allocated to the result variable and the process never reach the last print instructions:
There is something that I am missing here. What I understand: the JCQueue used by BoltExecutor to retrieve the id of which bolt to execute is never ending although there is only one parameters sent to the local DRPC and only one bolt declared into the topology. I have already tried to add more bolts to the topology or change the builder implementation used to create it but with no success.

I found a solution suitable for my use case using Apache Storm 2.1.0.
It seems that invoking the submitTopology method of the local cluster as proposed by the documentation does not end the executor correctly with version 2.1.0 using the LinearDRPCTopologyBuilder to build the topology.
By looking closer to the source code, it was possible to understand how to apply the LinearDRPCTopologyBuilder logic to the TopologyBuilder directly.
Here is the change applied to the createBuilder method:
public static TopologyBuilder createBuilder(ILocalDRPC localDRPC)
{
var spout = Optional.ofNullable(localDRPC)
.map(drpc -> new DRPCSpout("sales", drpc))
.orElse(new DRPCSpout("sales"));
var bolt = new MRedisLookupBolt(createRedisConfiguration(), new RedisTurnoverMapper());
var builder = new TopologyBuilder();
builder.setSpout("drpc", spout);
builder.setBolt("redisLookup", bolt, 1)
.shuffleGrouping("drpc");
builder.setBolt("return", new ReturnResults())
.shuffleGrouping("redisLookup");
return builder;
}
And here is an exemple of execution:
Config conf = new Config();
conf.setDebug(true);
conf.setNumWorkers(1);
try(LocalDRPC drpc = new LocalDRPC())
{
LocalCluster cluster = new LocalCluster();
var builder = BasicRedisRPCTopology.createBuilder(drpc);
cluster.submitTopology("Sales-fetch", conf, builder.createTopology());
var result = drpc.execute("sales", "XXXXX");
System.out.println("################ Result: " + result);
}
catch (Exception e)
{
e.printStackTrace();
}
Unfortunately this solution does not allow to use all the embedded tools of the LinearDRPCTopologyBuilder and implies to build all the topology flow 'by hand'. Is is necessary to change the mapper behavior to as the fields are not exposed in the same order as before.

JEST Bulk Request Issue

I am trying to run a Bulk Request through JEST and want to append my data (say "bills") one at a time and then execute all at once, however when i run the following code on 10 bills just the last bill is getting executed, can someone please correct this code to execute all 10 bills (by executing it outside the for loop ie using Bulk Request)?
for(JSONObject bill : bills) {
bulkRequest = new Bulk.Builder()
.addAction(new Index.Builder(bill.toString()).index(index).type(type).id(id).build())
.build();
}
bulkResponse = Client.execute(bulkRequest);

You need to build the Bulk Builder out of the loop and then use it to add all bills:
bulkRequest = new Bulk.Builder()
for(JSONObject bill : bills) {
bulkRequest.addAction(new Index.Builder(bill.toString()).index(index).type(type).id(id).build())
}
bulkResponse = Client.execute(bulkRequest.build());

I know It's an old question, but just in case someone stumbles across this, here is a java 8/(lambdas) way of doing the same thing.
Client.execute( new Bulk.Builder()
.addAction(
bills.stream()
.map(bill ->
new Index.Builder(bill.toString()
)
.index(index).type(type).id(id).build())
.collect(Collectors.toList())
).build());

JT400 - Get spool file generated by a command

I am developing a java class with JT400 and trying to get the result of the command “dspmsg qsysopr" with:
AS400 as400 = new AS400(system, user, password);
CommandCall cmd = new CommandCall(as400);
cmd.runCommand("dspmsg qsysopr");
I found out that the command runs in a JOB with the user QUSER, but an spool file with the result is generated under the user "user" specified when I instantiate the object AS400.
I can successfully run the command, but instead of the messages in the queue I have as result:
"Printer output created."
I get this result with the code:
Job job = cmd.getServerJob();
AS400Message[] messageList = cmd.getMessageList();
for (int i = 0; i < messageList.length; i++) {
System.out.println(messageList[i].getText());
}
Question 1: Is there a way to not receive the messages in a spool file but have it returned to me as na AS400Message or something similar?
Not been able to do so, I am using the following method to get the spool file:
public String getSpoolFile (
String splfname, // splf name
String splfnumbert, // splf number
String jobname, // job name
String jobuser, // job user
String jobfnumber // job number
) throws Exception {
int splno = Integer.parseInt(splfnumbert);
SpooledFile sf = new SpooledFile( as400, // system
splfname, // splf name
splno, // splf number
jobname, // job name
jobuser, // job user
jobfnumber );
PrintParameterList printParms = new PrintParameterList();
printParms.setParameter(PrintObject.ATTR_WORKSTATION_CUST_OBJECT, "/QSYS.LIB/QWPDEFAULT.WSCST");
printParms.setParameter(PrintObject.ATTR_MFGTYPE, "*WSCST");
// Create a page input stream from the spooled file
PrintObjectPageInputStream is;
String data ="";
String response ="";
is = sf.getPageInputStream(printParms);
BufferedReader d = new BufferedReader(new InputStreamReader(is));
while((data = d.readLine() )!=null)
{
response+=data+"\n";
}
return response;
}
The problem is: I don't have the parameters to call the method "getSpoolFile".
If I manually log in AS400, check the spool file details and call the method with the manually obtained parameters, I successfully get the spool file.
But the JOB object I receive under:
Job job = cmd.getServerJob();
After running:
cmd.runCommand("dspmsg qsysopr");
Is not the same Job that created the spool file. For example, If I check:
System.out.println(job.getUser());
I have "QUSER" as result, but the spool file is generated under “user” output queue.
Question 2: How can I get the JOB related to the generation of that spool file?
Question 3: can I also get the parameters related to the spool file generated like the spool file number and spool file name?
I need the following information in order to call the "getSpoolFile" method:
Spool file name
Spool file number
Job name
Job user
Job number
thanks,
Carlos

You probably don't want to "print" the messages at all. You haven't said what you want to do with any messages once you get them (and getting QSYSOPR messages is probably not a good idea in the first place).
You might review AS/400 Message queue filtering - JT400 and then begin thinking how you want to proceed. There is a lot that can be done with messages.

Not specifically how to retrieve the spool file information but this will get you the QSYSOPR messages if that is your ultimate goal.
AS400 sys = new AS400(as400system,username,password);
//Get the user object for QSYSOPR
User u = new User(sys,"qsysopr");
//Get the path to the user's message queue
String qpath = u.getMessageQueue();
//Retrieve the message queue object
MessageQueue queue = new MessageQueue(sys, qpath);
// Get the list of messages currently in this user's queue.
queue.setListDirection(false);
//Get the first 15 messsages
QueuedMessage[] qm = queue.getMessages(0,15);
//Loop through the messages
for (int i = qm.length -1; i >=0; i--)
{
System.out.println(qm[i].getText());
}
Of course there are other properties, date, reply status, user that you can retrieve from the QueuedMessage class. No parsing needed.

How to perform Amazon Cloud Search with .net code?

I am learning Amazon Cloud Search but I couldn't find any code in either C# or Java (though I am creating in C# but if I can get code in Java then I can try converting in C#).
This is just 1 code I found in C#: https://github.com/Sitefinity-SDK/amazon-cloud-search-sample/tree/master/SitefinityWebApp.
This is 1 method i found in this code:
public IResultSet Search(ISearchQuery query)
{
AmazonCloudSearchDomainConfig config = new AmazonCloudSearchDomainConfig();
config.ServiceURL = "http://search-index2-cdduimbipgk3rpnfgny6posyzy.eu-west-1.cloudsearch.amazonaws.com/";
AmazonCloudSearchDomainClient domainClient = new AmazonCloudSearchDomainClient("AKIAJ6MPIX37TLIXW7HQ", "DnrFrw9ZEr7g4Svh0rh6z+s3PxMaypl607eEUehQ", config);
SearchRequest searchRequest = new SearchRequest();
List<string> suggestions = new List<string>();
StringBuilder highlights = new StringBuilder();
highlights.Append("{\'");
if (query == null)
throw new ArgumentNullException("query");
foreach (var field in query.HighlightedFields)
{
if (highlights.Length > 2)
{
highlights.Append(", \'");
}
highlights.Append(field.ToUpperInvariant());
highlights.Append("\':{} ");
SuggestRequest suggestRequest = new SuggestRequest();
Suggester suggester = new Suggester();
suggester.SuggesterName = field.ToUpperInvariant() + "_suggester";
suggestRequest.Suggester = suggester.SuggesterName;
suggestRequest.Size = query.Take;
suggestRequest.Query = query.Text;
SuggestResponse suggestion = domainClient.Suggest(suggestRequest);
foreach (var suggest in suggestion.Suggest.Suggestions)
{
suggestions.Add(suggest.Suggestion);
}
}
highlights.Append("}");
if (query.Filter != null)
{
searchRequest.FilterQuery = this.BuildQueryFilter(query.Filter);
}
if (query.OrderBy != null)
{
searchRequest.Sort = string.Join(",", query.OrderBy);
}
if (query.Take > 0)
{
searchRequest.Size = query.Take;
}
if (query.Skip > 0)
{
searchRequest.Start = query.Skip;
}
searchRequest.Highlight = highlights.ToString();
searchRequest.Query = query.Text;
searchRequest.QueryParser = QueryParser.Simple;
var result = domainClient.Search(searchRequest).SearchResult;
//var result = domainClient.Search(searchRequest).SearchResult;
return new AmazonResultSet(result, suggestions);
}
I have already created domain in Amazon Cloud Search using AWS console and uploaded document using Amazon predefine configuration option that is movie Imdb json file provided by Amazon for demo.
But in this method I am not getting how to use this method, like if I want to search Director name then how do I pass in this method as because this method parameter is of type ISearchQuery?

I'd suggest using the official AWS CloudSearch .NET SDK. The library you were looking at seems fine (although I haven't look at it any detail) but the official version is more likely to expose new CloudSearch features as soon as they're released, will be supported if you need to talk to AWS support, etc, etc.
Specifically, take a look at the SearchRequest class -- all its params are strings so I think that obviates your question about ISearchQuery.
I wasn't able to find an example of a query in .NET but this shows someone uploading docs using the AWS .NET SDK. It's essentially the same procedure as querying: creating and configuring a Request object and passing it to the client.
EDIT:
Since you're still having a hard time, here's an example. Bear in mind that I am unfamiliar with C# and have not attempted to run or even compile this but I think it should at least be close to working. It's based off looking at the docs at http://docs.aws.amazon.com/sdkfornet/v3/apidocs/
// Configure the Client that you'll use to make search requests
string queryUrl = #"http://search-<domainname>-xxxxxxxxxxxxxxxxxxxxxxxxxx.us-east-1.cloudsearch.amazonaws.com";
AmazonCloudSearchDomainClient searchClient = new AmazonCloudSearchDomainClient(queryUrl);
// Configure a search request with your query
SearchRequest searchRequest = new SearchRequest();
searchRequest.Query = "potato";
// TODO Set your other params like parser, suggester, etc
// Submit your request via the client and get back a response containing search results
SearchResponse searchResponse = searchClient.Search(searchRequest);

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

(JAVA) How check status of Step in AWS EMR? - java

As Ajay mentioned on his own answer, there is a need for a loop to constantly check the statuses of the cluster, bootstrap actions, and steps. This post shows how to create such loop to keep the program inside of it until a certain status is reached.

Related

ChromeDevTools in selenium, waiting for response bodies

Apache Storm 2.1.0 local DRPC does not return any response although a tuple is well emitted to the collector by the last bolt

JEST Bulk Request Issue

JT400 - Get spool file generated by a command

How to perform Amazon Cloud Search with .net code?

Categories

Resources