I am not finding evidence of NodeInitializationAction for Dataproc having run - java

I am specifying a NodeInitializationAction for Dataproc as follows:
ClusterConfig clusterConfig = new ClusterConfig();
clusterConfig.setGceClusterConfig(...);
clusterConfig.setMasterConfig(...);
clusterConfig.setWorkerConfig(...);
List<NodeInitializationAction> initActions = new ArrayList<>();
NodeInitializationAction action = new NodeInitializationAction();
action.setExecutableFile("gs://mybucket/myExecutableFile");
initActions.add(action);
clusterConfig.setInitializationActions(initActions);
Then later:
Cluster cluster = new Cluster();
cluster.setProjectId("wide-isotope-147019");
cluster.setConfig(clusterConfig);
cluster.setClusterName("cat");
Then finally, I invoke the dataproc.create operation with the cluster. I can see the cluster being created, but when I ssh into the master machine ("cat-m" in us-central1-f), I see no evidence of the script I specified having been copied over or run.
So this leads to my questions:
What should I expect in terms of evidence? (edit: I found the script itself in /etc/google-dataproc/startup-scripts/dataproc-initialization-script-0).
Where does the script get invoked from? I know it runs as the user root, but beyond that, I am not sure where to find it. I did not find it in the root directory.
At what point does the Operation returned from the Create call change from "CREATING" to "RUNNING"? Does this happen before or after the script gets invoked, and does it matter if the exit code of the script is non-zero?
Thanks in advance.

Dataproc makes a number of guarantees about init actions:
each script should be downloaded and stored locally in:
/etc/google-dataproc/startup-scripts/dataproc-initialization-script-0
the output of the script will be captured in a "staging bucket" (either the bucket specified via --bucket option, or a Dataproc auto-generated bucket). Assuming your cluster is named my-cluster, if you describe master instance via gcloud compute instances describe my-cluster-m, the exact location is in dataproc-agent-output-directory metadata key
Cluster may not enter RUNNING state (and Operation may not complete) until all init actions execute on all nodes. If init action exits with non-zero code, or init action exceeds specified timeout, it will be reported as such
similarly if you resize a cluster, we guarantee that new workers do not join cluster until each worker is fully configured in isolation
if you still don't belive me :) inspect Dataproc agent log in /var/log/google-dataproc-agent-0.log and look for entries from BootstrapActionRunner

Related

How do I use transactions in Spring Data Redis Reactive?

I'm trying to use ReactiveRedisOperations from spring-data-redis 2.1.8 to do transactions, for example:
WATCH mykey
val = GET mykey
val = val + 1
MULTI
SET mykey $val
EXEC
But I cannot seem to find a way to do this when browsing the docs or the ReactiveRedisOperations. Is this not available in the reactive client, or how can you achieve this?
TL;DR: There's no proper support for Redis Transactions using the Reactive API
The reason lies in the execution model: How Redis executes transactions and how the reactive API is supposed to work.
When using transactions, a connection enters transactional state, then commands are queued and finally executed with EXEC. Executing queued commands with exec makes the execution of the individual commands conditional on the EXEC command.
Consider the following snippet (Lettuce code):
RedisReactiveCommands<String, String> commands = …;
commands.multi().then(commands.set("key", "value")).then(commands.exec());
This sequence shows command invocation in a somewhat linear fashion:
Issue MULTI
Once MULTI completes, issue a SET command
Once SET completes, call EXEC
The caveat is with SET: SET only completes after calling EXEC. So this means we have a forward reference to the exec command. We cannot listen to a command that is going to be executed in the future.
You could apply a workaround:
RedisReactiveCommands<String, String> commands = …
Mono<TransactionResult> tx = commands.multi()
.flatMap(ignore -> {
commands.set("key", "value").doOnNext(…).subscribe();
return commands.exec();
});
The workaround would incorporate command subscription within your code (Attention: This is an anti-pattern in reactive programming). After calling exec(), you get the TransactionResult in return.
Please also note: Although you can retrieve results via Mono<TransactionResult>, the actual SET command also emits its result (see doOnNext(…)).
That being said, it allows us circling back to the actual question: Because these concepts do not work well together, there's no API for transactional use in Spring Data Redis.

Spark: Job restart and retries

Suppose you have Spark + Standalone cluster manager. You opened spark session with some configs and want to launch SomeSparkJob 40 times in parallel with different arguments.
Questions
How to set reties amount on job failures?
How to restart jobs programmatically on failure? This could be useful if jobs failure due lack of resources. Than I can launch one by one all jobs that require extra resources.
How to restart spark application on job failure? This could be useful if job lack resources even when it's launched simultaneously. Than to change cores, CPU etc configs I need to relaunch application in Standalone cluster manager.
My workarounds
1) I pretty sure the 1st point is possible, since it's possible at spark local mode. I just don't know how to do that in standalone mode.
2-3) It's possible to hand listener on spark context like spark.sparkContext().addSparkListener(new SparkListener() {. But seems SparkListener lacks failure callbacks.
Also there is a bunch of methods with very poor documentation. I've never used them, but perhaps they could help to solve my problem.
spark.sparkContext().dagScheduler().runJob();
spark.sparkContext().runJob()
spark.sparkContext().submitJob()
spark.sparkContext().taskScheduler().submitTasks();
spark.sparkContext().dagScheduler().handleJobCancellation();
spark.sparkContext().statusTracker()
You can use SparkLauncher and control the flow.
import org.apache.spark.launcher.SparkLauncher;
public class MyLauncher {
public static void main(String[] args) throws Exception {
Process spark = new SparkLauncher()
.setAppResource("/my/app.jar")
.setMainClass("my.spark.app.Main")
.setMaster("local")
.setConf(SparkLauncher.DRIVER_MEMORY, "2g")
.launch();
spark.waitFor();
}
}
See API for more details.
Since it creates process you can check the Process status and retry e.g. try following:
public boolean isAlive()
If Process is not live start again, see API for more details.
Hoping this gives high level idea of how we can achieve what you mentioned in your question. There could be more ways to do same thing but thought to share this approach.
Cheers !
check your spark.sql.broadcastTimeout and spark.broadcast.blockSize properties, try to increase them .

How to decrease heartbeat time of slave nodes in Hadoop

I am working on AWS EMR.
I want to get the information of died task node as soon as possible. But as per default setting in hadoop, heartbeat is shared after every 10 minutes.
This is the default key-value pair in mapred-default - mapreduce.jobtracker.expire.trackers.interval : 600000ms
I tried to modify default value to 6000ms using - this link
After that, whenever I terminate any ec2 machine from EMR cluster, I am not able to see state change that fast.(in 6 seconds)
Resource manager REST API - http://MASTER_DNS_NAME:8088/ws/v1/cluster/nodes
Questions-
What is the command to check the mapreduce.jobtracker.expire.trackers.interval value in running EMR cluster(Hadoop cluster)?
Is this the right key I am using to get the state change ? If it is not, please suggest any other solution.
What is the difference between DECOMMISSIONING vs DECOMMISSIONED vs LOST state of nodes in Resource manager UI ?
Update
I tried numbers of times, but it is showing ambiguous behaviour. Sometimes, it moved to DECOMMISSIONING/DECOMMISIONED state, and sometime it directly move to LOST state after 10 minutes.
I need a quick state change, so that I can trigger some event.
Here is my sample code -
List<Configuration> configurations = new ArrayList<Configuration>();
Configuration mapredSiteConfiguration = new Configuration();
mapredSiteConfiguration.setClassification("mapred-site");
Map<String, String> mapredSiteConfigurationMapper = new HashMap<String, String>();
mapredSiteConfigurationMapper.put("mapreduce.jobtracker.expire.trackers.interval", "7000");
mapredSiteConfiguration.setProperties(mapredSiteConfigurationMapper);
Configuration hdfsSiteConfiguration = new Configuration();
hdfsSiteConfiguration.setClassification("hdfs-site");
Map<String, String> hdfsSiteConfigurationMapper = new HashMap<String, String>();
hdfsSiteConfigurationMapper.put("dfs.namenode.decommission.interval", "10");
hdfsSiteConfiguration.setProperties(hdfsSiteConfigurationMapper);
Configuration yarnSiteConfiguration = new Configuration();
yarnSiteConfiguration.setClassification("yarn-site");
Map<String, String> yarnSiteConfigurationMapper = new HashMap<String, String>();
yarnSiteConfigurationMapper.put("yarn.resourcemanager.nodemanagers.heartbeat-interval-ms", "5000");
yarnSiteConfiguration.setProperties(yarnSiteConfigurationMapper);
configurations.add(mapredSiteConfiguration);
configurations.add(hdfsSiteConfiguration);
configurations.add(yarnSiteConfiguration);
This is the settings that I changed into AWS EMR (internally Hadoop) to reduce the time between state change from RUNNING to other state(DECOMMISSIONING/DECOMMISIONED/LOST).
You can use "hdfs getconf". Please refer to this post Get a yarn configuration from commandline
These links give info about node manager health-check and the properties you have to check:
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/ClusterSetup.html
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeManager.html
Refer "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in the below link:
https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
Your queries are answered in this link:
https://issues.apache.org/jira/browse/YARN-914
Refer the "attachments" and "sub-tasks" area.
In simple terms, if the currently running application master and task containers gets shut-down properly (and/or re-initiated in different other nodes) then the node manager is said to be DECOMMISSIONED (gracefully), else it is LOST.
Update:
"dfs.namenode.decommission.interval" is for HDFS data node decommissioning, it does not matter if you are concerned only about node manager.
In exceptional cases, data node need not be a compute node.
Try yarn.nm.liveness-monitor.expiry-interval-ms (default 600000 - that is why you reported that the state changed to LOST in 10 minutes, set it to a smaller value as you require) instead of mapreduce.jobtracker.expire.trackers.interval.
You have set "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" as 5000, which means, the heartbeat goes to resource manager once in 5 seconds, whereas the default is 1000. Set it to a smaller value as you require.
hdfs getconf -confKey mapreduce.jobtracker.expire.trackers.interval
As mentioned in the other answer:
yarn.resourcemanager.nodemanagers.heartbeat-interval-ms should be set based on your network, if your network has high latency, you should set a bigger value.
3.
Its in DECOMMISSIONING when there are running containers and its waiting for them to complete so that those nodes can be decommissioned.
Its in LOST when its stuck in this process for too long. This state is reached after the set timeout is passed and decommissioning of node(s) couldn't be completed.
DECOMMISSIONED is when the decommissioning of the node(s) completes.
Reference : Resize a Running Cluster
For YARN NodeManager decommissioning, you can manually adjust the time
a node waits for decommissioning by setting
yarn.resourcemanager.decommissioning.timeout inside
/etc/hadoop/conf/yarn-site.xml; this setting is dynamically
propagated.

submit a job from eclipse to a running cluster on amazon EMR

I want to add jobs from my java code in eclipse to a running cluster of EMR for saving startup time (creating ec2, bootstrapping...).
I know how to run a new cluster from java code but it's terminating after all jobs are done.
RunJobFlowRequest runFlowRequest = new RunJobFlowRequest()
.withName("Some name")
.withInstances(instances)
// .withBootstrapActions(bootstrapActions)
.withJobFlowRole("EMR_EC2_DefaultRole")
.withServiceRole("EMR_DefaultRole")
.withSteps(firstJobStep, secondJobStep, thirdJobStep)
.withLogUri("s3n://path/to/logs");
// Run the jobs
RunJobFlowResult runJobFlowResult = mapReduce
.runJobFlow(runFlowRequest);
String jobFlowId = runJobFlowResult.getJobFlowId();
You have to set KeepJobFlowAliveWhenNoSteps parameter to TRUE, otherwise the cluster will be terminated after executing all the steps. If this property is set, the cluster will continue in waiting state after executing all the steps.
Add .withKeepJobFlowAliveWhenNoSteps(true) to the existing code.
Refer this doc for further details.

Where is jdoc or samples for Carbonado ReplicatedRepositoryBuilder

Other than bare-bones autogen stuff, I cannot find any documentation or samples that show how to actually use the ReplicatedRepositoryBuilder in Carbonado. In particular, the coding pattern to wire up on both the master and replica process sides.
I am using BDB-JE 5.0.58 with Carbonado 1.2.3. Here is the typical doc I found.
Questions, assuming I have a master producing process and a second client replica process which continually reads from the one-way replicated repository side:
are both processes supposed to have an instance of ReplicatedRepositoryBuilder with roles somehow reversed?
can either side execute resync(), even though I only want that driven from the master side?
how is replication being done, given these are simple library databases? Is there a listener under the covers on each end, doing what amounts to an in-process thread replaying changes? Is this going on at both ends?
is the replica client side supposed to do Repository actions through ReplicatedRepositoryBuilder.getReplicaRepositoryBuilder.build(), or just a normal BDBRepositoryBuilder.build()? I don't see how it is supposed to get a replication-friendly replica Repository handle to work from.
I would be ecstatic at just a simple java code example of both the producer and consumer sides, as they would look sitting in different processes on locahost, showing a make-change => resync => other-side-sees-it sequence.
// master producing process
RepositoryBuilder masterBuilder = new BDBRepositoryBuilder(); // ... set some props
RepoitoryBuilder replicaBuilder = new BDBRepositoryBuilder(); // ... set some props
ReplicatedRepositoryBuilder builder = new ReplicatedRepositoryBuilder();
builder.setMaster(true);
builder.setMasterRepositoryBuilder(masterBuilder);
builder.setReplicaRepositoryBuilder(replicaBuilder);
Repository repository = builder.build(); // master CRUD and query done via this handle
... do some CRUD to 'repository' ...
ResyncCapability capability = repository.getCapability(ResyncCapability.class);
capability.resync(MyStorableRecord.class, 1.0, null);
// client reader process
// ?

Categories