I am trying to run the following code in my local mac where a spark cluster with master and slaves are running
public void run(String inputFilePath) {
String master = "spark://192.168.1.199:7077";
SparkConf conf = new SparkConf()
.setAppName(WordCountTask.class.getName())
.setMaster(master);
JavaSparkContext context = new JavaSparkContext(conf);
context.textFile(inputFilePath)
.flatMap(text -> Arrays.asList(text.split(" ")).iterator())
.mapToPair(word -> new Tuple2<>(word, 1))
.reduceByKey((a, b) -> a + b)
.foreach(result -> LOGGER.info(
String.format("Word [%s] count [%d].", result._1(), result._2)));
}
}
However I get the following exception both in the master console and
Error while invoking RpcHandler#receive() on RPC id
5655526795459682754 java.io.EOFException
and in the program console
18/07/01 22:35:19 WARN StandaloneAppClient$ClientEndpoint: Failed to
connect to master 192.168.1.199:7077 org.apache.spark.SparkException:
Exception thrown in awaitResult
This runs well when I set the master as "local[*]" as given in this example.
I have seen examples where the jar is submited with spark-submit command but I am trying to run it programatically.
Just realised the version of Spark was different in the master/slave and the POM file of the code. Bumped up the version in the pom.xml to match the spark cluster and it worked.
Related
I am using temporal for running workflows. I have created a jar with my app. and running the below cmd from terminal java -jar build/libs/app-0.0.1-SNAPSHOT.jar
Getting the below error when trying to run the above cmd:-
Exception in thread "main" io.grpc.StatusRuntimeException: UNKNOWN
at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271)
at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252)
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165)
at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.getSystemInfo(WorkflowServiceGrpc.java:4139)
at io.temporal.serviceclient.SystemInfoInterceptor.getServerCapabilitiesOrThrow(SystemInfoInterceptor.java:95)
at io.temporal.serviceclient.ChannelManager.lambda$getServerCapabilities$3(ChannelManager.java:330)
at io.temporal.internal.retryer.GrpcRetryer.retryWithResult(GrpcRetryer.java:60)
at io.temporal.serviceclient.ChannelManager.connect(ChannelManager.java:297)
at io.temporal.serviceclient.WorkflowServiceStubsImpl.connect(WorkflowServiceStubsImpl.java:161)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
at java.base/java.lang.reflect.Method.invoke(Method.java:577)
at io.temporal.internal.WorkflowThreadMarker.lambda$protectFromWorkflowThread$1(WorkflowThreadMarker.java:83)
at jdk.proxy1/jdk.proxy1.$Proxy0.connect(Unknown Source)
at io.temporal.worker.WorkerFactory.start(WorkerFactory.java:210)
at com.hok.furlenco.workflow.refundStatusSync.RefundStatusSyncSaga.createWorkFlow(RefundStatusSyncSaga.java:41)
at com.hok.furlenco.workflow.refundStatusSync.RefundStatusSyncSaga.main(RefundStatusSyncSaga.java:17)
Caused by: java.nio.channels.UnsupportedAddressTypeException
at java.base/sun.nio.ch.Net.checkAddress(Net.java:146)
at java.base/sun.nio.ch.Net.checkAddress(Net.java:157)
at java.base/sun.nio.ch.SocketChannelImpl.checkRemote(SocketChannelImpl.java:816)
at java.base/sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:839)
at io.grpc.netty.shaded.io.netty.util.internal.SocketUtils$3.run(SocketUtils.java:91)
at io.grpc.netty.shaded.io.netty.util.internal.SocketUtils$3.run(SocketUtils.java:88)
at java.base/java.security.AccessController.doPrivileged(AccessController.java:569)
at io.grpc.netty.shaded.io.netty.util.internal.SocketUtils.connect(SocketUtils.java:88)
at io.grpc.netty.shaded.io.netty.channel.socket.nio.NioSocketChannel.doConnect(NioSocketChannel.java:322)
at io.grpc.netty.shaded.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:248)
at io.grpc.netty.shaded.io.netty.channel.DefaultChannelPipeline$HeadContext.connect(DefaultChannelPipeline.java:1342)
at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:548)
at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:533)
at io.grpc.netty.shaded.io.netty.channel.ChannelDuplexHandler.connect(ChannelDuplexHandler.java:54)
at io.grpc.netty.shaded.io.grpc.netty.WriteBufferingAndExceptionHandler.connect(WriteBufferingAndExceptionHandler.java:157)
at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:548)
at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.access$1000(AbstractChannelHandlerContext.java:61)
at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext$9.run(AbstractChannelHandlerContext.java:538)
at io.grpc.netty.shaded.io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
at io.grpc.netty.shaded.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
at io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:503)
at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.grpc.netty.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:833)
The app works fine when trying to run it from IDE:-
The temporal server is running as a docker container in my local:-
**
RefundStatusSyncSaga.java
**
/ gRPC stubs wrapper that talks to the local docker instance of temporal service.
WorkflowServiceStubs service = WorkflowServiceStubs.newLocalServiceStubs();
// client that can be used to start and signal workflows
WorkflowClient client = WorkflowClient.newInstance(service);
// worker factory that can be used to create workers for specific task queues
WorkerFactory factory = WorkerFactory.newInstance(client);
// Worker that listens on a task queue and hosts both workflow and activity implementations.
Worker worker = factory.newWorker(TASK_QUEUE);
// Workflows are stateful. So you need a type to create instances.
worker.registerWorkflowImplementationTypes(RefundSyncWorkflowImpl.class);
// Activities are stateless and thread safe. So a shared instance is used.
RefundStatusActivities tripBookingActivities = new RefundStatusActivitiesImpl();
worker.registerActivitiesImplementations(tripBookingActivities);
// Start all workers created by this factory.
factory.start();
System.out.println("Worker started for task queue: " + TASK_QUEUE);
// now we can start running instances of our saga - its state will be persisted
WorkflowOptions options = WorkflowOptions.newBuilder().setTaskQueue(TASK_QUEUE)
.setWorkflowId("1")
.setWorkflowIdReusePolicy( WorkflowIdReusePolicy.WORKFLOW_ID_REUSE_POLICY_REJECT_DUPLICATE)
.setCronSchedule("* * * * *")
.build();
RefundSyncWorkflow refundSyncWorkflow = client.newWorkflowStub(RefundSyncWorkflow.class, options);
refundSyncWorkflow.syncRefundStatus();
The complete code can be seen here -> https://github.com/iftekharkhan09/temporal-sample
I also come across this and I dig into the jar debugging. I found that in this check public static InetSocketAddress checkAddress(SocketAddress sa), the SocketAddress will become /xxx:443(my original addr is xxx:443). Then the validation check failed... I still don't know how to solve it.
update: one solution could be found here https://community.temporal.io/t/unable-to-run-temporal-workflow-from-jar/6607
public static void main(String [] args){
Properties config = new Properties();
config.put(AdminClientConfig.BOOTSRAP._SERVERS_CONFIG, "mybroker.ip.address:9092");
AdminClient admin = AdminClient.create(config);
ListTopicsResult ltr = admin.listTopics().names().get();
}
I am catching an ExecutionException with the error messages: org.apache.kafka.common.errors.TimeoutException: Call(callName:listTopics, deadlineMs=1599813311360, tries=1, nextAllowedTryMs=-9223372034707292162) timed out at 9223372036854775807 after 1 attempt(s)
StackTrace points to the class KafkaFutureImpl in wrapAndThrow.
I can't really paste the entire error since I am writing all of this through my mobile phone.
I am using Kafka Clients 2.6.0, JDK 1.8.0_191
this is very weird since the timeout happens instantly, and I have also tried passing arguments into get(time, timeunit) and I am getting the same result.
EDIT:
I was missing some dependencies in the POM. That resolved the issue.
I am having problem executing the deployment at the final stage of my jenkins job.
Caused by: org.jboss.as.cli.CommandFormatException: Undeploy failed: {"WFLYCTL0062:
Composite operation failed and was rolled back. Steps that failed:" => {"Operation step-1"
=> "WFLYDC0043: Cannot remove deployment abc-web-1.0.101.war from the domain as it is
still used by server groups [abc-demo-latest]"}}
at org.jboss.as.cli.handlers.UndeployHandler.doHandle(UndeployHandler.java:231)
at org.jboss.as.cli.handlers.CommandHandlerWithHelp.handle(CommandHandlerWithHelp.java:86)
at org.jboss.as.cli.impl.CommandContextImpl.handle(CommandContextImpl.java:581)
I do not have access to the admin console at the moment, but will appreciate any hints or help to what could possibly be wrong with my configuration.
Below is the gradle task failing:
task removeUnusedArtifactsFromJbossRepository(dependsOn: ['outputTenantSettings', 'ensureValidTenant']) << {
confirmToProceed("This command will remove unused artifacts from the Jboss Repository. Although this won't affect running systems, it may make manual rollbacks more difficult. Are you sure your wish to proceed?");
def serverGroup = getTenantProperties('config.properties').server_group_name;
ModelNode node = new ModelNode();
node.get(ClientConstants.OP).set(ClientConstants.READ_RESOURCE_OPERATION);
node.get(ClientConstants.OP_ADDR).add("/deployment");
ModelNode result = getCommandHelper().getModelControllerClient().execute(node);
def deployments = result.get("result").get("deployment");
if(deployments.getType() == ModelType.UNDEFINED) {
println "No artifacts to remove"
return;
}
for(def deployment : deployments.asList()) {
def deploymentName = deployment.asProperty().name
if(deploymentName.contains("abc-web")) {
println "Attempting to remove Deployment '${deploymentName}'"
try {
executeCliCommand("undeploy ${deploymentName}")
println "Deployment '${deploymentName}'' removed"
} catch(java.lang.IllegalArgumentException exception) {
// JBAS014653 means that the file is deployed to servers and can't be removed
// it's ok to ignore this
if(exception.getMessage().contains("JBAS014653")) {
println "Deployment '${deploymentName}' cannot be removed as it's currently in use"
} else {
throw exception;
}
}
}
}
}
After further investigation, I think the script is trying to delete a deployment from another server group i.e.abc-demo-latest, which I need to remain untouched or undeployed.
Is there a way I can change the script to only undeploy from the newly created server group, before deploying the new release?
I have tried the following:
if(serverGroup.equals("abc-demo-latest")) {
println("Undeploying customer-abc-latest server group deployment ${deploymentName}")
executeCliCommand("undeploy ${deploymentName}")
}else{
println("Undeploying customer-demo-ams-stable server group deployment ${deploymentName}")
executeCliCommand("undeploy ${deploymentName} --server-groups=other-server-group --keep-content")
}
But got the following error:
Failed to execute goal org.apache.maven.plugins:maven-deploy-plugin:2.7:
deploy (default-deploy) on project abc-parent: Failed to deploy artifacts:
Could not transfer artifact abc-parent:pom:1.0.26 from/to deployment (http://x.y.z:8080/nexus/content/repositories/releases/):
Failed to transfer file: http://x.y.z:8080/nexus/content/repositories/releases/abc-parent/1.0.26/abc-parent-1.0.26.pom.
Return code is: 400, ReasonPhrase: Bad Request
The key is in the error you posted there:
WFLYDC0043: Cannot remove deployment abc-web-1.0.101.war from the
domain as it is still used by server groups [abc-demo-latest]
You might not have undeployed for all of the server groups and that's why you can't undeploy from the domain. See the docs for undeploying. There is a command where you can undeploy from all of the relevant groups:
undeploy * --all-relevant-server-groups
Or just one server group:
undeploy * --server-groups=other-server-group
So i tried to configure my job to be submitted to yarn, but instead it runs locally:
config.set("yarn.resourcemanager.address", "ADDRESS:8032");
config.set("mapreduce.framework.name", "yarn");
config.set("fs.default.name", "hdfs://ADDRESS:8020");
If i set mapred.job.tracker it fails with:
Exception in thread "main" org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RpcServerException): Unknown rpc kind in rpc headerRPC_WRITABLE
because its not MR1.
So why is the app not submitted to yarn?
Solved it by doing this:
config.set("yarn.resourcemanager.address", "ADDRESS:8032");
config.set("yarn.resourcemanager.scheduler.address", "ADDRESS:8030");
config.set("yarn.resourcemanager.resource-tracker.address", "ADDRESS:8031");
config.set("yarn.resourcemanager.admin.address", "ADDRESS:8033");
instead of:
config.set("yarn.resourcemanager.address", "ADDRESS:8032");
I am trying to write a Yarn application master that submits itself into Yarn's registry (Hadoop 2.6)
In essence this is what the application master is trying to do:
ApplicationId id = ...
String path = ...
YarnConfiguration conf = new YarnConfiguration();
RegistryOperations registryOperations = RegistryOperationsFactory.createInstance(conf);
ServiceRecord record = new ServiceRecord();
record.set(YarnRegistryAttributes.YARN_ID, applicationId);
record.set(YarnRegistryAttributes.YARN_PERSISTENCE,PersistencePolicies.APPLICATION_ATTEMPT);
registryOperations.bind(path, record, BindFlags.CREATE | BindFlags.OVERWRITE);
When submitting this code to hadoop 2.6 I get the following exception:
org.apache.hadoop.service.ServiceStateException: Service RegistryOperations is in wrong state: INITED
at org.apache.hadoop.registry.client.impl.zk.CuratorService.checkServiceLive(CuratorService.java:184)
at org.apache.hadoop.registry.client.impl.zk.CuratorService.zkSet(CuratorService.java:633)
at org.apache.hadoop.registry.client.impl.zk.RegistryOperationsService.bind(RegistryOperationsService.java:114)
...
Googling the problem yield no usable results, so I tried inspecting the relevant Yarn's source code - currently without success
Anyone else having this problem? any Idea's of what causing it or how to solve it?
From reading the RegistryOperationsFactory javadoc it said that calling any of the create* functions will initialize the resulted RegistryOperations instance.
What I didnt know is that while RegistryOperationsFactory initialize it, It still need to get started.. so this code works:
ApplicationId id = ...
String path = ...
YarnConfiguration conf = new YarnConfiguration();
RegistryOperations registryOperations = RegistryOperationsFactory.createInstance(conf);
registryOperations.start();
ServiceRecord record = new ServiceRecord();
record.set(YarnRegistryAttributes.YARN_ID, applicationId);
record.set(YarnRegistryAttributes.YARN_PERSISTENCE,PersistencePolicies.APPLICATION_ATTEMPT);
registryOperations.bind(path, record, BindFlags.CREATE | BindFlags.OVERWRITE);