I successfully Build Tez-0.6.0 against Hadoop-2.5.2
Then I configured Tez-0.6.0 as like in http://tez.apache.org/install.html
Moved Tez lib package to HDFS location and updated my tez-site.xml
<property>
<name>tez.lib.uris</name>
<value>${fs.default.name}/apps/Tez/,${fs.default.name}/apps/Tez/lib/</value>
</property>
After that I tried the sample test for tez
hadoop jar tez-examples-0.6.0.jar orderedwordcount <input> <output>
But I face following error while running this command
Running OrderedWordCount
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/Hadoop/
share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBind
er.class]
SLF4J: Found binding in [jar:file:/C:/Tez/lib
/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/04/15 10:47:57 INFO client.TezClient: Tez Client Version: [ component=tez-api
, version=0.6.0, revision=${buildNumber}, SCM-URL=scm:git:https://git-wip-us.apa
che.org/repos/asf/tez.git, buildTime=2015-04-15T01:13:02Z ]
15/04/15 10:48:00 INFO client.TezClient: Submitting DAG application with id: app
lication_1429073725727_0005
15/04/15 10:48:00 INFO Configuration.deprecation: fs.default.name is deprecated.
Instead, use fs.defaultFS
15/04/15 10:48:00 INFO client.TezClientUtils: Using tez.lib.uris value from conf
iguration: hdfs://HA-Cluster/apps/Tez/,hdfs://HA-Cluster/apps/Tez/lib/
15/04/15 10:48:01 INFO client.TezClient: Stage directory /tmp/app/tez/sta
ging doesn't exist and is created
15/04/15 10:48:01 INFO client.TezClient: Tez system stage directory hdfs://HA-cluster
/tmp/app/tez/staging/.tez/application_1429073725727_0005 doesn't ex
ist and is created
15/04/15 10:48:02 INFO client.TezClient: Submitting DAG to YARN, applicationId=a
pplication_1429073725727_0005, dagName=OrderedWordCount
15/04/15 10:48:03 INFO impl.YarnClientImpl: Submitted application application_14
29073725727_0005
15/04/15 10:48:03 INFO client.TezClient: The url to track the Tez AM: http://syn
cserver34:8088/proxy/application_1429073725727_0005/
15/04/15 10:48:03 INFO client.DAGClientImpl: Waiting for DAG to start running
15/04/15 10:48:09 INFO client.DAGClientImpl: DAG completed. FinalState=FAILED
OrderedWordCount failed with diagnostics: [Application application_1429073725727
_0005 failed 2 times due to AM Container for appattempt_1429073725727_0005_00000
2 exited with exitCode: -1073741515 due to: Exception from container-launch: Ex
itCodeException exitCode=-1073741515:
ExitCodeException exitCode=-1073741515:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
702)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.la
unchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
ontainerLaunch.call(ContainerLaunch.java:300)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
ontainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:615)
at java.lang.Thread.run(Thread.java:744)
1 file(s) moved.
Container exited with a non-zero exit code -1073741515
.Failing this attempt.. Failing the application.]
While Seeing at Resourcemanager log:
15/04/15 12:56:15 ERROR scheduler.SchedulerApplicationAttempt: Error trying to a
ssign container token and NM token to an allocated container container_142908227
1173_0001_01_000001
java.lang.IllegalArgumentException: java.net.UnknownHostException: MasterNode
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUti
l.java:373)
at org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerToken(Bu
ilderUtils.java:247)
at org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTok
enSecretManager.createContainerToken(RMContainerTokenSecretManager.java:199)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerAppl
icationAttempt.pullNewlyAllocatedContainersAndNMTokens(SchedulerApplicationAttem
pt.java:425)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.F
iCaSchedulerApp.getAllocation(FiCaSchedulerApp.java:248)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.Capa
cityScheduler.allocate(CapacityScheduler.java:736)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAtte
mptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:816)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAtte
mptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:809)
at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.
doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMa
chineFactory.java:302)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMach
ineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine
.doTransition(StateMachineFactory.java:448)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAtte
mptImpl.handle(RMAppAttemptImpl.java:649)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAtte
mptImpl.handle(RMAppAttemptImpl.java:104)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$Applica
tionAttemptEventDispatcher.handle(ResourceManager.java:761)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$Applica
tionAttemptEventDispatcher.handle(ResourceManager.java:742)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher
.java:173)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.ja
va:106)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.net.UnknownHostException: MasterNode
... 19 more
Problem might be while connecting to nodemanager it unable to handshake with ResourceManager.
If I try in single node hadoop cluster mean It working correctly.
Try to add property
yarn.nodemanager.delete.debug-delay-sec
1200
One thing while running the "launchcontainer.cmd" located in hadoop \tmp..\appcache location.It arise an issue in accessing the Dll for running mapreduce on windows platform, ie MSVCR100.dll is missing to handle the Tez job.As bellow
"The program can't start because MSCVR100.dll is missing from your
computer. Try reinstalling the program to fix this issue"
Provide Full privilege to hadoop-tmp directory and try to replaced/Moved msvcr100.dll(C:\Windows\System32) file in windows machine to run the mapreduce program for TEZ job.
Related
I want to Deploy Alluxio on a Cluster with HA.My CDH version: 3.0.0+cdh6.3.2.
I build Alluxio with a specific Hadoop release version:
mvn install -Phadoop-3 -Dhadoop.version=3.0.0 -DskipTests
I put alluxio-assembly-server-2.4.1-2-SNAPSHOT-jar-with-dependencies.jar and alluxio-underfs-hdfs-2.4.1-2-SNAPSHOT-jar-with-dependencies.jar in the lib/ folder of Alluxio every node.
/opt/alluxio-2.4.1-1/conf/alluxio-site.properties:
alluxio.master.mount.table.root.ufs=hdfs://nameservice1/alluxio/data
alluxio.master.journal.type=UFS
alluxio.master.journal.folder=hdfs://nameservice1/alluxio/journal/
alluxio.master.security.impersonation.root.users=*
alluxio.worker.tieredstore.level0.dirs.quota=10GB
alluxio.worker.tieredstore.level1.dirs.quota=10GB
alluxio.worker.tieredstore.level2.dirs.quota=10GB
alluxio.zookeeper.enabled=true
alluxio.zookeeper.address=test-cdh001:2181,test-cdh002:2181,test-cdh003:2181
alluxio.underfs.hdfs.configuration=/etc/hadoop/conf/core-site.xml:/etc/hadoop/conf/hdfs-site.xml
when I format Alluxio cluster with the following command in one of the master nodes:
./bin/alluxio format
I got a error:
Executing the following command on all worker nodes and logging to /opt/alluxio-2.4.1-1/logs/task.log: /opt/alluxio-2.4.1-1/bin/alluxio formatWorker
Waiting for tasks to finish...
All tasks finished
Formatting Alluxio Master # test-cdh001
2021-01-07 18:35:58,766 INFO Format - Formatting master journal: hdfs://nameservice1/alluxio/journal/
2021-01-07 18:35:58,806 INFO ExtensionFactoryRegistry - Loading core jars from /opt/alluxio-2.4.1-1/lib
2021-01-07 18:35:58,869 INFO ExtensionFactoryRegistry - Loading extension jars from /opt/alluxio-2.4.1-1/extensions
2021-01-07 18:35:58,886 WARN ExtensionFactoryRegistry - No factory implementation supports the path hdfs://nameservice1/alluxio/journal/BlockMaster
2021-01-07 18:35:58,887 INFO ExtensionFactoryRegistry - Loading core jars from /opt/alluxio-2.4.1-1/lib
2021-01-07 18:35:58,906 INFO ExtensionFactoryRegistry - Loading extension jars from /opt/alluxio-2.4.1-1/extensions
2021-01-07 18:35:58,915 WARN ExtensionFactoryRegistry - No factory implementation supports the path hdfs://nameservice1/alluxio/journal/BlockMaster
2021-01-07 18:35:58,915 ERROR Format - Failed to format
java.lang.IllegalArgumentException: No Under File System Factory found for: hdfs://nameservice1/alluxio/journal/BlockMaster
at alluxio.underfs.UnderFileSystem$Factory.create(UnderFileSystem.java:95)
at alluxio.master.journal.ufs.UfsJournal.<init>(UfsJournal.java:149)
at alluxio.master.journal.ufs.UfsJournalSystem.createJournal(UfsJournalSystem.java:73)
at alluxio.master.journal.ufs.UfsJournalSystem.createJournal(UfsJournalSystem.java:47)
at alluxio.cli.Format.format(Format.java:120)
at alluxio.cli.Format.main(Format.java:97)
Any help would be much appreciated.
Add your hadoop version to the mount command with "--option alluxio.underfs.version="
Ex. alluxio fs mount --option alluxio.underfs.version=3.2 /mnt/hdfs/emr hdfs://hostname:8020/tmp/emr
The command below resolves all required dependencies to execute my program:
mvn exec:java -Dexec.mainClass='com.cnrg.cdproc.App'
but I can't connect to MongoDB with it, when it's possible to execute the same code from the jar using this command: java -jar target/cdproc-1.0.0-SNAPSHOT-shaded.jar.
Here's the error I'm getting starting the app via maven:
[Thread-1] INFO org.mongodb.driver.cluster - Cluster created with settings {hosts=[localhost:9032], mode=SINGLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms'}
[cluster-ClusterId{value='5fc969c67d495439ce9a3796', description='null'}-localhost:9032] INFO org.mongodb.driver.cluster - Exception in monitor thread while connecting to server localhost:9032
com.mongodb.MongoInterruptedException: Interrupted acquiring a permit to retrieve an item from the pool
at com.mongodb.internal.connection.ConcurrentPool.acquirePermit(ConcurrentPool.java:203)
at com.mongodb.internal.connection.ConcurrentPool.get(ConcurrentPool.java:140)
at com.mongodb.internal.connection.ConcurrentPool.get(ConcurrentPool.java:123)
at com.mongodb.internal.connection.PowerOfTwoBufferPool.getByteBuffer(PowerOfTwoBufferPool.java:82)
at com.mongodb.internal.connection.PowerOfTwoBufferPool.getBuffer(PowerOfTwoBufferPool.java:77)
at com.mongodb.internal.connection.SocketStream.getBuffer(SocketStream.java:93)
at com.mongodb.internal.connection.InternalStreamConnection.getBuffer(InternalStreamConnection.java:684)
at com.mongodb.internal.connection.ByteBufferBsonOutput.getByteBufferAtIndex(ByteBufferBsonOutput.java:93)
at com.mongodb.internal.connection.ByteBufferBsonOutput.getCurrentByteBuffer(ByteBufferBsonOutput.java:82)
at com.mongodb.internal.connection.ByteBufferBsonOutput.writeByte(ByteBufferBsonOutput.java:77)
at org.bson.io.OutputBuffer.write(OutputBuffer.java:150)
at org.bson.io.OutputBuffer.writeInt32(OutputBuffer.java:56)
at com.mongodb.internal.connection.RequestMessage.writeMessagePrologue(RequestMessage.java:158)
at com.mongodb.internal.connection.RequestMessage.encode(RequestMessage.java:137)
at com.mongodb.internal.connection.CommandMessage.encode(CommandMessage.java:59)
at com.mongodb.internal.connection.InternalStreamConnection.sendAndReceive(InternalStreamConnection.java:269)
at com.mongodb.internal.connection.CommandHelper.sendAndReceive(CommandHelper.java:83)
at com.mongodb.internal.connection.CommandHelper.executeCommand(CommandHelper.java:33)
at com.mongodb.internal.connection.InternalStreamConnectionInitializer.initializeConnectionDescription(InternalStreamConnectionInitializer.java:107)
at com.mongodb.internal.connection.InternalStreamConnectionInitializer.initialize(InternalStreamConnectionInitializer.java:62)
at com.mongodb.internal.connection.InternalStreamConnection.open(InternalStreamConnection.java:144)
at com.mongodb.internal.connection.DefaultServerMonitor$ServerMonitorRunnable.lookupServerDescription(DefaultServerMonitor.java:188)
at com.mongodb.internal.connection.DefaultServerMonitor$ServerMonitorRunnable.run(DefaultServerMonitor.java:144)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.InterruptedException
at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1343)
at java.base/java.util.concurrent.Semaphore.acquire(Semaphore.java:318)
at com.mongodb.internal.connection.ConcurrentPool.acquirePermit(ConcurrentPool.java:199)
... 23 more
What's the difference between starting the program through the jar and starting the program through maven? How to troubleshoot that issue if possible and start the program from maven?
I am trying to show all the services using the Jps command, but when i hit the console the below nodes are only showing
3633 SecondaryNameNode
4228 Jps
3493 DataNode
4198 NodeManager
4088 ResourceManager
I am trying to start all services using start-dfs.sh and start-yarn.sh.But after that also the result is same.I went into the logs to find the exception,i saw below exception .
2018-06-29 16:02:31,414 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup#0.0.0.0:50070
2018-06-29 16:02:31,414 WARN org.apache.hadoop.http.HttpServer2: HttpServer Acceptor: isRunning is false. Rechecking.
2018-06-29 16:02:31,416 WARN org.apache.hadoop.http.HttpServer2: HttpServer Acceptor: isRunning is false
2018-06-29 16:02:31,423 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system...
2018-06-29 16:02:31,425 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
2018-06-29 16:02:31,425 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2018-06-29 16:02:31,425 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.io.IOException: Failed to load an FSImage file!
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:673)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:281)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1006)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:736)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:531)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:587)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:754)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:738)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1427)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1493)
2018-06-29 16:02:31,428 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2018-06-29 16:02:31,454 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1
************************************************************/
I have no clue to solve this , please help.I am using hadoop-2.5.0-cdh5.3.2.
Follow these steps:
Check the path to your FSImage, i.e, where the Namenode is storing the FSImage. In my case it is /hadoop/hdfs/namenode/current
Check the last create FSImage in Namenode and Secondary Namenode. Find the latest FSImage available.
Copy the latest FSImage from Secondary Namenode to Namenode with the same permissions it had in Secondary Namenode. By default, it is hdfs:hadoop in my case
After copying, try restarting all the services.
Format the namenode: "hdfs namenode -format"
Now, ensure the clusterID= of namenode and datanode as same. If
not,replace with one another.
In my case,
/Path_installation_dir/hdata/dfs/name/current/VERSION
/Path_installation_dir/hdata/dfs/data/current/VERSION
All done. start dfs, yarn.
In my case, I had 2 namenodes running and after a server reboot data got corrupted. I was getting "Failed to load image from FSImageFile" in the logs.
In my case, namenode-0 was still healthy and namenode-1 was having the problem
I proceeded as follows:
scale down namenode to 1: leave only namenode-0
delete namenode-1 PVC
make sure the volume is not there with kubectl get pvc -n hadoop
scale namenode back to 2
namenode-0 took care of Data Corruption and made it available to namenode-1
I have set some MapReduce configuration in my main method as so
configuration.set("mapreduce.jobtracker.address", "localhost:54311");
configuration.set("mapreduce.framework.name", "yarn");
configuration.set("yarn.resourcemanager.address", "localhost:8032");
Now when I launch the mapreduce task, the process is tracked (I can see it in my cluster dashboard (the one listening on port 8088)), but the process never finishes. It remains blocked at the following line:
15/06/30 15:56:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/06/30 15:56:17 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8032
15/06/30 15:56:18 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/06/30 15:56:18 INFO input.FileInputFormat: Total input paths to process : 1
15/06/30 15:56:18 INFO mapreduce.JobSubmitter: number of splits:1
15/06/30 15:56:18 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1435241671439_0008
15/06/30 15:56:19 INFO impl.YarnClientImpl: Submitted application application_1435241671439_0008
15/06/30 15:56:19 INFO mapreduce.Job: The url to track the job: http://10.0.0.10:8088/proxy/application_1435241671439_0008/
15/06/30 15:56:19 INFO mapreduce.Job: Running job: job_1435241671439_0008
Someone has an idea?
Edit : in my yarn nodemanager log, I have this message
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1435241671439_0003_03_000001
2015-06-30 15:44:38,396 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1435241671439_0002_04_000001
Edit 2 :
I also have in the yarn manager log, some exception that happened sooner (for a precedent mapreduce call) :
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.BindException: Problem binding to [0.0.0.0:8040] java.net.BindException: Address already in use; For more details see:
Solution : I killed all the daemon processes and restarted again hadoop ! In fact, when I ran jps, I was still getting hadoop daemons though I had stopped them. This was a mismatch of HADOOP_PID_DIR
The default port of nodemanage of yarn is 8040. The error says that the port is already in use. Stop all the hadoop process, if you dont have data, may be format namenode once and try running the job again. From both of your edits, the issue is surely with node manager
Solution : I killed all the daemon processes and restarted again hadoop ! In fact, when I ran jps, I was still getting hadoop daemons though I had stopped them. This was related to a mismatch of HADOOP_PID_DIR
I installed a cloudera cluster with a vagrant box.
I get an error when I launch the following example:
hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep input output23 'dfs[a-z.]+'
I went to check the log in /var/log/hadoop-yarn.
There several log file, in yarn-yarn-nodemanager-cdh-master.log, there is the following stackstrace:
2015-06-17 11:42:42,398 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1434535025160_0001_000001 (auth:SIMPLE)
2015-06-17 11:42:42,597 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_1434535025160_0001_01_
000001 by user vagrant
2015-06-17 11:42:42,762 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Creating a new application reference for app appli
cation_1434535025160_0001
2015-06-17 11:42:42,776 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1434535025160_0001 tran
sitioned from NEW to INITING
2015-06-17 11:42:42,778 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=vagrant IP=10.10.50.5 OPERATION=Start Container Request
TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1434535025160_0001 CONTAINERID=container_1434535025160_0001_01_000001
2015-06-17 11:42:43,997 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
java.lang.IllegalArgumentException: Wrong FS: hdfs://var/log/hadoop-yarn, expected: hdfs://cdh-master:8020
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.verifyAndCreateRemoteLogDir(LogAggregationService.java:192)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:319)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
at java.lang.Thread.run(Thread.java:744)
2015-06-17 11:42:44,000 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Adding container_1434535025160_0001_01_000001 t
o application application_1434535025160_0001
2015-06-17 11:42:44,001 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..
2015-06-17 11:42:44,034 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup#0.0.0.0:8042
2015-06-17 11:42:44,035 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Applications still running : [application_14345350
I've seen this error
java.lang.IllegalArgumentException: Wrong FS:
hdfs://var/log/hadoop-yarn, expected: hdfs://cdh-master:8020
in the following post: Failed to start Jobtracker and Tasktracker in CDH pseudo cluster, but this did not helped me much.
Does anyone has an idea?
Thx
Change the property yarn.nodemanager.remote-app-log-dir in yarn-site.xml config file either to:
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>hdfs://cdh-master:8020/var/log/hadoop-yarn/apps</value>
</property>
or
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/var/log/hadoop-yarn/apps</value>
</property>
The second option will use the default filesystem that should be set to HDFS anyway.