I am trying to run Oryx with Hadoop 2.4. Hadoop starts successfully with warning:
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable.
Oryx also starts successfully. But when I ingest data into it, following exception is thrown :
2014-08-22 14:35:05,835 ERROR [IPC Server handler 3 on 37788] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1408697508855_0002_m_000000_0 - exited : org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
2014-08-22 14:35:05,835 INFO [IPC Server handler 3 on 37788] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from attempt_1408697508855_0002_m_000000_0: Error: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
2014-08-22 14:35:05,837 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1408697508855_0002_m_000000_0: Error: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
2014-08-22 14:35:05,840 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1408697508855_0002_m_000000_0 TaskAttempt Transitioned from RUNNING to FAIL_CONTAINER_CLEANUP
Has anyone faced such kind of issue earlier? Any kind of help will be appreciable!
I'm copying a few items from your thread on the mailing list:
This may be a problem with the installation's Snappy libraries, but that seems to have been resolved
The YARN containers are being killed for running past virtual memory limits. See the FAQ -- this may be a Java issue you can work around by changing YARN config.
Final problem seems to be another issue with YARN config, although it's not clear. I suggest starting from fresh config and/or a distribution that is preconfigured and known to work, if possible.
Related
I used the following command to run the spark java example of wordcount:-
time spark-submit --deploy-mode cluster --master spark://192.168.0.7:7077 --class org.apache.spark.examples.JavaWordCount /home/pi/Desktop/example/new/target/javaword.jar /books_500.txt
I have copied the same jar file into all nodes in the same location. (Copying into HDFS didn't work for me.) When I run it, the following is the output:-
Running Spark using the REST application submission protocol.
16/07/14 16:32:18 INFO rest.RestSubmissionClient: Submitting a request to launch an application in spark://192.168.0.7:7077.
16/07/14 16:32:30 WARN rest.RestSubmissionClient: Unable to connect to server spark://192.168.0.7:7077.
Warning: Master endpoint spark://192.168.0.7:7077 was not a REST server. Falling back to legacy submission gateway instead.
16/07/14 16:32:30 WARN util.Utils: Your hostname, master02 resolves to a loopback address: 127.0.1.1; using 192.168.0.7 instead (on interface wlan0)
16/07/14 16:32:30 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address
16/07/14 16:32:31 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
It just stops there, quits the job and waits for the next command on terminal. I didn't understand this error without an error message. Help needed please...!!
I have set some MapReduce configuration in my main method as so
configuration.set("mapreduce.jobtracker.address", "localhost:54311");
configuration.set("mapreduce.framework.name", "yarn");
configuration.set("yarn.resourcemanager.address", "localhost:8032");
Now when I launch the mapreduce task, the process is tracked (I can see it in my cluster dashboard (the one listening on port 8088)), but the process never finishes. It remains blocked at the following line:
15/06/30 15:56:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/06/30 15:56:17 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8032
15/06/30 15:56:18 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/06/30 15:56:18 INFO input.FileInputFormat: Total input paths to process : 1
15/06/30 15:56:18 INFO mapreduce.JobSubmitter: number of splits:1
15/06/30 15:56:18 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1435241671439_0008
15/06/30 15:56:19 INFO impl.YarnClientImpl: Submitted application application_1435241671439_0008
15/06/30 15:56:19 INFO mapreduce.Job: The url to track the job: http://10.0.0.10:8088/proxy/application_1435241671439_0008/
15/06/30 15:56:19 INFO mapreduce.Job: Running job: job_1435241671439_0008
Someone has an idea?
Edit : in my yarn nodemanager log, I have this message
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1435241671439_0003_03_000001
2015-06-30 15:44:38,396 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1435241671439_0002_04_000001
Edit 2 :
I also have in the yarn manager log, some exception that happened sooner (for a precedent mapreduce call) :
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.BindException: Problem binding to [0.0.0.0:8040] java.net.BindException: Address already in use; For more details see:
Solution : I killed all the daemon processes and restarted again hadoop ! In fact, when I ran jps, I was still getting hadoop daemons though I had stopped them. This was a mismatch of HADOOP_PID_DIR
The default port of nodemanage of yarn is 8040. The error says that the port is already in use. Stop all the hadoop process, if you dont have data, may be format namenode once and try running the job again. From both of your edits, the issue is surely with node manager
Solution : I killed all the daemon processes and restarted again hadoop ! In fact, when I ran jps, I was still getting hadoop daemons though I had stopped them. This was related to a mismatch of HADOOP_PID_DIR
I installed a cloudera cluster with a vagrant box.
I get an error when I launch the following example:
hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep input output23 'dfs[a-z.]+'
I went to check the log in /var/log/hadoop-yarn.
There several log file, in yarn-yarn-nodemanager-cdh-master.log, there is the following stackstrace:
2015-06-17 11:42:42,398 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1434535025160_0001_000001 (auth:SIMPLE)
2015-06-17 11:42:42,597 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_1434535025160_0001_01_
000001 by user vagrant
2015-06-17 11:42:42,762 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Creating a new application reference for app appli
cation_1434535025160_0001
2015-06-17 11:42:42,776 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1434535025160_0001 tran
sitioned from NEW to INITING
2015-06-17 11:42:42,778 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=vagrant IP=10.10.50.5 OPERATION=Start Container Request
TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1434535025160_0001 CONTAINERID=container_1434535025160_0001_01_000001
2015-06-17 11:42:43,997 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
java.lang.IllegalArgumentException: Wrong FS: hdfs://var/log/hadoop-yarn, expected: hdfs://cdh-master:8020
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.verifyAndCreateRemoteLogDir(LogAggregationService.java:192)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:319)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
at java.lang.Thread.run(Thread.java:744)
2015-06-17 11:42:44,000 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Adding container_1434535025160_0001_01_000001 t
o application application_1434535025160_0001
2015-06-17 11:42:44,001 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..
2015-06-17 11:42:44,034 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup#0.0.0.0:8042
2015-06-17 11:42:44,035 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Applications still running : [application_14345350
I've seen this error
java.lang.IllegalArgumentException: Wrong FS:
hdfs://var/log/hadoop-yarn, expected: hdfs://cdh-master:8020
in the following post: Failed to start Jobtracker and Tasktracker in CDH pseudo cluster, but this did not helped me much.
Does anyone has an idea?
Thx
Change the property yarn.nodemanager.remote-app-log-dir in yarn-site.xml config file either to:
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>hdfs://cdh-master:8020/var/log/hadoop-yarn/apps</value>
</property>
or
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/var/log/hadoop-yarn/apps</value>
</property>
The second option will use the default filesystem that should be set to HDFS anyway.
I successfully Build Tez-0.6.0 against Hadoop-2.5.2
Then I configured Tez-0.6.0 as like in http://tez.apache.org/install.html
Moved Tez lib package to HDFS location and updated my tez-site.xml
<property>
<name>tez.lib.uris</name>
<value>${fs.default.name}/apps/Tez/,${fs.default.name}/apps/Tez/lib/</value>
</property>
After that I tried the sample test for tez
hadoop jar tez-examples-0.6.0.jar orderedwordcount <input> <output>
But I face following error while running this command
Running OrderedWordCount
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/Hadoop/
share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBind
er.class]
SLF4J: Found binding in [jar:file:/C:/Tez/lib
/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/04/15 10:47:57 INFO client.TezClient: Tez Client Version: [ component=tez-api
, version=0.6.0, revision=${buildNumber}, SCM-URL=scm:git:https://git-wip-us.apa
che.org/repos/asf/tez.git, buildTime=2015-04-15T01:13:02Z ]
15/04/15 10:48:00 INFO client.TezClient: Submitting DAG application with id: app
lication_1429073725727_0005
15/04/15 10:48:00 INFO Configuration.deprecation: fs.default.name is deprecated.
Instead, use fs.defaultFS
15/04/15 10:48:00 INFO client.TezClientUtils: Using tez.lib.uris value from conf
iguration: hdfs://HA-Cluster/apps/Tez/,hdfs://HA-Cluster/apps/Tez/lib/
15/04/15 10:48:01 INFO client.TezClient: Stage directory /tmp/app/tez/sta
ging doesn't exist and is created
15/04/15 10:48:01 INFO client.TezClient: Tez system stage directory hdfs://HA-cluster
/tmp/app/tez/staging/.tez/application_1429073725727_0005 doesn't ex
ist and is created
15/04/15 10:48:02 INFO client.TezClient: Submitting DAG to YARN, applicationId=a
pplication_1429073725727_0005, dagName=OrderedWordCount
15/04/15 10:48:03 INFO impl.YarnClientImpl: Submitted application application_14
29073725727_0005
15/04/15 10:48:03 INFO client.TezClient: The url to track the Tez AM: http://syn
cserver34:8088/proxy/application_1429073725727_0005/
15/04/15 10:48:03 INFO client.DAGClientImpl: Waiting for DAG to start running
15/04/15 10:48:09 INFO client.DAGClientImpl: DAG completed. FinalState=FAILED
OrderedWordCount failed with diagnostics: [Application application_1429073725727
_0005 failed 2 times due to AM Container for appattempt_1429073725727_0005_00000
2 exited with exitCode: -1073741515 due to: Exception from container-launch: Ex
itCodeException exitCode=-1073741515:
ExitCodeException exitCode=-1073741515:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
702)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.la
unchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
ontainerLaunch.call(ContainerLaunch.java:300)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
ontainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:615)
at java.lang.Thread.run(Thread.java:744)
1 file(s) moved.
Container exited with a non-zero exit code -1073741515
.Failing this attempt.. Failing the application.]
While Seeing at Resourcemanager log:
15/04/15 12:56:15 ERROR scheduler.SchedulerApplicationAttempt: Error trying to a
ssign container token and NM token to an allocated container container_142908227
1173_0001_01_000001
java.lang.IllegalArgumentException: java.net.UnknownHostException: MasterNode
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUti
l.java:373)
at org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerToken(Bu
ilderUtils.java:247)
at org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTok
enSecretManager.createContainerToken(RMContainerTokenSecretManager.java:199)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerAppl
icationAttempt.pullNewlyAllocatedContainersAndNMTokens(SchedulerApplicationAttem
pt.java:425)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.F
iCaSchedulerApp.getAllocation(FiCaSchedulerApp.java:248)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.Capa
cityScheduler.allocate(CapacityScheduler.java:736)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAtte
mptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:816)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAtte
mptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:809)
at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.
doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMa
chineFactory.java:302)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMach
ineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine
.doTransition(StateMachineFactory.java:448)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAtte
mptImpl.handle(RMAppAttemptImpl.java:649)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAtte
mptImpl.handle(RMAppAttemptImpl.java:104)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$Applica
tionAttemptEventDispatcher.handle(ResourceManager.java:761)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$Applica
tionAttemptEventDispatcher.handle(ResourceManager.java:742)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher
.java:173)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.ja
va:106)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.net.UnknownHostException: MasterNode
... 19 more
Problem might be while connecting to nodemanager it unable to handshake with ResourceManager.
If I try in single node hadoop cluster mean It working correctly.
Try to add property
yarn.nodemanager.delete.debug-delay-sec
1200
One thing while running the "launchcontainer.cmd" located in hadoop \tmp..\appcache location.It arise an issue in accessing the Dll for running mapreduce on windows platform, ie MSVCR100.dll is missing to handle the Tez job.As bellow
"The program can't start because MSCVR100.dll is missing from your
computer. Try reinstalling the program to fix this issue"
Provide Full privilege to hadoop-tmp directory and try to replaced/Moved msvcr100.dll(C:\Windows\System32) file in windows machine to run the mapreduce program for TEZ job.
I want to run Nutch 1.9 in Eclipse on Windows. I followed the tutorial from http://wiki.apache.org/nutch/RunNutchInEclipse and opened the project in Eclipse.
But when I run Nutch, I get the following error:
2014-09-19 17:45:48,039 INFO crawl.Injector (Injector.java:inject(283)) - Injector: starting at 2014-09-19 17:45:48
2014-09-19 17:45:48,043 INFO crawl.Injector (Injector.java:inject(284)) - Injector: crawlDb: K:/kumar/Nutch/apache-nutch-1.9/crawlresult
2014-09-19 17:45:48,043 INFO crawl.Injector (Injector.java:inject(285)) - Injector: urlDir: K:/kumar/Nutch/apache-nutch-1.9/urls
2014-09-19 17:45:48,043 INFO crawl.Injector (Injector.java:inject(294)) - Injector: Converting injected urls to crawl db entries.
2014-09-19 17:45:48,207 INFO jvm.JvmMetrics (JvmMetrics.java:init(71)) - Initializing JVM Metrics with processName=JobTracker, sessionId=
2014-09-19 17:45:48,252 WARN mapred.JobClient (JobClient.java:configureCommandLineOptions(661)) - No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
2014-09-19 17:45:48,268 INFO mapred.FileInputFormat (FileInputFormat.java:listStatus(192)) - Total input paths to process : 1
2014-09-19 17:45:48,485 INFO mapred.JobClient (JobClient.java:monitorAndPrintJob(1275)) - Running job: job_local_0001
2014-09-19 17:45:48,487 INFO mapred.FileInputFormat (FileInputFormat.java:listStatus(192)) - Total input paths to process : 1
2014-09-19 17:45:48,526 INFO mapred.MapTask (MapTask.java:runOldMapper(347)) - numReduceTasks: 0
2014-09-19 17:45:48,565 INFO plugin.PluginRepository (PluginManifestParser.java:parsePluginFolder(87)) - Plugins: looking in: K:\Nutch\apache-nutch-1.9\plugins
2014-09-19 17:45:48,566 WARN plugin.PluginRepository (PluginManifestParser.java:parsePluginFolder(101)) - java.io.FileNotFoundException: K:\Nutch\apache-nutch-1.9\plugins\creativecommons\plugin.xml (The system cannot find the file specified)
It seems that Hadoop is the causing error. I don't know how to solve this problem. I know Nutch requires Unix environment. But, I want to run Nutch in Eclipse on Windows.
Can anybody help me to solve this?
Download cygwin, then add that to your path of the environment variables. I think your problem is caused by the fact that windows can't invoke a unix native command. That is what I did however as soon as i got past that problem, I encountered other problems.