I am trying Hadoop map-reduce in Linux (Ubuntu Virtual Machine) by following the link
I ran the wordcount example on a sample file. The process gets killed unexpectedly. How can I debug this ?
Initially I was getting an insufficient memory error on large data set.
15/11/28 19:24:27 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
15/11/28 19:24:27 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/hduser/eg2/a.txt:0+1538
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000e6093000, 104861696, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 104861696 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /usr/local/hadoop/hs_err_pid7516.log
So I reduced the size of my files and tried again which resulted in unexpected termination.
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /user/hduser/eg2/ /user/hduser/eg2/eg2-output2
......
......
15/11/28 18:55:44 INFO mapred.LocalJobRunner: Waiting for map tasks
15/11/28 18:55:44 INFO mapred.LocalJobRunner: Starting task: attempt_local1996683170_0001_m_000000_0
15/11/28 18:55:44 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
15/11/28 18:55:44 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/hduser/eg2/a.txt:0+1538
15/11/28 18:55:45 INFO mapreduce.Job: Job job_local1996683170_0001 running in uber mode : false
15/11/28 18:55:45 INFO mapreduce.Job: map 0% reduce 0%
Killed
Why is the process getting terminated ?
Try:
Hadoop job -list
Kill all jobs and rerun it:
Hadoop job –kill <JobID>
Try checking the logs of job tracker for error
http://localhost:50070/ – web UI of the NameNode daemon
http://localhost:50030/ – web UI of the JobTracker daemon
http://localhost:50060/ – web UI of the TaskTracker daemon
The size of the data set didn't matter. Hadoop didn't have enough memory to start. I tried increasing the memory of my VM and the issue got fixed.
Related
I'm running a single node application with Spark on a machine with 32 GB RAM.
More than 12GB of the memory is available at the time I'm running the applicaton.
But From the spark UI and logs, I see that it using 3.8GB of RAM (which is gradually decreased as the jobs run).
At this time this is logged, 5GB more memory is avilable. Where as Spark is using 3.8GB
UPDATE
I set these parameters in conf/spark-env.sh but still each time I run the application It is using exactly 3.8 GB
export SPARK_WORKER_MEMORY=6g
export SPARK_MEM=6g
export SPARK_DAEMON_MEMORY=6g
Log
2015-11-19 13:05:41,701 INFO org.apache.spark.SparkEnv.logInfo:59 - Registering MapOutputTracker
2015-11-19 13:05:41,716 INFO org.apache.spark.SparkEnv.logInfo:59 - Registering BlockManagerMaster
2015-11-19 13:05:41,735 INFO org.apache.spark.storage.DiskBlockManager.logInfo:59 - Created local directory at /usr/local/TC_SPARCDC_COM/temp/blockmgr-8513cd3b-ac03-4c0a-b291-65aba4cbc395
2015-11-19 13:05:41,746 INFO org.apache.spark.storage.MemoryStore.logInfo:59 - MemoryStore started with capacity 3.8 GB
2015-11-19 13:05:41,777 INFO org.apache.spark.HttpFileServer.logInfo:59 - HTTP File server directory is /usr/local/TC_SPARCDC_COM/temp/spark-b86380c2-4cbd-43d6-a3b7-aa03d9a05a84/httpd-ceaffbd0-eac4-447e-9d3f-c452627a28cb
2015-11-19 13:05:41,781 INFO org.apache.spark.HttpServer.logInfo:59 - Starting HTTP Server
2015-11-19 13:05:41,842 INFO org.spark-project.jetty.server.Server.doStart:272 - jetty-8.y.z-SNAPSHOT
2015-11-19 13:05:41,854 INFO org.spark-project.jetty.server.AbstractConnector.doStart:338 - Started SocketConnector#0.0.0.0:5279
2015-11-19 13:05:41,855 INFO org.apache.spark.util.Utils.logInfo:59 - Successfully started service 'HTTP file server' on port 5279.
2015-11-19 13:05:41,867 INFO org.apache.spark.SparkEnv.logInfo:59 - Registering OutputCommitCoordinator
2015-11-19 13:05:42,013 INFO org.spark-project.jetty.server.Server.doStart:272 - jetty-8.y.z-SNAPSHOT
2015-11-19 13:05:42,039 INFO org.spark-project.jetty.server.AbstractConnector.doStart:338 - Started SelectChannelConnector#0.0.0.0:4040
2015-11-19 13:05:42,039 INFO org.apache.spark.util.Utils.logInfo:59 - Successfully started service 'SparkUI' on port 4040.
2015-11-19 13:05:42,041 INFO org.apache.spark.ui.SparkUI.logInfo:59 - Started SparkUI at http://103.252.184.181:4040
2015-11-19 13:05:42,114 WARN org.apache.spark.metrics.MetricsSystem.logWarning:71 - Using default name DAGScheduler for source because spark.app.id is not set.
2015-11-19 13:05:42,117 INFO org.apache.spark.executor.Executor.logInfo:59 - Starting executor ID driver on host localhost
2015-11-19 13:05:42,307 INFO org.apache.spark.util.Utils.logInfo:59 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 31334.
2015-11-19 13:05:42,308 INFO org.apache.spark.network.netty.NettyBlockTransferService.logInfo:59 - Server created on 31334
2015-11-19 13:05:42,309 INFO org.apache.spark.storage.BlockManagerMaster.logInfo:59 - Trying to register BlockManager
2015-11-19 13:05:42,312 INFO org.apache.spark.storage.BlockManagerMasterEndpoint.logInfo:59 - Registering block manager localhost:31334 with 3.8 GB RAM, BlockManagerId(driver, localhost, 31334)
2015-11-19 13:05:42,313 INFO org.apache.spark.storage.BlockManagerMaster.logInfo:59 - Registered BlockManager
If you are using SparkSubmit you can use the --executor-memory and --driver-memory flags. Otherwise, change these configurations spark.executor.memory and spark.driver.memory either directly in your program or in spark-defaults.
Note that you should not set memory too high. As a rule of thumb, aim for ~75% of available memory. That will leave enough memory for other processes (like your OS) running on your machines.
It is correctly stated by #Glennie Helles Sindholt but setting driver flags while submitting jobs on a standalone machine won't affect the usage as the JVM has been already been initialized. Checkout this link of discussion:
How to set Apache Spark Executor memory
If you are using Spark submit command to submit a job following is an example for how to set parameters while submitting the job:
spark-submit --master spark://127.0.0.1:7077 \
--num-executors 2 \
--executor-cores 8 \
--executor-memory 3g \
--class <Class name> \
$JAR_FILE_NAME or path \
/path-to-input \
/path-to-output \
By varying the number of parameters in this you can see and understand how the usage of RAM is changing. Also, there is a utility named htop on Linux. It is useful to instantaneous usage of memory, CPU cores and Swap space to have an understanding of what is happening. To install htop, use the following:
sudo apt-get install htop
It will look something like this:
htop utility
For more information you can check out the following links:
https://spark.apache.org/docs/latest/configuration.html
I using below command for export a HBase table into HDFS.
hbase org.apache.hadoop.hbase.mapreduce.Driver export "Table-name" "hdfs-path"
This command well executing for small size tables. But fail to export large data tables.
Error Logs:
2015-09-22 14:48:58,814 INFO [main] mapreduce.Job: Task Id : attempt_1442911480092_0002_m_000000_2, Status : FAILED
Container [pid=3575,containerID=container_1442911480092_0002_01_000004] is running beyond virtual memory limits. Current usage: 23.6 MB of 1 GB physical memory used; 4.8 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1442911480092_0002_01_000004 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 3575 3573 3575 3575 (bash) 0 0 108609536 334 /bin/bash -c /opt/java/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx3024m -Djava.io.tmpdir=/opt/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1442911480092_0002/container_1442911480092_0002_01_000004/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/opt/hadoop/logs/userlogs/application_1442911480092_0002/container_1442911480092_0002_01_000004 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 10.127.128.149 44859 attempt_1442911480092_0002_m_000000_2 4 1>/opt/hadoop/logs/userlogs/application_1442911480092_0002/container_1442911480092_0002_01_000004/stdout 2>/opt/hadoop/logs/userlogs/application_1442911480092_0002/container_1442911480092_0002_01_000004/stderr
|- 3591 3575 3575 3575 (java) 10 2 5018701824 5704 /opt/java/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx3024m -Djava.io.tmpdir=/opt/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1442911480092_0002/container_1442911480092_0002_01_000004/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/opt/hadoop/logs/userlogs/application_1442911480092_0002/container_1442911480092_0002_01_000004 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 10.127.128.149 44859 attempt_1442911480092_0002_m_000000_2 4
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
2015-09-22 14:49:07,892 INFO [main] mapreduce.Job: map 100% reduce 0%
2015-09-22 14:49:07,911 INFO [main] mapreduce.Job: Job job_1442911480092_0002 failed with state FAILED due to: Task failed task_1442911480092_0002_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0
2015-09-22 14:49:08,105 INFO [main] mapreduce.Job: Counters: 12
Job Counters
Failed map tasks=4
Launched map tasks=4
Other local map tasks=3
Rack-local map tasks=1
Total time spent by all maps in occupied slots (ms)=13484
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=13484
Total vcore-seconds taken by all map tasks=13484
Total megabyte-seconds taken by all map tasks=13807616
Map-Reduce Framework
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
I have set some MapReduce configuration in my main method as so
configuration.set("mapreduce.jobtracker.address", "localhost:54311");
configuration.set("mapreduce.framework.name", "yarn");
configuration.set("yarn.resourcemanager.address", "localhost:8032");
Now when I launch the mapreduce task, the process is tracked (I can see it in my cluster dashboard (the one listening on port 8088)), but the process never finishes. It remains blocked at the following line:
15/06/30 15:56:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/06/30 15:56:17 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8032
15/06/30 15:56:18 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/06/30 15:56:18 INFO input.FileInputFormat: Total input paths to process : 1
15/06/30 15:56:18 INFO mapreduce.JobSubmitter: number of splits:1
15/06/30 15:56:18 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1435241671439_0008
15/06/30 15:56:19 INFO impl.YarnClientImpl: Submitted application application_1435241671439_0008
15/06/30 15:56:19 INFO mapreduce.Job: The url to track the job: http://10.0.0.10:8088/proxy/application_1435241671439_0008/
15/06/30 15:56:19 INFO mapreduce.Job: Running job: job_1435241671439_0008
Someone has an idea?
Edit : in my yarn nodemanager log, I have this message
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1435241671439_0003_03_000001
2015-06-30 15:44:38,396 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1435241671439_0002_04_000001
Edit 2 :
I also have in the yarn manager log, some exception that happened sooner (for a precedent mapreduce call) :
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.BindException: Problem binding to [0.0.0.0:8040] java.net.BindException: Address already in use; For more details see:
Solution : I killed all the daemon processes and restarted again hadoop ! In fact, when I ran jps, I was still getting hadoop daemons though I had stopped them. This was a mismatch of HADOOP_PID_DIR
The default port of nodemanage of yarn is 8040. The error says that the port is already in use. Stop all the hadoop process, if you dont have data, may be format namenode once and try running the job again. From both of your edits, the issue is surely with node manager
Solution : I killed all the daemon processes and restarted again hadoop ! In fact, when I ran jps, I was still getting hadoop daemons though I had stopped them. This was related to a mismatch of HADOOP_PID_DIR
Cloudera CDH5.2 Quickstart VM
Cloudera Manager showing all nodes state = GREEN
I've jared on Eclipse a MR job including all relevant cloudera jars in the Build Path:
avro-1.7.6-cdh5.2.0.jar,
avro-mapred-1.7.6-cdh5.2.0-hadoop2.jar,
hadoop-common-2.5.0-cdh5.2.0.jar,
hadoop-mapreduce-client-core-2.5.0-cdh5.2.0.jar
I've run the following job
hadoop jar jproject1.jar avro00.AvroUserPrefCount -libjars ${LIBJARS} avro/00/in avro/00/out
I get the following error, is it a Java heap problem, any comments ? Thank you in advance
14/11/14 01:02:40 INFO client.RMProxy: Connecting to ResourceManager at quickstart.cloudera/127.0.0.1:8032
14/11/14 01:02:43 INFO input.FileInputFormat: Total input paths to process : 1
14/11/14 01:02:43 INFO mapreduce.JobSubmitter: number of splits:1
14/11/14 01:02:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1415950730849_0001
14/11/14 01:02:45 INFO impl.YarnClientImpl: Submitted application application_1415950730849_0001
14/11/14 01:02:45 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1415950730849_0001/
14/11/14 01:02:45 INFO mapreduce.Job: Running job: job_1415950730849_0001
14/11/14 01:03:04 INFO mapreduce.Job: Job job_1415950730849_0001 running in uber mode : false
14/11/14 01:03:04 INFO mapreduce.Job: map 0% reduce 0%
14/11/14 01:03:11 INFO mapreduce.Job: Task Id : attempt_1415950730849_0001_m_000000_0, Status : FAILED
Error: java.io.IOException: Unable to initialize any output collector
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:412)
at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:81)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:695)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
...
...
Checking the full task log of the failed attempt attempt_1415950730849_0001_m_000000_0 will help tell why you ran into the given exception.
The most common reason of observing such an error is a misconfigured value of io.sort.mb in your job. Its value must never be anywhere close to (or higher than) the configured map task heap size, and must also not currently exceed ~2000 MB (Java array maximum size).
An upstream improvement of making the error more clear on the true failure was also filed and resolved recently, via MAPREDUCE-6194.
I encountered the same issue yesterday. I checked the syslog for the particular map task which was failing, which suggested that I was getting another exception in that task which was triggering this error. In my case this was an invalid parsing, and when I corrected that issue, this error was fixed.
Closer examination of the log for the failed task should give you the root cause for the issue.
I am just getting started with linux/java/hadoop/EMR.
I am following this neat book.
The assignment is to run:
bin/hadoop jar hadoop-cookbook-chapter1.jar chapter1.WordCount input output
And this is the response that I get:
alex#HadoopMachine:/usr/share/hadoop$ sudo hadoop jar hadoop-cookbook-chapter1.jar chapter1.WordCount input output
13/05/01 01:01:08 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/05/01 01:01:08 INFO input.FileInputFormat: Total input paths to process : 1
13/05/01 01:01:08 WARN snappy.LoadSnappy: Snappy native library not loaded
13/05/01 01:01:09 INFO mapred.JobClient: Running job: job_local_0001
13/05/01 01:01:09 INFO util.ProcessTree: setsid exited with exit code 0
13/05/01 01:01:09 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#1c04d881
13/05/01 01:01:09 INFO mapred.MapTask: io.sort.mb = 100
13/05/01 01:01:09 WARN mapred.LocalJobRunner: job_local_0001
java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:949)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:674)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
13/05/01 01:01:10 INFO mapred.JobClient: map 0% reduce 0%
13/05/01 01:01:10 INFO mapred.JobClient: Job complete: job_local_0001
13/05/01 01:01:10 INFO mapred.JobClient: Counters: 0
Frankly, since I have almost no java background, I do not even know where to start debugging.
I would be most grateful for any guidance on how to tackle this issue.
update
after following greedybuddha's advice i am getting:
alex#HadoopMachine:/usr/share/hadoop$ sudo hadoop jar hadoop-cookbook-chapter1.jar chapter1.WordCount -Dmapred.child.java.opts=-Xmx1G input output
[sudo] password for alex:
13/05/01 11:03:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/05/01 11:03:54 INFO input.FileInputFormat: Total input paths to process : 1
13/05/01 11:03:54 WARN snappy.LoadSnappy: Snappy native library not loaded
13/05/01 11:03:54 INFO mapred.JobClient: Running job: job_local_0001
13/05/01 11:03:54 INFO util.ProcessTree: setsid exited with exit code 0
13/05/01 11:03:54 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#35756b65
13/05/01 11:03:54 INFO mapred.MapTask: io.sort.mb = 100
13/05/01 11:03:54 WARN mapred.LocalJobRunner: job_local_0001
java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:949)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:674)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
13/05/01 11:03:55 INFO mapred.JobClient: map 0% reduce 0%
13/05/01 11:03:55 INFO mapred.JobClient: Job complete: job_local_0001
13/05/01 11:03:55 INFO mapred.JobClient: Counters: 0
Java needs a certain amount of memory to run programs. When a program uses too much, it will throw the error you are having. The solution is to tell java to allocate more memory for the program. In this case you should be able to tell hadoop to allocate you the memory. Try the following.
bin/hadoop jar hadoop-cookbook-chapter1.jar chapter1.WordCount -Dmapred.child.java.opts=-Xmx1G input output
the option -Xmx1G says allow up 1 Gigabyte.
This other stackoverflow question is also very similar.
out of Memory Error in Hadoop