I using below command for export a HBase table into HDFS.
hbase org.apache.hadoop.hbase.mapreduce.Driver export "Table-name" "hdfs-path"
This command well executing for small size tables. But fail to export large data tables.
Error Logs:
2015-09-22 14:48:58,814 INFO [main] mapreduce.Job: Task Id : attempt_1442911480092_0002_m_000000_2, Status : FAILED
Container [pid=3575,containerID=container_1442911480092_0002_01_000004] is running beyond virtual memory limits. Current usage: 23.6 MB of 1 GB physical memory used; 4.8 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1442911480092_0002_01_000004 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 3575 3573 3575 3575 (bash) 0 0 108609536 334 /bin/bash -c /opt/java/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx3024m -Djava.io.tmpdir=/opt/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1442911480092_0002/container_1442911480092_0002_01_000004/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/opt/hadoop/logs/userlogs/application_1442911480092_0002/container_1442911480092_0002_01_000004 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 10.127.128.149 44859 attempt_1442911480092_0002_m_000000_2 4 1>/opt/hadoop/logs/userlogs/application_1442911480092_0002/container_1442911480092_0002_01_000004/stdout 2>/opt/hadoop/logs/userlogs/application_1442911480092_0002/container_1442911480092_0002_01_000004/stderr
|- 3591 3575 3575 3575 (java) 10 2 5018701824 5704 /opt/java/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx3024m -Djava.io.tmpdir=/opt/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1442911480092_0002/container_1442911480092_0002_01_000004/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/opt/hadoop/logs/userlogs/application_1442911480092_0002/container_1442911480092_0002_01_000004 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 10.127.128.149 44859 attempt_1442911480092_0002_m_000000_2 4
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
2015-09-22 14:49:07,892 INFO [main] mapreduce.Job: map 100% reduce 0%
2015-09-22 14:49:07,911 INFO [main] mapreduce.Job: Job job_1442911480092_0002 failed with state FAILED due to: Task failed task_1442911480092_0002_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0
2015-09-22 14:49:08,105 INFO [main] mapreduce.Job: Counters: 12
Job Counters
Failed map tasks=4
Launched map tasks=4
Other local map tasks=3
Rack-local map tasks=1
Total time spent by all maps in occupied slots (ms)=13484
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=13484
Total vcore-seconds taken by all map tasks=13484
Total megabyte-seconds taken by all map tasks=13807616
Map-Reduce Framework
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Related
I'm trying to submit a teragen job to YARN like this:
yarn jar $YARN_EXAMPLES/hadoop-mapreduce-examples-3.3.1.jar teragen 1000 /teragen
It all goes well until it errors out:
2021-11-04 23:45:20,540 INFO mapreduce.Job: Running job: job_1636069364859_0003
2021-11-04 23:45:25,629 INFO mapreduce.Job: Job job_1636069364859_0003 running in uber mode : false
2021-11-04 23:45:25,630 INFO mapreduce.Job: map 0% reduce 0%
2021-11-04 23:45:27,658 INFO mapreduce.Job: Task Id : attempt_1636069364859_0003_m_000000_0, Status : FAILED
[2021-11-04 23:45:26.200]Exception from container-launch.
Container id: container_1636069364859_0003_01_000002
Exit code: 127
[2021-11-04 23:45:26.201]Container exited with a non-zero exit code 127. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
/bin/bash: line 1: m: command not found
I have no clue what the problem is. I've tried looking into the logs, especially the prelaunch.err file but it is empty. The stderr file has:
/bin/bash: line 1: m: command not found
Checking the node manager logs, I found this:
2021-11-04 23:44:05,765 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: s3a-file-system metrics system started
2021-11-04 23:44:06,423 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1636069364859_0001_01_000002 transitioned from LOCALIZING to SCHEDULED
2021-11-04 23:44:06,423 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler: Starting container [container_1636069364859_0001_01_000002]
2021-11-04 23:44:06,453 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1636069364859_0001_01_000002 transitioned from SCHEDULED to RUNNING
2021-11-04 23:44:06,453 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Starting resource-monitoring for container_1636069364859_0001_01_000002
2021-11-04 23:44:06,457 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [bash, /opt/yarn/local/usercache/vagrant/appcache/application_1636069364859_0001/container_1636069364859_0001_01_000002/default_container_executor.sh]
2021-11-04 23:44:06,477 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1636069364859_0001_01_000002 is : 127
2021-11-04 23:44:06,478 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception from container-launch with container ID: container_1636069364859_0001_01_000002 and exit code: 127
ExitCodeException exitCode=127:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:1008)
at org.apache.hadoop.util.Shell.run(Shell.java:901)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1213)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:309)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:585)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:373)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:103)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2021-11-04 23:44:06,479 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception from container-launch.
2021-11-04 23:44:06,479 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Container id: container_1636069364859_0001_01_000002
2021-11-04 23:44:06,479 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exit code: 127
2021-11-04 23:44:06,479 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Container launch failed : Container exited with a non-zero exit code 127.
2021-11-04 23:44:06,501 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1636069364859_0001_01_000002 transitioned from RUNNING to EXITED_WITH_FAILURE
2021-11-04 23:44:06,503 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerCleanup: Cleaning up container container_1636069364859_0001_01_000002
2021-11-04 23:44:06,515 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping s3a-file-system metrics system...
2021-11-04 23:44:06,515 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: s3a-file-system metrics system stopped.
2021-11-04 23:44:06,515 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: s3a-file-system metrics system shutdown complete.
2021-11-04 23:44:06,525 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /opt/yarn/local/usercache/vagrant/appcache/application_1636069364859_0001/container_1636069364859_0001_01_000002
2021-11-04 23:44:06,526 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1636069364859_0001_01_000002 transitioned from EXITED_WITH_FAILURE to DONE
2021-11-04 23:44:06,526 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Removing container_1636069364859_0001_01_000002 from application application_1636069364859_0001
2021-11-04 23:44:06,526 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Stopping resource-monitoring for container_1636069364859_0001_01_000002
I've read other responses and when they mention that Java is missing or JAVA_HOME is not set. That's not my case, my JAVA_HOME is set to /usr/lib/jvm/java-8-openjdk-amd64.
Any idea what could be going on here? Thanks :)
The problem was with the memory allocated for each container. Some containers were not living long enough to actually log the error apparently.
But after several attempts I actually got an error that looked like this:
Error occurred during initialization of VM
Too small initial heap
For some reason, the memory configuration for YARN and MapReduce jobs I was using was not correct. I ended up using Ambari's HDP yarn-util.py to get the appropriate values for my setup.
I am trying Hadoop map-reduce in Linux (Ubuntu Virtual Machine) by following the link
I ran the wordcount example on a sample file. The process gets killed unexpectedly. How can I debug this ?
Initially I was getting an insufficient memory error on large data set.
15/11/28 19:24:27 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
15/11/28 19:24:27 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/hduser/eg2/a.txt:0+1538
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000e6093000, 104861696, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 104861696 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /usr/local/hadoop/hs_err_pid7516.log
So I reduced the size of my files and tried again which resulted in unexpected termination.
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /user/hduser/eg2/ /user/hduser/eg2/eg2-output2
......
......
15/11/28 18:55:44 INFO mapred.LocalJobRunner: Waiting for map tasks
15/11/28 18:55:44 INFO mapred.LocalJobRunner: Starting task: attempt_local1996683170_0001_m_000000_0
15/11/28 18:55:44 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
15/11/28 18:55:44 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/hduser/eg2/a.txt:0+1538
15/11/28 18:55:45 INFO mapreduce.Job: Job job_local1996683170_0001 running in uber mode : false
15/11/28 18:55:45 INFO mapreduce.Job: map 0% reduce 0%
Killed
Why is the process getting terminated ?
Try:
Hadoop job -list
Kill all jobs and rerun it:
Hadoop job –kill <JobID>
Try checking the logs of job tracker for error
http://localhost:50070/ – web UI of the NameNode daemon
http://localhost:50030/ – web UI of the JobTracker daemon
http://localhost:50060/ – web UI of the TaskTracker daemon
The size of the data set didn't matter. Hadoop didn't have enough memory to start. I tried increasing the memory of my VM and the issue got fixed.
My question may sound redundant here but the solution to the earlier questions were all ad-hoc. few I have tried but no luck yet.
Acutally, I am working on hadoop-1.2.1(on ubuntu 14), Initially I had single node set-up and there I ran the WordCount program succesfully. Then I added one more node to it according to this tutorial. It started successfully, without any errors, But now when I am running the same WordCount program it is hanging in reduce phase. I looked at task-tracker logs, they are as given below :-
INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201509110037_0001_m_000002_0 task's state:UNASSIGNED
INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_201509110037_0001_m_000002_0 which needs 1 slots
INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201509110037_0001_m_000002_0 which needs 1 slots
INFO org.apache.hadoop.mapred.JobLocalizer: Initializing user hadoopuser on this TT.
INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201509110037_0001_m_18975496
INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_201509110037_0001_m_18975496 spawned.
INFO org.apache.hadoop.mapred.TaskController: Writing commands to /app/hadoop/tmp/mapred/local/ttprivate/taskTracker/hadoopuser/jobcache/job_201509110037_0001/attempt_201509110037_0001_m_000002_0/taskjvm.sh
INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_201509110037_0001_m_18975496 given task: attempt_201509110037_0001_m_000002_0
INFO org.apache.hadoop.mapred.TaskTracker: attempt_201509110037_0001_m_000002_0 0.0% hdfs://HadoopMaster:54310/input/file02:25+3
INFO org.apache.hadoop.mapred.TaskTracker: Task attempt_201509110037_0001_m_000002_0 is done.
INFO org.apache.hadoop.mapred.TaskTracker: reported output size for attempt_201509110037_0001_m_000002_0 was 6
INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 2
INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_201509110037_0001_m_18975496 exited with exit code 0. Number of tasks it ran: 1
INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201509110037_0001_r_000000_0 task's state:UNASSIGNED
INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_201509110037_0001_r_000000_0 which needs 1 slots
INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201509110037_0001_r_000000_0 which needs 1 slots
INFO org.apache.hadoop.io.nativeio.NativeIO: Initialized cache for UID to User mapping with a cache timeout of 14400 seconds.
INFO org.apache.hadoop.io.nativeio.NativeIO: Got UserName hadoopuser for UID 10 from the native implementation
INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201509110037_0001_r_18975496
INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_201509110037_0001_r_18975496 spawned.
INFO org.apache.hadoop.mapred.TaskController: Writing commands to /app/hadoop/tmp/mapred/local/ttprivate/taskTracker/hadoopuser/jobcache/job_201509110037_0001/attempt_201509110037_0001_r_000000_0/taskjvm.sh
INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_201509110037_0001_r_18975496 given task: attempt_201509110037_0001_r_000000_0
INFO org.apache.hadoop.mapred.TaskTracker.clienttrace: src: 127.0.1.1:500, dest: 127.0.0.1:55946, bytes: 6, op: MAPRED_SHUFFLE, cliID: attempt_201509110037_0001_m_000002_0, duration: 7129894
INFO org.apache.hadoop.mapred.TaskTracker: attempt_201509110037_0001_r_000000_0 0.11111112% reduce > copy (1 of 3 at 0.00 MB/s) >
INFO org.apache.hadoop.mapred.TaskTracker: attempt_201509110037_0001_r_000000_0 0.11111112% reduce > copy (1 of 3 at 0.00 MB/s) >
INFO org.apache.hadoop.mapred.TaskTracker: attempt_201509110037_0001_r_000000_0 0.11111112% reduce > copy (1 of 3 at 0.00 MB/s) >
INFO org.apache.hadoop.mapred.TaskTracker: attempt_201509110037_0001_r_000000_0 0.11111112% reduce > copy (1 of 3 at 0.00 MB/s) >
INFO org.apache.hadoop.mapred.TaskTracker: attempt_201509110037_0001_r_000000_0 0.11111112% reduce > copy (1 of 3 at 0.00 MB/s) >
INFO org.apache.hadoop.mapred.TaskTracker: attempt_201509110037_0001_r_000000_0 0.11111112% reduce > copy (1 of 3 at 0.00 MB/s) >
Also on the console where I am running the program It hangs at -
00:39:24 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
00:39:24 INFO util.NativeCodeLoader: Loaded the native-hadoop library
00:39:24 WARN snappy.LoadSnappy: Snappy native library not loaded
00:39:24 INFO mapred.FileInputFormat: Total input paths to process : 2
00:39:24 INFO mapred.JobClient: Running job: job_201509110037_0001
00:39:25 INFO mapred.JobClient: map 0% reduce 0%
00:39:28 INFO mapred.JobClient: map 100% reduce 0%
00:39:35 INFO mapred.JobClient: map 100% reduce 11%
and my configuration files are as follows :-
//core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://HadoopMaster:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>
//hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
</configuration>
//mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>HadoopMaster:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
<property>
<name>mapred.reduce.slowstart.completed.maps</name>
<value>0.80</value>
</property>
</configuration>
/etc/hosts
127.0.0.1 localhost
127.0.1.1 M-1947
#HADOOP CLUSTER SETUP
172.50.88.54 HadoopMaster
172.50.88.60 HadoopSlave1
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
/etc/hostname
M-1947
//masters
HadoopMaster
//slaves
HadoopMaster
HadoopSlave1
I have been struggling with it for long, any help is appreciated. Thanks !
Got it fixed.. although, the same issue has multiple questions on the forums but the verified solution according to me is that hostname resolution for the any node in the cluster should be correct (moreover this issue doesnot depend upon the size of cluster).
Actually it is the issue with dns-lookup, ensure one make the below changes to resolve the above issue -
try printing hostname on each machine using '$ hostname'
check that the hostname printed for each machine is same as the entry made in master/slaves file for respective machine.
If it doesn't matches then rename the host by making changes in the /etc/hostname file and reboot the system.
Example :-
in /etc/hosts file (let's say on Master machine of hadoop cluster)
127.0.0.1 localhost
127.0.1.1 john-machine
#Hadoop cluster
172.50.88.21 HadoopMaster
172.50.88.22 HadoopSlave1
172.50.88.23 HadoopSlave2
then it's -> /etc/hostname file (on master machine) should contain the following entry (for the above issue to be resolved)
HadoopMaster
similarly verify the /etc/hostname files of the each slave node.
Cloudera CDH5.2 Quickstart VM
Cloudera Manager showing all nodes state = GREEN
I've jared on Eclipse a MR job including all relevant cloudera jars in the Build Path:
avro-1.7.6-cdh5.2.0.jar,
avro-mapred-1.7.6-cdh5.2.0-hadoop2.jar,
hadoop-common-2.5.0-cdh5.2.0.jar,
hadoop-mapreduce-client-core-2.5.0-cdh5.2.0.jar
I've run the following job
hadoop jar jproject1.jar avro00.AvroUserPrefCount -libjars ${LIBJARS} avro/00/in avro/00/out
I get the following error, is it a Java heap problem, any comments ? Thank you in advance
14/11/14 01:02:40 INFO client.RMProxy: Connecting to ResourceManager at quickstart.cloudera/127.0.0.1:8032
14/11/14 01:02:43 INFO input.FileInputFormat: Total input paths to process : 1
14/11/14 01:02:43 INFO mapreduce.JobSubmitter: number of splits:1
14/11/14 01:02:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1415950730849_0001
14/11/14 01:02:45 INFO impl.YarnClientImpl: Submitted application application_1415950730849_0001
14/11/14 01:02:45 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1415950730849_0001/
14/11/14 01:02:45 INFO mapreduce.Job: Running job: job_1415950730849_0001
14/11/14 01:03:04 INFO mapreduce.Job: Job job_1415950730849_0001 running in uber mode : false
14/11/14 01:03:04 INFO mapreduce.Job: map 0% reduce 0%
14/11/14 01:03:11 INFO mapreduce.Job: Task Id : attempt_1415950730849_0001_m_000000_0, Status : FAILED
Error: java.io.IOException: Unable to initialize any output collector
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:412)
at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:81)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:695)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
...
...
Checking the full task log of the failed attempt attempt_1415950730849_0001_m_000000_0 will help tell why you ran into the given exception.
The most common reason of observing such an error is a misconfigured value of io.sort.mb in your job. Its value must never be anywhere close to (or higher than) the configured map task heap size, and must also not currently exceed ~2000 MB (Java array maximum size).
An upstream improvement of making the error more clear on the true failure was also filed and resolved recently, via MAPREDUCE-6194.
I encountered the same issue yesterday. I checked the syslog for the particular map task which was failing, which suggested that I was getting another exception in that task which was triggering this error. In my case this was an invalid parsing, and when I corrected that issue, this error was fixed.
Closer examination of the log for the failed task should give you the root cause for the issue.
I am just getting started with linux/java/hadoop/EMR.
I am following this neat book.
The assignment is to run:
bin/hadoop jar hadoop-cookbook-chapter1.jar chapter1.WordCount input output
And this is the response that I get:
alex#HadoopMachine:/usr/share/hadoop$ sudo hadoop jar hadoop-cookbook-chapter1.jar chapter1.WordCount input output
13/05/01 01:01:08 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/05/01 01:01:08 INFO input.FileInputFormat: Total input paths to process : 1
13/05/01 01:01:08 WARN snappy.LoadSnappy: Snappy native library not loaded
13/05/01 01:01:09 INFO mapred.JobClient: Running job: job_local_0001
13/05/01 01:01:09 INFO util.ProcessTree: setsid exited with exit code 0
13/05/01 01:01:09 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#1c04d881
13/05/01 01:01:09 INFO mapred.MapTask: io.sort.mb = 100
13/05/01 01:01:09 WARN mapred.LocalJobRunner: job_local_0001
java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:949)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:674)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
13/05/01 01:01:10 INFO mapred.JobClient: map 0% reduce 0%
13/05/01 01:01:10 INFO mapred.JobClient: Job complete: job_local_0001
13/05/01 01:01:10 INFO mapred.JobClient: Counters: 0
Frankly, since I have almost no java background, I do not even know where to start debugging.
I would be most grateful for any guidance on how to tackle this issue.
update
after following greedybuddha's advice i am getting:
alex#HadoopMachine:/usr/share/hadoop$ sudo hadoop jar hadoop-cookbook-chapter1.jar chapter1.WordCount -Dmapred.child.java.opts=-Xmx1G input output
[sudo] password for alex:
13/05/01 11:03:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/05/01 11:03:54 INFO input.FileInputFormat: Total input paths to process : 1
13/05/01 11:03:54 WARN snappy.LoadSnappy: Snappy native library not loaded
13/05/01 11:03:54 INFO mapred.JobClient: Running job: job_local_0001
13/05/01 11:03:54 INFO util.ProcessTree: setsid exited with exit code 0
13/05/01 11:03:54 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#35756b65
13/05/01 11:03:54 INFO mapred.MapTask: io.sort.mb = 100
13/05/01 11:03:54 WARN mapred.LocalJobRunner: job_local_0001
java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:949)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:674)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
13/05/01 11:03:55 INFO mapred.JobClient: map 0% reduce 0%
13/05/01 11:03:55 INFO mapred.JobClient: Job complete: job_local_0001
13/05/01 11:03:55 INFO mapred.JobClient: Counters: 0
Java needs a certain amount of memory to run programs. When a program uses too much, it will throw the error you are having. The solution is to tell java to allocate more memory for the program. In this case you should be able to tell hadoop to allocate you the memory. Try the following.
bin/hadoop jar hadoop-cookbook-chapter1.jar chapter1.WordCount -Dmapred.child.java.opts=-Xmx1G input output
the option -Xmx1G says allow up 1 Gigabyte.
This other stackoverflow question is also very similar.
out of Memory Error in Hadoop