I am running Hadoop 0.21.0 in a single node cluster to process a single big > 200 GB file. For decreasing the execution time, I have tried different HDFS block sizes ( 128, 256, 512 MB, 1, 1.5, 1.75 GB ) respectively. However, I have got the following exception when using block size >= 2 GB.
Note: I am using java-8-oracle.
2015-08-05 12:02:12,524 WARN org.apache.hadoop.mapred.Child: Exception running child : java.lang.IndexOutOfBoundsException
at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:186)
at org.apache.hadoop.hdfs.BlockReader.read(BlockReader.java:113)
at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:466)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:517)
at java.io.DataInputStream.readFully(DataInputStream.java:195)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1518)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:460)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:651)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)
at org.apache.hadoop.mapred.Child.main(Child.java:211)
For the Hadoop version you are using(0.21.0) seems so.
The issue you have was fixed for the next version, see more here: https://issues.apache.org/jira/browse/HDFS-96
Related
I'M running jmx from command line
JVM_ARGS="-Xms2048m -Xmx4096m -XX:4096ize=4096m -XX:MaxNewSize=4096m" && export JVM_ARGS && ./jmeter.sh -n -t ./jmeter-ec2.jmx -l ./scriptresults.jtl
but on some point I got out of memory error , after going to jmeter.log
I found this error
ERROR o.a.j.JMeter: Uncaught exception: java.lang.OutOfMemoryError:
Java heap space at java.util.Arrays.copyOf(Arrays.java:3236)
~[?:1.8.0_91] at
java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:118)
~[?:1.8.0_91] at
java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
~[?:1.8.0_91] at
java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
~[?:1.8.0_91] at
org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.readResponse(HTTPSamplerBase.java:1833)
~[ApacheJMeter_http.jar:3.3 r1808647] at
org.apache.jmeter.protocol.http.sampler.HTTPAbstractImpl.readResponse(HTTPAbstractImpl.java:440)
~[ApacheJMeter_http.jar:3.3 r1808647] at
org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl.sample(HTTPHC4Impl.java:474)
~[ApacheJMeter_http.jar:3.3 r1808647] at
org.apache.jmeter.protocol.http.sampler.HTTPSamplerProxy.sample(HTTPSamplerProxy.java:74)
~[ApacheJMeter_http.jar:3.3 r1808647] at
org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1189)
~[ApacheJMeter_http.jar:3.3 r1808647] at
org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1178)
~[ApacheJMeter_http.jar:3.3 r1808647] at
org.apache.jmeter.threads.JMeterThread.executeSamplePackage(JMeterThread.java:498)
~[ApacheJMeter_core.jar:3.3 r1808647] at
org.apache.jmeter.threads.JMeterThread.processSampler(JMeterThread.java:424)
~[ApacheJMeter_core.jar:3.3 r1808647] at
org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:255)
~[ApacheJMeter_core.jar:3.3 r1808647] at
java.lang.Thread.run(Thread.java:745) [?:1.8.0_91] 2018-01-26
02:03:55,731 INFO o.a.j.e.StandardJMeterEngine: Notifying test
listeners of end of test 2018-01-26 02:03:55,732 INFO
o.a.j.r.Summariser: summary = 0 in 00:00:00 = ******/s Avg: 0
Min: 9223372036854775807 Max: -9223372036854775808 Err: 0 (0.00%)
what I"M doing wrong here ? I cant solve it:(
Your JVM arguments are wrong, just keep:
-Xms2048m -Xmx4096m
You don't tell with how much threads this occurs nor if you're running in GUI or NON GUI mode, so:
Don't run in GUI mode, it's an anti-pattern
Ensure you have enough memory for your threads
Finally you can reduce the memory impact of big response by adapting this in user.properties:
httpsampler.max_bytes_to_store_per_request
And another option is to only compute HASH from your response by setting this in http://jmeter.apache.org/usermanual/component_reference.html#HTTP_Request:
Well, given you have 1.5 GB file you will be able to have not more than 3 virtual users which doesn't look like a "load test" to me.
If you are not interested in downloaded file's content and just want to stress your server you can consider switching to JSR223 Sampler which will send request and discard the response data using underlying Apache HttpComponents libraries methods, the relevant Groovy code would be something like:
import org.apache.http.client.methods.HttpGet
import org.apache.http.impl.client.HttpClientBuilder
import org.apache.http.util.EntityUtils
def client = HttpClientBuilder.create().build()
def get = new HttpGet('http://example.com')
def response = client.execute(get)
EntityUtils.consume(response.getEntity())
References:
HttpClient Tutorial
HttpClient Quick Start
Apache Groovy - Why and How You Should Use It
I have a 2 node test cluster on AWS with spark-2.0.0-bin-hadoop2.7 installed.
This is the code I'm using to launch the cluster.
./spark-ec2 -k blah -i blah.pem -r us-west-1 -s 1 -t r3.2xlarge launch --copy-aws-credentials blah
Viewing port 8080 shows 58.8GB(0.0 B Used) of memory after running these two lines in rstudio.
Sys.setenv(SPARK_HOME="/root/spark")
library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib")))
When I run this line and refresh the page on port 8080 the memory usage changes to 58.8 GB (53.8 GB Used).
sparkR.session(master = "spark://[ip]:7077",
sparkHome = '/root/spark',
enableHiveSupport = FALSE)
When I try to create a spark data frame from a data frame which should consume 0.04857268 GB of memory I get this error:
acquisition <- as.DataFrame(orig)
17/11/04 14:27:23 WARN TaskSetManager: Stage 0 contains a task of very large size (166360 KB). The maximum recommended task size is 100 KB.
Exception in thread "dispatcher-event-loop-1" java.lang.OutOfMemoryError: Java heap space
I tried adding this but get the same error.
options(java.parameters = "-Xmx2048m")
install.packages("rJava")
library(rJava)
I'm stuck. I've spent three weekends googling this issue and can't figure it out.
Thanks.
I have a large RDD containing objects that is about 10GB in size. I want to convert this to a lookup table to be used in spark with the command:
val lookupTable = sparkContext.broadcast(entitiesRDD.collect) but it fails with:
17/02/27 17:33:25 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, d1): org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 2. To avoid this, increase spark.kryoserializer.buffer.max value.
at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:299)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:240)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I can not increase the spark.kryoserializer.buffer.max past 2048mb or I get the error:
Caused by: java.lang.IllegalArgumentException: spark.kryoserializer.buffer.max must be less than 2048 mb, got: + 2048 mb.
at org.apache.spark.serializer.KryoSerializer.<init>(KryoSerializer.scala:66)
How do other people make large lookup tables in spark?
Please before marking this duplicate read this : I have gone through all the answers provided for this error and nothing helped in my scenario.
I am doing a server migration where the same thing works well in 32 bit and 64 bit runs out of memory.
I have a windows service which internally points to .exe that spawns java process : I have made all the possible memory improvements in the config file of my .exe :Below:
I am not sure what different behavior is causing this out of memory for 64 bit server.(my java version is 1.8.xx)
#Java Additional Parameters
wrapper.java.additional.1=-XX:+UseConcMarkSweepGC
wrapper.java.additional.2=-XX:+UseParNewGC
wrapper.java.additional.3=-XX:ParallelGCThreads=8
wrapper.java.additional.4=-verbose:gc
# wrapper.java.additional.!!! should be sequence !!!=-Xloggc:D:\apps\Logs\gc.log
# wrapper.java.additional.!!! should be sequence !!!=-XX:+PrintGCDetails
# wrapper.java.additional.!!! should be sequence !!!=-XX:+PrintGCTimeStamps
wrapper.java.additional.5=-XX:MaxDirectMemorySize=128m
wrapper.java.additional.6=-XX:+HeapDumpOnOutOfMemoryError
wrapper.java.additional.7=-Dcom.sun.management.jmxremote.port=34001
wrapper.java.additional.8=-Dcom.sun.management.jmxremote.ssl=false
wrapper.java.additional.9=-Dcom.sun.management.jmxremote.authenticate=false
wrapper.java.additional.10=-XX:CMSInitiatingOccupancyFraction=55
wrapper.java.additional.11=-XX:NewSize=474m
wrapper.java.additional.12=-XX:MaxNewSize=474m
#wrapper.java.additional.13=-XX:PermSize=128m
#wrapper.java.additional.14=-XX:MaxPermSize=128m
wrapper.java.additional.15=-Xss128k
wrapper.java.additional.16=-XX:+CMSIncrementalMode
wrapper.java.additional.17=-XX:+UseCompressedOops
# Initial Java Heap Size (in MB)
wrapper.java.initmemory=1638
# Maximum Java Heap Size (in MB)
wrapper.java.maxmemory=1638
Still i am ending up to have :
[severe 2016/10/24 06:27:46.192 java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Unknown Source)
at com.gemstone.gemfire.internal.SocketCreator.asyncClose(SocketCreator.java:688)
Reading done for the concept here :
Error reading
I am not much into Java things but tried all the things from my side , any help on this will be highly appreciated , i spend huge amount of time on this but not able to reach to any conclusion.
***********Update***************
So basically could figure out that this problem was coming due to excessive creation of thread from Gemfire which exceeds the threshold ~800 threads for Gemfire Java Process.
Here Jconsole tool helped to calculate the thread count , i could see around 200-300 threads from different pool getting created with no purpose apart from usual threads and they have discription as :
Name: pool-9-thread-1
State: WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#163b285
Total blocked: 0 Total waited: 2
Stack trace:
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(Unknown Source)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source)
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source)
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source)
java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
java.lang.Thread.run(Unknown Source)
I'll add more details if i can find more on this !
*******Update 2 : ************
I could manage to see all the threads created by gemfire using Jconsole:
And this number keeps on increasing and after certain point of time i am seeing the OOM issue.Is there any way i can stop this unnecessary threads creation and memory conumption !
I am confronted with a weird problem. I have a mapreduce class which looks for patterns in a file (the patternfile goes into DistributedCache). Now I wanted to reuse this class to run for 1000 pattern files. I just had to extend the pattern matching class and override its main and run function. In the run of the child class I modify the commandline arguments and feed them to the parents run() function. Everything goes well up until iteration 45-50. Suddenly all tasktrackers start to fail until no progress is made. I checked the HDFS but still 70% of space left. Anybody any ideas as to why launching 50 jobs, one by one causes difficulties to hadoop?
#Override
public int run(String[] args) throws Exception {
//-patterns patternsDIR input/ output/
List<String> files = getFiles(args[1]);
String inputDataset=args[2];
String outputDir=args[3];
for (int i=0; i<files.size(); i++){
String [] newArgs= new String[4];
newArgs = modifyArgs(args);
super.run(newArgs);
}
return 0;
}
EDIT: Just checked the job logs, this is the first error occurring:
2013-11-12 09:03:01,665 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hduser cause:java.io.IOException: java.lang.OutOfMemoryError: Java heap space
2013-11-12 09:03:32,971 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201311120807_0053_m_000053_0' has completed task_201311120807_0053_m_000053 successfully.
2013-11-12 09:07:51,717 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hduser cause:java.io.IOException: java.lang.OutOfMemoryError: Java heap space
2013-11-12 09:08:05,973 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201311120807_0053_m_000128_0' has completed task_201311120807_0053_m_000128 successfully.
2013-11-12 09:08:16,571 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201311120807_0053_m_000130_0' has completed task_201311120807_0053_m_000130 successfully.
2013-11-12 09:08:16,571 WARN org.apache.hadoop.hdfs.LeaseRenewer: Failed to renew lease for [DFSClient_NONMAPREDUCE_1595161181_30] for 30 seconds. Will retry shortly ...
2013-11-12 09:08:27,175 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201311120807_0053_m_000138_0' has completed task_201311120807_0053_m_000138 successfully.
2013-11-12 09:08:25,241 ERROR org.mortbay.log: EXCEPTION
java.lang.OutOfMemoryError: Java heap space
2013-11-12 09:08:25,241 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 54311, call heartbeat(org.apache.hadoop.mapred.TaskTrackerStatus#7fcb9c0a, false, false, true, 9834) from 10.1.1.13:55028: error: java.io.IOException: java.lang.OutOfMemoryError: Java heap space
java.io.IOException: java.lang.OutOfMemoryError: Java heap space
at java.lang.AbstractStringBuilder.<init>(AbstractStringBuilder.java:62)
at java.lang.StringBuilder.<init>(StringBuilder.java:97)
at org.apache.hadoop.util.StringUtils.escapeString(StringUtils.java:435)
at org.apache.hadoop.mapred.Counters.escape(Counters.java:768)
at org.apache.hadoop.mapred.Counters.access$000(Counters.java:52)
at org.apache.hadoop.mapred.Counters$Counter.makeEscapedCompactString(Counters.java:111)
at org.apache.hadoop.mapred.Counters$Group.makeEscapedCompactString(Counters.java:221)
at org.apache.hadoop.mapred.Counters.makeEscapedCompactString(Counters.java:648)
at org.apache.hadoop.mapred.JobHistory$MapAttempt.logFinished(JobHistory.java:2276)
at org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:2636)
at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:1222)
at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:4471)
at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:3306)
at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:3001)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
2013-11-12 09:08:16,571 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54311, call heartbeat(org.apache.hadoop.mapred.TaskTrackerStatus#3269c671, false, false, true, 9841) from 10.1.1.23:42125: error: java.io.IOException: java.lang.OutOfMemoryError: Java heap space
java.io.IOException: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$Packet.<init>(DFSClient.java:2875)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:3806)
at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:150)
at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132)
at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:121)
at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:112)
at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:220)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:290)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:294)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:140)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at java.io.BufferedWriter.flush(BufferedWriter.java:253)
at java.io.PrintWriter.flush(PrintWriter.java:293)
at java.io.PrintWriter.checkError(PrintWriter.java:330)
at org.apache.hadoop.mapred.JobHistory.log(JobHistory.java:847)
at org.apache.hadoop.mapred.JobHistory$MapAttempt.logStarted(JobHistory.java:2225)
at org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:2632)
at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:1222)
at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:4471)
at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:3306)
at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:3001)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
at java.security.AccessController.doPrivileged(Native Method)
And after that we see a bunch of:
2013-11-12 09:13:48,204 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201311120807_0053_m_000033_0: Lost task tracker: tracker_n144-06b.wall1.ilabt.iminds.be:localhost/127.0.0.1:47567
EDIT2: Some ideas?
The heap space error is kind of unexpected since the mappers hardly require any memory.
I am calling the base class with super.run(), should I use a Toolrunner call for that?
In every iteration a file with approximately 1000 words + score is added to the DistributedCache, I am not sure whether I should reset the cache somewhere? (every job in the super.run() runs with job.waitForCompletion(), is the cache cleared then?)
EDIT3:
#Donald: I haven't resized the memory for the hadoop daemons, so they should have a heap of 1GB each. The maptasks have 800 MB of heap from which 450 MB is used for io.sort.
#Chris: I haven't modified anything on the counters, I am using the regular ones. There are 1764 map tasks with 16 counters each, and the job itself will have another 20 or so. This might indeed add up after 50 consecutive jobs, but I would think it is not stored in the heap if you are running multiple consecutive jobs?
#Extra information:
The map tasks are extremely fast, it only takes 3-5 seconds per task, and I have jvm.reuse=-1. A map tasks processes a file with 10 records (the file is much smaller than the block size). Due to the small files I could consider making input files with 100 records to reduce the mapping overhead.
The first thing I tried was to add a unit reducer (1 reduce task) to reduce the number of files create in the HDFS, (otherwise there would be 1 per pattern and therefore 1000 per job which might create overhead for the datanodes)
The number of records per job is rather low, I am looking for specific words in 1764 files and the number of matches with one of 1000 patterns is around 5000 map output records in total)
#All: Thanks for helping me out guys!