My KAFKA Stream java application goes to ERROR status due to an out of memory problem.
I use windowed aggregation, mainly in order to calculate median values:
a 1 second windows
.windowedBy(TimeWindows.of(Duration.ofSeconds(1)).advanceBy(Duration.ofMillis(999)).grace(Duration.ofMillis(1))) with .suppress(Suppressed.untilWindowCloses(Suppressed.BufferConfig.unbounded().withMaxBytes(10000).withLoggingDisabled()))
30 seconds windows without suppress .windowedBy(TimeWindows.of(Duration.ofSeconds(30)).advanceBy(Duration.ofSeconds(2)).grace(Duration.ofMillis(1)))
I have also a steate store:`
StoreBuilder<KeyValueStore<String, Gateway>> kvStoreBuilder = Stores.keyValueStoreBuilder(
Stores.persistentKeyValueStore(AppConfigs.GW_STATE_STORE_NAME),
Serdes.String(),
JsonSerdes.getGatewaySerde()
);
// add state store to StreamBuilder
builder.addStateStore(kvStoreBuilder);`
Eclipse memory analyzer says that:
One instance of ‘org.apache.kafka.streams.state.internals.InMemoryTimeOrderedKeyValueBuffer’ loaded by ‘jdk.internal.loader.ClassLoaders$AppClassLoader # 0xf00d8558’ occupies 238,753,712 (90.51%) bytes. The memory is accumulated in one instance of ‘java.util.HashMap$Node[]’, loaded by ‘’, which occupies 238,749,768 (90.51%) bytes.
Can anyone explain which should be the root cause ?
The error is from suppress() that use the in-memory store (InMemoryTimeOrderedKeyValueBuffer). suppress() does not support RocksDB atm (cf https://issues.apache.org/jira/browse/KAFKA-7224).
Your suppress() config seems to be incorrect:
Suppressed.BufferConfig.unbounded().withMaxBytes(10000).withLoggingDisabled()
The configs unbounded() and withMaxBytes() contradict each other: do you want an unbounded or bounded buffer? -- In your case, the second withMaxBytes() overwrites the first one. Thus, you only provide 10,000 bytes for the suppress buffer. Because you use untilWindowCloses(), Kafka Streams will need to shut down if it runs out of memory, because it's neither allows to early emit (untilWindowClose()) not allowed to use more memory (withMaxBytes(...)).
For untilWindowClose() you should use unbounded(). If you want to bound memory, you should not use untilWindowClose().
You need to tune the rocks DB configuration, please read this https://medium.com/#grinfeld_433/kafka-streams-and-rocksdb-in-the-space-time-continuum-and-a-little-bit-of-configuration-40edb5ee9ed7
If you are using java>=8, set metaspace otherwise it will eat all your server RAM. http://karunsubramanian.com/websphere/one-important-change-in-memory-management-in-java-8/
If using dockers, limit the max memory configs.
There is a bug in old kafka and they recommend to update version.
https://issues.apache.org/jira/browse/KAFKA-8637
Related
In my pipeline I have around 4 million records and the flow is as follows
Read all records from bigquery
Transform to proto
Combine globally and create a sorted kv based SST file which is later used for Rocksdb
This pipeline works for records upto 1.5 million but later fails with this error.
Shutting down JVM after 8 consecutive periods of measured GC
thrashing. Memory is used/total/max = 2481/2492/2492 MB, GC last/max =
97.50/97.50 %, #pushbacks=0, gc thrashing=true. Heap dump not written.
The error doesn't change even I used several optimizations suggested in various other threads such as
Changing machine type to high memory
Decreasing the accumulators (reduced the worker count to 1)
Use ssd disk
--experiments=shuffle_mode=service
Current stats
I can't use a custom file sink as the underlying SST writer doesn't support writing from bytable channel as here
Any insight on resolving this would be helpful
Noticed that the Current memory is still 3.75 GB, upgrading the worker machine type to n1-standard-2 worked
In Spark, I'm getting java.lang.OutOfMemoryError: Java heap space error when reading a String of around 1 GB from the HDFS from within a function. The executor memory I use is 6 GB though. To increase the user memory, I even decreased spark.memory.fraction to just 0.3, but I am still getting the same error. It seems as though decreasing that value had no effect. I am using Spark 1.6.1 and compiling with Spark 1.6 core library. Am I doing something wrong here?
Please see SparkConf
Spark Executor OOM: How to set Memory Parameters on Spark
Once a app is running the next most likely error you will see is an OOM on a spark executor. Spark is an extremely powerful tool for doing in-memory computation but it’s power comes with some sharp edges. The most common cause for an executor OOM’ing is that the application is trying to cache or load too much information into memory. Depending on your use case there are several solutions to this:
Increase the storage fraction variable, spark.storage.memoryFraction. This can be set as above on either the command line or in the SparkConf object. This variable sets exactly how much of the JVM will be dedicated to the caching and storage of RDD’s. You can set it as a value between 0 and 1, describing what portion of executor JVM memory will be dedicated for caching RDDs. If you have a job that will require very little shuffle memory but will utilize a lot of cached RDD’s increase this variable (example: Caching an RDD then performing aggregates on it.)
If all else fails you may just need additional ram on each worker.
Then increase the amount of ram the application requests by setting spark.executor.memory variable either on the command line or in the SparkConf object.
In your case somehow seems like memory fraction setting was not applied. as advised in comment you can print all settings applied like this to cross check.
logger.info(sparkContext.getConf.getAll.mkString("\n")
if its not applied, you can set this grammatically and try to see the effect.
val conf = new SparkConf()
.set("spark.memory.fraction", "1")
.set("spark.testing.memory", maxOnHeapExecutionMemory.toString)
…
as described in the test
UPDATE :
Please go through this nice post to understand more in detail
Gist of above the post is :
You can see 3 main memory regions on the diagram:
1) Reserved Memory : Memory reserved by the system, and its size is
hard coded
2) User Memory (in Spark 1.6 “Java Heap” – “Reserved Memory”) * (1.0
– spark.memory.fraction)
This is the memory pool that remains after the allocation of Spark
Memory, and it is completely up to you to use it in a way you like.
User Memory and its completely up to you what would be stored in this
RAM and how, Spark makes completely no accounting on what you do there
and whether you respect this boundary or not. Not respecting this
boundary in your code might cause OOM error.
3) Spark Memory (“Java Heap” – “Reserved Memory”) *
spark.memory.fraction, --> Memory pool managed by Spark. Further
divided in to
|--> Storage Memory
|--> Execution Memory
I needs Java code to monitor Redis's memory usage because Redis stores all data into RAM and it will crash if the memory is full.
It looks like Redis uses the whole OS memory, so if I use "Runtime" method in Java, it is no correct because it only counts the memory in JVM.
Is there any Java method to monitor the whole OS system's memory usage or these is some magic Redis method?
You could make periodic requests to redis, sending an INFO command, and parse the result to get the value of used_memory, which is the number of bytes allocated by Redis memory allocator.
update: Redis won't crash, it will swap - and so its performances will dramatically fall. You may detect swapping by comparing used_memory_rss to used_memory. used_memory_rss much greater than used_memory means swapping occured. But before that you can be wanted swapping will occur if used_memory is just below the total memory available for Redis.
If you are using redis as a cache, you may limit its memory consumption by adding these lines in the config file :
maxmemory 2mb
maxmemory-policy allkeys-lru
In this example it will be limited to 2 Mb.
update
maxmemory will prevent new write operations when the limit is reached, and respond with an error; and it will start to delete keys according to the LRU policy, which is appropriated for a cache.
I'm working with datastax 3.1 on a single node with 4Go of RAM.
I have not change anything in cassandra-en.sh and cassandra.yaml except the "--Xss" (because of my java version which require a little more)
So by default Cassandra set to 1Go my -Xms and -Xmx parameters: -Xms1024M -Xmx1024M
But while inserting my data after around 200 000 rows (in 3 different column_families), Solr and cassandra logs keep repeat this kind of warning:
WARN StorageService Flushing CFS(Keyspace='OpsCenter',
ColumnFamily='rollups60') to relieve memory pressure 17:58:07
WARN GCInspector Heap is 0.8825103486201678 full. You may need to reduce
memtable and/or cache sizes. Cassandra will now flush up to the two
largest memtables to free up memory. Adjust flush_largest_memtables_at
threshold in cassandra.yaml if you don't want Cassandra to do this
automatically
So, OK my heap is full, but why after flushing, is my heap still full ?
If I stop inserting data at this point. Warning keep repeating.
If I stop and restart cassandra. No problem raise
It looks like memory leak issue right?
So where should I look at?
Thanks for futur help.
One thing that's a memory hog is Solr's caches. Take a look at your solrconfig.xml file inside the "conf" dir of each of your Solr cores, and look at the value configured for caches such as:
<filterCache class="solr.FastLRUCache"
size="100"
initialSize="0"
autowarmCount="0"/>
There may be multiple entries like this one. Make sure that, at least the autowarmCount and initialSize are set to 0. Further more, lower the "size" value to something small, like 100 or something. All these values refer to number of entries in the cache.
Another thing that may help is configuring Solr to do hard-commits more often. Look for an entry such as:
<!-- stuff ommited for brevity -->
<autoCommit>
<maxDocs>5000</maxDocs>
<maxTime>15000</maxTime>
<openSearcher>false</openSearcher>
</autoCommit>
The above settings will commit to disk each time 5000 documents have been added or 15 seconds have passed since the last commit, which ever comes first. Also set openSearcher to false.
Finally, look for these entries and set them as follows:
<ramBufferSizeMB>16</ramBufferSizeMB>
<maxBufferedDocs>5000</maxBufferedDocs>
Now, making all this modifications on Solr at once will surely make it run a lot slower. Try instead to make them incrementally, until you get rid of the memory error. Also, it may simply be that you need to allocate more memory to your Java process. If you say the machine has 4 Gb of RAM, why not try your test with -Xmx2g or -Xmx3g ?
Cassandra is trying to clear up heap space, however flushing memtables doesn't flush Solr heap data structures.
For the index size you have, combined with possibly queries that load the Lucene field caches there is not enough heap space allocated. The best advice is to allocate more heap space.
To view the field cache memory usage:
http://www.datastax.com/docs/datastax_enterprise3.1/solutions/dse_search_core_status
I have written a parser class for a particular binary format (nfdump if anyone is interested) which uses java.nio's MappedByteBuffer to read through files of a few GB each. The binary format is just a series of headers and mostly fixed-size binary records, which are fed out to the called by calling nextRecord(), which pushes on the state machine, returning null when it's done. It performs well. It works on a development machine.
On my production host, it can run for a few minutes or hours, but always seems to throw "java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code", fingering one of the Map.getInt, getShort methods, i.e. a read operation in the map.
The uncontroversial (?) code that sets up the map is this:
/** Set up the map from the given filename and position */
protected void open() throws IOException {
// Set up buffer, is this all the flexibility we'll need?
channel = new FileInputStream(file).getChannel();
MappedByteBuffer map1 = channel.map(FileChannel.MapMode.READ_ONLY, 0, channel.size());
map1.load(); // we want the whole thing, plus seems to reduce frequency of crashes?
map = map1;
// assumes the host writing the files is little-endian (x86), ought to be configurable
map.order(java.nio.ByteOrder.LITTLE_ENDIAN);
map.position(position);
}
and then I use the various map.get* methods to read shorts, ints, longs and other sequences of bytes, before hitting the end of the file and closing the map.
I've never seen the exception thrown on my development host. But the significant point of difference between my production host and development is that on the former, I am reading sequences of these files over NFS (probably 6-8TB eventually, still growing). On my dev machine, I have a smaller selection of these files locally (60GB), but when it blows up on the production host it's usually well before it gets to 60GB of data.
Both machines are running java 1.6.0_20-b02, though the production host is running Debian/lenny, the dev host is Ubuntu/karmic. I'm not convinced that will make any difference. Both machines have 16GB RAM, and are running with the same java heap settings.
I take the view that if there is a bug in my code, there is enough of a bug in the JVM not to throw me a proper exception! But I think it is just a particular JVM implementation bug due to interactions between NFS and mmap, possibly a recurrence of 6244515 which is officially fixed.
I already tried adding in a "load" call to force the MappedByteBuffer to load its contents into RAM - this seemed to delay the error in the one test run I've done, but not prevent it. Or it could be coincidence that was the longest it had gone before crashing!
If you've read this far and have done this kind of thing with java.nio before, what would your instinct be? Right now mine is to rewrite it without nio :)
I would rewrite it without using mapped NIO. If you're dealing with more than one file there is a problem that the mapped memory is never released so you will run out of virtual memory: NB this isn't necessarily just an OutOfMemoryError which interacts with the garbage collector, it would be a failure to allocate the new mapped buffer. I would use a FileChannel.
Having said that, large-scale operations on NFS files are always extremely problematic. You would be much better off redesigning the system so that each file is read by its local CPU. You will also get immense speed improvements this way, far more than the 20% you will lose by not using mapped buffers.