Spark on mesos Executors failing with OOM Errors - java

We are using spark 2.0.2 managed by a DCOS system that fetch data from a Kafka 1.0.0 messaging service and writes parquet in a hdfs system.
Every thing was working ok, but when we increase the number of topics in Kafka, our spark executors began to crash constantly with OOM errors:
java.lang.OutOfMemoryError: Java heap space
at org.apache.parquet.column.values.dictionary.IntList.initSlab(IntList.java:90)
at org.apache.parquet.column.values.dictionary.IntList.<init>(IntList.java:86)
at org.apache.parquet.column.values.dictionary.DictionaryValuesWriter.<init>(DictionaryValuesWriter.java:93)
at org.apache.parquet.column.values.dictionary.DictionaryValuesWriter$PlainDoubleDictionaryValuesWriter.<init>(DictionaryValuesWriter.java:422)
at org.apache.parquet.column.ParquetProperties.dictionaryWriter(ParquetProperties.java:139)
at org.apache.parquet.column.ParquetProperties.dictWriterWithFallBack(ParquetProperties.java:178)
at org.apache.parquet.column.ParquetProperties.getValuesWriter(ParquetProperties.java:203)
at org.apache.parquet.column.impl.ColumnWriterV1.<init>(ColumnWriterV1.java:83)
at org.apache.parquet.column.impl.ColumnWriteStoreV1.newMemColumn(ColumnWriteStoreV1.java:68)
at org.apache.parquet.column.impl.ColumnWriteStoreV1.getColumnWriter(ColumnWriteStoreV1.java:56)
at org.apache.parquet.io.MessageColumnIO$MessageColumnIORecordConsumer.<init>(MessageColumnIO.java:183)
at org.apache.parquet.io.MessageColumnIO.getRecordWriter(MessageColumnIO.java:375)
at org.apache.parquet.hadoop.InternalParquetRecordWriter.initStore(InternalParquetRecordWriter.java:109)
at org.apache.parquet.hadoop.InternalParquetRecordWriter.<init>(InternalParquetRecordWriter.java:99)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:217)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:175)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:146)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:113)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:87)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:62)
at org.apache.parquet.avro.AvroParquetWriter.<init>(AvroParquetWriter.java:47)
at npm.parquet.ParquetMeasurementWriter.ensureOpenWriter(ParquetMeasurementWriter.java:91)
at npm.parquet.ParquetMeasurementWriter.write(ParquetMeasurementWriter.java:75)
at npm.ingestion.spark.StagingArea$Measurements.store(StagingArea.java:100)
at npm.ingestion.spark.StagingArea$StagingAreaStorage.store(StagingArea.java:80)
at npm.ingestion.spark.StagingArea.add(StagingArea.java:40)
at npm.ingestion.spark.Kafka2HDFSPM$SubsetProcessor.sendToStagingArea(Kafka2HDFSPM.java:207)
at npm.ingestion.spark.Kafka2HDFSPM$SubsetProcessor.consumeRecords(Kafka2HDFSPM.java:193)
at npm.ingestion.spark.Kafka2HDFSPM$SubsetProcessor.process(Kafka2HDFSPM.java:169)
at npm.ingestion.spark.Kafka2HDFSPM$FetchSubsetsAndStore.call(Kafka2HDFSPM.java:133)
at npm.ingestion.spark.Kafka2HDFSPM$FetchSubsetsAndStore.call(Kafka2HDFSPM.java:111)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:218)
18/03/20 18:41:13 ERROR [Executor task launch worker-0] SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[Executor task launch worker-0,5,main]
java.lang.OutOfMemoryError: Java heap space
at org.apache.parquet.column.values.dictionary.IntList.initSlab(IntList.java:90)
at org.apache.parquet.column.values.dictionary.IntList.<init>(IntList.java:86)
at org.apache.parquet.column.values.dictionary.DictionaryValuesWriter.<init>(DictionaryValuesWriter.java:93)
at org.apache.parquet.column.values.dictionary.DictionaryValuesWriter$PlainDoubleDictionaryValuesWriter.<init>(DictionaryValuesWriter.java:422)
at org.apache.parquet.column.ParquetProperties.dictionaryWriter(ParquetProperties.java:139)
at org.apache.parquet.column.ParquetProperties.dictWriterWithFallBack(ParquetProperties.java:178)
at org.apache.parquet.column.ParquetProperties.getValuesWriter(ParquetProperties.java:203)
at org.apache.parquet.column.impl.ColumnWriterV1.<init>(ColumnWriterV1.java:83)
at org.apache.parquet.column.impl.ColumnWriteStoreV1.newMemColumn(ColumnWriteStoreV1.java:68)
at org.apache.parquet.column.impl.ColumnWriteStoreV1.getColumnWriter(ColumnWriteStoreV1.java:56)
at org.apache.parquet.io.MessageColumnIO$MessageColumnIORecordConsumer.<init>(MessageColumnIO.java:183)
at org.apache.parquet.io.MessageColumnIO.getRecordWriter(MessageColumnIO.java:375)
at org.apache.parquet.hadoop.InternalParquetRecordWriter.initStore(InternalParquetRecordWriter.java:109)
at org.apache.parquet.hadoop.InternalParquetRecordWriter.<init>(InternalParquetRecordWriter.java:99)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:217)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:175)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:146)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:113)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:87)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:62)
at org.apache.parquet.avro.AvroParquetWriter.<init>(AvroParquetWriter.java:47)
at npm.parquet.ParquetMeasurementWriter.ensureOpenWriter(ParquetMeasurementWriter.java:91)
at npm.parquet.ParquetMeasurementWriter.write(ParquetMeasurementWriter.java:75)
at npm.ingestion.spark.StagingArea$Measurements.store(StagingArea.java:100)
at npm.ingestion.spark.StagingArea$StagingAreaStorage.store(StagingArea.java:80)
at npm.ingestion.spark.StagingArea.add(StagingArea.java:40)
at npm.ingestion.spark.Kafka2HDFSPM$SubsetProcessor.sendToStagingArea(Kafka2HDFSPM.java:207)
at npm.ingestion.spark.Kafka2HDFSPM$SubsetProcessor.consumeRecords(Kafka2HDFSPM.java:193)
at npm.ingestion.spark.Kafka2HDFSPM$SubsetProcessor.process(Kafka2HDFSPM.java:169)
at npm.ingestion.spark.Kafka2HDFSPM$FetchSubsetsAndStore.call(Kafka2HDFSPM.java:133)
at npm.ingestion.spark.Kafka2HDFSPM$FetchSubsetsAndStore.call(Kafka2HDFSPM.java:111)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:218)
We tried to increase the available the executors memory, review the code, but we couldn't find anything wrong.
Another info: we are using RDDs in spark.
Have someone encountered a similar problem, that already been solved

What is the heap configuration for the executor? By default, Java will autotune its heap according to machine memory. You need to change it to fit in your container with -Xmx setting.
See this article about running Java in the container
https://github.com/fabianenardon/docker-java-issues-demo/tree/master/memory-sample

Related

Simple Spring Boot Microservice - Memory Usage Consumption over 300 MB [duplicate]

I'm using spring boot to develop a client application.
and when run the spring boot application(using a fully executable jar), the memory usage is about 190M in x64 server, and 110M in x86 server.
My JVM options are (-Xmx64M -Xms64M -XX:MaxPermSize=64M -server),
why is it that in the x64 server, memory usage is so big?
how to reduce memory usage below 150M?
thanks.
Little late to the game here, but I suffered the same issue with a containerised Spring Boot application on Docker. The bare minimum you'll get away with is around 72M total memory on the simplest of Spring Boot applications with a single controller and embedded Tomcat. Throw in Spring Data REST, Spring Security and a few JPA entities and you'll be looking at 200M-300M minimum. You can get a simple Spring Boot app down to around 72M total by using the following JVM options.
With -XX:+UseSerialGC This will perform garbage collection inline with the thread allocating the heap memory instead of a dedicated GC thread(s)
With -Xss512k This will limit each threads stack memory to 512KB instead of the default 1MB
With -XX:MaxRAM=72m This will restrict the JVM's calculations for the heap and non heap managed memory to be within the limits of this value.
In addition to the above JVM options you can also use the following property inside your application.properties file:
server.tomcat.max-threads = 1 This will limit the number of HTTP request handler threads to 1 (default is 200)
Here is an example of docker stats running a very simple Spring Boot application with the above limits and with the docker -m 72m argument. If I decrease the values any lower than this I cannot get the app to start.
83ccc9b2156d: Mem Usage: 70.36MiB / 72MiB | Mem Percentage: 97.72%
And here you can see a breakdown of all the native and java heap memory on exit.
Native Memory Tracking:
Total: reserved=1398681KB, committed=112996KB
- Java Heap (reserved=36864KB, committed=36260KB)
(mmap: reserved=36864KB, committed=36260KB)
- Class (reserved=1086709KB, committed=43381KB)
(classes #7548)
( instance classes #7049, array classes #499)
(malloc=1269KB #19354)
(mmap: reserved=1085440KB, committed=42112KB)
( Metadata: )
( reserved=36864KB, committed=36864KB)
( used=36161KB)
( free=703KB)
( waste=0KB =0.00%)
( Class space:)
( reserved=1048576KB, committed=5248KB)
( used=4801KB)
( free=447KB)
( waste=0KB =0.00%)
- Thread (reserved=9319KB, committed=938KB)
(thread #14)
(stack: reserved=9253KB, committed=872KB)
(malloc=50KB #74)
(arena=16KB #26)
- Code (reserved=248678KB, committed=15310KB)
(malloc=990KB #4592)
(mmap: reserved=247688KB, committed=14320KB)
- GC (reserved=400KB, committed=396KB)
(malloc=272KB #874)
(mmap: reserved=128KB, committed=124KB)
- Compiler (reserved=276KB, committed=276KB)
(malloc=17KB #409)
(arena=260KB #6)
- Internal (reserved=660KB, committed=660KB)
(malloc=620KB #1880)
(mmap: reserved=40KB, committed=40KB)
- Symbol (reserved=11174KB, committed=11174KB)
(malloc=8417KB #88784)
(arena=2757KB #1)
- Native Memory Tracking (reserved=1858KB, committed=1858KB)
(malloc=6KB #80)
(tracking overhead=1852KB)
- Arena Chunk (reserved=2583KB, committed=2583KB)
(malloc=2583KB)
- Logging (reserved=4KB, committed=4KB)
(malloc=4KB #179)
- Arguments (reserved=17KB, committed=17KB)
(malloc=17KB #470)
- Module (reserved=137KB, committed=137KB)
(malloc=137KB #1616)
Don't expect to get any decent performance out of this either, as I would imagine the GC would be running frequently with this setup as it doesn't have a lot of spare memory to play with
After search, i found it's already have answer in stackoveflow.
Spring Boot memory consumption increases beyond -Xmx option
1. Number of http threads (Undertow starts around 50 threads per default, but you can increase / decrease via property the amount of threads needed)
2. Access to native routines (.dll, .so) via JNI
3. Static variables
4. Use of cache (memcache, ehcache, etc)
If a VM is 32 bit or 64 bit, 64 bit uses more memory to run the same application, so if you don't need a heap bigger than 1.5GB, so keep your application runnnig over 32 bit to save memory.
because spring boot starts around 50 threads per default for http service(Tomcat or Undertow, Jetty), and its use 1 MB per thread(64bit jvm default setting).
SO the in 64bit jvm, the memory usage is
heap(64M) + Permgen(max 64M) + thread stacks(1M x 50+) + native handles.
references:
https://dzone.com/articles/how-to-decrease-jvm-memory-consumption-in-docker-u
http://trustmeiamadeveloper.com/2016/03/18/where-is-my-memory-java/
https://developers.redhat.com/blog/2017/04/04/openjdk-and-containers/
You can use -XX:+UseSerialGC as JVM Argument to specify Serial Garbage Collector which is best choice to reduce Memory Heap .

Incompatible event loop type with native transport enabled on linux

We are seeing high CPU Usage in vertx with eventloop jstack stuck on "at sun.nio.ch.EPollArrayWrapper.epollWait". So we were trying to use native transport
on linux using "setPreferNativeTransport()" but we are getting following error.
java.lang.IllegalStateException: incompatible event loop type:
io.netty.channel.epoll.EpollEventLoop
at io.netty.channel.AbstractChannel$AbstractUnsafe.register(AbstractChannel.java:469)
at io.netty.channel.SingleThreadEventLoop.register(SingleThreadEventLoop.java:80)
at io.netty.channel.SingleThreadEventLoop.register(SingleThreadEventLoop.java:74)
at io.netty.bootstrap.AbstractBootstrap.initAndRegister(AbstractBootstrap.java:331)
at io.netty.bootstrap.Bootstrap.doResolveAndConnect(Bootstrap.java:163)
at io.netty.bootstrap.Bootstrap.connect(Bootstrap.java:145)
at com.github.mauricio.async.db.mysql.codec.MySQLConnectionHandler.connect(MySQLConnectionHandler.scala:87)
at com.github.mauricio.async.db.mysql.MySQLConnection.connect(MySQLConnection.scala:84)
at io.vertx.ext.asyncsql.impl.pool.AsyncConnectionPool.createConnection(AsyncConnectionPool.java:63)
at io.vertx.ext.asyncsql.impl.pool.AsyncConnectionPool.createOrWaitForAvailableConnection(AsyncConnectionPool.java:84)
at io.vertx.ext.asyncsql.impl.pool.AsyncConnectionPool.take(AsyncConnectionPool.java:93)
at io.vertx.ext.asyncsql.impl.BaseSQLClient.getConnection(BaseSQLClient.java:68)
at io.vertx.ext.asyncsql.impl.AsyncSQLClientImpl.getConnection(AsyncSQLClientImpl.java:54)
at io.vertx.ext.asyncsql.impl.ClientWrapper.getConnection(ClientWrapper.java:52)
at io.vertx.ext.sql.SQLClient.queryWithParams(SQLClient.java:109)
at io.vertx.reactivex.ext.sql.SQLClient.queryWithParams(SQLClient.java:218)
at io.vertx.reactivex.ext.sql.SQLClient.lambda$rxQueryWithParams$6(SQLClient.java:231)
at io.vertx.reactivex.core.impl.AsyncResultSingle.subscribeActual(AsyncResultSingle.java:42)
We are using vertx-mysql-postgresql-client and it seems to not support native transport. Has anyone seen this problem ?
We are using
Vertx: 3.5.3
vertx-mysql-postgresql-client: 3.5.3

Ehcache Initial table allocation failed

One of our web application runs within Tomcat 7 which is deployed on AS400 server, and it is using Ehcache as cache component swap data into disk and reduce memory usage.
Few weeks ago, when we try to deploy this application for one of our customer, it fails at startup. And log shows:
Caused by: java.lang.IllegalStateException: Cache 'data' creation in EhcacheManager failed.
at org.ehcache.core.EhcacheManager.createCache(EhcacheManager.java:288)
at org.ehcache.core.EhcacheManager.init(EhcacheManager.java:567)
... 7 more
Caused by: org.ehcache.StateTransitionException: Initial table allocation failed.
Initial Table Size (slots) : 64
Allocation Will Require : 1KB
Table Page Source : org.terracotta.offheapstore.disk.paging.MappedPageSource#bc8a4ca2
at org.ehcache.core.StatusTransitioner$Transition.succeeded(StatusTransitioner.java:209)
at org.ehcache.core.Ehcache.init(Ehcache.java:567)
at org.ehcache.core.EhcacheManager.createCache(EhcacheManager.java:261)
... 8 more
Caused by: java.lang.IllegalArgumentException: Initial table allocation failed.
Initial Table Size (slots) : 64
Allocation Will Require : 1KB
Table Page Source : org.terracotta.offheapstore.disk.paging.MappedPageSource#bc8a4ca2
at org.terracotta.offheapstore.OffHeapHashMap.<init>(OffHeapHashMap.java:219)
at org.terracotta.offheapstore.AbstractLockedOffHeapHashMap.<init>(AbstractLockedOffHeapHashMap.java:71)
at org.terracotta.offheapstore.AbstractOffHeapClockCache.<init>(AbstractOffHeapClockCache.java:76)
at org.terracotta.offheapstore.disk.persistent.AbstractPersistentOffHeapCache.<init>(AbstractPersistentOffHeapCache.java:43)
at org.terracotta.offheapstore.disk.persistent.PersistentReadWriteLockedOffHeapClockCache.<init>(PersistentReadWriteLockedOffHeapClockCache.java:36)
at org.ehcache.impl.internal.store.disk.factories.EhcachePersistentSegmentFactory$EhcachePersistentSegment.<init>(EhcachePersistentSegmentFactory.java:73)
at org.ehcache.impl.internal.store.disk.factories.EhcachePersistentSegmentFactory.newInstance(EhcachePersistentSegmentFactory.java:60)
at org.ehcache.impl.internal.store.disk.factories.EhcachePersistentSegmentFactory.newInstance(EhcachePersistentSegmentFactory.java:37)
at org.terracotta.offheapstore.concurrent.AbstractConcurrentOffHeapMap.<init>(AbstractConcurrentOffHeapMap.java:106)
at org.terracotta.offheapstore.concurrent.AbstractConcurrentOffHeapCache.<init>(AbstractConcurrentOffHeapCache.java:48)
at org.terracotta.offheapstore.disk.persistent.AbstractPersistentConcurrentOffHeapCache.<init>(AbstractPersistentConcurrentOffHeapCache.java:52)
at org.ehcache.impl.internal.store.disk.EhcachePersistentConcurrentOffHeapClockCache.<init>(EhcachePersistentConcurrentOffHeapClockCache.java:52)
at org.ehcache.impl.internal.store.disk.OffHeapDiskStore.createBackingMap(OffHeapDiskStore.java:279)
at org.ehcache.impl.internal.store.disk.OffHeapDiskStore.getBackingMap(OffHeapDiskStore.java:167)
at org.ehcache.impl.internal.store.disk.OffHeapDiskStore.access$600(OffHeapDiskStore.java:95)
at org.ehcache.impl.internal.store.disk.OffHeapDiskStore$Provider.init(OffHeapDiskStore.java:460)
at org.ehcache.impl.internal.store.disk.OffHeapDiskStore$Provider.initStore(OffHeapDiskStore.java:456)
at org.ehcache.impl.internal.store.disk.OffHeapDiskStore$Provider.initAuthoritativeTier(OffHeapDiskStore.java:507)
at org.ehcache.impl.internal.store.tiering.TieredStore$Provider.initStore(TieredStore.java:472)
at org.ehcache.core.EhcacheManager$8.init(EhcacheManager.java:499)
at org.ehcache.core.StatusTransitioner.runInitHooks(StatusTransitioner.java:135)
at org.ehcache.core.StatusTransitioner.access$000(StatusTransitioner.java:33)
at org.ehcache.core.StatusTransitioner$Transition.succeeded(StatusTransitioner.java:194)
this code triggered this is:
CacheConfiguration<String, String[]> dconf = CacheConfigurationBuilder
.newCacheConfigurationBuilder(String.class, String[].class, ResourcePoolsBuilder.heap(11)
.disk(3, MemoryUnit.GB, false))
.withExpiry(Expirations.timeToLiveExpiration(Duration.of(30, TimeUnit.MINUTES)))
.build();
dataCacheManager = CacheManagerBuilder.newCacheManagerBuilder()
.with(CacheManagerBuilder.persistence(new File(cacheFolder, "requestdata"))) //$NON-NLS-1$
.withCache(CACHE_NAME_DATA,dconf)
.build(true);
which surprised us because it has never happened before, we have deployed it for some other customers' server (Windows, As400, linux), none of them has this issues.
This is really a headache, we spend weeks try to figure it out, read source code, tuning jvm parameters, googling around..., nothing except one unanswered post: https://groups.google.com/forum/#!topic/ehcache-users/ApFAe5nYxuA
Is there anyone can help us one this? thanks ahead!
The Ehcache 3 disk store uses java.nio.MappedByteBuffer which require access to direct memory.
There is no documented default MaxDirectMemorySize in Java and the same JVM on different OS can behave differently.
If you have not already set the flag -XX:MaxDirectMemorySize=3G when launching your application, it could be the cause of that exception you see.

Apache-Flink Quickstart - reading CSV file error : Futures timed out after [10000 milliseconds]

I want to read CSV file using Flink-API locally, by the following code:
csvPath="data/weather.csv";
List<Tuple2<String, Double>> csv= env.readCsvFile(csvPath)
.types(String.class,Double.class).collect();
I tried some files in different size(from 800mb to 6gb). Sometimes the operation is completed successfully and sometimes it is not, because of the following timeout exception:
Exception in thread "main" java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:153)
at scala.concurrent.Await$$anonfun$ready$1.apply(package.scala:169)
at scala.concurrent.Await$$anonfun$ready$1.apply(package.scala:169)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.ready(package.scala:169)
at org.apache.flink.runtime.minicluster.FlinkMiniCluster.shutdown(FlinkMiniCluster.scala:439)
at org.apache.flink.runtime.minicluster.FlinkMiniCluster.stop(FlinkMiniCluster.scala:408)
at org.apache.flink.client.LocalExecutor.stop(LocalExecutor.java:127)
at org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:195)
at org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:91)
at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:923)
at org.apache.flink.api.java.DataSet.collect(DataSet.java:410)
at org.apache.flink.simpleCSV.run(simpleCSV.java:83)
how can I fix this problem? increase this timeout programmatically? Or should I put a config file somewhere? Is there a specific heap size that I should set based on the file size?
collect() transfers the data from the cluster to the local client. This does only work for very small data sets (< 10 MB).
If you have larger data sets, you need to process them on the cluster and emit the results through an output format, e.g., write it to a file.
If you are debugging this program, you can set a break point at the constructor of org.apache.flink.api.java.LocalEnvironment (the constructor with config) and run the following command to change the timeout to 200 seconds (Alt+F8 in IntelliJ Idea):
config.setString("akka.ask.timeout", "200 s")
To find LocalEnvironment class in IntelliJ Idea, press Ctr+n, and check "Include non-project classes in the pop-up window, then type "LocalEnvironment" in the edit box.

error when opening a lucene index: java.io.IOException: Map failed

I have a problem which I think is the same as that described here:
Error when opening a lucene index: Map failed
However the solution does not apply in this case so I am providing more details and asking again.
The index is created using Solr 5.3
The line of code causing the exception is:
IndexReader indexReader = DirectoryReader.open(FSDirectory.open(Paths.get("the_path")));
The exception stacktrace is:
Exception in thread "main" java.io.IOException: Map failed: MMapIndexInput(path="/mnt/fastdata/ac1zz/JATE/solr-5.3.0/server/solr/jate/data_aclrd/index/_5t.tvd") [this may be caused by lack of enough unfragmented virtual address space or too restrictive virtual memory limits enforced by the operating system, preventing us to map a chunk of 434505698 bytes. Please review 'ulimit -v', 'ulimit -m' (both should return 'unlimited'), and 'sysctl vm.max_map_count'. More information: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html]
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:907)
at org.apache.lucene.store.MMapDirectory.map(MMapDirectory.java:265)
at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:239)
at org.apache.lucene.codecs.compressing.CompressingTermVectorsReader.<init>(CompressingTermVectorsReader.java:144)
at org.apache.lucene.codecs.compressing.CompressingTermVectorsFormat.vectorsReader(CompressingTermVectorsFormat.java:91)
at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:120)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:65)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:58)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:50)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:731)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:50)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:63)
at uk.ac.shef.dcs.jate.app.AppATTF.extract(AppATTF.java:39)
at uk.ac.shef.dcs.jate.app.AppATTF.main(AppATTF.java:33)
The suggested solutions as in the exception message do not work in this case because I am running the application on a server and I do not have permissions to change those.
Namely,
ulimit -v unlimited
prints: "-bash: ulimit: virtual memory: cannot modify limit: Operation not permitted"
and
sysctl -w vm.max_map_count=10000000
gives:"error: permission denied on key 'vm.max_map_count'"
Is there anyway I can solve this?
Thanks
I have found a solution and so I am answering myself.
If you really cannot set ulimit or vm.max_map_count, the only solution, according to http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html, is to configure your solr (or if you work with Lucene api, choose explicitly) to use SimpleFSDirectory (if windows) or NIOFSDirectory, both are slower than the default.
For example
DirectoryReader.open(new NIOFSDirectory(Paths.get("path_to_index"), FSLockFactory.getDefault()))

Categories