Lucene Out Of Memory - java

I'm using Lucene v4.10.4. I have pretty big index, it could be over few GBs. So I get OutOfMemoryError on initializing IndexSearcher:
try (Directory dir = FSDirectory.open(new File(indexPath))) {
//Out of Memory here!
IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(indexDir));
How to tell Lucene's DirectoryReader to not load into memory more than 256 MB at once?
Log
Caused by: java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.util.fst.BytesStore.<init>(BytesStore.java:68)
at org.apache.lucene.util.fst.FST.<init>(FST.java:386)
at org.apache.lucene.util.fst.FST.<init>(FST.java:321)
at org.apache.lucene.codecs.blocktree.FieldReader.<init>(FieldReader.java:85)
at org.apache.lucene.codecs.blocktree.BlockTreeTermsReader.<init>(BlockTreeTermsReader.java:192)
at org.apache.lucene.codecs.lucene41.Lucene41PostingsFormat.fieldsProducer(Lucene41PostingsFormat.java:441)
at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.<init>(PerFieldPostingsFormat.java:197)
at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:254)
at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:120)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:108)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:62)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:923)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:53)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:67)

First you should check the current heap size of your JVM.
java -XX:+PrintFlagsFinal -version | grep MaxHeapSize
If this number is not reasonable for your use case, you should increase it when running your program with -Xmx option of java command. A sample command to assign 8GB of heap memory would look like:
java -Xmx8g -jar your_jar_file
Hope this helps.

Related

Java.lang.OutOfMemoryError:Java heap space org.apache.poi.ss.formula.FormulaCellCacheEntrySet.add(FormulaCellCacheEntrySet.java:63)

I am using "switch(evaluator.evaluateInCell(cellIn).getCellType())" in my java program to evaluate formula in Excel. I have arount 15,000 rows in my excel. Here is my code:
Cell cellDB_AcctNumber = row1.createCell(lastCell1);
cellDB_AcctNumber.setCellType(Cell.CELL_TYPE_FORMULA);
cellDB_AcctNumber.setCellFormula("VLOOKUP($E"+k+",'SQL_AMS_DATA'!$C$2:$F$"+lastrowDB+",2,FALSE)");
Cell cellDB_RoutNumber = (HSSFCell) row1.createCell(lastCell1+1);
cellDB_RoutNumber.setCellType(XSSFCell.CELL_TYPE_FORMULA);
cellDB_RoutNumber.setCellFormula("VLOOKUP($E"+k+",'SQL_AMS_DATA'!$C$2:$F$"+lastrowDB+",3,FALSE)");
Cell cellDB_AcctType = (HSSFCell) row1.createCell(lastCell1+2);
cellDB_AcctType.setCellType(XSSFCell.CELL_TYPE_FORMULA);
cellDB_AcctType.setCellFormula("VLOOKUP($E"+k+",'SQL_AMS_DATA'!$C$2:$F$"+lastrowDB+",4,FALSE)");
When I run in Eclipse i got the below error. Could you please help me out of this error ?
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.apache.poi.ss.formula.FormulaCellCacheEntrySet.add(FormulaCellCacheEntrySet.java:63)
at org.apache.poi.ss.formula.CellCacheEntry.addConsumingCell(CellCacheEntry.java:85)
at org.apache.poi.ss.formula.FormulaCellCacheEntry.changeConsumingCells(FormulaCellCacheEntry.java:80)
at org.apache.poi.ss.formula.FormulaCellCacheEntry.setSensitiveInputCells(FormulaCellCacheEntry.java:60)
at org.apache.poi.ss.formula.FormulaCellCacheEntry.updateFormulaResult(FormulaCellCacheEntry.java:109)
at org.apache.poi.ss.formula.CellEvaluationFrame.updateFormulaResult(CellEvaluationFrame.java:75)
at org.apache.poi.ss.formula.EvaluationTracker.updateCacheResult(EvaluationTracker.java:93)
at org.apache.poi.ss.formula.WorkbookEvaluator.evaluateAny(WorkbookEvaluator.java:294)
at org.apache.poi.ss.formula.WorkbookEvaluator.evaluate(WorkbookEvaluator.java:229)
at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.evaluateFormulaCellValue(HSSFFormulaEvaluator.java:354)
at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.evaluateInCell(HSSFFormulaEvaluator.java:243)
at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.evaluateInCell(HSSFFormulaEvaluator.java:46)
at com.sentry.comparison.convert.xls2xlsx.xls2xlsxConvert(xls2xlsx.java:74)
at com.sentry.comparison.main.ComparisonMain.main(ComparisonMain.java:41)
You can increase heap memory on eclipse.ini file, adding this lines in the bottom of file:
-XX:PermSize=256M
-XX:MaxPermSize=512M
But if you are running an Web Application into a web container or a web server, you could increase memory of your server.
If you are running into a tomcat, create a sh ou bat file (depending of you O.S.), put this file inside your bin directory and write this instructions inside this file:
set CATALINA_OPTS=-server -Xms256m -Xmx2048m -XX:PermSize=512m -XX:MaxPermSize=512

how do i increase Heap Memory Size Programatically

i have sample code for increase Heap Memory.But it is not increasing Memory.
sample Code :
int mb=1024*1024;
long rt=Runtime.getRuntime().totalMemory();
int heapsize=(int) (rt/mb);
System.out.println("Heap Size : " +heapsize);
String[] cmd = {"cmd.exe", "/c", "cd/C C:\\Users\\xxxxxx\\Documents\\NetBeansProjects\\MultiThreadSample\\src\\multithreadsample && java -Xms61m -Xmx128m"};
Process exec = Runtime.getRuntime().exec(cmd);
exec.destroy();
SpawnAndChangeHeap is the class name.Can You Please Suggest Me?
java heap allocation are continuous and happens at JVM initialization. Programmatically heap size cannot be extended or modified until Xmx or Xms changed with JVM restart.

Fuseki GC overhead limit exceeded during data import

I'm trying to import LinkedMDB (6.1m triples) into my local version of jena-fuseki at startup:
/path/to/fuseki-server --file=/path/to/linkedmdb.nt /ds
and that runs for a minute, then dies with the following error:
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at com.hp.hpl.jena.graph.Node$3.construct(Node.java:318)
at com.hp.hpl.jena.graph.Node.create(Node.java:344)
at com.hp.hpl.jena.graph.NodeFactory.createURI(NodeFactory.java:48)
at org.apache.jena.riot.system.RiotLib.createIRIorBNode(RiotLib.java:80)
at org.apache.jena.riot.system.ParserProfileBase.createURI(ParserProfileBase.java:107)
at org.apache.jena.riot.system.ParserProfileBase.create(ParserProfileBase.java:156)
at org.apache.jena.riot.lang.LangNTriples.tokenAsNode(LangNTriples.java:97)
at org.apache.jena.riot.lang.LangNTriples.parseOne(LangNTriples.java:90)
at org.apache.jena.riot.lang.LangNTriples.runParser(LangNTriples.java:54)
at org.apache.jena.riot.lang.LangBase.parse(LangBase.java:42)
at org.apache.jena.riot.RDFParserRegistry$ReaderRIOTFactoryImpl$1.read(RDFParserRegistry.java:142)
at org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:818)
at org.apache.jena.riot.RDFDataMgr.parse(RDFDataMgr.java:679)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:211)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:104)
at org.apache.jena.fuseki.FusekiCmd.processModulesAndArgs(FusekiCmd.java:251)
at arq.cmdline.CmdArgModule.process(CmdArgModule.java:51)
at arq.cmdline.CmdMain.mainMethod(CmdMain.java:100)
at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
at org.apache.jena.fuseki.FusekiCmd.main(FusekiCmd.java:141)
Is there a way that I can bump up the memory limit or import the data in less intensive way?
For comparison's sake, when I used a 1million triple source file, it imports in less than 10 seconds.
Increase heap memory, java -Xmx2048M -jar fuseki-sys.jar ......
open fuseki-server with an editor you'll find the line JVM_ARGS=${JVM_ARGS:--Xmx1200M} modify it to JVM_ARGS=${JVM_ARGS:--Xmx2048M}
Set JVM_ARGS when using the fuseki-server script.
Also note that --file=... is reading the file into memory. Maybe this is too big for handling that way. If so, load into TDB and use a TDB database with Fuseki.

How to take heap dump?

I want to collect heap dump on JVM crash
So i wrote a simple code
public class Test {
private String name;
public Test(String name) {
this.name = name;
}
public void execute() {
Map<String,String> randomData = new HashMap<String,String>();
for(int i=0;i<1000000000;i++) {
randomData.put("Key:" + i,"Value:" + i);
}
}
public void addData() {
}
public static void main(String args[]) {
String myName = "Aniket";
Test tStart = new Test(myName);
tStart.execute();
}
}
and I am running it as follows
[aniket#localhost Desktop]$ java -cp . -Xms2m -Xmx2m Test
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at Test.execute(Test.java:15)
at Test.main(Test.java:25)
I got OutOfMemoryError which I wanted but there is no heap dump in the working directory(like hs_err_pidXXXX.log which I expected). What am I missing? How do I get a heap dump?
Update :
I tried -XX:ErrorFile=. still no use. If above is not the way to get the heap dump(Crash JVM) how can I crash my JVM to get those logs?
You are confusing an exception or error being thrown as a JVM crash.
A JVM crash occurs due to an internal error in the JVM, you cannot trigger this by writing a normal Java program (or should not unless you find a bug)
What you are doing is triggering an Error which means the program continues to run until all the non daemon threads exit.
The simplest tool to examine the heap is VisualVM which comes with the JDK. If you want to trigger a heap dump on an OutOfMemoryError you can use -XX:+HeapDumpOnOutOfMemoryError
Use Jmap
jmap [options] pid
pid is the process id of application
When you see the below
Exception in thread "main" java.lang.OutOfMemoryError
It means your error or exception is handled by the exception handler. This is not a crash.
Eclipse has an awesome Heap Analyzer
Also, you can use jps to get the PID, and then jmap for the heap itself.
In case, you want to crash the JVM, your best guess would be native code.
Find the process id for which you want to take the heap dump
ps -ef | grep java
Once you get PID by running above command run below command to generate heap dump.
jmap -dump:format=b,file=<fileName> <java PID>
You can pass below JVM arguments to your application:
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=
This argument will automatically trigger heap dump in the specified 'file-path' when your application experiences OutOfMemoryError. There are 7 different options to take heap dumps from your application:
jmap
-XX:+HeapDumpOnOutOfMemoryError
jcmd
JVisualVM
JMX
Programmatic Approach
Administrative consoles
Details about each option can be found in this article. Once you have captured heap dump, you may use tools like Eclipse Memory Analysis tool, HeapHero to analyze the captured heap dumps.

java.lang.outofmemory exception jvm heap size insufficient

I tried to read a text file which have only one line. File size is over 50Mb. When I tried to read it using the following code it gives
java.lang.outofmemory exception jvm
heap size insufficient
I change the java heap memory to 1GB . But still it gives the same exception.
set JAVA_OPTS=-Xms1024m -Xmx1024m
I use foollowing code pragment to read the file.
BufferedReader Filein1=new
BufferedReader(new FileReader( new
File( "C:\ABC\MsgStream.txt" )));
s=Filein1.readLine();
Can some one please tell me how to overcome this problem. Thanks in advance.
The JAVA_OPTS environment variable is only respected by certain applications (for example the wrapper scripts that are typically used to launch Tomcat). The java command doesn't pay any attention to it.
You need to put the options on the java command line ... before the classname; e.g.
java -Xms1024m -Xmx1024m ... some.pkg.MainClass ...
(A 1Gb heap should be more than adequate for buffering a 50Mb file.)
Are JAVA_OPTS actually being run in the output? You may need to actually put them on the command line you're running, or include $JAVA_OPTS on it.
It's _JAVA_OPTIONS not JAVA_OPTS
If you have a class like
public class Test {
public static void main(String args[]) {
System.out.println("value : " + System.getProperty("foo"));
}
}
then you should get
> set _JAVA_OPTIONS=-Dfoo=bar
> java Test
> value : bar

Categories