java.lang.OutOfMemoryError: Java heap space? - java

I am writing list object's into a CSV file by using StringBuffer object, when the list contains less data our logic is working perfectly but when there is a large amount of data the in list then there's a problem and I get the error: java.lang.OutOfMemoryError: Java heap space problem
Code snippet as follows :
StringBuffer report = new StringBuffer();
String[] column = null;
StringReader stream = null;
for (MassDetailReportDto dto: newList.values()) {
int i = 0;
column = new String[REPORT_INDEX];
column[i++] = dto.getCommodityCode() == null ? " " : dto.getCommodityCode();
column[i++] = dto.getOaId() == null ? " " : dto.getOaId();
//like this we are calling some other getter methods
//After all getter methods we are appending columns to stringBuffer object
report.append(StringUtils.join(column, PIPE));
report.append(NEW_LINE);
//now stringbuffer object we are writing to file
stream = new StringReader(report.toString());
int count;
char buffer[] = new char[4096];
while ((count = stream.read(buffer)) > -1) {
//writing into file
writer.write(buffer, 0, count);
}
writer.flush();
//clearing the buffer
report.delete(0, report.length());
}
Error is :
java.lang.OutOfMemoryError: Java heap space
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:393)
at java.lang.StringBuilder.append(StringBuilder.java:120)
Could you please look into above code snippet and help me, it would be great help.

Where does column get initialized? I don't see it. But it seems that's a likely culprit. You are building a string array without clearing it out. column[i++] . Where do you clear out that array? It should be scoped to the loop body, not outside of it. So inside loop, declare your String[] column and use it within that scope.

This seems logical to have out of memory error when the list size is big enough. Increasing the JVM heap size (using -Xmx and -Xms jvm args) would resolve the issue temporarily. However, ideally you should used paged access to the source of the items in the list. If the list is populated from database or webservice, it can easily be accessed in paged way.

Related

for-loop is neglecting several indices in an ArrayList dataset

In the snippet of code below, I am attempting to write the contents of the stocks2 ArrayList into the stock_train.csv file. I am employing a for-loop which should loop through every element of the stocks2 ArrayList (the debugger indicates that the size of the stocks2 ArrayList is 2955).
However, I am tracking how many lines of data are actually getting written to the file with the variable r. At the end of the for-loop's runtime, r's value is only 390. I have reviewed this code thoroughly and am struggling to find the issue as to why more than 80% of my data is not getting allocated to the file. (My stock_train.csv file only shows 390 lines of data, rather than 2955). Are there any memory allocation or syntax issues that are preventing this for-loop from writing all of stock2's data to the csv file? Thanks in advance for your time.
CSVWriter cd = new CSVWriter(new FileWriter("src/in/stock_train.csv"), ',', CSVWriter.NO_QUOTE_CHARACTER);
int r=0;
int dd=0; // Tracker variables
for(int g=0; g<stocks2.size(); g++) {
Stock q = stocks2.get(g); // stocks2: size = 2955
String[] temp2 = new String[4];
if(q.getTimestamp().startsWith("a")) {
dd++; // dd: 1
break; // This code is included to neglect any data whose timestamp begins with 'a'. As evidenced by the value of 'dd', it only happens once.
}
temp2[0] = q.getTimestamp();
temp2[1] = Double.toString(q.getPrice());
temp2[2] = Double.toString(q.getVWAP(pv,v));
temp2[3] = Integer.toString(q.getStatus()); // Data I want allocated to the "stocks_train.csv" file
r++; // r: 390
System.out.println(g + " " + temp2);
cd.writeNext(temp2);
}
cd.close();
/* Comments depict values of variables after the for-loops run-time based on debugger information */
Your comment suggests that you want to skip an entry, if the corresponding timestamp starts with an "a". You actually use the break; keyword, which terminates the loop. This also explains why dd has a value of exactly 1.
What you want is a continue; instead of the break;. This has the effect that program execution continues with the next iteration of the loop.

Java data structure for providing random <String><Float> pair based on a large data set at run-time

Is there a smart way to create a 'JSON-like' structure of String - Float pairs, 'key' not needed as data will be grabbed randomly - although an incremented key from 0-n might aid random retrieval of associated data. Due to the size of data set (10k pairs of values), I need this to be saved out to an external file type.
The reason is how my data will be compiled. To save someone entering data into an array manually the item will be excel based, saved out to CSV, parsed using a temporary java program to a file format (for example jJSON) which can be added to my project resources folder. I can then retrieve data from this set, without my application having to manually load a huge array into memory upon application creation. I can quite easily parse the CSV to 'fill-up' an array (or similar) at run-time - but I fear that on a mobile device, the memory overhead will be significant?
I have reviewed the answers to: Suitable Java data structure for parsing large data file and Data structure options for efficiently storing sets of integer pairs on disk? and have not been able to draw a definitive conclusion.
I have tried saving to a .JSON file, however not sure if I can request a random entry, plus this seems quite cumbersome for holding a simple structure. Is a treeMap or hashtable where I need to be focusing my search.
To provide some context to my query, my application will be running on android, and needs to reference a definition (approx 500 character String) and a conversion factor (an Float). I need to retrieve a random data entry. The user may only make 2 or 3 requests during a session - therefore see no point in loading a 10k element array into memory. QUERY: potentially modern day technology on android phones will easily munch through this type of query, and its perhaps only an issue if I am parsing millions of entries at run-time?
I am open to using SQLlite to hold my data if this will provide the functionality required. Please note that the data set must be derived from an easily exportable file format from excel (CSV, TXT etc).
Any advice you can give me would be much appreciated.
Here's one possible design that requires a minimal memory footprint while providing fast access:
Start with a data file of comma-separated or tab-separated values so you have line breaks between your data pairs.
Keep an array of long values corresponding to the indexes of the lines in the data file. When you know where the lines are, you can use InputStream.skip() to advance to the desired line. This leverages the fact that skip() is typically quite a bit faster than read for InputStreams.
You would have some setup code that would run at initialization time to index the lines.
An enhancement would be to only index every nth line so that the array is smaller. So if n is 100 and you're accessing line 1003, you take the 10th index to skip to line 1000, then read past two more lines to get to line 1003. This allows you to tune the size of the array to use less memory.
I thought this was an interesting problem, so I put together some code to test my idea. It uses a sample 4MB CSV file that I downloaded from some big data website that has about 36K lines of data. Most of the lines are longer than 100 chars.
Here's code snippet for the setup phase:
long start = SystemClock.elapsedRealtime();
int lineCount = 0;
try (InputStream in = getResources().openRawResource(R.raw.fl_insurance_sample)) {
int index = 0;
int charCount = 0;
int cIn;
while ((cIn = in.read()) != -1) {
charCount++;
char ch = (char) cIn; // this was for debugging
if (ch == '\n' || ch == '\r') {
lineCount++;
if (lineCount % MULTIPLE == 0) {
index = lineCount / MULTIPLE;
if (index == mLines.length) {
mLines = Arrays.copyOf(mLines, mLines.length + 100);
}
mLines[index] = charCount;
}
}
}
mLines = Arrays.copyOf(mLines, index+1);
} catch (IOException e) {
Log.e(TAG, "error reading raw resource", e);
}
long elapsed = SystemClock.elapsedRealtime() - start;
I discovered my data file was actually separated by carriage returns rather than line feeds. It must have been created on an Apple computer. Hence the test for '\r' as well as '\n'.
Here's a snippet from the code to access the line:
long start = SystemClock.elapsedRealtime();
int ch;
int line = Integer.parseInt(editText.getText().toString().trim());
if (line < 1 || line >= mLines.length ) {
mTextView.setText("invalid line: " + line + 1);
}
line--;
int index = (line / MULTIPLE);
in.skip(mLines[index]);
int rem = line % MULTIPLE;
while (rem > 0) {
ch = in.read();
if (ch == -1) {
return; // readLine will fail
} else if (ch == '\n' || ch == '\r') {
rem--;
}
}
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
String text = reader.readLine();
long elapsed = SystemClock.elapsedRealtime() - start;
My test program used an EditText so that I could input the line number.
So to give you some idea of performance, the first phase averaged around 1600ms to read through the entire file. I used a MULTIPLE value of 10. Accessing the last record in the file averaged about 30ms.
To get down to 30ms access with only a 29312-byte memory footprint is pretty good, I think.
You can see the sample project on GitHub.

javaml java.lang.OutOfMemoryError: Java heap space

I'm using javaml to train a classifier. Now instances in my data contain vectors in the format like this
1 0:5 1:9 24:2 ......
so when i read these from a file I'm using string.split. And then putting the values in the sparseinstance which then gets addd to the classifier.
However I'm getting a heap space out of memory error. I've read about string.split() causing memory leaks as such I've used new String to avoid memory leak. However I'm still facing the heap space problem
The code is as follows
////////////////////////////////////////
BufferedReader br = new BufferedReader(new FileReader("Repository\\IMDB Data\\Train.feat"));
Dataset data=new DefaultDataset();
String TrainLine;
int j=0;
while((TrainLine = br.readLine()) != null && j < 20000){
//TrainLine.replaceAll(":", " ");
String[] arr = TrainLine.split("\\D+");
double[] nums = new double[arr.length];
for (int i = 0; i < nums.length; i++) {
nums[i] = Double.parseDouble(new String(arr[i]));
}
//vector has one less element than arr 85527
String label;
if(nums[0] == 1){
label = "positive";
}else{
label = "negative";
}
System.out.println(label);
Instance instance = new SparseInstance(85527,label);
int i;
for(i=1;i<arr.length;i=i+2){
instance.put((int)nums[i],nums[i+1]);
//Strings have been converted to new strings to overcome memory leak
}
data.add(instance);
j++;
}
knn = new KNearestNeighbors(5);
knn.buildClassifier(data);
svm = new LibSVM();
svm.buildClassifier(data);
////////////////////////////////////////
Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space
at java.util.TreeMap.put(Unknown Source)
at java.util.TreeSet.add(Unknown Source)
at java.util.AbstractCollection.addAll(Unknown Source)
at java.util.TreeSet.addAll(Unknown Source)
at net.sf.javaml.core.SparseInstance.keySet(SparseInstance.java:144)
at net.sf.javaml.core.SparseInstance.keySet(SparseInstance.java:27)
at libsvm.LibSVM.transformDataset(LibSVM.java:80)
at libsvm.LibSVM.buildClassifier(LibSVM.java:127)
at backend.ShubhamKNN.<init>(ShubhamKNN.java:55)
I also get this error, it happens when dataset is too big.
You can run your code with only 1000 records, i guess it runs ok. Cost much memory is a problem of Libsvm, it always occurs the error:
java.lang.OutOfMemoryError: Java heap space
if your computer has enough memory (mine is 8G), you can adjust the memory param of class in Eclipse:
choose the class which calls libsvm lib in Package Explorer view
at menu, Run -> Run configuration.. -> tab (x=)arguments - the input of VM arguments, type into -Xmx1024M. it means class could cost max memory is 1024M, I set the param 3072M, my class runs ok.
rerun the class.
above is my solution, more detail see:
http://blog.csdn.net/felomeng/article/details/4688414

Maximum number of items in a J2ME List

I'm working on j2me project that involves getting a list of users from an online database, I then intend to populate a list with the names of the users and the number can be very large. my question is - are there limits to the number of items you can append to a list?
HttpConnection hc = (HttpConnection);
String reply;
Connector.open("http://www.xxxxxxxxxxxx.com/......?xx=xx");
InputStream is = new hc.openInputStream();
int ch;
// Check the Content-Length first
long len = hc.getLength();
if(len!=-1) {
for(int i = 0;i<len;i++)
if((ch = is.read())!= -1)
reply += (char) ch;
} else {
// if the content-length is not available
while ((ch = is.read()) != -1)
reply += (char) ch;
}
is.close();
hc.close();
DataParser parser = new DataParser(reply); // This is a custom class I created to process the XML data returned from the server to split it into groups and put in an array.
List user list = new List("Users");
if (parser.moveToNext()) {
do {
list.append(parser.get(), null);
}
}
This code seems to be working fine but my problem is, if a keep calling list.append("", null), will it get to a point when some exception is thrown, maybe in the case of 50,000 names (list items)?
Their is no limitation to number of items in a list. You can as well use stringItems appended to form, then add item commands to them... I hope this helps.
J2ME tutorial at http://www.tutorialmasterng.blogspot.com
Some implementations may have limit. Older Sony Ericsson phones have limit of 256 items in the list. Anyway, as Meier pointed out, lists with really many items can be slow or difficult to use. And 50k of strings may easily cause OOM on low heap devices (1 - 2 MB).

Hash Table Memory Usage in Java

I am using java to read data from file, copy the data to smaller arrays and put these arrays in Hashtables. I noticed that Hashmap consumes more memory (about double) than what is in the original file! Any idea why?
Here is my code:
public static void main(final String[] args) throws IOException {
final PrintWriter writer = new PrintWriter(new FileWriter("test.txt",
true));
for(int i = 0; i < 1000000; i++)
writer.println("This is just a dummy text!");
writer.close();
final BufferedReader reader = new BufferedReader(new FileReader(
"test.txt"));
final HashMap<Integer, String> testMap = new HashMap<Integer, String>();
String line = reader.readLine();
int k = 0;
while(line != null) {
testMap.put(k, line);
k++;
line = reader.readLine();
}
}
This is not a problem of HashMap, its a problem of Java Objects in general. Each object has a certain memory overhead, including the arrays and the entries in your HashMap.
But more importantly: Character data consumes double the space in memory. The reason for this is that Java uses 16 bits for each character, whereas the file is probably encoded in ASCII or UTF-8, which only uses 7 or 8 bits per character.
Update: There is not much you can do about this. The code you posted is fine in principle. It just doesn't work with huge files. You might be able to do a little better if you tune your HashMap carefully, or you might use a byte array instead of a String to store your characters (assuming everything is ASCII or one-byte UTF-8).
But in the end, to solve your out-of-memory problems, the right way to go is to rethink your program so that you don't have to read the whole file into memory at once.
Whatever it is you're doing with the content of that file, think about whether you can do it while reading the file from disk (this is called streaming) or maybe extract the relevant parts and only store those. You could also try to random access the file.
I suggest you read up on those things a bit, try something and come back and ask a new question, specific to your application. Because this thread is getting too long.
A map is an "extendable" structure - when it reaches its capacity it gets resized. So it is possible that say 40% of the space used by your map is actually empty. If you know how many entries will be in your map, you can use the ad hoc constructors to size your map in an optimal way:
Map<xx,yy> map = new HashMap<> (length, 1);
Even if you do that, the map will still use more space than the actual size of the contained items.
In more details: HashMap's size gets doubled when it reaches (capacity * loadFactor). Default load factor for a HashMap is 0.75.
Example:
Imagine your map has a capacity (size) of 10,000 entries
You then put 7,501 entries in the map. Capacity * loadFactor = 10,000 * 0.75 = 7,500
So your hashmap has reached its resize threshold and gets resized to (capacity * 2) = 20,000, although you are only holding 7,501 entries. That wastes a lot of space.
EDIT
This simple code gives you an idea of what happens in practice - the output is:
threshold of empty map = 8192
size of empty map = 35792
threshold of filled map = 8192
size of filled map = 1181712
threshold with one more entry = 16384
size with one more entry = 66640
which shows that if the last item you add happens to force the map to resize, it can artificially increase the size of your map. Admittedly, that does not account for the whole effect that you are observing.
public static void main(String[] args) throws java.lang.Exception {
Field f = HashMap.class.getDeclaredField("threshold");
f.setAccessible(true);
long mem = Runtime.getRuntime().freeMemory();
Map<String, String> map = new HashMap<>(2 << 12, 1); // 8,192
System.out.println("threshold of empty map = " + f.get(map));
System.out.println("size of empty map = " + (mem - Runtime.getRuntime().freeMemory()));
mem = Runtime.getRuntime().freeMemory();
for (int i = 0; i < 8192; i++) {
map.put(String.valueOf(i), String.valueOf(i));
}
System.out.println("threshold of filled map = " + f.get(map));
System.out.println("size of filled map = " + (mem - Runtime.getRuntime().freeMemory()));
mem = Runtime.getRuntime().freeMemory();
map.put("a", "a");
System.out.println("threshold with one more entry = " + f.get(map));
System.out.println("size with one more entry = " + (mem - Runtime.getRuntime().freeMemory()));
}
There are lots of things internal to the implementation of HashMap (and arrays) that need to be stored. Array lengths would be one such example. Not sure if this would account for double, but it could certainly account for some.

Categories