BufferedReader no longer buffering after a while?

BufferedReader no longer buffering after a while? - java

Sorry I can't post code but I have a bufferedreader with 50000000 bytes set as the buffer size. It works as you would expect for half an hour, the HDD light flashing every two minutes or so, reading in the big chunk of data, and then going quiet again as the CPU processes it. But after about half an hour (this is a very big file), the HDD starts thrashing as if it is reading one byte at a time. It is still in the same loop and I think I checked free ram to rule out swapping (heap size is default).
Probably won't get any helpful answers, but worth a try.
OK I have changed heap size to 768mb and still nothing. There is plenty of free memory and java.exe is only using about 300mb.
Now I have profiled it and heap stays at about 200MB, well below what is available. CPU stays at 50%. Yet the HDD starts thrashing like crazy. I have.. no idea. I am going to rewrite the whole thing in c#, that is my solution.
Here is the code (it is just a throw-away script, not pretty):
BufferedReader s = null;
HashMap<String, Integer> allWords = new HashMap<String, Integer>();
HashSet<String> pageWords = new HashSet<String>();
long[] pageCount = new long[78592];
long pages = 0;
Scanner wordFile = new Scanner(new BufferedReader(new FileReader("allWords.txt")));
while (wordFile.hasNext()) {
allWords.put(wordFile.next(), Integer.parseInt(wordFile.next()));
}
s = new BufferedReader(new FileReader("wikipedia/enwiki-latest-pages-articles.xml"), 50000000);
StringBuilder words = new StringBuilder();
String nextLine = null;
while ((nextLine = s.readLine()) != null) {
if (a.matcher(nextLine).matches()) {
continue;
}
else if (b.matcher(nextLine).matches()) {
continue;
}
else if (c.matcher(nextLine).matches()) {
continue;
}
else if (d.matcher(nextLine).matches()) {
nextLine = s.readLine();
if (e.matcher(nextLine).matches()) {
if (f.matcher(s.readLine()).matches()) {
pageWords.addAll(Arrays.asList(words.toString().toLowerCase().split("[^a-zA-Z]")));
words.setLength(0);
pages++;
for (String word : pageWords) {
if (allWords.containsKey(word)) {
pageCount[allWords.get(word)]++;
}
else if (!word.isEmpty() && allWords.containsKey(word.substring(0, word.length() - 1))) {
pageCount[allWords.get(word.substring(0, word.length() - 1))]++;
}
}
pageWords.clear();
}
}
}
else if (g.matcher(nextLine).matches()) {
continue;
}
words.append(nextLine);
words.append(" ");
}

Have you tried removing the buffer size and trying it out with the defaults?

It may be not that the file buffering isn't working, but that your program is using up enough memory that your virtual memory system is page swapping to disk. What happens if you try with a smaller buffer size? What about larger?

I'd bet that you are running out of heap space and you are getting stuck doing back to back GC's. Have you profiled the app to see what is going on during that time? Also, try running with -verbose:gc to see garbage collection as it happens. You could also try starting with a larger heap like"
-Xms1000m -Xmx1000m
That will give you 1gb of heap so if you do use that all up, it should be much later than it is currently happening.

It appears to me that if the file you are reading is very large, then the following lines could result in a large portion of the file being copied to memory via a StringBuilder. If the process' memory footprint becomes too large, you will likely swap and/or throw your garbage collector into a spin.
...
words.append(nextLine);
words.append(" ");

Hopefully this may help: http://www.velocityreviews.com/forums/t131734-bufferedreader-and-buffer-size.html

Before you assume there is something wrong with Java and reading IO, I suggest you write a simple program which just reads the file as fast as it can. You should be able to read the file at 20 MB/s or more regardless of file size with default buffering. You should be able to do this by stripping down your application to just read the file. Then you can prove to yourself how long it takes to read the file.
You have used quite a lot of expensive operations. Perhaps you should look at how you can make your parser more efficient using a profiler. e.g.
word.substring(0, word.length() - 1)
is the same as
word
so the first if clause and the second are the same.

Related

Java outOfMemory exception in string.split

I have a big txt file with integers in it. Each line in file has two integer numbers separated by whitespace. Size of a file is 63 Mb.
Pattern p = Pattern.compile("\\s");
try (BufferedReader reader = new BufferedReader(new FileReader(filePath))) {
String line;
while ((line = reader.readLine()) != null) {
String[] tokens = p.split(line);
String s1 = new String(tokens[0]);
String s2 = new String(tokens[1]);
int startLabel = Integer.valueOf(s1) - 1;
int endLabel = Integer.valueOf(s2) - 1;
Vertex fromV = vertices.get(startLabel);
Vertex toV = vertices.get(endLabel);
Edge edge = new Edge(fromV, toV);
fromV.addEdge(edge);
toV.addEdge(edge);
edges.add(edge);
System.out.println("Edge from " + fromV.getLabel() + " to " + toV.getLabel());
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Arrays.java:2694)
at java.lang.String.<init>(String.java:203)
at java.lang.String.substring(String.java:1913)
at java.lang.String.subSequence(String.java:1946)
at java.util.regex.Pattern.split(Pattern.java:1202)
at java.util.regex.Pattern.split(Pattern.java:1259)
at SCC.main(SCC.java:25)
Why am I getting this exception? How can I change my code to avoid it?
EDIT:
I've already increase heap size to 2048m.
What is consuming it? That's what I would want to know also.
For all I know jvm should allocate memory to list of vertices, set of edges, buffer for buffered reader and one small string "line". I don't see where this outOfMemory coming from.
I read about string.split() method. I think it's causing memory leak, but I don't know what should I do about it.

What you should try first is reduce the file to small enough that it works. That will allow you to appraise just how large a problem you have.
Second, your problem is definitely unrelated to String#split since you are using it on just one line at a time. What is consuming your heap are the Vertex and Edge instances. You'll have to redesign this towards a smaller footprint, or completely overhaul your algorithms to be able to work with only a part of the graph in memory, the rest on the disk.
P.S. Just a general Java note: don't write
String s1 = new String(tokens[0]);
String s2 = new String(tokens[1]);
you just need
String s1 = tokens[0];
String s2 = tokens[1];
or even just use tokens[0] directly instead of s1, since it's about as clear.

Easiest way: increase your heap size:
Add -Xmx512m -Xms512m (or even more) arguments to jvm

Increase the heap memory limit, using the -Xmx JVM option.
More info here.

You are getting this exception because your program is storing too much data in the java heap.
Although your exception is showing up in the Pattern.split() method, the actual culprit could be any large memory user in your code, such as the graph you are building. Looking at what you provided, I suspect the graph data structure is storing much redundant data. You may want to research a more space-efficient graph structure.
If you are using the Sun JVM, try the JVM option -XX:+HeapDumpOnOutOfMemoryError to create a heap dump and analyze that for any heavy memory users, and use that analysis to optimize your code. See Using HeapDumpOnOutOfMemoryError parameter for heap dump for JBoss for more info.
If that's too much work for you, as others have indicated, try increasing the JVM heap space to a point where your program no longer crashes.

When ever you get an OOM while trying to parse stuff, its just that the method you are using is not scalable. Even though increasing the heap might solve the issue temporarily, it is not scalable. Example, if tomorrow your file size increases by an order or magnitude, you would be back in square one.
I would recommend trying to read the file in pieces, cache x lines of the file, read off it, clear the cache and re-do the process.
You can use either ehcache or guava cache.

The way you parse the string could be changed.
try (Scanner scanner = new Scanner(new FileReader(filePath))) {
while (scanner.hasNextInt()) {
int startLabel = scanner.nextInt();
int endLabel = scanner.nextInt();
scanner.nextLine(); // discard the rest of the line.
// use start and end.
}
I suspect the memory consumption is actually in the data structure you build rather than how you read the data, but this should make it more obvious.

Free unused RAM in Java

I have a Java class that allocates all files within a directory (6GB). Then for each file, does some text processing. When I check the ram usage, I can see that when I finish from a file and start to the next file, RAM does not get rid of the previous file - bad garbage collection, I guess. Is there a way to programatically free the finished file and its data?
public void fromDirectory(String path) {
File folder = new File(path);
disFile = path + "/dis.txt";
if (folder.isDirectory()) {
File[] listOfFiles = folder.listFiles();
for (int i = 0; i < listOfFiles.length; i++) {
File file = listOfFiles[i];
if (file.isFile() && file.getName().contains("log")) {
System.out.println("The file will be processed is: "
+ file.getPath());
forEachFile(file.getPath());
//Runtime.getRuntime().exec("purge");
//System.gc();
} else
System.out.println("The file " + file.getName()
+ " doesn't contain log");
}
} else {
System.out.println("The path: " + path + " is not a directory");
}
}
private void forEachFile(String filePath) {
File in = new File(filePath);
File out = new File(disFile);
try {
out.createNewFile();
FileWriter fw = new FileWriter(out.getAbsoluteFile());
BufferedWriter bw = new BufferedWriter(fw);
BufferedReader reader = new BufferedReader(new FileReader(in));
String line = null;
while ((line = reader.readLine()) != null) {
if (line.toLowerCase().contains("keyword")) {
bw.write(line);
bw.newLine();
numberOfLines++;
}
}
reader.close();
bw.close();
} catch (IOException e) {
e.printStackTrace();
}
}

You can strongly suggests the VM to do a garbage collection by calling System.gc() . It is generally considered a code-smell to do so.

I think you are mistaking two things here: JVMs memory allocation and realy memory usage within allocated space.
JVM may allocate a lot of memory and not free it even after the objects that were using it were garbaged internally. It may be freed after some time or not freed at all.
You could try to reduce the memory footprint of your application, for example by not using toLowerCase, since it creates a new object. Maybe a precompiled regex search would be faster?
Using System.gc() as you did it, in your case, in my opinion, is acceptable. Whether it helps anything - I don't know.
As long as you have a lot of memory available and Java doesn't slow down because of too not being able to allocate more, I would leave it as it is. The code looks fine.

Even if you are right about checking the memory from some profiler and deducing "correctly" that the file remains in memory why do you think it should be released immediately?
The JVM will garbage collect when the memory is running out (depending on the JVM configuration) not when developers think it should.
Also judging from your question I doubt if you used a profiler or a similar tool gauge JVM memory usage. Instead it's more likely you checked the memory being used by the JVM as a whole.
Also you shouldn't worry about these things unless you are encountering out of memory errors.

As stated, the garbage collector runs when there is no more memory available. If you have 10 files of 100MB each, and you set your heap to 4GB, then chances are that you simply won't ever get any GC.
Now, for the "free the finished file and its data" part, you cannot really do this by yourself, and should not try to do so.
If you want your application to be memory-efficient, then you can just set the maximum heap size to a small value.
On the other hand, if you want your application to be really fast, then you don't want to suffer from any GC, therefore eliminating every System.gc() call and giving your heap as much memory as possible.
Trying to free memory yourself mean giving too much memory to your heap (your app is not memory-efficient) and triggering GC yourself (your app is not time-efficient either).
Note that in some cases, the JVM can give back memory to the OS. For instance, with G1, it will, but with CMS, it won't. See this article for more details.
Finally, if you use Java7, you should wrap your InputStream/OutputStream in a try-with-resources. Or, at least, wrap the .close() in a finally block.
Hope that helps !

getting Java OutOfMemoryError: Java heap space error that I can't debug

I am struggling to figure out what's causing this OutofMemory Error. Making more memory available isn't the solution, because my system doesn't have enough memory. Instead I have to figure out a way of re-writing my code.
I've simplified my code to try to isolate the error. Please take a look at the following:
File[] files = new File(args[0]).listFiles();
int filecnt = 0;
LinkedList<String> urls = new LinkedList<String>();
for (File f : files) {
if (filecnt > 10) {
System.exit(1);
}
System.out.println("Doing File " + filecnt + " of " + files.length + " :" + f.getName());
filecnt++;
FileReader inputStream = null;
StringBuilder builder = new StringBuilder();
try {
inputStream = new FileReader(f);
int c;
char d;
while ((c = inputStream.read()) != -1) {
d = (char)c;
builder.append(d);
}
}
finally {
if (inputStream != null) {
inputStream.close();
}
}
inputStream.close();
String mystring = builder.toString();
String temp[] = mystring.split("\\|NEWandrewLINE\\|");
for (String s : temp) {
String temp2[] = s.split("\\|NEWandrewTAB\\|");
if (temp2.length == 22) {
urls.add(temp2[7].trim());
}
}
}
I know this code is probably pretty confusing :) I have loads of text files in the directory that is specified in args[0]. These text files were created by me. I used |NEWandrewLINE| to indicate a new row in the text file, and |NEWandrewTAB| to indicate a new column. In this code snippet, I am trying to access the URL of each stored row (which is in the 8th column of each row). So, I read in the whole text file. String split on |NEWandrewLINE| and then string split again on the substrings on |NEWandrewTAB|. I add the URL to the LinkedList (called "urls") with the line: urls.add(temp2[7].trim())
Now, the output of running this code is:
Doing File 0 of 973 :results1322453406319.txt
Doing File 1 of 973 :results1322464193519.txt
Doing File 2 of 973 :results1322337493419.txt
Doing File 3 of 973 :results1322347332053.txt
Doing File 4 of 973 :results1322330379488.txt
Doing File 5 of 973 :results1322369464720.txt
Doing File 6 of 973 :results1322379574296.txt
Doing File 7 of 973 :results1322346981999.txt
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2882)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:572)
at java.lang.StringBuilder.append(StringBuilder.java:203)
at Twitter.main(Twitter.java:86)
Where main line 86 relates to the line builder.append(d); in this example.
But the thing I don't understand is that if I comment out the line urls.add(temp2[7].trim()); I don't get any error. So the error seems to be caused by the linkedlist "urls" overfilling. But why then does the reported error relate to the StringBuilder?

Try to replace urls.add(temp2[7].trim()); with urls.add(new String(temp2[7].trim()));.
I suppose that your problem is that you are in fact storing the entire file content and not just the extracted URL field in your urls list, although that's not really obvious. It is actually an implementation specific issue with the String class, but usually String#split and String#trim return new String objects, which contain the same internal char array as the original string and only differs in their offset and length fields. Using the new String(String) constructor makes sure that you only keep the relevant part of the original data.

The linked list is using more memory each time you add a string. This means you can be left it not enough memory to build your StringBuilder.
The way to avoid this issue to write the results to a file instead of to a List as you don't appear to have enough memory to keep the List in memory.

Because this is
out of memory and not out of heap
you have LOTS of small temporary objects
I would suggest you give your JVM a -X maximum heap size limit that fits in your RAM.
To use less memory I would use a buffered reader to pull in the entire line and save on the temporary object creation.

The simple answer is: you should not load all the URLs from the text files into memory. You are surely doing this because you want to process them in a next step. So instead of adding them to a List in memory do the next step (maybe storing in a database or check if it is reachable) and forget that URL.

How many URLS do you have? Looks like you're just storing more of them than you can handle.
As far as I can see, the linked list is the only object that is not scoped inside the loop, so cannot be collected.
For an OOM error, it doesn't really matter where it is thrown.
To check this properly, use a profiler (look at JVisualVM for a free one, and you probably already have it). You'll see which objects are in the heap. You can also have the JVM dump its memory into a file when it crashes, then analyse that file with visualvm. You should see that one thing is grabbing all of your memory. I'm suspecting it's all the URLs.

There are several experts in here already, so, I'l be brief to the problems:
Inappropriate use of String Builder:
StringBuilder builder = new StringBuilder();
try {
inputStream = new FileReader(f);
int c;
char d;
while ((c = inputStream.read()) != -1) {
d = (char)c;
builder.append(d);
}
}
Java is beautiful when you process small amounts of data at a time, remember the garbage collector.
Instead, I would recommend that you read the file (Text file) 1 line at a time, process the line, and move on, never create a huge memory ball of StringBuilder just to get a String,
Immagine of your text file is 1 GB in size, you are done mate.
Add the real process while reading the file (as in item #1)
You dont need to close InputStream again, the code in finally block is good enough.
regards

if the linkedlist eats your memory every command which allocates memory afterwards may fail with an OOM error. So this looks like your problem.

You're reading the files into memory. At least one file is simply too big to fit into the default JVM heap. You can allow it use a lot more memory with an arg like -Xmx1g on the command line after java.
By the way this is really inefficient to read a file one character at a time!

Instead of trying to split the string (which basically creates an array of substrings based on the split) - thereby using more than double the memory each time you use the slpit, you should try to do regex based matching of the start and end patterns, extract individual sub-strings one by one and then extract the URL from that.
Also, if your file is large, I would suggest that you not even load all of that into memory at once ... stream its contents to a buffer (of manageable size) and use the pattern based search on that (and keep removing / adding more to the buffer as you progress through the file contents).
The implementation will slow down the program a bit but will use a considerably lesser amount of memory.

One major problem in your code is that you read whole file into a string builder, then convert it into string and then split it into smaller parts. So if file size is large you will get into trouble. As suggested by others process the file line by line as that should save a lot of memory.
Also you should check what is the size of your list after processing each file. If the size is very large you may want to use different approach or increase the memory for your process via -Xmx option.

Java: Filling in-memory sorted batches

So I'm using Java to do multi-way external merge sorts of large on-disk files of line-delimited tuples. Batches of tuples are read into a TreeSet, which are then dumped into on-disk sorted batches. Once all of the data have been exhausted, these batches are then merge-sorted to the output.
Currently I'm using magic numbers for figuring out how many tuples we can fit into memory. This is based on a static figure indicating how may tuples can be roughly fit per MB of heap space, and how much heap space is available using:
long max = Runtime.getRuntime().maxMemory();
long used = Runtime.getRuntime().totalMemory();
long free = Runtime.getRuntime().freeMemory();
long space = free + (max - used);
However, this does not always work so well since we may be sorting different length tuples (for which the static tuple-per-MB figure might be too conservative) and I now want to use flyweight patterns to jam more in there, which may make the figure even more variable.
So I'm looking for a better way to fill the heap-space to the brim. Ideally the solution should be:
reliable (no risk of heap-space exceptions)
flexible (not based on static numbers)
efficient (e.g., not polling runtime memory estimates after every tuple)
Any ideas?

Filling the heap to the brim might be a bad idea due to garbage collector trashing. (As the memory gets nearly full, the efficiency of garbage collection approaches 0, because the effort for collection depends on heap size, but the amount of memory freed depends on the size of the objects identified as unreachable).
However, if you must, can't you simply do it as follows?
for (;;) {
long freeSpace = getFreeSpace();
if (freeSpace < 1000000) break;
for (;;freeSpace > 0) {
treeSet.add(readRecord());
freeSpace -= MAX_RECORD_SIZE;
}
}
The calls to discover the free memory will be rare, so shouldn't tax performance much. For instance, if you have 1 GB heap space, and leave 1MB empty, and MAX_RECORD_SIZE is ten times average record size, getFreeSpace() will be invoked a mere log(1000) / -log(0.9) ~= 66 times.

Why bother with calculating how many items you can hold? How about letting java tell you when you've used up all your memory, catching the exception and continuing. For example,
// prepare output medium now so we don't need to worry about having enough
// memory once the treeset has been filled.
BufferedWriter writer = new BufferedWriter(new FileWriter("output"));
Set<?> set = new TreeSet<?>();
int linesRead = 0;
{
BufferedReader reader = new BufferedReader(new FileReader("input"));
try {
String line = reader.readLine();
while (reader != null) {
set.add(parseTuple(line));
linesRead += 1;
line = reader.readLine();
}
// end of file reached
linesRead = -1;
} catch (OutOfMemoryError e) {
// while loop broken
} finally {
reader.close();
}
// since reader and line were declared in a block their resources will
// now be released
}
// output treeset to file
for (Object o: set) {
writer.write(o.toString());
}
writer.close();
// use linesRead to find position in file for next pass
// or continue on to next file, depending on value of linesRead
If you still have trouble with memory, just make the reader's buffer extra large so as to reserve more memory.
The default size for the buffer in a BufferedReader is 4096 bytes. So when finishing reading you will release upwards of 4k of memory. After this your additional memory needs will be minimal. You need enough memory to create an iterator for the set, let's be generous and assume 200 bytes. You will also need memory to store the string output of your tuples (but only temporarily). You say the tuples contain about 200 characters. Let's double that to take account for separators -- 400 characters, which is 800 bytes. So all you really need is an additional 1k bytes. So you're fine as you've just released 4k bytes.
The reason you don't need to worry about the memory used to store the string output of your tuples is because they are short lived and only referred to within the output for loop. Note that the Writer will copy the contents into its buffer and then discard the string. Thus, the next time the garbage collector runs the memory can be reclaimed.
I've checked and, a OOME in add will not leave a TreeSet in an inconsistent state, and the memory allocation for a new Entry (the internal implementation for storing a key/value pair) happens before the internal representation is modified.

You can really fill the heap to the brim using direct memory writing (it does exist in Java!). It's in sun.misc.Unsafe, but isn't really recommended for use. See here for more details. I'd probably advise writing some JNI code instead, and using existing C++ algorithms.

I'll add this as an idea I was playing around with, involving using a SoftReference as a "sniffer" for low memory.
SoftReference<Byte[]> sniffer = new SoftReference<String>(new Byte[8192]);
while(iter.hasNext()){
tuple = iter.next();
treeset.add(tuple);
if(sniffer.get()==null){
dump(treeset);
treeset.clear();
sniffer = new SoftReference<String>(new Byte[8192]);
}
}
This might work well in theory, but I don't know the exact behaviour of SoftReference.
All soft references to softly-reachable objects are guaranteed to have been cleared before the virtual machine throws an OutOfMemoryError. Otherwise no constraints are placed upon the time at which a soft reference will be cleared or the order in which a set of such references to different objects will be cleared. Virtual machine implementations are, however, encouraged to bias against clearing recently-created or recently-used soft references.
Would like to hear feedback as it seems to me like an elegant solution, although behaviour might vary between VMs?
Testing on my laptop, I found that it the soft-reference is cleared infrequently, but sometimes is cleared too early, so I'm thinking to combine it with meriton's answer:
SoftReference<Byte[]> sniffer = new SoftReference<String>(new Byte[8192]);
while(iter.hasNext()){
tuple = iter.next();
treeset.add(tuple);
if(sniffer.get()==null){
free = MemoryManager.estimateFreeSpace();
if(free < MIN_SAFE_MEMORY){
dump(treeset);
treeset.clear();
sniffer = new SoftReference<String>(new Byte[8192]);
}
}
}
Again, thoughts welcome!

Fastest Java way to remove the first/top line of a file (like a stack)

I am trying to improve an external sort implementation in java.
I have a bunch of BufferedReader objects open for temporary files. I repeatedly remove the top line from each of these files. This pushes the limits of the Java's Heap.
I would like a more scalable method of doing this without loosing speed because of a bunch of constructor calls.
One solution is to only open files when they are needed, then read the first line and then delete it. But I am afraid that this will be significantly slower.
So using Java libraries what is the most efficient method of doing this.
--Edit--
For external sort, the usual method is to break a large file up into several chunk files. Sort each of the chunks. And then treat the sorted files like buffers, pop the top item from each file, the smallest of all those is the global minimum. Then continue until for all items.
http://en.wikipedia.org/wiki/External_sorting
My temporary files (buffers) are basically BufferedReader objects. The operations performed on these files are the same as stack/queue operations (peek and pop, no push needed).
I am trying to make these peek and pop operations more efficient. This is because using many BufferedReader objects takes up too much space.

I'm away from my compiler at the moment, but I think this will work. Edit: works fine.
I urge you to profile it and see. I bet the constructor calls are going to be nothing compared to the file I/O and your comparison operations.
public class FileStack {
private File file;
private long position = 0;
private String cache = null;
public FileStack(File file) {
this.file = file;
}
public String peek() throws IOException {
if (cache != null) {
return cache;
}
BufferedReader r = new BufferedReader(new FileReader(file));
try {
r.skip(position);
cache = r.readLine();
return cache;
} finally {
r.close();
}
}
public String pop() throws IOException {
String r = peek();
if (r != null) {
// if you have \r\n line endings, you may need +2 instead of +1
// if lines could end either way, you'll need something more complicated
position += r.length() + 1;
cache = null;
}
return r;
}
}

If heap space is the main concern, use the [2nd form of the BufferedReader constructor][1] and specify a small buffer size.
[1]: http://java.sun.com/j2se/1.5.0/docs/api/java/io/BufferedReader.html#BufferedReader(java.io.Reader, int)

I have a bunch of BufferedReader objects open for temporary files. I repeatedly remove the top line from each of these files. This pushes the limits of the Java's Heap.
This is a really surprising claim. Unless you have thousands files open at the same time, there is no way that that should stress the heap. The default buffer size for a BufferedReader is 8192 bytes, and there should be little extra space required. 8192 * 1000 is only ~8Mbytes, and that is tiny compared with a typical Java application's memory usage.
Consider the possibility that something else is causing the heap problems. For example, if your program retained references to each line that it read, THAT would lead to heap problems.
(Or maybe your notion of what is "too much space" is unrealistic.)
One solution is to only open files when they are needed, then read the first line and then delete it. But I am afraid that this will be significantly slower.
There is no doubt that it would be significantly slower! There is simply no efficient way to delete the first line from a file. Not in Java, or in any other language. Deleting characters from the beginning or middle of a file entails copying the file to a new one while skipping over the characters that need to be removed. There is no faster alternative.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

BufferedReader no longer buffering after a while? - java

Have you tried removing the buffer size and trying it out with the defaults?

It may be not that the file buffering isn't working, but that your program is using up enough memory that your virtual memory system is page swapping to disk. What happens if you try with a smaller buffer size? What about larger?

Hopefully this may help: http://www.velocityreviews.com/forums/t131734-bufferedreader-and-buffer-size.html

Related

Java outOfMemory exception in string.split

Free unused RAM in Java

getting Java OutOfMemoryError: Java heap space error that I can't debug

Java: Filling in-memory sorted batches

Fastest Java way to remove the first/top line of a file (like a stack)

Categories

Resources