I have a problem with the Java heap space of BlueJ.
I have written a program which reads in a .txt to a String and goes through all the characters of the string and do some stuff(guess this is not really important). Some of the .txt are really large(around 200 million).
If I try to execute the program with these .txt i get this "Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space" error code. I increased the bluej.windows.vm.args and bluej.windows.vm.args in the bluej.def to 8gb. And it still does not work. But I actually guess that even a 200million character String would not exceed this limit.
Here is my code of how I read in the .txt
try
{
FileReader reader = new FileReader(input.getText());
BufferedReader bReader = new BufferedReader(reader);
String parcour = "";
String line = bReader.readLine();
while(line != null)
{
parcour += line;
line = bReader.readLine();
}
input.getText() gets the file paths.
I would be really grateful for an answer. Thanks :)
- Cyaena
In the below explanation only the plain memory for the data is in the scope. All additional memory need for the structures are left out. It's more an overview as an in deep detail view.
The memory is eaten at those lines
String parcour = "";
...
String line = bReader.readLine();
...
parcour += line;
The line parcour += line is compiled into the class file as
new StringBuilder().append(parcour).append(line).toString()
Assume parcour contains a string of size 10 MB and line would be of size 2 MB. Then the memory allocated during parcour += line; would be (roughly)
// creates a StringBuilder object of size 12 MB
new StringBuilder().append(parcour).append(line)
// the `.toString()` would generate a String object of size 12 MB
new StringBuilder().append(parcour).append(line).toString()
Your code needs before the newly created String is assigned to parcour around 34 MB.
parcour = 10 MB
the temporary StringBuilder object = 12 MB
the String fromStringBuilder = 12 MB
------------------------------------------
total 34 MB
A small demo snippet to show that the OutOfMemoryException is thrown much earlier then you currently expect.
OOMString.java
class OOMString {
public static void main(String[] args) throws Exception {
String parcour = "";
char[] chars = new char[1_000];
String line = new String(chars);
while(line != null)
{
System.out.println("length = " + parcour.length());
parcour += line;
}
}
}
OOMStringBuilder.java
class OOMStringBuilder {
public static void main(String[] args) throws Exception {
StringBuilder parcour = new StringBuilder();
char[] chars = new char[1_000];
String line = new String(chars);
while(line != null)
{
System.out.println("length = " + parcour.length());
parcour.append(line);
}
}
}
Both snippets do the same. They add a 1,000 charcater string to parcour till the OutOfMemoryException is thrown. To speed it up we limit the heap size to 10 MB.
output of java -Xmx10m OOMString
length = 1048000
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
output of java -Xmx10m OOMStringBuilder
length = 2052000
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
When you execute the code you will notice that OOMString needs much more time to fail (even at a shorter length) than OOMStringBuilder.
You also need to keep in mind that a single character is two bytes long. If your file contains 100 ASCII characters they consume 200 byte in memory.
Maybe this small demonstration could explain it a little bit for you.
I've had some problems with BlueJ and Heap Space errors as well. In my case, opening the terminal crashed the entire application. I suspect this had something to do with generating a lot of output, similar to your large String. In my case, I accidentally had created an endless loop somewhere which broke the terminal window.
I had to remove all property files. Now BlueJ works again and gives no more OutOfMemoryErrors.
I hope this might be helpful in other cases as well.
Related
I have a java program that uploads files from local to Minio browser. The file size is around 900 MB. When I'm executing the java program I get -
Java.lang.OutOfMemoryError - Java heap Size
I tried increasing heap size both in eclipse.ini as well as under Run-->Configurations-->Project to -Xms4096M -Xmx8192M.
After increasing the heap size when I executed the program I recieve -
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
How to upload large size files to Minio using Java ?
This is how my java program looks like -
StringBuilder stringBuilder = new StringBuilder();
File[] files = new File(path).listFiles();
showFiles(files);
System.out.println(pathList);
ListIterator<String> itr=pathList.listIterator();
while(itr.hasNext()){
String relativePath=itr.next();
if(relativePath!=null) {
String absolutePath=path+(relativePath).replaceFirst("minio_files", "");
System.out.println(absolutePath);
System.out.println(relativePath);
File f =new File(absolutePath);
BufferedReader reader = new BufferedReader(new FileReader(f));
String line = null;
String ls = System.getProperty("line.separator");
while ((line = reader.readLine()) != null) {
stringBuilder.append(line);
stringBuilder.append(ls);
}
if(stringBuilder.length()!=0) {
// delete the last new line separator
stringBuilder.deleteCharAt(stringBuilder.length() - 1);
}
reader.close();
// Create a InputStream for object upload.
ByteArrayInputStream bais = new ByteArrayInputStream(stringBuilder.toString().getBytes("UTF-8"));
Do you absolutely need to remove a trailing line separator from your text file?
If this not absolutely required you could let the minio client libraries handle the upload transparently:
String absolutePath=path+(relativePath).replaceFirst("minio_files", "");
File f =new File(absolutePath);
minio.putObject("bucketName", f.getName(), absolutePath);
According to the minio docs this allows uploads of up to 5 GB. This is easier to implement and faster than any other solution.
If you absolutely need to remove a trailing line separator, you should at least pre-size the StringBuilder (and use the correct code to remove the trailing line separator):
File f = new File(absolutePath);
stringBuilder.ensureCapacity((int) f.length()+2);
try (BufferedReader reader = new BufferedReader(new FileReader(f))) {
String line;
String ls = System.getProperty("line.separator");
while ((line = reader.readLine()) != null) {
stringBuilder.append(line);
stringBuilder.append(ls);
}
if (stringBuilder.length() != 0) {
// delete the last new line separator
stringBuilder.setLength(stringBuilder.length() - ls.length());
}
}
Please beware that this code can never upload files larger than about 2GB:
arrays in java cannot be larger than Integer.MAX_VALUE-5
therefore StringBuilder cannot be used to create strings with more than Integer.MAX_VALUE-5 characters
transforming the string into a UTF-8 encoded byte array cannot produce a byte array longer than Integer.MAX_VALUE-5 bytes
since UTF-8 is a multibyte encoding, transforming a string with Integer.MAX_VALUE-5 characters into a byte array might not be possible
I am trying to preprocess a large txt file (10G), and store it in binary file for future use. As the code runs it slows down and ends with
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead
limit exceeded
The input file has the following structure
200020000000008;0;2
200020000000004;0;2
200020000000002;0;2
200020000000007;1;2
This is the code I am using:
String strLine;
FileInputStream fstream = new FileInputStream(args[0]);
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
//Read File Line By Line
HMbicnt map = new HMbicnt("-1");
ObjectOutputStream outputStream = null;
outputStream = new ObjectOutputStream(new FileOutputStream(args[1]));
int sepIndex = 15;
int sepIndex2 = 0;
String str_i = "";
String bb = "";
String bbBlock = "init";
int cnt = 0;
lineCnt = 0;
while ((strLine = br.readLine()) != null) {
//rozparsovat radek
str_i = strLine.substring(0, sepIndex);
sepIndex2 = strLine.substring(sepIndex+1).indexOf(';');
bb = strLine.substring(sepIndex+1, sepIndex+1+sepIndex2);
cnt = Integer.parseInt(strLine.substring(sepIndex+1+sepIndex2+1));
if(!bb.equals(bbBlock)){
outputStream.writeObject(map);
outputStream.flush();
map = new HMbicnt(bb);
map.addNew(str_i + ";" + bb, cnt);
bbBlock = bb;
}
else{
map.addNew(str_i + ";" + bb, cnt);
}
}
outputStream.writeObject(map);
//Close the input stream
br.close();
outputStream.writeObject(map = null);
outputStream.close();
Basically, it goes through the in file and stores data to the object HMbicnt (which is a hash map). Once it encounters new value in second column it should write object to the output file, free memory and continue.
Thanks for any help.
I think the problem is not that 10G is in memory, but that you are creating too many HashMaps. Maybe you could clear the HashMap instead of re-creating it after you don't need it anymore.
There seems to have been a similar problem in java.lang.OutOfMemoryError: GC overhead limit exceeded , it is also about HashMaps
Simply put, you're using too much memory. Since, as you said, your file is 10 GB, there is no way you're going to be able to fit it all into memory (unless, of course, you happen to have over 10 GB of RAM and have configured Java to use it).
From what I can tell from your code and description of it, you're reading the entire file into memory and adding it to one huge in-RAM map as you're doing so, then writing your result to output. This is not feasible. You'll need to redesign your code to work in-place (i.e. only keep a small portion of the file in memory at any given time).
I'm trying to parse a very large file (~1.2 GB). Some lines of the file are bigger than the maximum allowed String size.
FileReader fileReader = new FileReader(filePath);
BufferedReader bufferedReader = new BufferedReader(fileReader);
while ((line = bufferedReader.readLine()) != null) {
//Do something
}
bufferedReader.close();
Error:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:569)
at java.lang.StringBuffer.append(StringBuffer.java:369)
at java.io.BufferedReader.readLine(BufferedReader.java:370)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at sax.parser.PrettyPrintXML.format(PrettyPrintXML.java:30)
line 30 :
while ((line = bufferedReader.readLine()) != null) {
Can anyone suggest any alternative approach for this case.
You are using readLine() on a file that doesn't have lines. So it tries to read the entire file as a single lines. This does not scale.
Solution: don't. Read a chunk at a time, or maybe even a character at a time: whatever is dictated by the unstated structure of your file.
I believe maximum string character length is 2^31-1 [2,147,483,647] and 1.2GB txt file(assuming is a txt file) can store about 1,200,000,000 characters. Why do you need to read all the data? What are you using it for? Can you split the file up into several files or read and parse it as a smaller string. Need more info.
You can use Apache commons IO :
https://commons.apache.org/proper/commons-io/description.html
example:
InputStream in = new URL( "http://commons.apache.org" ).openStream();
try {
System.out.println( IOUtils.toString( in ) );
} finally {
IOUtils.closeQuietly(in);
}
This question already has answers here:
Java IO implementation of unix/linux "tail -f"
(9 answers)
Closed 8 years ago.
I have a text file that I first want to print the last 6 lines of, and then to detect when a new line has been added so that it will keep updating the screen with recent activity. The idea is that I'm trying to display six recent transactions made in my program.
The problem I am currently encountering is that it keeps printing the first (not last) six lines in the text file, when I want it to be the other way around.
Here is my sample code:
BufferedReader in = new BufferedReader(new FileReader("transaction-list.txt"));
System.out.println();
System.out.println("SIX MOST RECENT TRANSACTIONS:");
System.out.println();
String line;
for (int i=0; i<6;i++){
line=in.readLine();
System.out.println(line);
}
in.close();
}catch (IOException e){
e.printStackTrace();
}
break;
You have to save the lines into String Array. and after reading whole file just print Array. just remember where to start the reading of saved array..
BufferedReader in = new BufferedReader(new FileReader("transaction-list.txt"));
System.out.println();
System.out.println("SIX MOST RECENT TRANSACTIONS:");
System.out.println();
String[] last6 = new String[6];
int count=0;
while(in.ready()){
last6[count++%6]=in.readLine();
}
for (int i=0; i<6;i++){
System.out.println(last6[(i+count)%6]);
}
in.close();
Your currently logic only reads the first 6 lines and print it, basically you can read all lines into a list and remove those lines you don't need. Check following post:
How to read last 5 lines of a .txt file into java
While there are 4 other answers, I don't think any address both your points: (1) to print the last 6 lines and (2) then keep monitoring the file and printing new lines.
I also think you should keep it simple to better convey your code's intent and remove bug risk:
just use a BufferedReader rather than RandomAccessFile - this is what BufferedReader is for
instead of using an array just use a FIFO Queue like ArrayDeque<String> - this is a perfect use case for it and the "ringbuffer" implementation is fully encapsulated inside ArrayDeque
A barebones implementation which does all this would be something like:
public static void MonitorFile(String filePath)
throws FileNotFoundException, IOException, InterruptedException
{
// Used for demo only: count lines after init to exit function after n new lines
int newLineCount = 0;
// constants
final int INITIAL_LINE_LIMIT = 6;
final int POLLING_INTERVAL = 1000;
// file readers
FileReader file = new FileReader(filePath);
BufferedReader fr = new BufferedReader(file);
// read-and-monitor loop
boolean initialising = true;
Queue<String> lineBuffer = new ArrayDeque<String>(INITIAL_LINE_LIMIT);
int lineCount = 0;
while (true) {
String line= fr.readLine();
if (line != null)
{
if (initialising) { // buffer
lineBuffer.add(line);
if (++lineCount > INITIAL_LINE_LIMIT) lineBuffer.remove();
}
else { // print
System.out.printf("%d %s%n", ++lineCount, line);
newLineCount++;
}
}
else
{
// No more lines, so dump buffer and/or start monitoring
if (initialising)
{
initialising = false;
// reset the line numbers for printing
lineCount = Math.max(0, lineCount - INITIAL_LINE_LIMIT);
// print out the buffered lines
while((line = lineBuffer.poll()) != null)
System.out.printf("%d %s%n", ++lineCount, line);
System.out.println("finished pre-loading file: now monitoring changes");
}
// Wait and try and read again.
if (newLineCount > 2) break; // demo only: terminate after 2 new lines
else Thread.sleep(POLLING_INTERVAL);
}
}
}
Points to consider:
For what it's worth, I would pass the BufferedReader in as a parameter so this becomes more generalised,
This needs some kind of cancellation so it doesn't monitor forever.
Rather than polling and sleeping your thread you could also use file change monitoring, but that code would be more complex than is suitable for this answer.
The above code gives the following output
2 test line b
3 test line c
4 test line d
5 test line e
6 test line f
7 test line g
finished pre-loading file: now monitoring changes
8 test line h
9 test line i
10 test line j
11 test line k
I have big file (about 30mb) and here the code I use to read data from the file
BufferedReader br = new BufferedReader(new FileReader(file));
try {
String line = br.readLine();
while (line != null) {
sb.append(line).append("\n");
line = br.readLine();
}
Then I need to split the content I read, so I use
String[] inst = sb.toString().split("GO");
The problem is that sometimes the sub-string is over the maximum String length and I can't get all the data inside the string. How can I get rid of this?
Thanks
Scanner s = new Scanner(input).useDelimiter("GO"); and use s.next()
WHY PART:- The erroneous result may be the outcome of non contiguous heap segment as the CMS collector doesn't de-fragment memory.
(It does not answer your how to solve part though).
You may opt for loading the whole string partwise, i.e using substring