Existing file slowing down java program - java

I'm running a few methods together that take in many text files, read their contents, then write things about their contents to a new file. The problem I have is that when the file exist, the program is very slow. If I delete the file and run the program it then is very fast. I'm using BufferedReader and BufferedWriter for my I/O. I feel like there's a simple answer that I'm just not finding. Thanks in advance! I'd rather not post code if possible, Sorry!
EDIT:
here's very generally what's going on
File path= new File("some path");
try {
BufferedWriter writer = new BufferedWriter(new FileWriter(path, false));
//do some string manipulation
writer.append(string);
writer.newLine();
...
//once done
writer.close();
}catch(IOException e) {
//... handle this ...
}
The problem is that when this file exists, everything is slow. If it doesn't then it is fast.

I would revisit whatever it is you're doing when you say " //do some string manipulation".
Here is what I noticed with > 1000 iterations:
the time it takes to get the file handle and close the writer generally remain the same
the inner loop operation with the string "ABCDEFGHIJKLMNOPQRSTUVWXYZ" has a mean variance of 98ms
the same inner loop operation w/ a string quadruple that string's size causes much larger variety in terms of operation time. Sometimes the program finished in 2 seconds, sometimes it was 20 seconds.
I also did a version of this test where the file was always deleted first. It made no difference. Here's the code I ran:
public static void main(String[] args) {
long s = System.currentTimeMillis();
File path = new File("output.txt");
long stop = System.currentTimeMillis();
System.out.println("handle acquired " + (stop - s) );
try {
BufferedWriter writer = new BufferedWriter(new FileWriter(path, false));
//do some string manipulation
s = System.currentTimeMillis();
for (int i =0; i < 10000000; i++) {
String string = "ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ";
writer.append(string);
writer.newLine();
}
stop = System.currentTimeMillis();
System.out.println("loop end " + (stop - s) );
s = System.currentTimeMillis();
writer.close();
stop = System.currentTimeMillis();
System.out.println("writer closed " + (stop - s) );
}catch(IOException e) {
//... handle this ...
}
}

Related

Read from a large file and write to multiple files with java

I have an A.txt file of 100,000,000 records from 1 to 100000000, each record is one line. I have to read file A then write to file B and C, provided that even line writes to file B and the odd line writes to file C.
Required read and write time must be less than 40 seconds.
Below is the code that I already have but the runtime takes more than 50 seconds.
Does anyone have any other solution to reduce runtime?
Threading.java
import java.io.*;
import java.util.concurrent.LinkedBlockingQueue;
public class Threading implements Runnable {
LinkedBlockingQueue<String> queue = new LinkedBlockingQueue<>();
String file;
Boolean stop = false;
public Threading(String file) {
this.file = file;
}
public void addQueue(String row) {
queue.add();
}
public void Stop() {
stop = true;
}
public void run() {
try {
BufferedWriter bw = new BufferedWriter(new FileWriter(file));
while(!stop) {
try {
String rĘ” = queue.take();
bw.while(row + "\n");
} catch (Exception e) {
e.printStackTrace();
}
}
bw.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
ThreadCreate.java
// I used 2 threads to write to 2 files B and C
import java.io.*;
import java.util.List;
public class ThreadCreate {
public void startThread(File file) {
Threading t1 = new Threading("B.txt");
Threading t1 = new Threading("B.txt");
Thread td1 = new Thread(t1);
Thread td1 = new Thread(t1);
td1.start();
td2.start();
try {
BufferedReader br = new BufferedReader(new FileReader(file));
String line;
long start = System.currentTimeMillis();
while ((line = br.readLine()) != null) {
if (Integer.parseInt(line) % 2 == 0) {
t1.addQueue(line);
} else {
t2.addQueue(line);
}
}
t1.Stop();
t2.Stop();
br.close();
long end = System.currentTimeMillis();
System.out.println("Time to read file A and write file B, C: " + ((end - start)/1000) + "s");
} catch (Exception e) {
e.printStackTrace();
}
}
}
Main.java
import java.io.*;
public class Main {
public static void main(String[] args) throws IOException {
File file = new File("A.txt");
//Write file B, C
ThreadCreate t = new ThreadCreate();
t.startThread(file);
}
}
Why are you making threads? That just slows things down. Threads are useful if the bottleneck is either the calculation itself or the blocking nature of the operation, and they only hurt if it is not. Here, it isn't: The CPU is just idling (the bottleneck will be the disk), and the nature of what it is blocking on means that multithreading does not help either: Telling a single SSD to write 2 boatloads of bytes in parallel is probably no faster (only slower, as it needs to bounce back and forth). If the target disk is a spinning disk, it is way slower - the write head cannot make clones of itself to go any faster, and by making it multithreaded, you are wasting a ton of time by asking the write head to bounce back and forth between the different write locations.
There's nothing that immediately strikes me as ripe for significant speedups.
Sometimes, writing a ton of data to a disk just takes 50 seconds. If that's not acceptable, buy a faster disk.
try memory mapped files
byte[] buffer = "foo bar foo bar text\n".getBytes();
int number_of_lines = 100000000;
FileChannel file = new RandomAccessFile("writeFIle.txt", "rw").getChannel();
ByteBuffer wrBuf = file.map(FileChannel.MapMode.READ_WRITE, 0, buffer.length * number_of_lines);
for (int i = 0; i < number_of_lines; i++)
{
wrBuf.put(buffer);
}
file.close();
Took to my computer (Dell, I7 processor, with SSD, 32GB RAM) a little over half a minute to run this code)

java.io.IOException: Stream closed. what is the best way to write stream data to multiple files?

My java code receives stream data like twitter. I need to store the data e.g. 10000 records for each file. So, I need to recreate the file writer and buffered writer to create a new file then write data on it.
// global variables
String stat;
long counter = 0;
boolean first = true;
Date date;
SimpleDateFormat format;
String currentTime;
String fileName;
BufferedWriter bw = null;
FileWriter fw = null;
public static void main(String[] args) {
String dirToSave = args[0];
String fileIdentifier = args[1];
createFile(dirToSave, fileIdentifier);
StatusListener listener = new StatusListener() {
#Override
public void onStatus(Status status) {
stat = TwitterObjectFactory.getRawJSON(status);
try {
if(bw!=null){
bw.write(stat + "\n");
}
} catch (IOException ex) {
System.out.println(ex.getMessage());
}
counter++;
if (counter == 10000) {
createFile(dirToSave, fileIdentifier);
try {
TimeUnit.SECONDS.sleep(5);
} catch (InterruptedException ex) {
System.out.println(ex.getMessage());
}
counter = 0;
}
}
};
TwitterStream twitterStream = new TwitterStreamFactory(confBuild.build()).getInstance();
twitterStream.addListener(listener);
// twitterStream.filter(filQuery);
}
public static void createFile(String path, String fileIdentifier) {
date = new Date();
format = new SimpleDateFormat("yyyyMMddHHmm");
currentTime = format.format(date);
fileName = path + "/" + fileIdentifier + currentTime + ".json";
// if there was buffer before, flush & close it first before creating new file
if (!first) {
try {
bw.flush();
bw.close();
fw.close();
} catch (IOException ex) {
Logger.getLogger(LocalFile_All_en.class
.getName()).log(Level.SEVERE, null, ex);
}
} else {
first = false;
}
// create a new file
try {
fw = new FileWriter(fileName);
bw = new BufferedWriter(fw);
} catch (IOException ex) {
Logger.getLogger(Stack.class
.getName()).log(Level.SEVERE, null, ex);
}
}
However, i always get error after some hours.
SEVERE: null
java.io.IOException: Stream closed
EDIT: The error message says that, these codes throw the error
if (counter == 10000) {
createFile(dirToSave, fileIdentifier);
...
and
bw.flush();
What is the problem of my code? or is there a better way to write stream data like this?
If this error comes every now and then and writing after this error is ok again i think it can happen that bw is closed and not yet reopened while onStatus() tries to write of flush it.
So bw can be be not null but closed. You need to synchronize the closing/opening somehow.
For example make this stuff in onStatus() like so that you do not just write directly to bw but with some callbacks that handle the close/reopen new file.
Update: assuming here that this twitterStream can call onStatus() without waiting previous call finished. The first call has just closed the stream and the second is right after that writing to. Rare, but will happen in a long period of time.
Update2: this applies also to the flush() part.
I added this also as a short comment already but people often tell to get rid of static and especially global statics in java argumenting that it will cause big problems later which are hard to resolve/debug. This might be good case of it.
Read also:
Why are static variables considered evil?
Volatile Vs Static in java
Latter has an example how to sychronize concurrent requests.

Java Process Builder

I have a project where program has to open notepad file and after entering text and saving that notepad file program should display number of words in that file and it should delete the entered content in the file.
iam getting this error Error not derjava.lang.NullPointerException after running the program.
though after entering some text in Mytext.txt and saving it?
my question is why BufferedReader is reading empty file even though iam saving the file with some content.
Appreciate the help..
public class Notepad_Example {
public static void main(String[] jfb) {
try {
ProcessBuilder proc = new ProcessBuilder("notepad.exe", "C:\\Java Projects\\Reverse String\\src\\Mytext.txt");
proc.start();
BufferedReader br;
String s;
br = new BufferedReader(new FileReader("C:\\Java Projects\\Reverse String\\src\\Mytext.txt"));
s = br.readLine();
char c[] = new char[s.length()];
int j = 0;
for (int i = 0; i < s.length(); i++) {
if (s.charAt(i) != ' ') {
c[i] = s.charAt(i);
} else {
j++;
}
}
System.out.println("number of words are " + (j + 1));
br.close();
} catch (Exception hj) {
System.out.println("Error not der" + hj);
}
try {
FileWriter fw = new FileWriter("C:\\Java Projects\\Reverse String\\src\\Mytext.txt");
fw.close();
} catch (Exception hj) {
System.out.println("Error not der" + hj);
}
}
}
The issue you are having is here:
ProcessBuilder proc=new ProcessBuilder("notepad.exe","C:\\Java Projects\\Reverse String\\src\\Mytext.txt");
proc.start();
proc.start() is returning the freshly started process. You'll have to give the user the chance to edit and save the file and close the editor before you can read from that file. That is you have to wait for that process to finish before you can start using the results (the saved file) of that process.
So do instead something like this:
Process process = proc.start();
int result = process.waitFor();
if (result == 0) {
// Do your rest here
} else {
// give error message as the process did not finish without error.
}
Some further remarks:
The rest of your code also appears to have some issues.
You are only reading one line of that file. What if the user is using new lines?
The exception handling is not very good, at the very least print the stack trace of the exception which will give you further hints of where an exception was occuring
If you are using Java 7, read on try with resources; if you are using Java 6, add finally blocks to make sure your resources (the streams) are getting closed.
When you run proc.start(); it is not going to block and waitfor the process to end, it will continue running.
You will need to call the proc.waitFor() method, to block until it has finished.
NOTE
we have had some weird behaviour when using the process builder...
we used to start the process with a
new ProcessBuilder("notepad.exe", "C:\\Java Projects\\Reverse String\\src\\Mytext.txt");
but that started to fail wen we upgraded to Win7 and Java7 - we we not sure where this problem really originated, but we changed out Code like this:
String[] cmd = new String[]{"notepad.exe", "C:\\Java Projects\\Reverse String\\src\\Mytext.txt"};
new ProcessBuilder(cmd);
and since then it worked correct!

Java Process.waitFor() and Readline hangs

First, this is my code :
import java.io.*;
import java.util.Date;
import com.banctecmtl.ca.vlp.shared.exceptions.*;
public class PowershellTest implements Runnable {
public static final String PATH_TO_SCRIPT = "C:\\Scripts\\ScriptTest.ps1";
public static final String SERVER_IP = "XX.XX.XX.XXX";
public static final String MACHINE_TO_MOD = "MachineTest";
/**
* #param args
* #throws OperationException
*/
public static void main(String[] args) throws OperationException {
new PowershellTest().run();
}
public PowershellTest(){}
#Override
public synchronized void run() {
String input = "";
String error = "";
boolean isHanging = false;
try {
Runtime runtime = Runtime.getRuntime();
Process proc = runtime.exec("powershell -file " + PATH_TO_SCRIPT +" "+ SERVER_IP +" "+ MACHINE_TO_MOD);
proc.getOutputStream().close();
InputStream inputstream = proc.getInputStream();
InputStreamReader inputstreamreader = new InputStreamReader(inputstream);
BufferedReader bufferedreader = new BufferedReader(inputstreamreader);
proc.waitFor();
String line;
while (!isHanging && (line = bufferedreader.readLine()) != null) {
input += (line + "\n");
Date date = new Date();
while(!bufferedreader.ready()){
this.wait(1000);
//if its been more then 1 minute since a line has been read, its hanging.
if(new Date().getTime() - date.getTime() >= 60000){
isHanging = true;
break;
}
}
}
inputstream.close();
inputstream = proc.getErrorStream();
inputstreamreader = new InputStreamReader(inputstream);
bufferedreader = new BufferedReader(inputstreamreader);
isHanging = false;
while (!isHanging && (line = bufferedreader.readLine()) != null) {
error += (line + "\n");
Date date = new Date();
while(!bufferedreader.ready()){
this.wait(1000);
//if its been more then 1 minute since a line has been read, its hanging.
if(new Date().getTime() - date.getTime() >= 60000){
isHanging = true;
break;
}
}
}
inputstream.close();
proc.destroy();
} catch (IOException e) {
//throw new OperationException("File IO problem.", e);
} catch (InterruptedException e) {
//throw new OperationException("Script thread problem.",e);
}
System.out.println("Error : " + error + "\nInput : " + input);
}
}
I'm currently trying to run a powershell script that will start/stop a vm (VMWARE) on a remote server. The script work from command line and so does this code. The thing is, I hate how I have to use a thread (and make it wait for the script to respond, as explained further) for such a job. I had to do it because both BufferedReader.readline() and proc.waitFor() hang forever.
The script, when ran from cmd, is long to execute. it stall for 30 sec to 1 min from validating authentification with the server and executing the actual script. From what I saw from debugging, the readline hang when it start receiving those delays from the script.
I'm also pretty sure it's not a memory problem since I never had any OOM error in any debugging session.
Now I understand that Process.waitFor() requires me to flush the buffer from both the error stream and the regular stream to work and so that's mainly why I don't use it (I need the output to manage VM specific errors, certificates issues, etc.).
I would like to know if someone could explain to me why it hangs and if there is a way to just use the typical readline() without having it to hang so hard. Even if the script should have ended since a while, it still hang (I tried to run both the java application and a cmd command using the exact same thing I use in the java application at the same time, left it runingfor 1h, nothing worked). It is not just stuck in the while loop, the readline() is where the hanging is.
Also this is a test version, nowhere close to the final code, so please spare me the : this should be a constant, this is useless, etc. I will clean the code later. Also the IP is not XX.XX.XX.XXX in my code, obviously.
Either explanation or suggestion on how to fix would be greatly appreciated.
Ho btw here is the script I currently use :
Add-PSSnapin vmware.vimautomation.core
Connect-VIServer -server $args[0]
Start-VM -VM "MachineTest"
If you need more details I will try to give as much as I can.
Thanks in advance for your help!
EDIT : I also previously tested the code with a less demanding script, which job was to get the content of a file and print it. Since no waiting was needed to get the information, the readline() worked well. I'm thus fairly certain that the problem reside on the wait time coming from the script execution.
Also, forgive my errors, English is not my main language.
Thanks in advance for your help!
EDIT2 : Since I cannot answer to my own Question :
Here is my "final" code, after using threads :
import java.io.*;
public class PowershellTest implements Runnable {
public InputStream is;
public PowershellTest(InputStream newIs){
this.is = newIs;
}
#Override
public synchronized void run() {
String input = "";
String error = "";
try {
InputStreamReader inputstreamreader = new InputStreamReader(is);
BufferedReader bufferedreader = new BufferedReader(inputstreamreader);
String line;
while ((line = bufferedreader.readLine()) != null) {
input += (line + "\n");
}
is.close();
} catch (IOException e) {
//throw new OperationException("File IO problem.", e);
}
System.out.println("Error : " + error + "\nInput : " + input);
}
}
And the main simply create and start 2 thread (PowerShellTest instances), 1 with the errorStream and 1 with the inputStream.
I believe I made a dumb error when I first coded the app and fixed it somehow as I reworked the code over and over. It still take a good 5-6 mins to run, which is somehow similar if not longer than my previous code (which is logical since the errorStream and inputStream get their information sequentially in my case).
Anyway, thanks to all your answer and especially Miserable Variable for his hint on threading.
First, don't call waitFor() until after you've finished reading the streams. I would highly recommend you look at ProcessBuilder instead of simply using Runtime.exec, and split the command up yourself rather than relying on Java to do it for you:
ProcessBuilder pb = new ProcessBuilder("powershell", "-file", PATH_TO_SCRIPT,
SERVER_IP, MACHINE_TO_MOD);
pb.redirectErrorStream(true); // merge stdout and stderr
Process proc = pb.start();
redirectErrorStream merges the error output into the normal output, so you only have to read proc.getInputStream(). You should then be able to just read that stream until EOF, then call proc.waitFor().
You are currently waiting to complete reading from inputStream before starting to read from errorStream. If the process writes to its stderr before stdout maybe you are getting into a deadlock situation.
Try reading from both streams from concurrently running threads. While you are at it, also remove proc.getOutputStream().close();. It shouldn't affect the behavior, but it is not required either.

input.read() func. lock in the while loop

After connecting to server, I run some commands on server and then trying to take the server knowledge to console with;
while(i!=-1){
String c="";
String line = "";
try {
while ((i = input.read()) != 10 && i!=-1) {
bx[0] = (byte) i;
c = new String(bx);
line = line + c ;
System.out.print(c);
}
} catch (IOException e2) {
e2.printStackTrace();
}
File outfile = new File("calltrak.txt");
boolean append = true;
try
{
if (!outfile.exists())
{
append = false;
}
FileWriter fout1 = new FileWriter("calltrak.txt",append);
PrintWriter fileout = new PrintWriter(fout1,true);
fileout.println(line);
fileout.flush();
fileout.close();
} catch (IOException e1) {
e1.printStackTrace();
}
disp.append(line);
}
But the problem is when the program read all lines from server windows, in server it waits to new input and my prog still tring to read the line and so it locked... How can I solve this problem... (Note:Using a timer isn't a way to solve because the lines which the program read can be 100 line or 100000 and sometimes server can work slow) (In the code "disp" is Jpanel name)
I solved this problem with using paralel thread. With starting Inputstream read method I also started another thread and put inside of it a timer.If read method wait more than 5 seconds, other thread sen -1 to first loop and so loop terminated.
There are several problems performance wise with your code, but to answer your question you should let the server send a EndOfText 0x3 or EndOfTransmission 0x4 at the end see AsciiTable this way you can terminate then.

Categories