multi thread write only one outputstream and data loss - java

I run the five threads that generate the random string data and then write only one output stream. after the program is finished, a few data was lost.
I simplify my code.
new Thread(() -> {
stream.write(RANDOM_STRING + "\n)
).start();
class Stream {
String buffer = "";
Stream() {
new Thread(() -> {
BufferedOutputStream bs
= new BufferedOutputStream(new FileOutputStream("PATH");
bs.wrtie(buffer.getBytes()); // point 1
buffer = "" // point 2
bs.close();
}).start();
}
public void write(String input) {
buffer += input;
}
}
I think data loss's cause is between point 1 and 2. I think If I use the indexing data structure for checking what data was consumed, It can be solved. but is there any better way to solve this problem? Please help me. Thanks.

Try to use ConcurrentLinkedQueue<String> for buffer, with methods offer and poll instead of += and = "" on String reference.

Related

Can a java InputStream continuously read data from a method?

I have a piece of code
...
InputStream inputStream = new BufferedInputStream(new ByteArrayInputStream("test".getBytes()));
...
and this line makes string "test" an input for an InputStream, however this is a static InputStream.
is there any way without a Scanner, System.in or user external input to make this InputStream dynamic
what I need is something like this
...
InputStream inputStream = new BufferedInputStream(new
ByteArrayInputStream(generateContinuousDynamicString().getBytes()));
// So, basically input stream will be blocked until generateContinuousDynamicString()
// returns a result?
...
I've tried something like this
private static byte[] generateContinuousDynamicString(String s) {
String t = "";
// here comes the realization
// that the source for an input stream
// cannot be generated dynamically on the
// fly it only can be read from already
// existing (fully generated and available
// resource). Am I right? Otherwise how
// can I adjust this method in such a way that
// input stream would continuously have a new
// string to read from?
for (int i = 0; i < 1000; i++){
t += "<str>"+s+i+"</str>";
}
return ("<test>"+t+"</test>").getBytes();
}
So, if we have
...
InputStream inputStream = new BufferedInputStream(readFromADatabaseStream());
...
This is also not dynamic input stream as a resource is already in a database.
You want a pipe. Specifically, you want one of the following pairs of classes:
PipedInputStream and PipedOutputStream
PipedReader and PipedWriter
Your question asks for an InputStream, but since you’re dealing with text, you probably should use a Reader, which is intended for characters. In particular, note that getBytes() will return different values on Windows systems compared to non-Windows systems, for any String with non-ASCII characters. Using a Reader and Writer will remove the need to worry about that.
Either way, the approach is the same: create the readable end of the pipe, then create and feed the writable end of the pipe in another thread.
Using a PipedReader and PipedWriter:
PipedReader pipedReader = new PipedReader();
Reader reader = new BufferedReader(pipedReader);
ExecutorService executor = Executors.newSingleThreadExecutor();
Future<?> pipeFeeder = executor.submit(
() -> generateContinuousDynamicString(pipedReader));
// ...
private Void generateContinuousDynamicString(PipedReader pipedReader)
throws IOException {
try (Writer writer = new PipedWriter(pipedReader)) {
writer.write("<test>");
for (int i = 0; i < 1000; i++) {
writer.write("<str>" + i + "</str>");
}
writer.write("</test>");
}
return null;
}
Using a PipedInputStream and PipedOutputStream:
PipedInputStream pipedInputStream = new PipedInputStream();
InputStream inputStream = new BufferedInputStream(pipedInputStream);
ExecutorService executor = Executors.newSingleThreadExecutor();
Future<?> pipeFeeder = executor.submit(
() -> generateContinuousDynamicString(pipedInputStream));
// ...
private Void generateContinuousDynamicString(PipedInputStream pipedInputStream)
throws IOException {
Charset charset = StandardCharsets.UTF_8;
try (Writer writer = new OutputStreamWriter(
new PipedInputStream(pipedinputStream),
StandardCharsets.UTF_8)) {
writer.write("<test>");
for (int i = 0; i < 1000; i++) {
writer.write("<str>" + i + "</str>");
}
writer.write("</test>");
}
return null;
}
Sure. But you have a bit of an issue: Whatever code is generating the endless stream of dynamic data cannot just be in the method that 'returns the inputstream' just by itself, that's what your realisation is about.
You have two major options:
Threads
Instead, you could fire off a thread which is continually generating data. Note that whatever it 'generates' needs to be cached; this is not a good fit if, say, you want to dynamically generate an inputstream that just serves up an endless amount of 0 bytes, for example. It's a good fit if the data is coming from, say, a USB connected arduino that from time to time sends information about a temperature sensor that it's connected to. Note that you need the thread to store the data it receives someplace, and then have an inputstream that will 'pull' from this queue of data you're making. To make an inputstream that pulls from a queue, see the next section. As this will involve threads, use something from java.util.concurrent, such as ArrayBlockingQueue - this has the double benefit that you won't get infinite buffers, either (the act of putting something in the buffer will block if the buffer is full).
subclassing
What you can also do is take the code that can generate new values, but, put it in an envelope - a thing you can pass around. You want to make some code, but not run it - you want to run that later, when the thing you hand the inputstream to, calls .read().
One easy way to do that, is to extend InputStream - and then implement your own zero method. Looks something like this:
class InfiniteZeroesInputStream extends InputStream {
public int read() {
return 0;
}
}
It's that simple. Given:
try (InputStream in = new InfiniteZeroesInputStream()) {
in.read(); // returns 0.. and will always do so.
byte[] b = new byte[65536];
in.read(b); // fills the whole array with zeroes.
}

Reading all process output before process closes

I'm reading separate process console output using standard code:
ProcessBuilder b = new ProcessBuilder(exeArgs);
b.redirectErrorStream(true);
Process process = b.start();
try (BufferedReader inputReader= new BufferedReader(new InputStreamReader(process.getInputStream()))) {
String output;
while ((output = inputReader.readLine()) != null) {
// do something with output
}
}
The problem is that if the process ends faster than I read output, a portion of the output is lost.
The easiest way to reproduce the problem is to put a breakpoint on the try expression and wait for a second before continuing program execution. Doing this will prevent you from getting any output at all. E.g. inputReader.readLine() will never return a string.
Is there are a way to cache process output to ensure it's always read completely?
It seems on Windows the console buffer size is limited and you should read it as quickly as possible to avoid buffer overflow and losing a portion of data. The only reliable solution I found after reading several answers here and reading blog posts on topic is to read the console output in a separate thread as quickly as possible.
Here is a simple code implementing the task in Kotlin:
typealias GobblerCallback = ((e: String) -> Unit)
class StreamGobbler(private val stream: InputStream, private val lineReadCallback: GobblerCallback? = null) : Thread() {
override fun run() {
try {
val reader = BufferedReader(InputStreamReader(stream, StandardCharsets.UTF_8))
while (true) {
val line = safeReadLine(reader) ?: break
lineReadCallback?.invoke(line)
}
} catch (e: Exception) {
// Handle exception if necessary
}
}
}
// Sample code that reads full process output
val gobbler1 = StreamGobbler(process.inputStream, callback)
gobbler1.start()
val gobbler2 = StreamGobbler(process.errorStream, callback)
gobbler2.start()

Heap size issue - Memory management using java

I have the following code in my application which does two things:
Parse the file which has 'n' number of data.
For each data in the file, there will be two web service calls.
public static List<String> parseFile(String fileName) {
List<String> idList = new ArrayList<String>();
try {
BufferedReader cfgFile = new BufferedReader(new FileReader(new File(fileName)));
String line = null;
cfgFile.readLine();
while ((line = cfgFile.readLine()) != null) {
if (!line.trim().equals("")) {
String [] fields = line.split("\\|");
idList.add(fields[0]);
}
}
cfgFile.close();
} catch (IOException e) {
System.out.println(e+" Unexpected File IO Error.");
}
return idList;
}
When i try parse the file having 1 million lines of record, the java process fails after processing certain amount of data. I got java.lang.OutOfMemoryError: Java heap space error. I can partly figure out that the java process stops because of this huge data being provided. Kindly suggest me how to proceed with this huge data.
EDIT: Will this part of code new BufferedReader(new FileReader(new File(fileName))); parse the whole file and gets affected to the size of the file.
The problem you have is you are accumulating all the data on the list. The best way to approach this is to do it on a streaming fashion. This means do not accumulate all the ids on the list, but call your web service on each row or accumulate a smaller buffer and then do the call.
Opening the file and creating the BufferedReader will have no impact on memory consumption, as the bytes from the file will be read (more or less) line by line. The problem is at this point in the code idList.add(fields[0]);, the list will grow as large as the file as you keep accumulating all of the file data into it.
Your code should do something like this:
while ((line = cfgFile.readLine()) != null) {
if (!line.trim().equals("")) {
String [] fields = line.split("\\|");
callToRemoteWebService(fields[0]);
}
}
Increase your java heap memory size using the -Xms and -Xmx options. If not set explicitly, the jvm sets the heap size to the ergonomic defaults which in your case is not enough. Read this paper to find out more about tuning the memory in jvm: http://www.oracle.com/technetwork/java/javase/tech/memorymanagement-whitepaper-1-150020.pdf
EDIT: Alternative way on doing this in a producer-consumer way to exploit parallel processing. The general idea is to create a producer thread that reads the file and queues tasks for processing and n consumer threads that consume them. A very general idea (for illustrative purposes) is the following:
// blocking queue holding the tasks to be executed
final SynchronousQueue<Callable<String[]> queue = // ...
// reads the file and submit tasks for processing
final Runnable producer = new Runnable() {
public void run() {
BufferedReader in = null;
try {
in = new BufferedReader(new FileReader(new File(fileName)));
String line = null;
while ((line = file.readLine()) != null) {
if (!line.trim().equals("")) {
String[] fields = line.split("\\|");
// this will block if there are not available consumer threads to process it...
queue.put(new Callable<Void>() {
public Void call() {
process(fields);
}
});
}
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt());
} finally {
// close the buffered reader here...
}
}
}
// Consumes the tasks submitted from the producer. Consumers can be pooled
// for parallel processing.
final Runnable consumer = new Runnable() {
public void run() {
try {
while (true) {
// this method blocks if there are no items left for processing in the queue...
Callable<Void> task = queue.take();
taks.call();
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}
Of course you have to write code that manages the lifecycle of the consumer and producer threads. The right way to do this would be by implementing it using an Executor.
When you want to work with big data, you have 2 choices:
use a big enough heap to fit all the data. this will "work" for a while, but if your data size is unbounded, it will eventually fail.
work with the data incrementally. only keep part of the data (of a bounded size) in memory at any one time. this is the ideal solution as it will scale to any amount of data.

How can I print .exe printf() messages from java program

I have one application that prints messages from Test.exe in console .My java program creates one process by executing this Test.exe.
This application prints messages by reading from input-stream of that process.
The problem, that I am facing is,
I have two scenarios:
1) When I double click test.exe, messages("Printing : %d") are printing for every second.
2)But when I run my java application,whole messages are printing at last(not for every second) before terminating Test.exe.If .exe has a very huge messages to print,then it will print those messages(I think whenever buffer becomes full)and flushing will be done.
But how can I print messages same as 1st case.
Help from anyone would be appreciated. :)
Here is the code for this Test.exe.
#include <stdio.h>
#include <windows.h>
void main(void)
{
int i=0;
while (1)
{
Sleep(500);
printf("\nPrinting : %d",i);
i++;
if (i==10)
//if(i==100)
{
return 0;
}
}
}
And my Java application is below:
public class MainClass {
public static void main(String[] args) {
String str = "G:\\Charan\\Test\\Debug\\Test.exe";
try {
Process testProcess = Runtime.getRuntime().exec(str);
InputStream inputStream = new BufferedInputStream(
testProcess.getInputStream());
int read = 0;
byte[] bytes = new byte[1000];
String text;
while (read >= 0) {
if (inputStream.available() > 0 ) {
read = inputStream.read(bytes);
if (read > 0) {
text = new String(bytes, 0, read);
System.out.println(text);
}
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
Is it possible in reverse order.If I input some text from console,Java should read and pass that String to .exe(or testProcess).How .exe scan something from Java program.
Could anyone help me..
Given that you're trying to print stdout from that process line by line, I would created a BufferedReader object using the process' input stream and use the readLine() method on that. You can get a BufferedReader object using the following chain of constructors:
BufferedReader testProcessReader = new BufferedReader(new InputStreamReader(testProcess.getInputStream()));
And to read line by line:
String line;
while ((line = testProcessReader.readLine()) != null) {
System.out.println(line);
}
The assumption here is that Test.exe is flushing its output, which is required by any read from the Java side. You can flush the output from C by calling fflush(stdout) after every call to printf().
If you don't flush, the data only lives in a buffer. When considering performance, it's a trade-off, how often you want the data to be written vs. how many writes / flush operations you want to save. If performance is critical, you can consider looking into a more efficient inter-process communication mechanism to pass data between the processes instead of stdout. Since you are on Windows, the first step might be to take a look at the Microsoft IPC help page.
Seems to have something to do with not flushing. I guess it's on both sides - The C library you use seems to only automatically flush output when writing to a terminal. Flush manually after calling printf.
On the Java side, try reading from a non-buffered stream.

Java: Pause thread and get position in file

I'm writing an application in Java with multithreading which I want to pause and resume.
The thread is reading a file line by line while finding matching lines to a pattern. It has to continue on the place I paused the thread. To read the file I use a BufferedReader in combination with an InputStreamReader and FileInputStream.
fip = new FileInputStream(new File(*file*));
fileBuffer = new BufferedReader(new InputStreamReader(fip));
I use this FileInputStream because I need the filepointer for the position in the file.
When processing the lines it writes the matching lines to a MySQL database. To use a MySQL-connection between the threads I use a ConnectionPool to make sure just one thread is using one connection.
The problem is when I pause the threads and resume them, a few matching lines just disappear. I also tried to subtract the buffersize from the offset but it still has the same problem.
What is a decent way to solve this problem or what am I doing wrong?
Some more details:
The loop
// Regex engine
RunAutomaton ra = new RunAutomaton(this.conf.getAuto(), true);
lw = new LogWriter();
while((line=fileBuffer.readLine()) != null) {
if(line.length()>0) {
if(ra.run(line)) {
// Write to LogWriter
lw.write(line, this.file.getName());
lw.execute();
}
}
}
// Loop when paused.
while(pause) { }
}
Calculating place in file
// Get the position in the file
public long getFilePosition() throws IOException {
long position = fip.getChannel().position() - bufferSize + fileBuffer.getNextChar();
return position;
}
Putting it into the database
// Get the connector
ConnectionPoolManager cpl = ConnectionPoolManager.getManager();
Connector con = null;
while(con == null)
con = cpl.getConnectionFromPool();
// Insert the query
con.executeUpdate(this.sql.toString());
cpl.returnConnectionToPool(con);
Here's an example of what I believe you're looking for. You didn't show much of your implementation so it's hard to debug what might be causing gaps for you. Note that the position of the FileInputStream is going to be a multiple of 8192 because the BufferedReader is using a buffer of that size. If you want to use multiple threads to read the same file you might find this answer helpful.
public class ReaderThread extends Thread {
private final FileInputStream fip;
private final BufferedReader fileBuffer;
private volatile boolean paused;
public ReaderThread(File file) throws FileNotFoundException {
fip = new FileInputStream(file);
fileBuffer = new BufferedReader(new InputStreamReader(fip));
}
public void setPaused(boolean paused) {
this.paused = paused;
}
public long getFilePos() throws IOException {
return fip.getChannel().position();
}
public void run() {
try {
String line;
while ((line = fileBuffer.readLine()) != null) {
// process your line here
System.out.println(line);
while (paused) {
sleep(10);
}
}
} catch (IOException e) {
// handle I/O errors
} catch (InterruptedException e) {
// handle interrupt
}
}
}
I think the root of the problem is that you shouldn't be subtracting bufferSize. Rather you should be subtracting the number of unread characters in the buffer. And I don't think there's a way to get this.
The easiest solution I can think of is to create a custom subclass of FilterReader that keeps track of the number of characters read. Then stack the streams as follows:
FileReader
< BufferedReader
< custom filter reader
< BufferedReader(sz == 1)
The final BufferedReader is there so that you can use readLine ... but you need to set the buffer size to 1 so that the character count from your filter matches the position that the application has reached.
Alternatively, you could implement your own readLine() method in the custom filter reader.
After a few days searching I found out that indeed subtracting the buffersize and adding the position in the buffer wasn't the right way to do it. The position was never right and I was always missing some lines.
When searching a new way to do my job I didn't count the number of characters because it are just too many characters to count which will decrease my performance a lot. But I've found something else. Software engineer Mark S. Kolich created a class JumpToLine which uses the Apache IO library to jump to a given line. It can also provide the last line it has readed so this is really what I need.
There are some examples on his homepage for those interested.

Categories