Can't understand piped inputstream - java

I am trying to understand piped streams.
Instead of piped stream why can't we use other streams to pipe each other? like below:
final ByteArrayOutputStream pos = new ByteArrayOutputStream();
final ByteArrayInputStream pis = new ByteArrayInputStream(pos.toByteArray());
and when will we have a deadlock in a piped stream? I tried to read and write using single main thread, but it executes smoothly.

The difficulty here is that the process must be implemented in several threads because writing to one end of the pipe must be matched with a read at the other end.
It is certainly not difficult to create a thread to monitor arrivals at the end of one pipe and push them back through another pipe but it cannot be done with a single thread.
Have you looked at this question?

Piped streams allow for efficient byte-by-byte processing with minimal effort.
I could very well be wrong, but I believe toByteArray() might not do what you think it does. It just copies the current contents, not any contents in future.
So the only real issue here is management of this, which would be a bit more difficult. You'd have to constantly poll the output stream. Not to mention the memory allocation of an array for each call to toByteArray (which "Creates a newly allocated byte array" for each call).
How I suspect deadlocks may happen in a single thread:
If you try to (blocking) read from an input stream that doesn't have data yet. It will never be able to get data because data can only be obtained from the output stream to which must be written in the same thread, which can't happen while you're sitting waiting for data.
So, in a single thread, it will happen if you're not very careful, but it should be possible to successfully use them in the same thread without deadlocks, but why would you want to? I think another data structure may be better suited, like a linked-list or simple circular array.

Related

Using a file as a long-term buffer

I am receiving an infinite stream of data as input and I am writing it to a file "out.dat". At the same time, I have to process the data with a slow function process_data(). What I would like to do is to have a thread that reads data from the file "out.dat" and processes it with process_data(). So, in the main thread, I receive the input data and write it to "out.dat", in the second I read the data from "out.dat" and process it.
(The reason for not using an exchange buffer between the two threads is that it should be quite big as the input stream is fast and process_data() is slow. Over time, the exchange buffer will grow too much.)
It seems that using FileChannel would be appropriate. In the documentation I read "The view of a file provided by an instance of this class is guaranteed to be consistent with other views of the same file provided by other instances in the same program.".
However, I am not sure since it does not specify what "consistent" really means. My need is to write data to "out.dat" and, later, from another thread, read (FileChannel#read) the same data again. Does "consistent" mean that once I write the data, say at position 100000, when I do read(byteBuffer, 100000) I get the same data?
Do I have to call FileChannel#force to force the writings to be saved on disk?
Thank you in advance.

What is the use of flush method in serializing an object after writing the object in to the stream? [duplicate]

In Java, flush() method is used in streams. But I don't understand what are all the purpose of using this method?
fin.flush();
tell me some suggestions.
From the docs of the flush method:
Flushes the output stream and forces any buffered output bytes to be written out. The general contract of flush is that calling it is an indication that, if any bytes previously written have been buffered by the implementation of the output stream, such bytes should immediately be written to their intended destination.
The buffering is mainly done to improve the I/O performance. More on this can be read from this article: Tuning Java I/O Performance.
When you write data to a stream, it is not written immediately, and it is buffered. So use flush() when you need to be sure that all your data from buffer is written.
We need to be sure that all the writes are completed before we close the stream, and that is why flush() is called in file/buffered writer's close().
But if you have a requirement that all your writes be saved anytime before you close the stream, use flush().
When we give any command, the streams of that command are stored in the memory location called buffer(a temporary memory location) in our computer. When all the temporary memory location is full then we use flush(), which flushes all the streams of data and executes them completely and gives a new space to new streams in buffer temporary location.
-Hope you will understand
If the buffer is full, all strings that is buffered on it, they will be saved onto the disk. Buffers is used for avoiding from Big Deals! and overhead.
In BufferedWriter class that is placed in java libs, there is a one line like:
private static int defaultCharBufferSize = 8192;
If you do want to send data before the buffer is full, you do have control. Just Flush It. Calls to writer.flush() say, "send whatever's in the buffer, now!
reference book: https://www.amazon.com/Head-First-Java-Kathy-Sierra/dp/0596009208
pages:453
In addition to other good answers here, this explanation made it very clear for me:
A buffer is a portion in memory that is used to store a stream of data
(characters). These characters sometimes will only get sent to an
output device (e.g. monitor) when the buffer is full or meets a
certain number of characters. This can cause your system to lag if you
just have a few characters to send to an output device. The flush()
method will immediately flush the contents of the buffer to the output
stream.
Source: https://www.youtube.com/watch?v=MjK3dZTc0Lg
Streams are often accessed by threads that periodically empty their content and, for example, display it on the screen, send it to a socket or write it to a file. This is done for performance reasons. Flushing an output stream means that you want to stop, wait for the content of the stream to be completely transferred to its destination, and then resume execution with the stream empty and the content sent.
For performance issue, first data is to be written into Buffer. When buffer get full then data is written to output (File,console etc.). When buffer is partially filled and you want to send it to output(file,console) then you need to call flush() method manually in order to write partially filled buffer to output(file,console).

FileReader or FileWriter object flush method invokation in java

I read that invoking flush method guarantees that the last of the data you thought you had already written actually gets out to the file.I didn't get the meaning of this statement can any one explain clearly what actually flush method invocation will do?
The writers are usually buffered so it waits for the buffer to be filled before it writes it to the file. Flush tells to write the buffer even though it might not be filled yet. It's usually useful when you finish the writing since the last buffer may not be full but you want to finish the writing.
Many streams have internal buffers which they use to store data before it is passed on. This prevents a file stream from having to continually write each individual byte to disk (which can be quite expensive). The flush command forces a stream to clear its internal buffers so that, in this case, everything is forced to disk.

Reading a file by multiple threads

I have a 250Mb file to be read. And the application is multi threaded. If i allow all threads to read the file the memory starvation occurs.
I get out of memory error.
To avoid it. I want to have only one copy of the String (which is read from stream) in memory and i want all the threads to use it.
while (true) {
synchronized (buffer) {
num = is.read(buffer);
String str = new String(buffer, 0, num);
}
sendToPC(str);
}
Basically i want to have only one copy of string when all thread completed sending, i want to read second string and so on.
Why multiple threads? You only have one disk and it can only go so fast. Multithreading it won't help, almost certainly. And any software design that relies on having an entire file in memory is seriously flawed in the first place.
Suppose you define your problem?
I realize this is kind of late, but I think what you want here is to use the map function in the FileChannel class. Once you map a region of the file into memory, then all of your threads can read or write to that block of memory and the OS will synchronizes that memory region with the file periodically (or when you call MappedByteBuffer.load()), and if you want each thread to work with a different part of the file, then you can assign several maps each mapping a specific region of the file and just use one map per thread.
see the javadoc for FileChannel, RandomAccessFile, and MappedByteBuffer
Could you directly use streams instead of completely reading the file in to memory?
You can register all threads as callbacks in the File reading class. SO have something like an array or list of classes implementing an interface StringReaderThread which has the method processString(String input). After reading each line from the file, iterate over this array/list and call processString() on all the threads this way. Would this solve your problem?

What is the purpose of flush() in Java streams?

In Java, flush() method is used in streams. But I don't understand what are all the purpose of using this method?
fin.flush();
tell me some suggestions.
From the docs of the flush method:
Flushes the output stream and forces any buffered output bytes to be written out. The general contract of flush is that calling it is an indication that, if any bytes previously written have been buffered by the implementation of the output stream, such bytes should immediately be written to their intended destination.
The buffering is mainly done to improve the I/O performance. More on this can be read from this article: Tuning Java I/O Performance.
When you write data to a stream, it is not written immediately, and it is buffered. So use flush() when you need to be sure that all your data from buffer is written.
We need to be sure that all the writes are completed before we close the stream, and that is why flush() is called in file/buffered writer's close().
But if you have a requirement that all your writes be saved anytime before you close the stream, use flush().
When we give any command, the streams of that command are stored in the memory location called buffer(a temporary memory location) in our computer. When all the temporary memory location is full then we use flush(), which flushes all the streams of data and executes them completely and gives a new space to new streams in buffer temporary location.
-Hope you will understand
If the buffer is full, all strings that is buffered on it, they will be saved onto the disk. Buffers is used for avoiding from Big Deals! and overhead.
In BufferedWriter class that is placed in java libs, there is a one line like:
private static int defaultCharBufferSize = 8192;
If you do want to send data before the buffer is full, you do have control. Just Flush It. Calls to writer.flush() say, "send whatever's in the buffer, now!
reference book: https://www.amazon.com/Head-First-Java-Kathy-Sierra/dp/0596009208
pages:453
In addition to other good answers here, this explanation made it very clear for me:
A buffer is a portion in memory that is used to store a stream of data
(characters). These characters sometimes will only get sent to an
output device (e.g. monitor) when the buffer is full or meets a
certain number of characters. This can cause your system to lag if you
just have a few characters to send to an output device. The flush()
method will immediately flush the contents of the buffer to the output
stream.
Source: https://www.youtube.com/watch?v=MjK3dZTc0Lg
Streams are often accessed by threads that periodically empty their content and, for example, display it on the screen, send it to a socket or write it to a file. This is done for performance reasons. Flushing an output stream means that you want to stop, wait for the content of the stream to be completely transferred to its destination, and then resume execution with the stream empty and the content sent.
For performance issue, first data is to be written into Buffer. When buffer get full then data is written to output (File,console etc.). When buffer is partially filled and you want to send it to output(file,console) then you need to call flush() method manually in order to write partially filled buffer to output(file,console).

Categories