I want to write a program in Java with support for unix pipeline. The problem is that my input files are images and I need in some way to separate them from one another.
I thought that there is no problem because I can read InputStream using ImageIO.read() without reseting position. But it isn't that simple. ImageIO.read() closes the stream every time an image is read. So I can't read more than one file from stdin. Do you have some solution for this?
The API for read() mentions, "This method does not close the provided InputStream after the read operation has completed; it is the responsibility of the caller to close the stream, if desired." You might also check the result for null and verify that a suitable ImageReader is available.
Related
My understanding is that this is a common scenario, but Java doesn't have a baked in solution and I've been searching on and off for more than a day now. I have tried the CircularCharBuffer from the Ostermiller library, but that uses some sort of reader that constantly waits for new input, so I couldn't get readline() to detect the end of the content (it would just hang).
So could someone tell me how I could do a conversion? For what it's worth, I'm converting multiple (potentially many) PDF files to raw text using the PDFBox lib. The PDFBox API puts the content onto a Writer, after which I need to get at the content for further processing (so BufferedReader/Writer is not actually essential, but some kind of Reader/Writer). I know that this is possible using StringReader/Writer, but I'm not sure that this is efficient plus I loose the readline() method.
This is a bit like asking how to convert a pig into an elephant ... :-)
OK, there are two ways to address this problem (using the Java libraries):
You can capture the data written to a buffered writer so that it can then be read using a buffered reader. Basically, you do this by:
using your BufferedWriter to write to a StringWriter or CharArrayWriter,
closing it,
extracting the resulting stuff from the SW / CAW as a String, and
wrapping the String in a StringReader,
wrapping the StringReader in a BufferedReader.
You can create a PipedReader / PipedWriter pair and wrap them with BufferedReader and BufferedWriter respectively.
The two approaches both have disadvantages:
The first one requires you to complete the writing before constructing the read side. That means you need space to hold the entire stream content in memory, and you can't do producer-side and consumer-side processing in parallel.
The second one requires you to produce and consume in separate threads ... or risk having the pipeline block permanently.
Conceptually speaking, the Ostermiller library is really an reimplementation of PipeReader / PipeWriter. (And some of the advantages of his reimplementation were mooted in Java 1.6 ... which allows you to specify the pipeline's buffer size. Mark support is interesting, but I can imagine some problems, depending on how you used it.)
You might also be able to find a PipedReader / PipedWriter replacement that uses a flexible buffer that grows and contracts as required. (At least ... this is conceptually possible.)
The CircularCharBuffer from the Ostermiller lib has two methods getWriter() and getReader() to get a reader on the content of a writer, and vice versa. The reason the Reader was hanging at the final readLine() was because I wasn't calling close() on the writer after I had finished writing to it. So the final readLine() was waiting for new content on the writer that was never going to arrive.
The Ostermiller library can be found here.
I need to pass an InputStream to an object which reads data which I previously stored into a File. I'm assessing a more efficient approach than storing eveything into a File and then passing the FileInputStream. I'd like to do it on the fly.
May someone appoint me the correct approach to do that?
The idea would be passing a Custom InputStream which innerly calls every line I was going to store in the file. I guess I need buffering. I discard storing everything in a String and then build an InputStream on it, as we are in the same situation, waiting to output all the lines before rereading them again.
There already is a stream for this. It's the PipedInputStream. You'll need to have one thread write to the PipedOutputStream, and pass the PipedInputStream to the object that will be reading in another thread.
I need to check if a position in a random access file has not been written to. The problem with this is, when the position actually hasn't been written to, I get (as anticipated) an EOFException. I have been reading RandomAccessFile documentation to try to solve this problem, and tried researching online.
Things I've tried:
Using a try-catch block and catching every time there is a EOFException (Using try-catch as a conditional statement). It works, but it is horrible practice, and it is very inefficient, as for my case it is EOF the majority of the time.
Using a BufferReader to loop through and check the position. I ended up running into many problems and decided that there must be a better way.
I don't want to do any copying one file over to another or any other work around. I know there has to be a direct way of doing this, I just can't seem to find the correct solution.
Are you trying to write a "tailer"?
What you need to do is have one thread which reads only the data which is there using FileChannel.size() to check for more data. This data is passed to a piped input. This allows you to have a second thread which reads the piped stream continuously e.g. using BufferedReader and blocks where more data is needed.
I'm working with a library that I have to provide an InputStream and a PrintStream. It uses the InputStream to gather data for processing and the PrintStream to provide results. I'm stuck using this library and its API cannot be altered.
There are two issues with this that I think have related solutions.
First, the data that needs to be read via the InputStream is not available upfront. Instead, the data is dynamically created by a different part of the application and given to my code as a String via method call. My code's job is to somehow allow the library to read this data through the InputStream provided as I get it.
Second, I need to somehow get the result that is written to the PrintStream and send it to another part of the application as a String. This needs to happen as immediately after the data is put in to the PrintStream as possible.
What it looks like I need are two stream objects that behave more or less like buffers. I need an InputStream that I can shove data in to whenever I have it and a PrintStream that I can grab it's contents whenever it has some. This seems a little awkward to me, but I'm not sure how else to do it.
I'm wondering if anything already exists that allows this kind of behavior or if there is a different (better) solution that will work in the situation I've described. The only thing I can come up with is to try to implement streams with this behavior, but that can become complicated fast (especially since the InputStream needs to block until data is available).
Any ideas?
Edit: To be clear, I'm not writing the library. I'm writing code that is supposed to provide the library with an InputStream to read data from and a PrintStream to write data to.
Looks like both streams need to be constantly reading/writing so you'll need two threads independent of each other. The pattern resembles JMS a little bit, in which case you're feeding information to a "queue" or "topic", and wait for it to be processed then put on a "output" queue/topic. This may introduce additional moving parts, but you could write a simple client to place info onto a JMS queue, then have a listener to just grab messages, and feed it to the input stream constantly. Then another piece of code to read from output stream, and do what you need with it.
Hope this helps.
I'm trying to use a BufferedInputStream to load an external DICOM file, but it eventually runs out of memory. When I used an InputStream, this never came up (I did this when I was loading the file through the assets folder).
I created my own producer-consumer threads to buffer the file, so I don't actually need the BufferedInputStream, but I DO need to use mark() and reset() which is not available in FileInputStream.
How should I go around this? Is there another kind of InputStream that I can use with a File which has the mark()/reset() functions? Can I empty the buffer somehow before the BufferedInputStream throws the error? Or should I find a way around using mark() instead?
Thanks for your input.
For mark and reset to work with buffered input the file points between the mark and reset need to remain in memory.
Workarounds depend on what you're actually trying to do; if you just need to start reading from a known location, perhaps a RandomAccessFile.