I'm using BufferedWriter with the default size of 8192 characters to write lines to a local file. The lines are read from socket inputstream using BufferedReader readLine method, blocking I/O.
Average line length is 50 characters. It all works well and fast enough (over 1 mln lines per second) however if the client stops writing, lines that are currently stored in BufferedWriter buffer won't be flushed to disk. In fact the buffered characters won't be flushed to disk until the client resumes writing or the connection is closed. This translates into a delay between the time line is transmitted by client and the time this line is committed to file, so long-tail latency goes up.
Is there a way to flush incomplete BufferedWriter buffer on timeout, e.g. within 100 milliseconds?
What about something like this? It's not a real BufferedWriter, but it's a Writer. It works by periodically checking on on the last writer to the underlying, hopefully unbuffered writer, then flushing the BufferedWriter if it's been longer than the timeout.
public class PeriodicFlushingBufferedWriter extends Writer {
protected final MonitoredWriter monitoredWriter;
protected final BufferedWriter writer;
protected final long timeout;
protected final Thread thread;
public PeriodicFlushingBufferedWriter(Writer out, long timeout) {
this(out, 8192, timeout);
}
public PeriodicFlushingBufferedWriter(Writer out, int sz, final long timeout) {
monitoredWriter = new MonitoredWriter(out);
writer = new BufferedWriter(monitoredWriter, sz);
this.timeout = timeout;
thread = new Thread(new Runnable() {
#Override
public void run() {
long deadline = System.currentTimeMillis() + timeout;
while (!Thread.interrupted()) {
try {
Thread.sleep(Math.max(deadline - System.currentTimeMillis(), 0));
} catch (InterruptedException e) {
return;
}
synchronized (PeriodicFlushingBufferedWriter.this) {
if (Thread.interrupted()) {
return;
}
long lastWrite = monitoredWriter.getLastWrite();
if (System.currentTimeMillis() - lastWrite >= timeout) {
try {
writer.flush();
} catch (IOException e) {
}
}
deadline = lastWrite + timeout;
}
}
}
});
thread.start();
}
#Override
public synchronized void write(char[] cbuf, int off, int len) throws IOException {
this.writer.write(cbuf, off, len);
}
#Override
public synchronized void flush() throws IOException {
this.writer.flush();
}
#Override
public synchronized void close() throws IOException {
try {
thread.interrupt();
} finally {
this.writer.close();
}
}
private static class MonitoredWriter extends FilterWriter {
protected final AtomicLong lastWrite = new AtomicLong();
protected MonitoredWriter(Writer out) {
super(out);
}
#Override
public void write(int c) throws IOException {
lastWrite.set(System.currentTimeMillis());
super.write(c);
}
#Override
public void write(char[] cbuf, int off, int len) throws IOException {
lastWrite.set(System.currentTimeMillis());
super.write(cbuf, off, len);
}
#Override
public void write(String str, int off, int len) throws IOException {
lastWrite.set(System.currentTimeMillis());
super.write(str, off, len);
}
#Override
public void flush() throws IOException {
lastWrite.set(System.currentTimeMillis());
super.flush();
}
public long getLastWrite() {
return this.lastWrite.get();
}
}
}
#copeg is right - flush it after every line. It is easy to flush it at time period but what is the sense to have only half record and not be able to proceed it?
You might apply Observer, Manager, and Factory patterns here and have a central BufferedWriterManager produce your BufferedWriters and maintain a list of active instances. An internal thread might wake periodically and flush the active instances. This might also be an opportunity for Weak references so there is no requirement for your consumers to explicitly free the object. Instead, the GC will do the work and your Manager simply needs to handle the case when its internal reference becomes null (i.e. when all strong references are dropped).
Don't try this complex scheme, it's too hard. Just reduce the size of the buffer, by specifying it when constructing the BufferedWriter. Reduce it till you find the balance between performance and latency that you need.
Related
I have a BufferedWriter as shown below:
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(
new GZIPOutputStream( hdfs.create(filepath, true ))));
String line = "text";
writer.write(line);
I want to find out the bytes written to the file with out querying file like
hdfs = FileSystem.get( new URI( "hdfs://localhost:8020" ), configuration );
filepath = new Path("path");
hdfs.getFileStatus(filepath).getLen();
as it will add overhead and I don't want that.
Also I cant do this:
line.getBytes().length;
As it give size before compression.
You can use the CountingOutputStream from Apache commons IO library.
Place it between the GZIPOutputStream and the file Outputstream (hdfs.create(..)).
After writing the content to the file you can read the number of written bytes from the CountingOutputStream instance.
If this isn't too late and you are using 1.7+ and you don't wan't to pull in an entire library like Guava or Commons-IO, you can just extend the GZIPOutputStream and obtain the data from the associated Deflater like so:
public class MyGZIPOutputStream extends GZIPOutputStream {
public MyGZIPOutputStream(OutputStream out) throws IOException {
super(out);
}
public long getBytesRead() {
return def.getBytesRead();
}
public long getBytesWritten() {
return def.getBytesWritten();
}
public void setLevel(int level) {
def.setLevel(level);
}
}
You can make you own descendant of OutputStream and count how many time write method was invoked
This is similar to the response by Olaseni, but I moved the counting into the BufferedOutputStream rather than the GZIPOutputStream, and this is more robust, since def.getBytesRead() in Olaseni's answer is not available after the stream has been closed.
With the implementation below, you can supply your own AtomicLong to the constructor so that you can assign the CountingBufferedOutputStream in a try-with-resources block, but still retrieve the count after the block has exited (i.e. after the file is closed).
public static class CountingBufferedOutputStream extends BufferedOutputStream {
private final AtomicLong bytesWritten;
public CountingBufferedOutputStream(OutputStream out) throws IOException {
super(out);
this.bytesWritten = new AtomicLong();
}
public CountingBufferedOutputStream(OutputStream out, int bufSize) throws IOException {
super(out, bufSize);
this.bytesWritten = new AtomicLong();
}
public CountingBufferedOutputStream(OutputStream out, int bufSize, AtomicLong bytesWritten)
throws IOException {
super(out, bufSize);
this.bytesWritten = bytesWritten;
}
#Override
public void write(byte[] b) throws IOException {
super.write(b);
bytesWritten.addAndGet(b.length);
}
#Override
public void write(byte[] b, int off, int len) throws IOException {
super.write(b, off, len);
bytesWritten.addAndGet(len);
}
#Override
public synchronized void write(int b) throws IOException {
super.write(b);
bytesWritten.incrementAndGet();
}
public long getBytesWritten() {
return bytesWritten.get();
}
}
I use AsyncHttpClient library for async non blocking requests.
My case: write data to a file as it is received over the network.
For download file from remote host and save to file I used default ResponseBodyPartFactory.EAGER and AsynchronousFileChannel so as not to block the netty thread as data arrives. But as my measurements showed, in comparison with LAZY the memory consumption in the Java heap increases many times over.
So I decided to go straight to LAZY, but did not consider the consequences for the files.
This code will help to reproduce the problem.:
public static class AsyncChannelWriter {
private final CompletableFuture<Integer> startPosition;
private final AsynchronousFileChannel channel;
public AsyncChannelWriter(AsynchronousFileChannel channel) throws IOException {
this.channel = channel;
this.startPosition = CompletableFuture.completedFuture((int) channel.size());
}
public CompletableFuture<Integer> getStartPosition() {
return startPosition;
}
public CompletableFuture<Integer> write(ByteBuffer byteBuffer, CompletableFuture<Integer> currentPosition) {
return currentPosition.thenCompose(position -> {
CompletableFuture<Integer> writenBytes = new CompletableFuture<>();
channel.write(byteBuffer, position, null, new CompletionHandler<Integer, ByteBuffer>() {
#Override
public void completed(Integer result, ByteBuffer attachment) {
writenBytes.complete(result);
}
#Override
public void failed(Throwable exc, ByteBuffer attachment) {
writenBytes.completeExceptionally(exc);
}
});
return writenBytes.thenApply(writenBytesLength -> writenBytesLength + position);
});
}
public void close(CompletableFuture<Integer> currentPosition) {
currentPosition.whenComplete((position, throwable) -> IOUtils.closeQuietly(channel));
}
}
public static void main(String[] args) throws IOException {
final String filepath = "/media/veracrypt4/files/1.jpg";
final String downloadUrl = "https://m0.cl/t/butterfly-3000.jpg";
final AsyncHttpClient client = Dsl.asyncHttpClient(Dsl.config().setFollowRedirect(true)
.setResponseBodyPartFactory(AsyncHttpClientConfig.ResponseBodyPartFactory.LAZY));
final AsynchronousFileChannel channel = AsynchronousFileChannel.open(Paths.get(filepath), StandardOpenOption.WRITE, StandardOpenOption.TRUNCATE_EXISTING, StandardOpenOption.CREATE);
final AsyncChannelWriter asyncChannelWriter = new AsyncChannelWriter(channel);
final AtomicReference<CompletableFuture<Integer>> atomicReferencePosition = new AtomicReference<>(asyncChannelWriter.getStartPosition());
client.prepareGet(downloadUrl)
.execute(new AsyncCompletionHandler<Response>() {
#Override
public State onBodyPartReceived(HttpResponseBodyPart content) throws Exception {
//if EAGER, content.getBodyByteBuffer() return HeapByteBuffer, if LAZY, return DirectByteBuffer
final ByteBuffer bodyByteBuffer = content.getBodyByteBuffer();
final CompletableFuture<Integer> currentPosition = atomicReferencePosition.get();
final CompletableFuture<Integer> newPosition = asyncChannelWriter.write(bodyByteBuffer, currentPosition);
atomicReferencePosition.set(newPosition);
return State.CONTINUE;
}
#Override
public Response onCompleted(Response response) {
asyncChannelWriter.close(atomicReferencePosition.get());
return response;
}
});
}
in this case, the picture is broken. But if i use FileChannel instead of AsynchronousFileChannel, in both cases, the files come out normal. Can there be any nuances when working with DirectByteBuffer (in case withLazyResponseBodyPart.getBodyByteBuffer()) and AsynchronousFileChannel?
What could be wrong with my code, if everything works fine with EAGER?
UPDATE
As I noticed, if I use LAZY, and for example,i add the line
Thread.sleep (10) in the method onBodyPartReceived, like this:
#Override
public State onBodyPartReceived(HttpResponseBodyPart content) throws Exception {
final ByteBuffer bodyByteBuffer = content.getBodyByteBuffer();
final CompletableFuture<Integer> currentPosition = atomicReferencePosition.get();
final CompletableFuture<Integer> newPosition = finalAsyncChannelWriter.write(bodyByteBuffer, currentPosition);
atomicReferencePosition.set(newPosition);
Thread.sleep(10);
return State.CONTINUE;
}
The file is saved to disk in non broken state.
As I understand it, the reason is that during these 10 milliseconds, the asynchronous thread in AsynchronousFileChannel manages to write data to the disk from this DirectByteBuffer.
It turns out that the file is broken due to the fact that this asynchronous thread uses this buffer for writing along with the netty thread.
If we take a look at source code with EagerResponseBodyPart, then we will see the following
private final byte[] bytes;
public EagerResponseBodyPart(ByteBuf buf, boolean last) {
super(last);
bytes = byteBuf2Bytes(buf);
}
#Override
public ByteBuffer getBodyByteBuffer() {
return ByteBuffer.wrap(bytes);
}
Thus, when a piece of data arrives, it is immediately stored in the byte array. Then we can safely wrap them in HeapByteBuffer and transfer to the asynchronous thread in file channel.
But if you look at the code LazyResponseBodyPart
private final ByteBuf buf;
public LazyResponseBodyPart(ByteBuf buf, boolean last) {
super(last);
this.buf = buf;
}
#Override
public ByteBuffer getBodyByteBuffer() {
return buf.nioBuffer();
}
As you can see, we actually use in asynchronous file channel thread netty ByteBuff(in this case always PooledSlicedByteBuf) via method call nioBuffer
What can i do in this situation, how to safely pass DirectByteBuffer in an async thread without copying buffer to java heap?
I talked to the maintainer of AsyncHttpClient.
Can see here
The main problem was that i dont's use netty ByteBuf methods retain and release.
In the end, I came to two solutions.
First: Write the bytes in sequence to the file with tracking position with CompletableFuture.
Define wrapper class for AsynchronousFileChannel
#Log4j2
public class AsyncChannelNettyByteBufWriter implements Closeable {
private final AtomicReference<CompletableFuture<Long>> positionReference;
private final AsynchronousFileChannel channel;
public AsyncChannelNettyByteBufWriter(AsynchronousFileChannel channel) {
this.channel = channel;
try {
this.positionReference = new AtomicReference<>(CompletableFuture.completedFuture(channel.size()));
} catch (IOException e) {
throw new UncheckedIOException(e);
}
}
public CompletableFuture<Long> write(ByteBuf byteBuffer) {
final ByteBuf byteBuf = byteBuffer.retain();
return positionReference.updateAndGet(x -> x.thenCompose(position -> {
final CompletableFuture<Integer> writenBytes = new CompletableFuture<>();
channel.write(byteBuf.nioBuffer(), position, byteBuf, new CompletionHandler<Integer, ByteBuf>() {
#Override
public void completed(Integer result, ByteBuf attachment) {
attachment.release();
writenBytes.complete(result);
}
#Override
public void failed(Throwable exc, ByteBuf attachment) {
attachment.release();
log.error(exc);
writenBytes.completeExceptionally(exc);
}
});
return writenBytes.thenApply(writenBytesLength -> writenBytesLength + position);
}));
}
public void close() {
positionReference.updateAndGet(x -> x.whenComplete((position, throwable) -> {
try {
channel.close();
} catch (IOException e) {
throw new UncheckedIOException(e);
}
}));
}
}
In fact, there probably won't be an AtomicReference here, if the recording happens in one thread, and if from several, then we need to seriously approach synchronization.
And main usage.
public static void main(String[] args) throws IOException {
final String filepath = "1.jpg";
final String downloadUrl = "https://m0.cl/t/butterfly-3000.jpg";
final AsyncHttpClient client = Dsl.asyncHttpClient(Dsl.config().setFollowRedirect(true)
.setResponseBodyPartFactory(AsyncHttpClientConfig.ResponseBodyPartFactory.LAZY));
final AsynchronousFileChannel channel = AsynchronousFileChannel.open(Paths.get(filepath), StandardOpenOption.WRITE, StandardOpenOption.TRUNCATE_EXISTING, StandardOpenOption.CREATE);
final AsyncChannelNettyByteBufWriter asyncChannelNettyByteBufWriter = new AsyncChannelNettyByteBufWriter(channel);
client.prepareGet(downloadUrl)
.execute(new AsyncCompletionHandler<Response>() {
#Override
public State onBodyPartReceived(HttpResponseBodyPart content) {
final ByteBuf byteBuf = ((LazyResponseBodyPart) content).getBuf();
asyncChannelNettyByteBufWriter.write(byteBuf);
return State.CONTINUE;
}
#Override
public Response onCompleted(Response response) {
asyncChannelNettyByteBufWriter.close();
return response;
}
});
}
The second solution: track the position based on the received size of bytes.
public static void main(String[] args) throws IOException {
final String filepath = "1.jpg";
final String downloadUrl = "https://m0.cl/t/butterfly-3000.jpg";
final AsyncHttpClient client = Dsl.asyncHttpClient(Dsl.config().setFollowRedirect(true)
.setResponseBodyPartFactory(AsyncHttpClientConfig.ResponseBodyPartFactory.LAZY));
final ExecutorService executorService = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors() * 2);
final AsynchronousFileChannel channel = AsynchronousFileChannel.open(Paths.get(filepath), new HashSet<>(Arrays.asList(StandardOpenOption.WRITE, StandardOpenOption.TRUNCATE_EXISTING, StandardOpenOption.CREATE)), executorService);
client.prepareGet(downloadUrl)
.execute(new AsyncCompletionHandler<Response>() {
private long position = 0;
#Override
public State onBodyPartReceived(HttpResponseBodyPart content) {
final ByteBuf byteBuf = ((LazyResponseBodyPart) content).getBuf().retain();
long currentPosition = position;
position+=byteBuf.readableBytes();
channel.write(byteBuf.nioBuffer(), currentPosition, byteBuf, new CompletionHandler<Integer, ByteBuf>() {
#Override
public void completed(Integer result, ByteBuf attachment) {
attachment.release();
if(content.isLast()){
try {
channel.close();
} catch (IOException e) {
throw new UncheckedIOException(e);
}
}
}
#Override
public void failed(Throwable exc, ByteBuf attachment) {
attachment.release();
try {
channel.close();
} catch (IOException e) {
throw new UncheckedIOException(e);
}
}
});
return State.CONTINUE;
}
#Override
public Response onCompleted(Response response) {
return response;
}
});
}
In the second solution, because we don’t wait until some bytes are written to the file, AsynchronousFileChannel can create a lot of threads (If you use Linux, because Linux does not implement non-blocking asynchronous file IO. In Windows, the situation is much better).
As my measurements showed, in the case of writing to a slow USB flash, the number of threads can reach tens of thousands, so for this you need to limit the number of threads by creating your ExecutorService and transferring it to AsynchronousFileChannel.
Are there obvious advantages and disadvantages of the first and second solutions? It's hard for me to say. Maybe someone can tell what is more effective.
Is it possible to stop the bytesTransferred stream for the Apache Util.copyStream function?
long bytesTransferred = Util.copyStream(inputStream, outputStream, 32768, CopyStreamEvent.UNKNOWN_STREAM_SIZE, new CopyStreamListener() {
#Override
public void bytesTransferred(CopyStreamEvent event) {
bytesTransferred(event.getTotalBytesTransferred(), event.getBytesTransferred(), event.getStreamSize());
}
#Override
public void bytesTransferred(long totalBytesTransferred, int bytesTransferred,
long streamSize) {
try {
if(true) {
log.info('Stopping');
return; //Cancel
} else {
log.info('Still going');
}
} catch (InterruptedException e) {
// this should not happen!
}
}
});
In this case, what will happen is that I keep getting a Stopping message in my logs. I also tried throwing a new RuntileException instead of returning, and again I get endless Stopping messages. How would I cancel the bytesTransfered in this case?
You could try wrapping the input stream, and overriding the read methods to check for a stop flag. If set, throw an IOException. Example class.
/**
* Wrapped input stream that can be cancelled.
*/
public class WrappedStoppableInputStream extends InputStream
{
private InputStream m_wrappedInputStream;
private boolean m_stop = false;
/**
* Constructor.
* #param inputStream original input stream
*/
public WrappedStoppableInputStream(InputStream inputStream)
{
m_wrappedInputStream = inputStream;
}
/**
* Call to stop reading stream.
*/
public void cancelTransfer()
{
m_stop = true;
}
#Override
public int read() throws IOException
{
if (m_stop)
{
throw new IOException("Stopping stream");
}
return m_wrappedInputStream.read();
}
#Override
public int read(byte[] b) throws IOException
{
if (m_stop)
{
throw new IOException("Stopping stream");
}
return m_wrappedInputStream.read(b);
}
#Override
public int read(byte[] b, int off, int len) throws IOException
{
if (m_stop)
{
throw new IOException("Stopping stream");
}
return m_wrappedInputStream.read(b, off, len);
}
}
I am assuming that the file copying is running inside a thread. So you wrap your input stream with WrappedStoppableInputStream, and pass that to your copy function, to be used instead of the original input stream.
I am using Apache HttpClient 4 to communicate with a REST API and most of the time I do lengthy PUT operations. Since these may happen over an unstable Internet connection I need to detect if the connection is interrupted and possibly need to retry (with a resume request).
To try my routines in the real world I started a PUT operation and then I flipped the Wi-Fi switch of my laptop, causing an immediate total interruption of any data flow. However it takes a looong time (maybe 5 minutes or so) until eventually a SocketException is thrown.
How can I speed up to process? I'd like to set a timeout of maybe something around 30 seconds.
Update:
To clarify, my request is a PUT operation. So for a very long time (possibly hours) the only operation is a write() operation and there are no read operations. There is a timeout setting for read() operations, but I could not find one for write operations.
I am using my own Entity implementation and thus I write directly to an OutputStream which will pretty much immediately block once the Internet connection is interrupted. If OutputStreams had a timeout parameter so I could write out.write(nextChunk, 30000); I could detect such a problem myself. Actually I tried that:
public class TimeoutHttpEntity extends HttpEntityWrapper {
public TimeoutHttpEntity(HttpEntity wrappedEntity) {
super(wrappedEntity);
}
#Override
public void writeTo(OutputStream outstream) throws IOException {
try(TimeoutOutputStreamWrapper wrapper = new TimeoutOutputStreamWrapper(outstream, 30000)) {
super.writeTo(wrapper);
}
}
}
public class TimeoutOutputStreamWrapper extends OutputStream {
private final OutputStream delegate;
private final long timeout;
private final ExecutorService executorService = Executors.newSingleThreadExecutor();
public TimeoutOutputStreamWrapper(OutputStream delegate, long timeout) {
this.delegate = delegate;
this.timeout = timeout;
}
#Override
public void write(int b) throws IOException {
executeWithTimeout(() -> {
delegate.write(b);
return null;
});
}
#Override
public void write(byte[] b) throws IOException {
executeWithTimeout(() -> {
delegate.write(b);
return null;
});
}
#Override
public void write(byte[] b, int off, int len) throws IOException {
executeWithTimeout(() -> {
delegate.write(b, off, len);
return null;
});
}
#Override
public void close() throws IOException {
try {
executeWithTimeout(() -> {
delegate.close();
return null;
});
} finally {
executorService.shutdown();
}
}
private void executeWithTimeout(final Callable<?> task) throws IOException {
try {
executorService.submit(task).get(timeout, TimeUnit.MILLISECONDS);
} catch (TimeoutException e) {
throw new IOException(e);
} catch (ExecutionException e) {
final Throwable cause = e.getCause();
if (cause instanceof IOException) {
throw (IOException)cause;
}
throw new Error(cause);
} catch (InterruptedException e) {
throw new Error(e);
}
}
}
public class TimeoutOutputStreamWrapperTest {
private static final byte[] DEMO_ARRAY = new byte[]{1,2,3};
private TimeoutOutputStreamWrapper streamWrapper;
private OutputStream delegateOutput;
public void setUp(long timeout) {
delegateOutput = mock(OutputStream.class);
streamWrapper = new TimeoutOutputStreamWrapper(delegateOutput, timeout);
}
#AfterMethod
public void teardown() throws Exception {
streamWrapper.close();
}
#Test
public void write_writesByte() throws Exception {
// Setup
setUp(Long.MAX_VALUE);
// Execution
streamWrapper.write(DEMO_ARRAY);
// Evaluation
verify(delegateOutput).write(DEMO_ARRAY);
}
#Test(expectedExceptions = DemoIOException.class)
public void write_passesThruException() throws Exception {
// Setup
setUp(Long.MAX_VALUE);
doThrow(DemoIOException.class).when(delegateOutput).write(DEMO_ARRAY);
// Execution
streamWrapper.write(DEMO_ARRAY);
// Evaluation performed by expected exception
}
#Test(expectedExceptions = IOException.class)
public void write_throwsIOException_onTimeout() throws Exception {
// Setup
final CountDownLatch executionDone = new CountDownLatch(1);
setUp(100);
doAnswer(new Answer<Void>() {
#Override
public Void answer(InvocationOnMock invocation) throws Throwable {
executionDone.await();
return null;
}
}).when(delegateOutput).write(DEMO_ARRAY);
// Execution
try {
streamWrapper.write(DEMO_ARRAY);
} finally {
executionDone.countDown();
}
// Evaluation performed by expected exception
}
public static class DemoIOException extends IOException {
}
}
This is somewhat complicated, but it works quite well in my unit tests. And it works in real life as well, except that the HttpRequestExecutor catches the exception in line 127 and tries to close the connection. However when trying to close the connection it first tries to flush the connection which again blocks.
I might be able to dig deeper in HttpClient and figure out how to prevent this flush operation, but it is already a not too pretty solution, and it is just about to get even worse.
UPDATE:
It looks like this can't be done on the Java level. Can I do it on another level? (I am using Linux).
Java blocking I/O does not support socket timeout for write operations. You are entirely at the mercy of the OS / JRE to unblock the thread blocked by the write operation. Moreover, this behavior tends to be OS / JRE specific.
This might be a legitimate case to consider using a HTTP client based on non-blocking I/O (NIO) such as Apache HttpAsyncClient.
You can configure the socket timeout using RequestConfig:
RequestConfig myRequestConfig = RequestConfig.custom()
.setSocketTimeout(5000) // 5 seconds
.build();
When, when you do the call, just assign your new configuration. For instance,
HttpPut httpPut = new HttpPut("...");
httpPut.setConfig(requestConfig);
...
HttpClientContext context = HttpClientContext.create();
....
httpclient.execute(httpPut, context);
For more information regarthing timeout configurations, here there is a good explanation.
Her is one of the link i came across which talks connection eviction policy : here
public static class IdleConnectionMonitorThread extends Thread {
private final HttpClientConnectionManager connMgr;
private volatile boolean shutdown;
public IdleConnectionMonitorThread(HttpClientConnectionManager connMgr) {
super();
this.connMgr = connMgr;
}
#Override
public void run() {
try {
while (!shutdown) {
synchronized (this) {
wait(5000);
// Close expired connections
connMgr.closeExpiredConnections();
// Optionally, close connections
// that have been idle longer than 30 sec
connMgr.closeIdleConnections(30, TimeUnit.SECONDS);
}
}
} catch (InterruptedException ex) {
// terminate
}
}
public void shutdown() {
shutdown = true;
synchronized (this) {
notifyAll();
}
}}
I think you might want to look at this.
I have an InputStream of a file and i use apache poi components to read from it like this:
POIFSFileSystem fileSystem = new POIFSFileSystem(inputStream);
The problem is that i need to use the same stream multiple times and the POIFSFileSystem closes the stream after use.
What is the best way to cache the data from the input stream and then serve more input streams to different POIFSFileSystem ?
EDIT 1:
By cache i meant store for later use, not as a way to speedup the application. Also is it better to just read up the input stream into an array or string and then create input streams for each use ?
EDIT 2:
Sorry to reopen the question, but the conditions are somewhat different when working inside desktop and web application.
First of all, the InputStream i get from the org.apache.commons.fileupload.FileItem in my tomcat web app doesn't support markings thus cannot reset.
Second, I'd like to be able to keep the file in memory for faster acces and less io problems when dealing with files.
you can decorate InputStream being passed to POIFSFileSystem with a version that when close() is called it respond with reset():
class ResetOnCloseInputStream extends InputStream {
private final InputStream decorated;
public ResetOnCloseInputStream(InputStream anInputStream) {
if (!anInputStream.markSupported()) {
throw new IllegalArgumentException("marking not supported");
}
anInputStream.mark( 1 << 24); // magic constant: BEWARE
decorated = anInputStream;
}
#Override
public void close() throws IOException {
decorated.reset();
}
#Override
public int read() throws IOException {
return decorated.read();
}
}
testcase
static void closeAfterInputStreamIsConsumed(InputStream is)
throws IOException {
int r;
while ((r = is.read()) != -1) {
System.out.println(r);
}
is.close();
System.out.println("=========");
}
public static void main(String[] args) throws IOException {
InputStream is = new ByteArrayInputStream("sample".getBytes());
ResetOnCloseInputStream decoratedIs = new ResetOnCloseInputStream(is);
closeAfterInputStreamIsConsumed(decoratedIs);
closeAfterInputStreamIsConsumed(decoratedIs);
closeAfterInputStreamIsConsumed(is);
}
EDIT 2
you can read the entire file in a byte[] (slurp mode) then passing it to a ByteArrayInputStream
Try BufferedInputStream, which adds mark and reset functionality to another input stream, and just override its close method:
public class UnclosableBufferedInputStream extends BufferedInputStream {
public UnclosableBufferedInputStream(InputStream in) {
super(in);
super.mark(Integer.MAX_VALUE);
}
#Override
public void close() throws IOException {
super.reset();
}
}
So:
UnclosableBufferedInputStream bis = new UnclosableBufferedInputStream (inputStream);
and use bis wherever inputStream was used before.
This works correctly:
byte[] bytes = getBytes(inputStream);
POIFSFileSystem fileSystem = new POIFSFileSystem(new ByteArrayInputStream(bytes));
where getBytes is like this:
private static byte[] getBytes(InputStream is) throws IOException {
byte[] buffer = new byte[8192];
ByteArrayOutputStream baos = new ByteArrayOutputStream(2048);
int n;
baos.reset();
while ((n = is.read(buffer, 0, buffer.length)) != -1) {
baos.write(buffer, 0, n);
}
return baos.toByteArray();
}
Use below implementation for more custom use -
public class ReusableBufferedInputStream extends BufferedInputStream
{
private int totalUse;
private int used;
public ReusableBufferedInputStream(InputStream in, Integer totalUse)
{
super(in);
if (totalUse > 1)
{
super.mark(Integer.MAX_VALUE);
this.totalUse = totalUse;
this.used = 1;
}
else
{
this.totalUse = 1;
this.used = 1;
}
}
#Override
public void close() throws IOException
{
if (used < totalUse)
{
super.reset();
++used;
}
else
{
super.close();
}
}
}
What exactly do you mean with "cache"? Do you want the different POIFSFileSystem to start at the beginning of the stream? If so, there's absolutely no point caching anything in your Java code; it will be done by the OS, just open a new stream.
Or do you wan to continue reading at the point where the first POIFSFileSystem stopped? That's not caching, and it's very difficult to do. The only way I can think of if you can't avoid the stream getting closed would be to write a thin wrapper that counts how many bytes have been read and then open a new stream and skip that many bytes. But that could fail when POIFSFileSystem internally uses something like a BufferedInputStream.
If the file is not that big, read it into a byte[] array and give POI a ByteArrayInputStream created from that array.
If the file is big, then you shouldn't care, since the OS will do the caching for you as best as it can.
[EDIT] Use Apache commons-io to read the File into a byte array in an efficient way. Do not use int read() since it reads the file byte by byte which is very slow!
If you want to do it yourself, use a File object to get the length, create the array and the a loop which reads bytes from the file. You must loop since read(byte[], int offset, int len) can read less than len bytes (and usually does).
This is how I would implemented, to be safely used with any InputStream :
write your own InputStream wrapper where you create a temporary file to mirror the original stream content
dump everything read from the original input stream into this temporary file
when the stream was completely read you will have all the data mirrored in the temporary file
use InputStream.reset to switch(initialize) the internal stream to a FileInputStream(mirrored_content_file)
from now on you will loose the reference of the original stream(can be collected)
add a new method release() which will remove the temporary file and release any open stream.
you can even call release() from finalize to be sure the temporary file is release in case you forget to call release()(most of the time you should avoid using finalize, always call a method to release object resources). see Why would you ever implement finalize()?
public static void main(String[] args) throws IOException {
BufferedInputStream inputStream = new BufferedInputStream(IOUtils.toInputStream("Foobar"));
inputStream.mark(Integer.MAX_VALUE);
System.out.println(IOUtils.toString(inputStream));
inputStream.reset();
System.out.println(IOUtils.toString(inputStream));
}
This works. IOUtils is part of commons IO.
This answer iterates on previous ones 1|2 based on the BufferInputStream. The main changes are that it allows infinite reuse. And takes care of closing the original source input stream to free-up system resources. Your OS defines a limit on those and you don't want the program to run out of file handles (That's also why you should always 'consume' responses e.g. with the apache EntityUtils.consumeQuietly()). EDIT Updated the code to handle for gready consumers that use read(buffer, offset, length), in that case it may happen that BufferedInputStream tries hard to look at the source, this code protects against that use.
public class CachingInputStream extends BufferedInputStream {
public CachingInputStream(InputStream source) {
super(new PostCloseProtection(source));
super.mark(Integer.MAX_VALUE);
}
#Override
public synchronized void close() throws IOException {
if (!((PostCloseProtection) in).decoratedClosed) {
in.close();
}
super.reset();
}
private static class PostCloseProtection extends InputStream {
private volatile boolean decoratedClosed = false;
private final InputStream source;
public PostCloseProtection(InputStream source) {
this.source = source;
}
#Override
public int read() throws IOException {
return decoratedClosed ? -1 : source.read();
}
#Override
public int read(byte[] b) throws IOException {
return decoratedClosed ? -1 : source.read(b);
}
#Override
public int read(byte[] b, int off, int len) throws IOException {
return decoratedClosed ? -1 : source.read(b, off, len);
}
#Override
public long skip(long n) throws IOException {
return decoratedClosed ? 0 : source.skip(n);
}
#Override
public int available() throws IOException {
return source.available();
}
#Override
public void close() throws IOException {
decoratedClosed = true;
source.close();
}
#Override
public void mark(int readLimit) {
source.mark(readLimit);
}
#Override
public void reset() throws IOException {
source.reset();
}
#Override
public boolean markSupported() {
return source.markSupported();
}
}
}
To reuse it just close it first if it wasn't.
One limitation though is that if the stream is closed before the whole content of the original stream has been read, then this decorator will have incomplete data, so make sure the whole stream is read before closing.
I just add my solution here, as this works for me. It basically is a combination of the top two answers :)
private String convertStreamToString(InputStream is) {
Writer w = new StringWriter();
char[] buf = new char[1024];
Reader r;
is.mark(1 << 24);
try {
r = new BufferedReader(new InputStreamReader(is, "UTF-8"));
int n;
while ((n=r.read(buf)) != -1) {
w.write(buf, 0, n);
}
is.reset();
} catch(UnsupportedEncodingException e) {
Logger.debug(this.getClass(), "Cannot convert stream to string.", e);
} catch(IOException e) {
Logger.debug(this.getClass(), "Cannot convert stream to string.", e);
}
return w.toString();
}