extracting contents of ZipFile entries when read from byte[] (Java)

extracting contents of ZipFile entries when read from byte[] (Java) - java

I have a zip file whose contents are presented as byte[] but the original file object is not accessible. I want to read the contents of each of the entries. I am able to create a ZipInputStream from a ByteArrayInputStream of the bytes and can read the entries and their names. However I cannot see an easy way to extract the contents of each entry.
(I have looked at Apache Commons but cannot see an easy way there either).
UPDATE #Rich's code seems to solve the problem, thanks
QUERY why do both examples have a multiplier of * 4 (128/512 and 1024*4) ?

If you want to process nested zip entries from a stream, see this answer for ideas. Because the inner entries are listed sequentially they can be processed by getting the size of each entry and reading that many bytes from the stream.
Updated with an example that copies each entry to Standard out:
ZipInputStream is;//obtained earlier
ZipEntry entry = is.getNextEntry();
while(entry != null) {
copyStream(is, out, entry);
entry = is.getNextEntry();
}
...
private static void copyStream(InputStream in, OutputStream out,
ZipEntry entry) throws IOException {
byte[] buffer = new byte[1024 * 4];
long count = 0;
int n = 0;
long size = entry.getSize();
while (-1 != (n = in.read(buffer)) && count < size) {
out.write(buffer, 0, n);
count += n;
}
}

It actually uses the ZipInputStream as the InputStream (but don't close it at the end of each entry).

It's a little bit tricky to calculate the start of next ZipEntry. Please see this example included in JDK 6,
public static void main(String[] args) {
try {
ZipInputStream is = new ZipInputStream(System.in);
ZipEntry ze;
byte[] buf = new byte[128];
int len;
while ((ze = is.getNextEntry()) != null) {
System.out.println("----------- " + ze);
// Determine the number of bytes to skip and skip them.
int skip = (int)ze.getSize() - 128;
while (skip > 0) {
skip -= is.skip(Math.min(skip, 512));
}
// Read the remaining bytes and if it's printable, print them.
out: while ((len = is.read(buf)) >= 0) {
for (int i=0; i<len; i++) {
if ((buf[i]&0xFF) >= 0x80) {
System.out.println("**** UNPRINTABLE ****");
// This isn't really necessary since getNextEntry()
// automatically calls it.
is.closeEntry();
// Get the next zip entry.
break out;
}
}
System.out.write(buf, 0, len);
}
}
is.close();
} catch (Exception e) {
e.printStackTrace();
}
}

Related

How to read a File character-by-character in reverse without running out-of-memory?

The Story
I've been having a problem lately...
I have to read a file in reverse character by character without running out of memory.
I can't read it line-by-line and reverse it with StringBuilder because it's a one-line file that takes up to a gig (GB) of I/O space.
Hence it would take up too much of the JVM's (and the System's) Memory.
I've decided to just read it character by character from end-to-start (back-to-front) so that I could process as much as I can without consuming too much memory.
What I've Tried
I know how to read a file in one go:
(MappedByteBuffer+FileChannel+Charset which gave me OutOfMemoryExceptions)
and read a file character-by-character with UTF-8 character support
(FileInputStream+InputStreamReader).
The problem is that FileInputStream's #read() only calls #read0() which is a native method!
Because of that I have no idea about the underlying code...
Which is why I'm here today (or at least until this is done)!

This will do it (but as written it is not very efficient).
just skip to the last location read less one and read and print the character.
then reset the location to the mark, adjust size and continue.
File f = new File("Some File name");
int size = (int) f.length();
int bsize = 1;
byte[] buf = new byte[bsize];
try (BufferedInputStream b =
new BufferedInputStream(new FileInputStream(f))) {
while (size > 0) {
b.mark(size);
b.skip(size - bsize);
int k = b.read(buf);
System.out.print((char) buf[0]);
size -= k;
b.reset();
}
} catch (IOException ioe) {
ioe.printStackTrace();
}
This could be improved by increasing the buffer size and making equivalent adjustments in the mark and skip arguments.
Updated Version
I wasn't fully satisfied with my answer so I made it more general. Some variables could have served double duty but using meaningful names helps clarify how they are used.
Mark must be used so reset can be used. However, it only needs to be set once and is set to position 0 outside of the main loop. I do not know if marking closer to the read point is more efficient or not.
skipCnt - initally set to fileLength it is the number of bytes to skip before reading. If the number of bytes remaining is greater than the buffer size, then the skip count will be skipCnt - bsize. Else it will be 0.
remainingBytes - a running total of how many bytes are still to be read. It is updated by subtracting the current readCnt.
readCnt - how many bytes to read. If remainingBytes is greater than bsize then set to bsize, else set to remainingBytes
The while loop continuously reads the file starting near the end and then prints the just read information in reverse order. All variables are updated and the process repeats until the remainingBytes reaches 0.
File f = new File("some file");
int bsize = 16;
int fileSize = (int)f.length();
int remainingBytes = fileSize;
int skipCnt = fileSize;
byte[] buf = new byte[bsize];
try (BufferedInputStream b =
new BufferedInputStream(new FileInputStream(f))) {
b.mark(0);
while(remainingBytes > 0) {
skipCnt = skipCnt > bsize ? skipCnt - bsize : 0;
b.skip(skipCnt);
int readCnt = remainingBytes > bsize ? bsize : remainingBytes;
b.read(buf,0,readCnt);
for (int i = readCnt-1; i >= 0; i--) {
System.out.print((char) buf[i]);
}
remainingBytes -= readCnt;
b.reset();
}
} catch (IOException ioe) {
ioe.printStackTrace();
}

This doesn't support multi byte UTF-8 characters
Using a RandomAccessFile you can easily read a file in chunks from the end to the beginning, and reverse each of the chunks.
Here's a simple example:
import java.io.FileWriter;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.util.stream.IntStream;
class Test {
private static final int BUF_SIZE = 10;
private static final int FILE_LINE_COUNT = 105;
public static void main(String[] args) throws Exception {
// create a large file
try (FileWriter fw = new FileWriter("largeFile.txt")) {
IntStream.range(1, FILE_LINE_COUNT).mapToObj(Integer::toString).forEach(s -> {
try {
fw.write(s + "\n");
} catch (IOException e) {
throw new RuntimeException(e);
}
});
}
// reverse the file
try (RandomAccessFile raf = new RandomAccessFile("largeFile.txt", "r")) {
long size = raf.length();
byte[] buf = new byte[BUF_SIZE];
for (long i = size - BUF_SIZE; i > -BUF_SIZE; i -= BUF_SIZE) {
long offset = Math.max(0, i);
long readSize = Math.min(i + BUF_SIZE, BUF_SIZE);
raf.seek(offset);
raf.read(buf, 0, (int) readSize);
for (int j = (int) readSize - 1; j >= 0; j--) {
System.out.print((char) buf[j]);
}
}
}
}
}
This uses a very small file and very small chunks so that you can test it easily. Increase those constants to see it work on a larger scale.
The input file contains newlines to make it easy to read the output, but the reversal doesn't depend on the file "having lines".

How to get the new content from the file

Scenario:
1.Create fromX.txt and toY.txt file (content has to be appended and will come from another logic)
2.check every second fromX.txt file for new addition if yes write it to toY.txt
how to get the just new content fromX.txt file?
I have tried implementing it by counting number of lines and looking for any change in it.
public static int countLines(String filename) throws IOException {
InputStream is = new BufferedInputStream(new FileInputStream(filename));
try {
byte[] c = new byte[1024];
int count = 0;
int readChars = 0;
boolean empty = true;
while ((readChars = is.read(c)) != -1) {
empty = false;
for (int i = 0; i < readChars; ++i) {
if (c[i] == '\n') {
++count;
}
}
}
return (count == 0 && !empty) ? 1 : count;
} finally {
is.close();
}
}

You implement it like this:
Open the using RandomAccessFile
Seek to where the end-of-file was last time. (If this is the first time, seek to the start of the file.)
Read until you reach the new end-of-file.
Record where the end-of-file is.
Close the RandomAccessFile
Record the position as a byte offset from the start of the file, and use the same value for seeking.
You can modify the above to reuse the RandomAccessFile object rather than opening / closing it each time.
UPDATE - The javadocs for RandomAccessFile are here. Look for the seek and getFilePointer methods.

Servlet getContentLength() returns > 0 but getInputStream().available() returns 0 [duplicate]

How do I read an entire InputStream into a byte array?

You can use Apache Commons IO to handle this and similar tasks.
The IOUtils type has a static method to read an InputStream and return a byte[].
InputStream is;
byte[] bytes = IOUtils.toByteArray(is);
Internally this creates a ByteArrayOutputStream and copies the bytes to the output, then calls toByteArray(). It handles large files by copying the bytes in blocks of 4KiB.

You need to read each byte from your InputStream and write it to a ByteArrayOutputStream.
You can then retrieve the underlying byte array by calling toByteArray():
InputStream is = ...
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
int nRead;
byte[] data = new byte[16384];
while ((nRead = is.read(data, 0, data.length)) != -1) {
buffer.write(data, 0, nRead);
}
return buffer.toByteArray();

Finally, after twenty years, there’s a simple solution without the need for a 3rd party library, thanks to Java 9:
InputStream is;
…
byte[] array = is.readAllBytes();
Note also the convenience methods readNBytes(byte[] b, int off, int len) and transferTo(OutputStream) addressing recurring needs.

Use vanilla Java's DataInputStream and its readFully Method (exists since at least Java 1.4):
...
byte[] bytes = new byte[(int) file.length()];
DataInputStream dis = new DataInputStream(new FileInputStream(file));
dis.readFully(bytes);
...
There are some other flavors of this method, but I use this all the time for this use case.

If you happen to use Google Guava, it'll be as simple as using ByteStreams:
byte[] bytes = ByteStreams.toByteArray(inputStream);

Safe solution (close streams correctly):
Java 9 and newer:
final byte[] bytes;
try (inputStream) {
bytes = inputStream.readAllBytes();
}
Java 8 and older:
public static byte[] readAllBytes(InputStream inputStream) throws IOException {
final int bufLen = 4 * 0x400; // 4KB
byte[] buf = new byte[bufLen];
int readLen;
IOException exception = null;
try {
try (ByteArrayOutputStream outputStream = new ByteArrayOutputStream()) {
while ((readLen = inputStream.read(buf, 0, bufLen)) != -1)
outputStream.write(buf, 0, readLen);
return outputStream.toByteArray();
}
} catch (IOException e) {
exception = e;
throw e;
} finally {
if (exception == null) inputStream.close();
else try {
inputStream.close();
} catch (IOException e) {
exception.addSuppressed(e);
}
}
}
Kotlin (when Java 9+ isn't accessible):
#Throws(IOException::class)
fun InputStream.readAllBytes(): ByteArray {
val bufLen = 4 * 0x400 // 4KB
val buf = ByteArray(bufLen)
var readLen: Int = 0
ByteArrayOutputStream().use { o ->
this.use { i ->
while (i.read(buf, 0, bufLen).also { readLen = it } != -1)
o.write(buf, 0, readLen)
}
return o.toByteArray()
}
}
To avoid nested use see here.
Scala (when Java 9+ isn't accessible) (By #Joan. Thx):
def readAllBytes(inputStream: InputStream): Array[Byte] =
Stream.continually(inputStream.read).takeWhile(_ != -1).map(_.toByte).toArray

As always, also Spring framework (spring-core since 3.2.2) has something for you: StreamUtils.copyToByteArray()

public static byte[] getBytesFromInputStream(InputStream is) throws IOException {
ByteArrayOutputStream os = new ByteArrayOutputStream();
byte[] buffer = new byte[0xFFFF];
for (int len = is.read(buffer); len != -1; len = is.read(buffer)) {
os.write(buffer, 0, len);
}
return os.toByteArray();
}

In-case someone is still looking for a solution without dependency and If you have a file.
DataInputStream
byte[] data = new byte[(int) file.length()];
DataInputStream dis = new DataInputStream(new FileInputStream(file));
dis.readFully(data);
dis.close();
ByteArrayOutputStream
InputStream is = new FileInputStream(file);
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
int nRead;
byte[] data = new byte[(int) file.length()];
while ((nRead = is.read(data, 0, data.length)) != -1) {
buffer.write(data, 0, nRead);
}
RandomAccessFile
RandomAccessFile raf = new RandomAccessFile(file, "r");
byte[] data = new byte[(int) raf.length()];
raf.readFully(data);

Do you really need the image as a byte[]? What exactly do you expect in the byte[] - the complete content of an image file, encoded in whatever format the image file is in, or RGB pixel values?
Other answers here show you how to read a file into a byte[]. Your byte[] will contain the exact contents of the file, and you'd need to decode that to do anything with the image data.
Java's standard API for reading (and writing) images is the ImageIO API, which you can find in the package javax.imageio. You can read in an image from a file with just a single line of code:
BufferedImage image = ImageIO.read(new File("image.jpg"));
This will give you a BufferedImage, not a byte[]. To get at the image data, you can call getRaster() on the BufferedImage. This will give you a Raster object, which has methods to access the pixel data (it has several getPixel() / getPixels() methods).
Lookup the API documentation for javax.imageio.ImageIO, java.awt.image.BufferedImage, java.awt.image.Raster etc.
ImageIO supports a number of image formats by default: JPEG, PNG, BMP, WBMP and GIF. It's possible to add support for more formats (you'd need a plug-in that implements the ImageIO service provider interface).
See also the following tutorial: Working with Images

If you don't want to use the Apache commons-io library, this snippet is taken from the sun.misc.IOUtils class. It's nearly twice as fast as the common implementation using ByteBuffers:
public static byte[] readFully(InputStream is, int length, boolean readAll)
throws IOException {
byte[] output = {};
if (length == -1) length = Integer.MAX_VALUE;
int pos = 0;
while (pos < length) {
int bytesToRead;
if (pos >= output.length) { // Only expand when there's no room
bytesToRead = Math.min(length - pos, output.length + 1024);
if (output.length < pos + bytesToRead) {
output = Arrays.copyOf(output, pos + bytesToRead);
}
} else {
bytesToRead = output.length - pos;
}
int cc = is.read(output, pos, bytesToRead);
if (cc < 0) {
if (readAll && length != Integer.MAX_VALUE) {
throw new EOFException("Detect premature EOF");
} else {
if (output.length != pos) {
output = Arrays.copyOf(output, pos);
}
break;
}
}
pos += cc;
}
return output;
}

ByteArrayOutputStream out = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
while (true) {
int r = in.read(buffer);
if (r == -1) break;
out.write(buffer, 0, r);
}
byte[] ret = out.toByteArray();

#Adamski: You can avoid buffer entirely.
Code copied from http://www.exampledepot.com/egs/java.io/File2ByteArray.html (Yes, it is very verbose, but needs half the size of memory as the other solution.)
// Returns the contents of the file in a byte array.
public static byte[] getBytesFromFile(File file) throws IOException {
InputStream is = new FileInputStream(file);
// Get the size of the file
long length = file.length();
// You cannot create an array using a long type.
// It needs to be an int type.
// Before converting to an int type, check
// to ensure that file is not larger than Integer.MAX_VALUE.
if (length > Integer.MAX_VALUE) {
// File is too large
}
// Create the byte array to hold the data
byte[] bytes = new byte[(int)length];
// Read in the bytes
int offset = 0;
int numRead = 0;
while (offset < bytes.length
&& (numRead=is.read(bytes, offset, bytes.length-offset)) >= 0) {
offset += numRead;
}
// Ensure all the bytes have been read in
if (offset < bytes.length) {
throw new IOException("Could not completely read file "+file.getName());
}
// Close the input stream and return bytes
is.close();
return bytes;
}

Input Stream is ...
ByteArrayOutputStream bos = new ByteArrayOutputStream();
int next = in.read();
while (next > -1) {
bos.write(next);
next = in.read();
}
bos.flush();
byte[] result = bos.toByteArray();
bos.close();

Java 9 will give you finally a nice method:
InputStream in = ...;
ByteArrayOutputStream bos = new ByteArrayOutputStream();
in.transferTo( bos );
byte[] bytes = bos.toByteArray();

We are seeing some delay for few AWS transaction, while converting S3 object to ByteArray.
Note: S3 Object is PDF document (max size is 3 mb).
We are using the option #1 (org.apache.commons.io.IOUtils) to convert the S3 object to ByteArray. We have noticed S3 provide the inbuild IOUtils method to convert the S3 object to ByteArray, we are request you to confirm what is the best way to convert the S3 object to ByteArray to avoid the delay.
Option #1:
import org.apache.commons.io.IOUtils;
is = s3object.getObjectContent();
content =IOUtils.toByteArray(is);
Option #2:
import com.amazonaws.util.IOUtils;
is = s3object.getObjectContent();
content =IOUtils.toByteArray(is);
Also let me know if we have any other better way to convert the s3 object to bytearray

I know it's too late but here I think is cleaner solution that's more readable...
/**
* method converts {#link InputStream} Object into byte[] array.
*
* #param stream the {#link InputStream} Object.
* #return the byte[] array representation of received {#link InputStream} Object.
* #throws IOException if an error occurs.
*/
public static byte[] streamToByteArray(InputStream stream) throws IOException {
byte[] buffer = new byte[1024];
ByteArrayOutputStream os = new ByteArrayOutputStream();
int line = 0;
// read bytes from stream, and store them in buffer
while ((line = stream.read(buffer)) != -1) {
// Writes bytes from byte array (buffer) into output stream.
os.write(buffer, 0, line);
}
stream.close();
os.flush();
os.close();
return os.toByteArray();
}

I tried to edit #numan's answer with a fix for writing garbage data but edit was rejected. While this short piece of code is nothing brilliant I can't see any other better answer. Here's what makes most sense to me:
ByteArrayOutputStream out = new ByteArrayOutputStream();
byte[] buffer = new byte[1024]; // you can configure the buffer size
int length;
while ((length = in.read(buffer)) != -1) out.write(buffer, 0, length); //copy streams
in.close(); // call this in a finally block
byte[] result = out.toByteArray();
btw ByteArrayOutputStream need not be closed. try/finally constructs omitted for readability

See the InputStream.available() documentation:
It is particularly important to realize that you must not use this
method to size a container and assume that you can read the entirety
of the stream without needing to resize the container. Such callers
should probably write everything they read to a ByteArrayOutputStream
and convert that to a byte array. Alternatively, if you're reading
from a file, File.length returns the current length of the file
(though assuming the file's length can't change may be incorrect,
reading a file is inherently racy).

Wrap it in a DataInputStream if that is off the table for some reason, just use read to hammer on it until it gives you a -1 or the entire block you asked for.
public int readFully(InputStream in, byte[] data) throws IOException {
int offset = 0;
int bytesRead;
boolean read = false;
while ((bytesRead = in.read(data, offset, data.length - offset)) != -1) {
read = true;
offset += bytesRead;
if (offset >= data.length) {
break;
}
}
return (read) ? offset : -1;
}

Java 8 way (thanks to BufferedReader and Adam Bien)
private static byte[] readFully(InputStream input) throws IOException {
try (BufferedReader buffer = new BufferedReader(new InputStreamReader(input))) {
return buffer.lines().collect(Collectors.joining("\n")).getBytes(<charset_can_be_specified>);
}
}
Note that this solution wipes carriage return ('\r') and can be inappropriate.

The other case to get correct byte array via stream, after send request to server and waiting for the response.
/**
* Begin setup TCP connection to PC app
* to open integrate connection between mobile app and pc app (or mobile app)
*/
mSocket = new Socket(IP, port);
// mSocket.setSoTimeout(30000);
DataOutputStream mDos = new DataOutputStream(mSocket.getOutputStream());
String str = "MobileRequest#" + params[0] + "#<EOF>";
mDos.write(str.getBytes());
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
/* Since data are accepted as byte, all of them will be collected in the
following byte array which initialised with accepted data length. */
DataInputStream mDis = new DataInputStream(mSocket.getInputStream());
byte[] data = new byte[mDis.available()];
// Collecting data into byte array
for (int i = 0; i < data.length; i++)
data[i] = mDis.readByte();
// Converting collected data in byte array into String.
String RESPONSE = new String(data);

You're doing an extra copy if you use ByteArrayOutputStream. If you know the length of the stream before you start reading it (e.g. the InputStream is actually a FileInputStream, and you can call file.length() on the file, or the InputStream is a zipfile entry InputStream, and you can call zipEntry.length()), then it's far better to write directly into the byte[] array -- it uses half the memory, and saves time.
// Read the file contents into a byte[] array
byte[] buf = new byte[inputStreamLength];
int bytesRead = Math.max(0, inputStream.read(buf));
// If needed: for safety, truncate the array if the file may somehow get
// truncated during the read operation
byte[] contents = bytesRead == inputStreamLength ? buf
: Arrays.copyOf(buf, bytesRead);
N.B. the last line above deals with files getting truncated while the stream is being read, if you need to handle that possibility, but if the file gets longer while the stream is being read, the contents in the byte[] array will not be lengthened to include the new file content, the array will simply be truncated to the old length inputStreamLength.

I use this.
public static byte[] toByteArray(InputStream is) throws IOException {
ByteArrayOutputStream output = new ByteArrayOutputStream();
try {
byte[] b = new byte[4096];
int n = 0;
while ((n = is.read(b)) != -1) {
output.write(b, 0, n);
}
return output.toByteArray();
} finally {
output.close();
}
}

This is my copy-paste version:
#SuppressWarnings("empty-statement")
public static byte[] inputStreamToByte(InputStream is) throws IOException {
if (is == null) {
return null;
}
// Define a size if you have an idea of it.
ByteArrayOutputStream r = new ByteArrayOutputStream(2048);
byte[] read = new byte[512]; // Your buffer size.
for (int i; -1 != (i = is.read(read)); r.write(read, 0, i));
is.close();
return r.toByteArray();
}

Java 7 and later:
import sun.misc.IOUtils;
...
InputStream in = ...;
byte[] buf = IOUtils.readFully(in, -1, false);

You can try Cactoos:
byte[] array = new BytesOf(stream).bytes();

Here is an optimized version, that tries to avoid copying data bytes as much as possible:
private static byte[] loadStream (InputStream stream) throws IOException {
int available = stream.available();
int expectedSize = available > 0 ? available : -1;
return loadStream(stream, expectedSize);
}
private static byte[] loadStream (InputStream stream, int expectedSize) throws IOException {
int basicBufferSize = 0x4000;
int initialBufferSize = (expectedSize >= 0) ? expectedSize : basicBufferSize;
byte[] buf = new byte[initialBufferSize];
int pos = 0;
while (true) {
if (pos == buf.length) {
int readAhead = -1;
if (pos == expectedSize) {
readAhead = stream.read(); // test whether EOF is at expectedSize
if (readAhead == -1) {
return buf;
}
}
int newBufferSize = Math.max(2 * buf.length, basicBufferSize);
buf = Arrays.copyOf(buf, newBufferSize);
if (readAhead != -1) {
buf[pos++] = (byte)readAhead;
}
}
int len = stream.read(buf, pos, buf.length - pos);
if (len < 0) {
return Arrays.copyOf(buf, pos);
}
pos += len;
}
}

Solution in Kotlin (will work in Java too, of course), which includes both cases of when you know the size or not:
fun InputStream.readBytesWithSize(size: Long): ByteArray? {
return when {
size < 0L -> this.readBytes()
size == 0L -> ByteArray(0)
size > Int.MAX_VALUE -> null
else -> {
val sizeInt = size.toInt()
val result = ByteArray(sizeInt)
readBytesIntoByteArray(result, sizeInt)
result
}
}
}
fun InputStream.readBytesIntoByteArray(byteArray: ByteArray,bytesToRead:Int=byteArray.size) {
var offset = 0
while (true) {
val read = this.read(byteArray, offset, bytesToRead - offset)
if (read == -1)
break
offset += read
if (offset >= bytesToRead)
break
}
}
If you know the size, it saves you on having double the memory used compared to the other solutions (in a brief moment, but still could be useful). That's because you have to read the entire stream to the end, and then convert it to a byte array (similar to ArrayList which you convert to just an array).
So, if you are on Android, for example, and you got some Uri to handle, you can try to get the size using this:
fun getStreamLengthFromUri(context: Context, uri: Uri): Long {
context.contentResolver.query(uri, arrayOf(MediaStore.MediaColumns.SIZE), null, null, null)?.use {
if (!it.moveToNext())
return#use
val fileSize = it.getLong(it.getColumnIndex(MediaStore.MediaColumns.SIZE))
if (fileSize > 0)
return fileSize
}
//if you wish, you can also get the file-path from the uri here, and then try to get its size, using this: https://stackoverflow.com/a/61835665/878126
FileUtilEx.getFilePathFromUri(context, uri, false)?.use {
val file = it.file
val fileSize = file.length()
if (fileSize > 0)
return fileSize
}
context.contentResolver.openInputStream(uri)?.use { inputStream ->
if (inputStream is FileInputStream)
return inputStream.channel.size()
else {
var bytesCount = 0L
while (true) {
val available = inputStream.available()
if (available == 0)
break
val skip = inputStream.skip(available.toLong())
if (skip < 0)
break
bytesCount += skip
}
if (bytesCount > 0L)
return bytesCount
}
}
return -1L
}

You can use cactoos library with provides reusable object-oriented Java components.
OOP is emphasized by this library, so no static methods, NULLs, and so on, only real objects and their contracts (interfaces).
A simple operation like reading InputStream, can be performed like that
final InputStream input = ...;
final Bytes bytes = new BytesOf(input);
final byte[] array = bytes.asBytes();
Assert.assertArrayEquals(
array,
new byte[]{65, 66, 67}
);
Having a dedicated type Bytes for working with data structure byte[] enables us to use OOP tactics for solving tasks at hand.
Something that a procedural "utility" method will forbid us to do.
For example, you need to enconde bytes you've read from this InputStream to Base64.
In this case you will use Decorator pattern and wrap Bytes object within implementation for Base64.
cactoos already provides such implementation:
final Bytes encoded = new BytesBase64(
new BytesOf(
new InputStreamOf("XYZ")
)
);
Assert.assertEquals(new TextOf(encoded).asString(), "WFla");
You can decode them in the same manner, by using Decorator pattern
final Bytes decoded = new Base64Bytes(
new BytesBase64(
new BytesOf(
new InputStreamOf("XYZ")
)
)
);
Assert.assertEquals(new TextOf(decoded).asString(), "XYZ");
Whatever your task is you will be able to create own implementation of Bytes to solve it.

How to read bytes from a file, whereas the result byte[] is exactly as long

I want the result byte[] to be exactly as long as the file content. How to achieve that.
I am thinking of ArrayList<Byte>, but it doe not seem to be efficient.

Personally I'd go the Guava route:
File f = ...
byte[] content = Files.toByteArray(f);
Apache Commons IO has similar utility methods if you want.
If that's not what you want, it's not too hard to write that code yourself:
public static byte[] toByteArray(File f) throws IOException {
if (f.length() > Integer.MAX_VALUE) {
throw new IllegalArgumentException(f + " is too large!");
}
int length = (int) f.length();
byte[] content = new byte[length];
int off = 0;
int read = 0;
InputStream in = new FileInputStream(f);
try {
while (read != -1 && off < length) {
read = in.read(content, off, (length - off));
off += read;
}
if (off != length) {
// file size has shrunken since check, handle appropriately
} else if (in.read() != -1) {
// file size has grown since check, handle appropriately
}
return content;
} finally {
in.close();
}
}

I'm pretty sure File#length() doesn't iterate through the file. (Assuming this is what you meant by length()) Each OS provides efficient enough mechanisms to find file size without reading it all.

Allocate an adequate buffer (if necessary, resize it while reading) and keep track of how many bytes read. After finishing reading, create a new array with the exact length and copy the content of the reading buffer.

Small function that you can use :
// Returns the contents of the file in a byte array.
public static byte[] getBytesFromFile(File file) throws IOException {
InputStream is = new FileInputStream(file);
// Get the size of the file
long length = file.length();
// You cannot create an array using a long type.
// It needs to be an int type.
// Before converting to an int type, check
// to ensure that file is not larger than Integer.MAX_VALUE.
if (length > Integer.MAX_VALUE) {
throw new RuntimeException(file.getName() + " is too large");
}
// Create the byte array to hold the data
byte[] bytes = new byte[(int)length];
// Read in the bytes
int offset = 0;
int numRead = 0;
while (offset < bytes.length
&& (numRead=is.read(bytes, offset, bytes.length-offset)) >= 0) {
offset += numRead;
}
// Ensure all the bytes have been read in
if (offset < bytes.length) {
throw new IOException("Could not completely read file "+file.getName());
}
// Close the input stream and return bytes
is.close();
return bytes;
}

Storing a large binary file

Are there any ways to store a large binary file like 50 MB in the ten files with 5 MB?
thanks
are there any special classes for doing this?

Use a FileInputStream to read the file and a FileOutputStream to write it.
Here a simple (incomplete) example (missing error handling, writes 1K chunks)
public static int split(File file, String name, int size) throws IOException {
FileInputStream input = new FileInputStream(file);
FileOutputStream output = null;
byte[] buffer = new byte[1024];
int count = 0;
boolean done = false;
while (!done) {
output = new FileOutputStream(String.format(name, count));
count += 1;
for (int written = 0; written < size; ) {
int len = input.read(buffer);
if (len == -1) {
done = true;
break;
}
output.write(buffer, 0, len);
written += len;
}
output.close();
}
input.close();
return count;
}
and called like
File input = new File("C:/data/in.gz");
String name = "C:/data/in.gz.part%02d"; // %02d will be replaced by segment number
split(input, name, 5000 * 1024));

Yes, there are. Basically just count the bytes which you write to file and if it hits a certain limit, then stop writing, reset the counter and continue writing to another file using a certain filename pattern so that you can correlate the files with each other. You can do that in a loop. You can learn here how to write to files in Java and for the remnant just apply the primary school maths.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

extracting contents of ZipFile entries when read from byte[] (Java) - java

It actually uses the ZipInputStream as the InputStream (but don't close it at the end of each entry).

Related

How to read a File character-by-character in reverse without running out-of-memory?

How to get the new content from the file

Servlet getContentLength() returns > 0 but getInputStream().available() returns 0 [duplicate]

How to read bytes from a file, whereas the result byte[] is exactly as long

Storing a large binary file

Categories

Resources