Determine the size of an InputStream

Determine the size of an InputStream - java

My current situation is: I have to read a file and put the contents into InputStream. Afterwards I need to place the contents of the InputStream into a byte array which requires (as far as I know) the size of the InputStream. Any ideas?
As requested, I will show the input stream that I am creating from an uploaded file
InputStream uploadedStream = null;
FileItemFactory factory = new DiskFileItemFactory();
ServletFileUpload upload = new ServletFileUpload(factory);
java.util.List items = upload.parseRequest(request);
java.util.Iterator iter = items.iterator();
while (iter.hasNext()) {
FileItem item = (FileItem) iter.next();
if (!item.isFormField()) {
uploadedStream = item.getInputStream();
//CHANGE uploadedStreambyte = item.get()
}
}
The request is a HttpServletRequest object, which is like the FileItemFactory and ServletFileUpload is from the Apache Commons FileUpload package.

This is a REALLY old thread, but it was still the first thing to pop up when I googled the issue. So I just wanted to add this:
InputStream inputStream = conn.getInputStream();
int length = inputStream.available();
Worked for me. And MUCH simpler than the other answers here.
Warning This solution does not provide reliable results regarding the total size of a stream. Except from the JavaDoc:
Note that while some implementations of {#code InputStream} will return
* the total number of bytes in the stream, many will not.

I would read into a ByteArrayOutputStream and then call toByteArray() to get the resultant byte array. You don't need to define the size in advance (although it's possibly an optimisation if you know it. In many cases you won't)

You can't determine the amount of data in a stream without reading it; you can, however, ask for the size of a file:
http://java.sun.com/javase/6/docs/api/java/io/File.html#length()
If that isn't possible, you can write the bytes you read from the input stream to a ByteArrayOutputStream which will grow as required.

I just wanted to add, Apache Commons IO has stream support utilities to perform the copy. (Btw, what do you mean by placing the file into an inputstream? Can you show us your code?)
Edit:
Okay, what do you want to do with the contents of the item?
There is an item.get() which returns the entire thing in a byte array.
Edit2
item.getSize() will return the uploaded file size.

For InputStream
org.apache.commons.io.IoUtils.toByteArray(inputStream).length()
For Optional < MultipartFile >
Stream.of(multipartFile.get()).mapToLong(file->file.getSize()).findFirst().getAsLong()

you can get the size of InputStream using getBytes(inputStream) of Utils.java check this following link
Get Bytes from Inputstream

The function below should work with any InputStream. As other answers have hinted, you can't reliably find the length of an InputStream without reading through it, but unlike other answers, you should not attempt to hold the entire stream in memory by reading into a ByteArrayOutputStream, nor is there any reason to. Instead of reading the stream, you should ideally rely on other API for stream sizes, for example getting the size of a file using the File API.
public static int length(InputStream inputStream, int chunkSize) throws IOException {
byte[] buffer = new byte[chunkSize];
int chunkBytesRead = 0;
int length = 0;
while((chunkBytesRead = inputStream.read(buffer)) != -1) {
length += chunkBytesRead;
}
return length;
}
Choose a reasonable value for chunkSize appropriate to the kind of InputStream. E.g. reading from disk it would not be efficient to have too small a value for chunkSize.

When explicitly dealing with a ByteArrayInputStream then contrary to some of the comments on this page you can use the .available() function to get the size. Just have to do it before you start reading from it.
From the JavaDocs:
Returns the number of remaining bytes that can be read (or skipped
over) from this input stream. The value returned is count - pos, which
is the number of bytes remaining to be read from the input buffer.
https://docs.oracle.com/javase/7/docs/api/java/io/ByteArrayInputStream.html#available()

If you need to stream the data to another object that doesn't allow you to directly determine the size (e.g. javax.imageio.ImageIO), then you can wrap your InputStream within a CountingInputStream (Apache Commons IO), and then read the size:
CountingInputStream countingInputStream = new CountingInputStream(inputStream);
// ... process the whole stream ...
int size = countingInputStream.getCount();

If you know that your InputStream is a FileInputStream or a ByteArrayInputStream, you can use a little reflection to get at the stream size without reading the entire contents. Here's an example method:
static long getInputLength(InputStream inputStream) {
try {
if (inputStream instanceof FilterInputStream) {
FilterInputStream filtered = (FilterInputStream)inputStream;
Field field = FilterInputStream.class.getDeclaredField("in");
field.setAccessible(true);
InputStream internal = (InputStream) field.get(filtered);
return getInputLength(internal);
} else if (inputStream instanceof ByteArrayInputStream) {
ByteArrayInputStream wrapper = (ByteArrayInputStream)inputStream;
Field field = ByteArrayInputStream.class.getDeclaredField("buf");
field.setAccessible(true);
byte[] buffer = (byte[])field.get(wrapper);
return buffer.length;
} else if (inputStream instanceof FileInputStream) {
FileInputStream fileStream = (FileInputStream)inputStream;
return fileStream.getChannel().size();
}
} catch (NoSuchFieldException | IllegalAccessException | IOException exception) {
// Ignore all errors and just return -1.
}
return -1;
}
This could be extended to support additional input streams, I am sure.

Add to your pom.xml:
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.5</version>
</dependency>
Use to get the content type lenght (InputStream file):
IOUtils.toByteArray(file).length

Use this method, you just have to pass the InputStream
public String readIt(InputStream is) {
if (is != null) {
BufferedReader reader = new BufferedReader(new InputStreamReader(is, "utf-8"), 8);
StringBuilder sb = new StringBuilder();
String line;
while ((line = reader.readLine()) != null) {
sb.append(line).append("\n");
}
is.close();
return sb.toString();
}
return "error: ";
}

try {
InputStream connInputStream = connection.getInputStream();
} catch (IOException e) {
e.printStackTrace();
}
int size = connInputStream.available();
int available ()
Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream. The next invocation might be the same thread or another thread. A single read or skip of this many bytes will not block, but may read or skip fewer bytes.
InputStream - Android SDK | Android Developers

Related

Server cannot read entire content

I have a servlet which receives via POST method a large JSON string (> 10,000 characters).
If i read the content of the request like this:
try(Reader reader = new InputStreamReader(new BufferedInputStream(request.getInputStream()), StandardCharsets.UTF_8))
{
char[] buffer = new char[request.getContentLength()];
reader.read(buffer);
System.out.println(new String(buffer));
}
i don´t get the entire content! The buffer size is correct. But the length of the created string is not.
But if i do it like this:
try(BufferedInputStream input = new BufferedInputStream(request.getInputStream()))
{
byte[] buffer = new byte[request.getContentLength()];
input.read(buffer);
System.out.println(new String(buffer, StandardCharsets.UTF_8));
}
it works perfectly.
So where am i wrong in the first case?

The way you are using InputStreamReader is not really the intended way. A call to read is not guaranteed to read any specific number of bytes (it depends on the stream you are reading from), which is why the return value of this method is the number of bytes that were read. You would need to keep reading from the stream and buffering until it indicates it has reached the end (it will return -1 as the number of bytes that were read). Some good examples of how to do this can be found here: Convert InputStream to byte array in Java
But since you want this as character data, you should probably use request.getReader() instead. A good example of how to do this can be found here: Retrieving JSON Object Literal from HttpServletRequest

Trim Padding From ByteArrayOutputStream

I'm working with Amazon S3 and would like to upload an InputStream (which requires counting the number of bytes I'm sending).
public static boolean uploadDataTo(String bucketName, String key, String fileName, InputStream stream) {
ByteArrayOutputStream out = new ByteArrayOutputStream();
byte[] buffer = new byte[1];
try {
while (stream.read(buffer) != -1) { // copy from stream to buffer
out.write(buffer); // copy from buffer to byte array
}
} catch (Exception e) {
UtilityFunctionsObject.writeLogException(null, e);
}
byte[] result = out.toByteArray(); // we needed all that just for length
int bytes = result.length;
IO.close(out);
InputStream uploadStream = new ByteArrayInputStream(result);
....
}
I was told copying a byte at a time is highly inefficient (obvious for large files). I can't make it more because it will add padding to the ByteArrayOutputStream, which I can't strip out. I can strip it out from result, but how can I do it safely? If I use an 8KB buffer, can I just strip out the right most buffer[i] == 0? Or is there a better way to do this? Thanks!
Using Java 7 on Windows 7 x64.

You can do something like this:
int read = 0;
while ((read = stream.read(buffer)) != -1) {
out.write(buffer, 0, read);
}
stream.read() returns the number of bytes that have been written into buffer. You can pass this information to the len parameter of out.write(). So you make sure that you write only the bytes you have read from the stream.

Use Jakarta Commons IOUtils to copy from the input stream to the byte array stream in a single step. It will use an efficient buffer, and not write any excess bytes.

If you want efficiency you could process the file as you read it. I would replace uploadStream with stream and remove the rest of the code.
If you need some buffering you can do this
InputStream uploadStream = new BufferedInputStream(stream);
the default buffer size is 8 KB.
If you want the length use File.length();
long length = new File(fileName).length();

why initialize this byte array to 1024

I'm relatively new to Java and I'm attempting to write a simple android app. I have a large text file with about 3500 lines in the assets folder of my applications and I need to read it into a string. I found a good example about how to do this but I have a question about why the byte array is initialized to 1024. Wouldn't I want to initialize it to the length of my text file? Also, wouldn't I want to use char, not byte? Here is the code:
private void populateArray(){
AssetManager assetManager = getAssets();
InputStream inputStream = null;
try {
inputStream = assetManager.open("3500LineTextFile.txt");
} catch (IOException e) {
Log.e("IOException populateArray", e.getMessage());
}
String s = readTextFile(inputStream);
// Add more code here to populate array from string
}
private String readTextFile(InputStream inputStream) {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
inputStream.length
byte buf[] = new byte[1024];
int len;
try {
while ((len = inputStream.read(buf)) != -1) {
outputStream.write(buf, 0, len);
}
outputStream.close();
inputStream.close();
} catch (IOException e) {
Log.e("IOException readTextFile", e.getMessage());
}
return outputStream.toString();
}
EDIT: Based on your suggestions, I tried this approach. Is it any better? Thanks.
private void populateArray(){
AssetManager assetManager = getAssets();
InputStream inputStream = null;
Reader iStreamReader = null;
try {
inputStream = assetManager.open("List.txt");
iStreamReader = new InputStreamReader(inputStream, "UTF-8");
} catch (IOException e) {
Log.e("IOException populateArray", e.getMessage());
}
String String = readTextFile(iStreamReader);
// more code here
}
private String readTextFile(InputStreamReader inputStreamReader) {
StringBuilder sb = new StringBuilder();
char buf[] = new char[2048];
int read;
try {
do {
read = inputStreamReader.read(buf, 0, buf.length);
if (read>0) {
sb.append(buf, 0, read);
}
} while (read>=0);
} catch (IOException e) {
Log.e("IOException readTextFile", e.getMessage());
}
return sb.toString();
}

This example is not good at all. It's full of bad practices (hiding exceptions, not closing streams in finally blocks, not specify an explicit encoding, etc.). It uses a 1024 bytes long buffer because it doesn't have any way of knowing the length of the input stream.
Read the Java IO tutorial to learn how to read text from a file.

You are reading the file into a buffer of 1024 Bytes.
Then those 1024 bytes are written to outputStream.
This process repeats until the whole file is read into the outputStream.
As JB Nizet mentioned the example is full of bad practices.

Wouldn't I want to initialize it to the length of my text file? Also, wouldn't I want to use char, not byte?
Yes, and yes ... and as other answers have said, you've picked an example with a number of errors in it.
However, there is a theoretical problem doing both; i.e. setting the buffer length to the file length and using a character buffer rather than a byte buffer. The problem is that the file size is measured in bytes, but the size of the buffer needs to be measured in characters. This is normally fine, but it is theoretically possible that you will need more characters than the file size in bytes; e.g. if the input file used a 6 bit character set and packed 4 characters into 3 bytes.

To read from a file I usaully use a Scanner and a StringBuilder.
Scanner scan = new Scanner(new BufferedInputStream(new FileInputStream(filename)), "UTF-8");
StringBuilder sb = new StringBuilder();
while (scan.hasNextLine()) {
sb.append(scan.nextLine());
sb.append("\n");
}
scan.close
return sb.toString();
Try to throw your exceptions instead of swallowing them. The caller must know there was a problem reading your file.
Edit: Also note that using a BufferedInputStream is important. Otherwise it will try to read bytes by bytes which can be slow.

read a file byte by byte then perform some operation every n bytes

I would like to know how can I read a file byte by byte then perform some operation every n bytes.
for example:
Say I have a file of size = 50 bytes, I want to divide it into blocks each of n bytes. Then each block is sent to a function for some operations to be done on those bytes. The blocks are to be created during the read process and sent to the function when the block reaches n bytes so that I don`t use much memory for storing all blocks.
I want the output of the function to be written/appended on a new file.
This is what I've reached to read, yet I don't know it it is right:
fc = new JFileChooser();
File f = fc.getSelectedFile();
FileInputStream in = new FileInputStream(f);
byte[] b = new byte[16];
in.read(b);
I haven't done anything yet for the write process.

You're on the right lines. Consider wrapping your FileInputStream with a BufferedInputStream, which improve I/O efficiency by reading the file in chunks.
The next step is to check the number of bytes read (returned by your call to read) and to hand-off the array to the processing function. Obviously you'll need to pass the number of bytes read to this method too in case the array was only partially populated.

So far your code looks OK. For reading binary files (as opposed to text files) you should indeed use FileInputStream (for reading text files, you should use a Reader, such as FileReader).
Note that you should check the return value from in.read(b);, because it might read less than 16 bytes if there are less than 16 bytes left at the end of the file.
Ofcourse you should add a loop to the program that keeps reading blocks of bytes until you reach the end of the file.
To write data to a binary file, use FileOutputStream. That class has a constructor that you can pass a flag to indicate that you want to append to an existing file:
FileOutputStream out = new FileOutputStream("output.bin", true);
Also, don't forget to call close() on the FileInputStream and FileOutputStream when you are done.
See the Java API documentation, especially the classes in the java.io package.

I believe that this will work:
final int blockSize = // some calculation
byte[] block = new byte[blockSize];
InputStream is = new FileInputStream(f);
try {
int ret = -1;
do {
int bytesRead = 0;
while (bytesRead < blockSize) {
ret = is.read(block, bytesRead, blockSize - bytesRead);
if (ret < 0)
break; // no more data
bytesRead += ret;
}
myFunction(block, bytesRead);
} while (0 <= ret);
}
finally {
is.close();
}
This code will call myFunction with blockSize bytes for all but possibly the last invocation.

It's a start.
You should check what read() returns. It can read fewer bytes than the size of the array, and also indicate that the end of the file is reached.
Obviously, you need to read() in a loop...
It might be a good idea to reuse the array, but that requires that the part that reads the array copies what it needs, rather than just keeping a reference to the array.

I think this is what you migth need
void readFile(String path, int n) {
try {
File f = new File(path);
FileInputStream fis = new FileInputStream(f);
int ret = 0;
byte[] array = new byte[n];
while(ret > -1) {
ret = fis.read(array);
doSomething(array, ret);
}
fis.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}

Why does my Sax Parser produce no results after using InputStream Read?

I have this piece of code which I'm hoping will be able to tell me how much data I have downloaded (and soon put it in a progress bar), and then parse the results through my Sax Parser. If I comment out basically everything above the //xr.parse(new InputSource(request.getInputStream())); line and swap the xr.parse's over, it works fine. But at the moment, my Sax parser tells me I have nothing. Is it something to do with is.read (buffer) section?
Also, just as a note, request is a HttpURLConnection with various signatures.
/*Input stream to read from our connection*/
InputStream is = request.getInputStream();
/*we make a 2 Kb buffer to accelerate the download, instead of reading the file a byte at once*/
byte [ ] buffer = new byte [ 2048 ] ;
/*How many bytes do we have already downloaded*/
int totBytes,bytes,sumBytes = 0;
totBytes = request.getContentLength () ;
while ( true ) {
/*How many bytes we got*/
bytes = is.read (buffer);
/*If no more byte, we're done with the download*/
if ( bytes <= 0 ) break;
sumBytes+= bytes;
Log.v("XML", sumBytes + " of " + totBytes + " " + ( ( float ) sumBytes/ ( float ) totBytes ) *100 + "% done" );
}
/* Parse the xml-data from our URL. */
// OLD, and works if comment all the above
//xr.parse(new InputSource(request.getInputStream()));
xr.parse(new InputSource(is))
/* Parsing has finished. */;
Can anyone help me at all??
Kind regards,
Andy

'I could only find a way to do that
with bytes, unless you know another
method?'.
But you haven't found a method. You've just written code that doesn't work. And you don't want to save the input to a String either. You want to count the bytes while you're parsing them. Otherwise you're just adding latency, i.e. wasting time and slowing everything down. For an example of how to do it right, see javax.swing.ProgressMonitorInputStream. You don't have to use that but you certainly do have to use a FilterInputStream of some sort, probaby one you write yourself, that is wrapped around the request input stream and passed to the parser.

Your while loop is consuming the input stream and leaving nothing for the parser to read.
For what you're trying to do, you might want to look into implementing a FilterInputStream subclass wrapping the input stream.

You are building an InputStream over another InputStream that consumes its data before.
If you want to avoid reading just single bytes you could use a BufferedInputStream or different things like a BufferedReader.
In any case it's better to obtain the whole content before parsing it! Unless you need to dynamically parse it.
If you really want to keep it on like you are doing you should create two piped streams:
PipedOutputStream pipeOut = new PipedOutputStream();
PipedInputStream pipeIn = new PipedInputStream();
pipeIn.connect(pipeOut);
pipeOut.write(yourBytes);
xr.parse(pipeIn);
Streams in Java, like their name suggest you, doesn't have a precise dimension neither you know when they'll finish so whenever you create an InputStream, if you read from them you cannot then pass the same InputStream to another object because data is already being consumed from the former one.
If you want to do both things (downloading and parsing) together you have to hook between the data received from the HTTPUrlConncection you should:
first know the length of the data being downloaded, this can be obtained from HttpUrlConnection header
using a custom InputStream that decorates (this is how streams work in Java, see here) updading the progressbar..
Something like:
class MyInputStream extends InputStream
{
MyInputStream(InputStream is, int total)
{
this.total = total;
}
public int read()
{
stepProgress(1);
return super.read();
}
public int read(byte[] b)
{
int l = super.read(b);
stepProgress(l);
return l;
}
public int read(byte[] b, int off, int len)
{
int l = super.read(b, off, len);
stepProgress(l);
return l
}
}
InputStream mis= new MyInputStream(request.getInputStream(), length);
..
xr.parse(mis);

You can save your data in a file, and then read them out.
InputStream is = request.getInputStream();
if(is!=null){
File file = new File(path, "someFile.txt");
FileOutputStream os = new FileOutputStream(file);
buffer = new byte[2048];
bufferLength = 0;
while ((bufferLength = is.read(buffer)) > 0)
os.write(buffer, 0, bufferLength);
os.flush();
os.close();
XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
factory.setNamespaceAware(true);
XmlPullParser xpp = factory.newPullParser();
FileInputStream fis = new FileInputStream(file);
xpp.setInput(new InputStreamReader(fis));
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Determine the size of an InputStream - java

I would read into a ByteArrayOutputStream and then call toByteArray() to get the resultant byte array. You don't need to define the size in advance (although it's possibly an optimisation if you know it. In many cases you won't)

For InputStream org.apache.commons.io.IoUtils.toByteArray(inputStream).length() For Optional < MultipartFile > Stream.of(multipartFile.get()).mapToLong(file->file.getSize()).findFirst().getAsLong()

you can get the size of InputStream using getBytes(inputStream) of Utils.java check this following link Get Bytes from Inputstream

Add to your pom.xml: <dependency> <groupId>commons-io</groupId> <artifactId>commons-io</artifactId> <version>2.5</version> </dependency> Use to get the content type lenght (InputStream file): IOUtils.toByteArray(file).length

Related

Server cannot read entire content

Trim Padding From ByteArrayOutputStream

why initialize this byte array to 1024

read a file byte by byte then perform some operation every n bytes

Why does my Sax Parser produce no results after using InputStream Read?

Categories

Resources