I have a servlet which receives via POST method a large JSON string (> 10,000 characters).
If i read the content of the request like this:
try(Reader reader = new InputStreamReader(new BufferedInputStream(request.getInputStream()), StandardCharsets.UTF_8))
{
char[] buffer = new char[request.getContentLength()];
reader.read(buffer);
System.out.println(new String(buffer));
}
i don´t get the entire content! The buffer size is correct. But the length of the created string is not.
But if i do it like this:
try(BufferedInputStream input = new BufferedInputStream(request.getInputStream()))
{
byte[] buffer = new byte[request.getContentLength()];
input.read(buffer);
System.out.println(new String(buffer, StandardCharsets.UTF_8));
}
it works perfectly.
So where am i wrong in the first case?
The way you are using InputStreamReader is not really the intended way. A call to read is not guaranteed to read any specific number of bytes (it depends on the stream you are reading from), which is why the return value of this method is the number of bytes that were read. You would need to keep reading from the stream and buffering until it indicates it has reached the end (it will return -1 as the number of bytes that were read). Some good examples of how to do this can be found here: Convert InputStream to byte array in Java
But since you want this as character data, you should probably use request.getReader() instead. A good example of how to do this can be found here: Retrieving JSON Object Literal from HttpServletRequest
Related
I have a file named mark_method.json containing ABCDE in it and I am reading this file using the InputStream class.
By definition, the InputStream class reads an input stream of bytes. How does this work? I don't have bytes in the file, but characters?
I am trying to understand how a stream reading bytes is reading characters from the file?
public class MarkDemo {
public static void main(String args[]) throws Exception {
InputStream is = null;
try {
is = new FileInputStream("C:\\Users\\s\\Documents\\EB\\EB_02_09_2020_with_page_number_and_quote_number\\Old_images\\mark_method.json");
}
catch(Exception e) {
e.printStackTrace();
} finally {
if(is != null) {
is.close();
}
}
}
}
Every data on the computer is stored in bits and bytes. Here the content of the files is also stored in bytes.
We have programs which convert these bytes into human-readable forms thus we see the mark_method.json file containing characters and not bytes.
An character is a byte. (At least in ASCII).
Each byte from 0 to 127 has a character value. For example 0 is the Null-character, 0xa is \n, 0xd is \r, 0x41 is 'A' and so on.
The implementation only knows bytes. It doesn't know, that the char 0x2709 is ✉. It only sees it as two bytes: 0x27 and 0x09.
Only the texteditor interprets the bytes and show the matching symbol/letter
I think what you are actually asking here is how to convert the bytes you read from file using FileInputStream in to a Java String object you can print and manipulate.
FileInputStream does not have any read methods for directly producing a String object so if that is what you want, you need to further manipulate the input you get.
Option one is to use the Scanner class:
Scanner scanner = new Scanner(is);
String word = scanner.next();
Another option is to read the bytes and use the constructor of the String class that works with byte array:
byte [] bytes = new byte[10];
is.read(bytes);
String text = new String(bytes);
Note that for simplicity I just assumed you can read 10 valid bytes from your file.
In real code you would need some logic to make sure you are reading correct number of bytes.
Also, if your file is not stored using your system default character set, you will need to specify the character set as a parameter to the String constructor.
Finally, you can use another wrapper class, BufferedReader that has a readLine function which takes care of all the logic needed to read bytes representing a line of text from a file and return them in a String.
BufferedReader in = new BufferedReader(new FileReader("foo.in"));
String line = in.readLine();
Say we have a file like so:
one
two
three
(but this file got encrypted)
My crypto method returns the whole file in memory, as a byte[] type.
I know byte arrays don't have a concept of "lines", that's something a Scanner (for example) could have.
I would like to traverse each line, convert it to string and perform my operation on it but I don't know
how to:
Find lines in a byte array
Slice the original byte array to "lines" (I would convert those slices to String, to send to my other methods)
Correctly traverse a byte array, where each iteration is a new "line"
Also: do I need to consider the different OS the file might have been composed in? I know that there is some difference between new lines in Windows and Linux and I don't want my method to work only with one format.
Edit: Following some tips from answers here, I was able to write some code that gets the job done. I still wonder if this code is worthy of keeping or I am doing something that can fail in the future:
byte[] decryptedBytes = doMyCrypto(fileName, accessKey);
ByteArrayInputStream byteArrInStrm = new ByteArrayInputStream(decryptedBytes);
InputStreamReader inStrmReader = new InputStreamReader(byteArrInStrm);
BufferedReader buffReader = new BufferedReader(inStrmReader);
String delimRegex = ",";
String line;
String[] values = null;
while ((line = buffReader.readLine()) != null) {
values = line.split(delimRegex);
if (Objects.equals(values[0], tableKey)) {
return values;
}
}
System.out.println(String.format("No entry with key %s in %s", tableKey, fileName));
return values;
In particular, I was advised to explicitly set the encoding but I was unable to see exactly where?
If you want to stream this, I'd suggest:
Create a ByteArrayInputStream to wrap your array
Wrap that in an InputStreamReader to convert binary data to text - I suggest you explicitly specify the text encoding being used
Create a BufferedReader around that to read a line at a time
Then you can just use:
String line;
while ((line = bufferedReader.readLine()) != null)
{
// Do something with the line
}
BufferedReader handles line breaks from all operating systems.
So something like this:
byte[] data = ...;
ByteArrayInputStream stream = new ByteArrayInputStream(data);
InputStreamReader streamReader = new InputStreamReader(stream, StandardCharsets.UTF_8);
BufferedReader bufferedReader = new BufferedReader(streamReader);
String line;
while ((line = bufferedReader.readLine()) != null)
{
System.out.println(line);
}
Note that in general you'd want to use try-with-resources blocks for the streams and readers - but it doesn't matter in this case, because it's just in memory.
As Scott states i would like to see what you came up with so we can help you alter it to fit your needs.
Regarding your last comment about the OS; if you want to support multiple file types you should consider making several functions that support those different file extensions. As far as i know you do need to specify which file and what type of file you are reading with your code.
So you know you can use AsynchronousFileChannel to read an entire file to a String:
AsynchronousFileChannel fileChannel = AsynchronousFileChannel.open(filePath, StandardOpenOption.READ);
long len = fileChannel.size();
ReadAttachment readAttachment = new ReadAttachment();
readAttachment.byteBuffer = ByteBuffer.allocate((int) len);
readAttachment.asynchronousChannel = fileChannel;
CompletionHandler<Integer, ReadAttachment> completionHandler = new CompletionHandler<Integer, ReadAttachment>() {
#Override
public void completed(Integer result, ReadAttachment attachment) {
String content = new String(attachment.byteBuffer.array());
try {
attachment.asynchronousChannel.close();
} catch (IOException e) {
e.printStackTrace();
}
completeCallback.accept(content);
}
#Override
public void failed(Throwable exc, ReadAttachment attachment) {
exc.printStackTrace();
exceptionError(errorCallback, completeCallback, String.format("error while reading file [%s]: %s", path, exc.getMessage()));
}
};
fileChannel.read(
readAttachment.byteBuffer,
0,
readAttachment,
completionHandler);
Suppose that now, I don't want to allocate an entire ByteBuffer, but read line by line. I could use a ByteBuffer of fixed width and keep recalling read many times, always copying and appending to a StringBuffer until I don't get to a new line... My only concern is: because the encoding of the file that I am reading could be multi byte per character (UTF something), it may happen that the read bytes end with an uncomplete character. How can I make sure that I'm converting the right bytes into strings and not messing up the encoding?
UPDATE: answer is in the comment of the selected answer, but it basically points to CharsetDecoder.
If you have clear ASCII separator which you have in your case (\n), you'll not need to care about incomplete string as this character maps to singlebyte (and vice versa).
So just search for '\n' byte in your input and read and convert anything before into String. Loop until no more new lines are found. Then compact the buffer and reuse it for next read. If you don't find new line you'll have to allocate bigger buffer, copy the content of the old one and only then call the read again.
EDIT: As mentioned in the comment, you can pass the ByteBuffer to CharsetDecoder on the fly and translate it into CharBuffer (then append to StringBuilder or whatever is preffered solution).
Try Scanner:
Scanner sc = new Scanner(FileChannel.open(filePath, StandardOpenOption.READ));
String line = sc.readLine();
FileChannel is InterruptibleChannel
My app reads text file line by line and record offset of each line until the end of file. offset returns changed value when readLine is first executed. But it does not change any more after that. What is wrong with my code? I use RandomAccessFile instead of FileInputStream because seek() is faster than skip() when file is big.
String buffer;
long offset;
RandomAccessFile raf = new RandomAccessFile("data.txt", "r");
FileInputStream is = new FileInputStream(raf.getFD());
BufferedReader br = new BufferedReader(new InputStreamReader(is));
while (true) {
offset = raf.getFilePointer(); // offset remains the same after 1st readLine. why?
if ((buffer = br.readLine()) == null) // buffer has correct value.
return;
………………………………
}
Because BufferedReader is buffered. So it reads the data into it's buffer the first time and then just keeps it there until it needs more buffered data.
If you want to use a smaller buffer for testing purposes, try new BufferedReader(new InputStreamReader(is), 1000); or something. Your pointer should now increment by 1000 occasionally.
If you want your counter to work properly, you can do one of two things. Either you can count the characters you are receiving and then do some converting to byte lengths which you can use to make your own counter or you can use a FileReader with no buffering which will increment the counter the way you expect.
Update: It seems FileReader does something behind the scenes. I'd use something like new CountingInputStream(new BufferedInputStream(new FileInputStream(raf.getFD())) loop through the data in byte form, manually identify line endings while dumping the bytes into a String. Not the prettiest way, but the only way I can think of given Reader's internal buffering. I think CountingInputStream is provided by Apache freely and it contains a count method or something like that.
My current situation is: I have to read a file and put the contents into InputStream. Afterwards I need to place the contents of the InputStream into a byte array which requires (as far as I know) the size of the InputStream. Any ideas?
As requested, I will show the input stream that I am creating from an uploaded file
InputStream uploadedStream = null;
FileItemFactory factory = new DiskFileItemFactory();
ServletFileUpload upload = new ServletFileUpload(factory);
java.util.List items = upload.parseRequest(request);
java.util.Iterator iter = items.iterator();
while (iter.hasNext()) {
FileItem item = (FileItem) iter.next();
if (!item.isFormField()) {
uploadedStream = item.getInputStream();
//CHANGE uploadedStreambyte = item.get()
}
}
The request is a HttpServletRequest object, which is like the FileItemFactory and ServletFileUpload is from the Apache Commons FileUpload package.
This is a REALLY old thread, but it was still the first thing to pop up when I googled the issue. So I just wanted to add this:
InputStream inputStream = conn.getInputStream();
int length = inputStream.available();
Worked for me. And MUCH simpler than the other answers here.
Warning This solution does not provide reliable results regarding the total size of a stream. Except from the JavaDoc:
Note that while some implementations of {#code InputStream} will return
* the total number of bytes in the stream, many will not.
I would read into a ByteArrayOutputStream and then call toByteArray() to get the resultant byte array. You don't need to define the size in advance (although it's possibly an optimisation if you know it. In many cases you won't)
You can't determine the amount of data in a stream without reading it; you can, however, ask for the size of a file:
http://java.sun.com/javase/6/docs/api/java/io/File.html#length()
If that isn't possible, you can write the bytes you read from the input stream to a ByteArrayOutputStream which will grow as required.
I just wanted to add, Apache Commons IO has stream support utilities to perform the copy. (Btw, what do you mean by placing the file into an inputstream? Can you show us your code?)
Edit:
Okay, what do you want to do with the contents of the item?
There is an item.get() which returns the entire thing in a byte array.
Edit2
item.getSize() will return the uploaded file size.
For InputStream
org.apache.commons.io.IoUtils.toByteArray(inputStream).length()
For Optional < MultipartFile >
Stream.of(multipartFile.get()).mapToLong(file->file.getSize()).findFirst().getAsLong()
you can get the size of InputStream using getBytes(inputStream) of Utils.java check this following link
Get Bytes from Inputstream
The function below should work with any InputStream. As other answers have hinted, you can't reliably find the length of an InputStream without reading through it, but unlike other answers, you should not attempt to hold the entire stream in memory by reading into a ByteArrayOutputStream, nor is there any reason to. Instead of reading the stream, you should ideally rely on other API for stream sizes, for example getting the size of a file using the File API.
public static int length(InputStream inputStream, int chunkSize) throws IOException {
byte[] buffer = new byte[chunkSize];
int chunkBytesRead = 0;
int length = 0;
while((chunkBytesRead = inputStream.read(buffer)) != -1) {
length += chunkBytesRead;
}
return length;
}
Choose a reasonable value for chunkSize appropriate to the kind of InputStream. E.g. reading from disk it would not be efficient to have too small a value for chunkSize.
When explicitly dealing with a ByteArrayInputStream then contrary to some of the comments on this page you can use the .available() function to get the size. Just have to do it before you start reading from it.
From the JavaDocs:
Returns the number of remaining bytes that can be read (or skipped
over) from this input stream. The value returned is count - pos, which
is the number of bytes remaining to be read from the input buffer.
https://docs.oracle.com/javase/7/docs/api/java/io/ByteArrayInputStream.html#available()
If you need to stream the data to another object that doesn't allow you to directly determine the size (e.g. javax.imageio.ImageIO), then you can wrap your InputStream within a CountingInputStream (Apache Commons IO), and then read the size:
CountingInputStream countingInputStream = new CountingInputStream(inputStream);
// ... process the whole stream ...
int size = countingInputStream.getCount();
If you know that your InputStream is a FileInputStream or a ByteArrayInputStream, you can use a little reflection to get at the stream size without reading the entire contents. Here's an example method:
static long getInputLength(InputStream inputStream) {
try {
if (inputStream instanceof FilterInputStream) {
FilterInputStream filtered = (FilterInputStream)inputStream;
Field field = FilterInputStream.class.getDeclaredField("in");
field.setAccessible(true);
InputStream internal = (InputStream) field.get(filtered);
return getInputLength(internal);
} else if (inputStream instanceof ByteArrayInputStream) {
ByteArrayInputStream wrapper = (ByteArrayInputStream)inputStream;
Field field = ByteArrayInputStream.class.getDeclaredField("buf");
field.setAccessible(true);
byte[] buffer = (byte[])field.get(wrapper);
return buffer.length;
} else if (inputStream instanceof FileInputStream) {
FileInputStream fileStream = (FileInputStream)inputStream;
return fileStream.getChannel().size();
}
} catch (NoSuchFieldException | IllegalAccessException | IOException exception) {
// Ignore all errors and just return -1.
}
return -1;
}
This could be extended to support additional input streams, I am sure.
Add to your pom.xml:
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.5</version>
</dependency>
Use to get the content type lenght (InputStream file):
IOUtils.toByteArray(file).length
Use this method, you just have to pass the InputStream
public String readIt(InputStream is) {
if (is != null) {
BufferedReader reader = new BufferedReader(new InputStreamReader(is, "utf-8"), 8);
StringBuilder sb = new StringBuilder();
String line;
while ((line = reader.readLine()) != null) {
sb.append(line).append("\n");
}
is.close();
return sb.toString();
}
return "error: ";
}
try {
InputStream connInputStream = connection.getInputStream();
} catch (IOException e) {
e.printStackTrace();
}
int size = connInputStream.available();
int available ()
Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream. The next invocation might be the same thread or another thread. A single read or skip of this many bytes will not block, but may read or skip fewer bytes.
InputStream - Android SDK | Android Developers