Convert Stream to String Java/Groovy - java

I stole this snippet off the web. But it looks to be limited to 4096 bytes and is quite ugly IMO. Anyone know of a better approach? I'm actually using Groovy btw...
String streamToString(InputStream input) {
StringBuffer out = new StringBuffer();
byte[] b = new byte[4096];
for (int n; (n = input.read(b)) != -1;) {
out.append(new String(b, 0, n));
}
return out.toString();
}
EDIT:
I found a better solution in Groovy:
InputStream exportTemplateStream = getClass().getClassLoader().getResourceAsStream("export.template")
assert exportTemplateStream: "[export.template stream] resource not found"
String exportTemplate = exportTemplateStream.text

Some good and fast answers. However I think the best one is Groovy has added a "getText" method to InputStream. So all I had to do was stream.text. And good call on the 4096 comment.

For Groovy
filePath = ... //< a FilePath object
stream = filePath.read() //< InputStream object
// Specify the encoding, and get the String object
//content = stream.getText("UTF-16")
content = stream.getText("UTF-8")
The InputStream class reference
The getText() without encoding, it will use current system encoding, ex ("UTF-8").

Try IOUtils from Apache Commons:
String s = IOUtils.toString(inputStream, "UTF-8");

It's reading the input in chunks of 4096 bytes(4KB), but the size of the actual string is not limited as it keeps reading more and appending it to the SringBuffer.

You can do it fairly easily using the Scanner class:
String streamToSring(InputStream input) {
Scanner s = new Scanner(input);
StringBuilder builder = new StringBuilder();
while (s.hasNextLine()) {
builder.append(s.nextLine() +"\n");
}
return builder.toString();
}

That snippet has a bug: if the input uses a multi-byte character encoding, there's a good chance that a single character will span two reads (and not be convertable). And it also has the semi-bug that it relies on the platform's default encoding.
Instead, use Jakarta Commons IO. In particular, the version of IOUtils.toString() that takes an InputStream and applies an encoding to it.

For future reviewers who have similar problems, please note that both IOUtils from Apache, and Groovy's InputStream.getText() method require the stream to complete, or be closed before returning. If you are working with a persistent stream you will nead to deal with the "ugly" example that Phil originally posted, or work with non-blocking IO.

You can try something similar to this
new FileInputStream( new File("c:/tmp/file.txt") ).eachLine { println it }

Related

How to write a reader into a file using nio?

Given a Reader, a Charset, and a Path, how do I correctly and efficiently write the reader's content into a file?
The total size of the reader's content is not known in advance.
This is my current solution:
CharBuffer charBuffer = CharBuffer.allocate(1024);
try (FileChannel fc = (FileChannel) Files.newByteChannel(path, StandardOpenOption.WRITE, StandardOpenOption.CREATE_NEW)) {
while (true) {
int size = reader.read(charBuffer);
if (size < 0) break;
charBuffer.flip();
ByteBuffer bytes = charset.encode(charBuffer);
fc.write(bytes);
charBuffer.flip();
}
}
It works but it allocates a new ByteBuffer in every loop. I could try to reuse the byte buffer, but I would actually prefer a solution that uses only one buffer in total.
Using ByteBuffer#toCharBuffer is not an option because it does not consider the charset.
I also don't like the type cast in the try-statement, is there a cleaner solution?
The simplest way to transfer reader to a path is to use the built in methods of Files:
try(var out = Files.newBufferedWriter(path, charset, StandardOpenOption.WRITE, StandardOpenOption.CREATE_NEW)) {
reader.transferTo(out);
}
This does not need the CharBuffer and simplifies the logic of the code you need to write for this often needed task.

Using the String(bytes[]) constructor for an InputStream

I'm wondering what the objections are to using what I'll call the 'String constructor method' to convert an InputStream into a String.
Edit: added emphasis. In particular, I'm wondering why we have to mess with Streams and Buffers and Scanners and whatnot when this method seems to work fine.
private String readStream(InputStream in) {
byte[] buffer = new byte[1024];
try {
return new String(buffer, 0, in.read(buffer));
} catch (IOException e) {
Log.d(DEBUG_TAG, "Error reading input stream!");
return "";
}
}
I've seen this other helpful post and tried the methods I could:
Method 1, Apache commons, is a no-go, since I can't use and don't want libraries right now.
Method 2, The Scanner one, looks promising, but then you'd have to be able to set delimiters in the stream, which isn't always possible, right? E.g. right now I'm using an InputStream from a web API.
Method 3, the InputStreamReader in the slurp function, didn't work either - it gives me a bunch of numbers, where I'm sending a string with all types of characters, so I may be messing something up in my encoding.
But after many Google searches, I finally found the String constructor method, which is the only one that works for me.
From comments on the thread I linked, I know there are issues with encoding in the method I'm using. I've been coding for a while now and know what encodings are and why they're around. But I still lack any knowledge about what kinds of encodings are used where, and how to detect and handle them. Any resources/help on that topic would also be very appreciated!
Here is one method using only standard libraries:
use a ByteArrayOutputStream and copy all the bytes you receive in it;
wrap this ByteArrayOutputStream's bytes into a ByteBuffer;
use a CharsetDecoder to decode the ByteBuffer into a CharBuffer;
.toString() the CharBuffer after rewinding it.
Code (note: doesn't handle closing the input):
// Step 1: read all the bytes
final ByteArrayOutputStream out = new ByteArrayOutputStream();
final byte[] buffer = new byte[8196];
int count;
while ((count = in.read(buffer)) != -1)
out.write(buf, 0, count);
// Step 2: wrap the array
final ByteBuffer byteBuffer = ByteBuffer.wrap(out.toByteArray());
// Step 3: decode
final CharsetDecoder decoder = StandardCharsets.UTF_8.newDecoder()
.onUnmappableCharacter(CodingErrorAction.REPORT)
.onMalformedInput(CodingErrorAction.REPORT);
final CharBuffer charBuffer = decoder.decode(byteBuffer);
charBuffer.flip();
return charBuffer.toString();

How to load a classpath resource to an array of byte?

I know how to get the inputstream for a given classpath resource, read from the inputstream until i reach the end, but it looks like a very common problem, and i wonder if there an API that I don't know, or a library that would make things as simple as
byte[] data = ResourceUtils.getResourceAsBytes("/assets/myAsset.bin")
or
byte[] data = StreamUtils.readStreamToEnd(myInputStream)
for example!
Java 9 native implementation:
byte[] data = this.getClass().getClassLoader().getResourceAsStream("/assets/myAsset.bin").readAllBytes();
Have a look at Google guava ByteStreams.toByteArray(INPUTSTREAM), this is might be what you want.
Although i agree with Andrew Thompson, here is a native implementation that works since Java 7 and uses the NIO-API:
byte[] data = Files.readAllBytes(Paths.get(this.getClass().getClassLoader().getResource("/assets/myAsset.bin").toURI()));
Take a look at Apache IOUtils - it has a bunch of methods to work with streams
I usually use the following two approaches to convert Resource into byte[] array.
1 - approach
What you need is to first call getInputStream() on Resource object, and then pass that to convertStreamToByteArray method like below....
InputStream stream = resource.getInputStream();
long size = resource.getFile().lenght();
byte[] byteArr = convertStreamToByteArray(stream, size);
public byte[] convertStreamToByteArray(InputStream stream, long size) throws IOException {
// check to ensure that file size is not larger than Integer.MAX_VALUE.
if (size > Integer.MAX_VALUE) {
return new byte[0];
}
byte[] buffer = new byte[(int)size];
ByteArrayOutputStream os = new ByteArrayOutputStream();
int line = 0;
// read bytes from stream, and store them in buffer
while ((line = stream.read(buffer)) != -1) {
// Writes bytes from byte array (buffer) into output stream.
os.write(buffer, 0, line);
}
stream.close();
os.flush();
os.close();
return os.toByteArray();
}
2 - approach
As Konstantin V. Salikhov suggested, you could use org.apache.commons.io.IOUtils and call its IOUtils.toByteArray(stream) static method and pass to it InputStream object like this...
byte[] byteArr = IOUtils.toByteArray(stream);
Note - Just thought I'll mention this that under the hood toByteArray(...) checks to ensure that file size is not larger than Integer.MAX_VALUE, so you don't have to check for this.
Commonly Java methods will accept an InputStream. In that majority of cases, I would recommend passing the stream directly to the method of interest.
Many of those same methods will also accept an URL (e.g. obtained from getResource(String)). That can sometimes be better, since a variety of the methods will require a repositionable InputStream and there are times that the stream returned from getResourceAsStream(String) will not be repositionable.

streams to strings: merging multiple files into a single string

I've got two text files that I want to grab as a stream and convert to a string. Ultimately, I want the two separate files to merge.
So far, I've got
//get the input stream of the files.
InputStream is =
cts.getClass().getResourceAsStream("/files/myfile.txt");
// convert the stream to string
System.out.println(cts.convertStreamToString(is));
getResourceAsStream doesn't take multiple strings as arguments. So what do I need to do? Separately convert them and merge together?
Can anyone show me a simple way to do that?
It sounds like you want to concatenate streams. You can use a SequenceInputStream to create a single stream from multiple streams. Then read the data from this single stream and use it as you need.
Here's an example:
String encoding = "UTF-8"; /* You need to know the right character encoding. */
InputStream s1 = ..., s2 = ..., s3 = ...;
Enumeration<InputStream> streams =
Collections.enumeration(Arrays.asList(s1, s2, s3));
Reader r = new InputStreamReader(new SequenceInputStream(streams), encoding);
char[] buf = new char[2048];
StringBuilder str = new StringBuilder();
while (true) {
int n = r.read(buf);
if (n < 0)
break;
str.append(buf, 0, n);
}
r.close();
String contents = str.toString();
You can utilize commons-io which has the ability to read a Stream into a String
http://commons.apache.org/io/api-release/org/apache/commons/io/IOUtils.html#toString%28java.io.InputStream%29
Off hand I can think of a couple ways
Create a StringBuilder, then convert each stream to a string and append to the stringbuilder.
Or, create a writable memorystream and stream each input stream into that memorystream, then convert that to a string.
Create a loop that for each file loads the text into a StringBuilder. Then once each file's data is appended, call toString() on the builder.

How to read one stream into another? [duplicate]

This question already has answers here:
Easy way to write contents of a Java InputStream to an OutputStream
(24 answers)
Closed 3 years ago.
FileInputStream in = new FileInputStream(myFile);
ByteArrayOutputStream out = new ByteArrayOutputStream();
Question: How can I read everything from in into out in a way which is not a hand-crafted loop with my own byte buffer?
Java 9 (and later) answer (docs):
in.transferTo(out);
Seems they finally realized that this functionality is so commonly needed that it’d better be built in. The method returns the number of bytes copied in case you need to know.
Write one method to do this, and call it from everywhere which needs the functionality. Guava already has code for this, in ByteStreams.copy. I'm sure just about any other library with "general" IO functionality has it too, but Guava's my first "go-to" library where possible. It rocks :)
In Apache Commons / IO, you can do it using IOUtils.copy(in, out):
InputStream in = new FileInputStream(myFile);
OutputStream out = new ByteArrayOutputStream();
IOUtils.copy(in, out);
But I agree with Jon Skeet, I'd rather use Guava's ByteStreams.copy(in, out)
So what Guava's ByteStreams.copy(in, out) does:
private static final int BUF_SIZE = 0x1000; // 4K
public static long copy(InputStream from, OutputStream to)
throws IOException {
checkNotNull(from);
checkNotNull(to);
byte[] buf = new byte[BUF_SIZE];
long total = 0;
while (true) {
int r = from.read(buf);
if (r == -1) {
break;
}
to.write(buf, 0, r);
total += r;
}
return total;
}
In my project I used this method:
private static void copyData(InputStream in, OutputStream out) throws Exception {
byte[] buffer = new byte[8 * 1024];
int len;
while ((len = in.read(buffer)) > 0) {
out.write(buffer, 0, len);
}
}
Alternatively to Guava one could use Apache Commons IO (old), and Apache Commons IOUtils (new as advised in the comment).
I'd use the loop, instead of importing new classes, or adding libraries to my project. The library function is probably also implemented with a loop. But that's just my personal taste.
However, my question to you: what are you trying to do? Think of the "big picture", if you want to put the entire contents of a file into a byte array, why not just do that? The size of the arrray is file.length(), and you don't need it to dynamically grow, hidden behind a ByteArrayOutputStream (unless your file is shared, and its contents can change while you read).
Another alternative: could you use a FileChannel and a ByteBuffer (java.nio)?

Categories