UTF-8 byte[] to String - java

Let's suppose I have just used a BufferedInputStream to read the bytes of a UTF-8 encoded text file into a byte array. I know that I can use the following routine to convert the bytes to a string, but is there a more efficient/smarter way of doing this than just iterating through the bytes and converting each one?
public String openFileToString(byte[] _bytes)
{
String file_string = "";
for(int i = 0; i < _bytes.length; i++)
{
file_string += (char)_bytes[i];
}
return file_string;
}

Look at the constructor for String
String str = new String(bytes, StandardCharsets.UTF_8);
And if you're feeling lazy, you can use the Apache Commons IO library to convert the InputStream to a String directly:
String str = IOUtils.toString(inputStream, StandardCharsets.UTF_8);

Java String class has a built-in-constructor for converting byte array to string.
byte[] byteArray = new byte[] {87, 79, 87, 46, 46, 46};
String value = new String(byteArray, "UTF-8");

To convert utf-8 data, you can't assume a 1-1 correspondence between bytes and characters.
Try this:
String file_string = new String(bytes, "UTF-8");
(Bah. I see I'm way to slow in hitting the Post Your Answer button.)
To read an entire file as a String, do something like this:
public String openFileToString(String fileName) throws IOException
{
InputStream is = new BufferedInputStream(new FileInputStream(fileName));
try {
InputStreamReader rdr = new InputStreamReader(is, "UTF-8");
StringBuilder contents = new StringBuilder();
char[] buff = new char[4096];
int len = rdr.read(buff);
while (len >= 0) {
contents.append(buff, 0, len);
}
return buff.toString();
} finally {
try {
is.close();
} catch (Exception e) {
// log error in closing the file
}
}
}

You can use the String(byte[] bytes) constructor for that. See this link for details.
EDIT You also have to consider your plateform's default charset as per the java doc:
Constructs a new String by decoding the specified array of bytes using
the platform's default charset. The length of the new String is a
function of the charset, and hence may not be equal to the length of
the byte array. The behavior of this constructor when the given bytes
are not valid in the default charset is unspecified. The
CharsetDecoder class should be used when more control over the
decoding process is required.

You could use the methods described in this question (especially since you start off with an InputStream): Read/convert an InputStream to a String
In particular, if you don't want to rely on external libraries, you can try this answer, which reads the InputStream via an InputStreamReader into a char[] buffer and appends it into a StringBuilder.

Knowing that you are dealing with a UTF-8 byte array, you'll definitely want to use the String constructor that accepts a charset name. Otherwise you may leave yourself open to some charset encoding based security vulnerabilities. Note that it throws UnsupportedEncodingException which you'll have to handle. Something like this:
public String openFileToString(String fileName) {
String file_string;
try {
file_string = new String(_bytes, "UTF-8");
} catch (UnsupportedEncodingException e) {
// this should never happen because "UTF-8" is hard-coded.
throw new IllegalStateException(e);
}
return file_string;
}

Here's a simplified function that will read in bytes and create a string. It assumes you probably already know what encoding the file is in (and otherwise defaults).
static final int BUFF_SIZE = 2048;
static final String DEFAULT_ENCODING = "utf-8";
public static String readFileToString(String filePath, String encoding) throws IOException {
if (encoding == null || encoding.length() == 0)
encoding = DEFAULT_ENCODING;
StringBuffer content = new StringBuffer();
FileInputStream fis = new FileInputStream(new File(filePath));
byte[] buffer = new byte[BUFF_SIZE];
int bytesRead = 0;
while ((bytesRead = fis.read(buffer)) != -1)
content.append(new String(buffer, 0, bytesRead, encoding));
fis.close();
return content.toString();
}

String has a constructor that takes byte[] and charsetname as parameters :)

This also involves iterating, but this is much better than concatenating strings as they are very very costly.
public String openFileToString(String fileName)
{
StringBuilder s = new StringBuilder(_bytes.length);
for(int i = 0; i < _bytes.length; i++)
{
s.append((char)_bytes[i]);
}
return s.toString();
}

Why not get what you are looking for from the get go and read a string from the file instead of an array of bytes? Something like:
BufferedReader in = new BufferedReader(new InputStreamReader( new FileInputStream( "foo.txt"), Charset.forName( "UTF-8"));
then readLine from in until it's done.

I use this way
String strIn = new String(_bytes, 0, numBytes);

Related

How to convert String variable back in byte[] in JAVA [duplicate]

This question already has answers here:
How to convert Java String into byte[]?
(9 answers)
Closed 4 years ago.
I have the following code to zip and unzip the String:
public static void main(String[] args) {
// TODO code application logic here
String Source = "hello world";
byte[] a = ZIP(Source);
System.out.format("answer:");
System.out.format(a.toString());
System.out.format("\n");
byte[] Source2 = a.toString().getBytes();
System.out.println("\nsource 2:" + Source2.toString() + "\n");
String b = unZIP(Source2);
System.out.println("\nunzip answer:");
System.out.format(b);
System.out.format("\n");
}
public static byte[] ZIP(String source) {
ByteArrayOutputStream bos= new ByteArrayOutputStream(source.length()* 4);
try {
GZIPOutputStream outZip= new GZIPOutputStream(bos);
outZip.write(source.getBytes());
outZip.flush();
outZip.close();
} catch (Exception Ex) {
}
return bos.toByteArray();
}
public static String unZIP(byte[] Source) {
ByteArrayInputStream bins= new ByteArrayInputStream(Source);
byte[] buf= new byte[2048];
StringBuffer rString= new StringBuffer("");
int len;
try {
GZIPInputStream zipit= new GZIPInputStream(bins);
while ((len = zipit.read(buf)) > 0) {
rString.append(new String(buf).substring(0, len));
}
return rString.toString();
} catch (Exception Ex) {
return "";
}
}
When "Hello World" have been zipped, it's will become [B#7bdecdec in byte[] and convert into String and display on the screen. However, if I'm trying to convert the string back into byte[] with the following code:
byte[] Source2 = a.toString().getBytes();
the value of variable a will become to [B#60a1807c instead of [B#7bdecdec . Does anyone know how can I convert the String (a value of byte but been convert into String) back in byte[] in JAVA?
Why doing byte[] Source2 = a.toString().getBytes(); ?
It seems like a double conversion; you convert a byte[] to string the to byte[].
The real conversion of a byte[] to string is new String(byte[]) hoping that you're in the same charset.
Source2 should be an exact copy of a hence you should just do byte[] Source2 = a;
Your unzip is wrong because you are converting back a string which might be in some other encoding (let's say UTF-8):
public static String unZIP(byte[] source) throws IOException {
ByteArrayOutputStream bos = new ByteArrayOutputStream(source.length*2);
try (ByteArrayInputStream in = new ByteArrayInputStream(source);
GZIPInputStream zis = new GZIPInputStream(in)) {
byte[] buffer = new buffer[4096];
for (int n = 0; (n = zis.read(buffer) != 0; ) {
bos.write(buffer, 0, n);
}
}
return new String(bos.toByteArray(), StandardCharsets.UTF_8);
}
This one, not tested, will:
Store byte from the gzip stream into a ByteArrayOutputStream
Close the gzip/ByteArrayInputStream using try with resources
Convert the whole into a String using UTF-8 (you should always use encoding and unless rare case, UTF-8 is the way to go).
You must not use StringBuffer for two reasons:
The most important one: this will not behave well with multi bytes string such as UTF-8 or UTF-16.
And second, StringBuffer is synchronized: you should use StringBuilder whenever possible and whenever it should be used (eg: not here!). StringBuffer should be reserved for case where your share the StringBuffer with several threads, otherwise it is useless.
With those change, you will also need to change the ZIP as per David Conrad comment and because the unZIP use UTF-8:
public static byte[] ZIP(String source) throws IOException {
ByteArrayOutputStream bos = new ByteArrayOutputStream(source.length()* 4);
try (GZIPOutputStream zip = new GZIPOutputStream(bos)) {
zip.write(source.getBytes(StandardCharsets.UTF_8));
}
return bos.toByteArray();
}
As for the main, printing a byte[] will result in the default toString.

GZIP decompress string and byte conversion

I have a problem in code:
private static String compress(String str)
{
String str1 = null;
ByteArrayOutputStream bos = null;
try
{
bos = new ByteArrayOutputStream();
BufferedOutputStream dest = null;
byte b[] = str.getBytes();
GZIPOutputStream gz = new GZIPOutputStream(bos,b.length);
gz.write(b,0,b.length);
bos.close();
gz.close();
}
catch(Exception e) {
System.out.println(e);
e.printStackTrace();
}
byte b1[] = bos.toByteArray();
return new String(b1);
}
private static String deCompress(String str)
{
String s1 = null;
try
{
byte b[] = str.getBytes();
InputStream bais = new ByteArrayInputStream(b);
GZIPInputStream gs = new GZIPInputStream(bais);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
int numBytesRead = 0;
byte [] tempBytes = new byte[6000];
try
{
while ((numBytesRead = gs.read(tempBytes, 0, tempBytes.length)) != -1)
{
baos.write(tempBytes, 0, numBytesRead);
}
s1 = new String(baos.toByteArray());
s1= baos.toString();
}
catch(ZipException e)
{
e.printStackTrace();
}
}
catch(Exception e) {
e.printStackTrace();
}
return s1;
}
public String test() throws Exception
{
String str = "teststring";
String cmpr = compress(str);
String dcmpr = deCompress(cmpr);
}
This code throw java.io.IOException: unknown format (magic number ef1f)
GZIPInputStream gs = new GZIPInputStream(bais);
It turns out that when converting byte new String (b1) and the byte b [] = str.getBytes () bytes are "spoiled." At the output of the line we have already more bytes. If you avoid the conversion to a string and work on the line with bytes - everything works. Sorry for my English.
public String unZip(String zipped) throws DataFormatException, IOException {
byte[] bytes = zipped.getBytes("WINDOWS-1251");
Inflater decompressed = new Inflater();
decompressed.setInput(bytes);
byte[] result = new byte[100];
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
while (decompressed.inflate(result) != 0)
buffer.write(result);
decompressed.end();
return new String(buffer.toByteArray(), charset);
}
I'm use this function to decompress server responce. Thanks for help.
You have two problems:
You're using the default character encoding to convert the original string into bytes. That will vary by platform. It's better to specify an encoding - UTF-8 is usually a good idea.
You're trying to represent the opaque binary data of the result of the compression as a string by just calling the String(byte[]) constructor. That constructor is only meant for data which is encoded text... which this isn't. You should use base64 for this. There's a public domain base64 library which makes this easy. (Alternatively, don't convert the compressed data to text at all - just return a byte array.)
Fundamentally, you need to understand how different text and binary data are - when you want to convert between the two, you should do so carefully. If you want to represent "non text" binary data (i.e. bytes which aren't the direct result of encoding text) in a string you should use something like base64 or hex. When you want to encode a string as binary data (e.g. to write some text to disk) you should carefully consider which encoding to use. If another program is going to read your data, you need to work out what encoding it expects - if you have full control over it yourself, I'd usually go for UTF-8.
Additionally, the exception handling in your code is poor:
You should almost never catch Exception; catch more specific exceptions
You shouldn't just catch an exception and continue as if it had never happened. If you can't really handle the exception and still complete your method successfully, you should let the exception bubble up the stack (or possibly catch it and wrap it in a more appropriate exception type for your abstraction)
When you GZIP compress data, you always get binary data. This data cannot be converted into string as it is no valid character data (in any encoding).
So your compress method should return a byte array and your decompress method should take a byte array as its parameter.
Futhermore, I recommend you use an explicit encoding when you convert the string into a byte array before compression and when you turn the decompressed data into a string again.
When you GZIP compress data, you always get binary data. This data
cannot be converted into string as it is no valid character data (in
any encoding).
Codo is right, thanks a lot for enlightening me. I was trying to decompress a string (converted from the binary data). What I amended was using InflaterInputStream directly on the input stream returned by my http connection. (My app was retrieving a large JSON of strings)

How can I read a .txt file into a single Java string while maintaining line breaks?

Virtually every code example out there reads a TXT file line-by-line and stores it in a String array. I do not want line-by-line processing because I think it's an unnecessary waste of resources for my requirements: All I want to do is quickly and efficiently dump the .txt contents into a single String. The method below does the job, however with one drawback:
private static String readFileAsString(String filePath) throws java.io.IOException{
byte[] buffer = new byte[(int) new File(filePath).length()];
BufferedInputStream f = null;
try {
f = new BufferedInputStream(new FileInputStream(filePath));
f.read(buffer);
if (f != null) try { f.close(); } catch (IOException ignored) { }
} catch (IOException ignored) { System.out.println("File not found or invalid path.");}
return new String(buffer);
}
... the drawback is that the line breaks are converted into long spaces e.g. " ".
I want the line breaks to be converted from \n or \r to <br> (HTML tag) instead.
Thank you in advance.
What about using a Scanner and adding the linefeeds yourself:
sc = new java.util.Scanner ("sample.txt")
while (sc.hasNext ()) {
buf.append (sc.nextLine ());
buf.append ("<br />");
}
I don't see where you get your long spaces from.
You can read directly into the buffer and then create a String from the buffer:
File f = new File(filePath);
FileInputStream fin = new FileInputStream(f);
byte[] buffer = new byte[(int) f.length()];
new DataInputStream(fin).readFully(buffer);
fin.close();
String s = new String(buffer, "UTF-8");
You could add this code:
return new String(buffer).replaceAll("(\r\n|\r|\n|\n\r)", "<br>");
Is this what you are looking for?
The code will read the file contents as they appear in the file - including line breaks.
If you want to change the breaks into something else like displaying in html etc, you will either need to post process it or do it by reading the file line by line. Since you do not want the latter, you can replace your return by following which should do the conversion -
return (new String(buffer)).replaceAll("\r[\n]?", "<br>");
StringBuilder sb = new StringBuilder();
try {
InputStream is = getAssets().open("myfile.txt");
byte[] bytes = new byte[1024];
int numRead = 0;
try {
while((numRead = is.read(bytes)) != -1)
sb.append(new String(bytes, 0, numRead));
}
catch(IOException e) {
}
is.close();
}
catch(IOException e) {
}
your resulting String: String result = sb.toString();
then replace whatever you want in this result.
I agree with the general approach by #Sanket Patel, but using Commons I/O you would likely want File Utils.
So your code word look like:
String myString = FileUtils.readFileToString(new File(filePath));
There is also another version to specify an alternate character encoding.
You should try org.apache.commons.io.IOUtils.toString(InputStream is) to get file content as String. There you can pass InputStream object which you will get from
getAssets().open("xml2json.txt") *<<- belongs to Android, which returns InputStream*
in your Activity. To get String use this :
String xml = IOUtils.toString((getAssets().open("xml2json.txt")));
So,
String xml = IOUtils.toString(*pass_your_InputStream_object_here*);

Android Reading from an Input stream efficiently

I am making an HTTP get request to a website for an android application I am making.
I am using a DefaultHttpClient and using HttpGet to issue the request. I get the entity response and from this obtain an InputStream object for getting the html of the page.
I then cycle through the reply doing as follows:
BufferedReader r = new BufferedReader(new InputStreamReader(inputStream));
String x = "";
x = r.readLine();
String total = "";
while(x!= null){
total += x;
x = r.readLine();
}
However this is horrendously slow.
Is this inefficient? I'm not loading a big web page - www.cokezone.co.uk so the file size is not big. Is there a better way to do this?
Thanks
Andy
The problem in your code is that it's creating lots of heavy String objects, copying their contents and performing operations on them. Instead, you should use StringBuilder to avoid creating new String objects on each append and to avoid copying the char arrays. The implementation for your case would be something like this:
BufferedReader r = new BufferedReader(new InputStreamReader(inputStream));
StringBuilder total = new StringBuilder();
for (String line; (line = r.readLine()) != null; ) {
total.append(line).append('\n');
}
You can now use total without converting it to String, but if you need the result as a String, simply add:
String result = total.toString();
I'll try to explain it better...
a += b (or a = a + b), where a and b are Strings, copies the contents of both a and b to a new object (note that you are also copying a, which contains the accumulated String), and you are doing those copies on each iteration.
a.append(b), where a is a StringBuilder, directly appends b contents to a, so you don't copy the accumulated string at each iteration.
Have you tried the built in method to convert a stream to a string? It's part of the Apache Commons library (org.apache.commons.io.IOUtils).
Then your code would be this one line:
String total = IOUtils.toString(inputStream);
The documentation for it can be found here:
http://commons.apache.org/io/api-1.4/org/apache/commons/io/IOUtils.html#toString%28java.io.InputStream%29
The Apache Commons IO library can be downloaded from here:
http://commons.apache.org/io/download_io.cgi
Another possibility with Guava:
dependency: compile 'com.google.guava:guava:11.0.2'
import com.google.common.io.ByteStreams;
...
String total = new String(ByteStreams.toByteArray(inputStream ));
I believe this is efficient enough... To get a String from an InputStream, I'd call the following method:
public static String getStringFromInputStream(InputStream stream) throws IOException
{
int n = 0;
char[] buffer = new char[1024 * 4];
InputStreamReader reader = new InputStreamReader(stream, "UTF8");
StringWriter writer = new StringWriter();
while (-1 != (n = reader.read(buffer))) writer.write(buffer, 0, n);
return writer.toString();
}
I always use UTF-8. You could, of course, set charset as an argument, besides InputStream.
What about this. Seems to give better performance.
byte[] bytes = new byte[1000];
StringBuilder x = new StringBuilder();
int numRead = 0;
while ((numRead = is.read(bytes)) >= 0) {
x.append(new String(bytes, 0, numRead));
}
Edit: Actually this sort of encompasses both steelbytes and Maurice Perry's
Possibly somewhat faster than Jaime Soriano's answer, and without the multi-byte encoding problems of Adrian's answer, I suggest:
File file = new File("/tmp/myfile");
try {
FileInputStream stream = new FileInputStream(file);
int count;
byte[] buffer = new byte[1024];
ByteArrayOutputStream byteStream =
new ByteArrayOutputStream(stream.available());
while (true) {
count = stream.read(buffer);
if (count <= 0)
break;
byteStream.write(buffer, 0, count);
}
String string = byteStream.toString();
System.out.format("%d bytes: \"%s\"%n", string.length(), string);
} catch (IOException e) {
e.printStackTrace();
}
Maybe rather then read 'one line at a time' and join the strings, try 'read all available' so as to avoid the scanning for end of line, and to also avoid string joins.
ie, InputStream.available() and InputStream.read(byte[] b), int offset, int length)
Reading one line of text at a time, and appending said line to a string individually is time-consuming both in extracting each line and the overhead of so many method invocations.
I was able to get better performance by allocating a decent-sized byte array to hold the stream data, and which is iteratively replaced with a larger array when needed, and trying to read as much as the array could hold.
For some reason, Android repeatedly failed to download the entire file when the code used the InputStream returned by HTTPUrlConnection, so I had to resort to using both a BufferedReader and a hand-rolled timeout mechanism to ensure I would either get the whole file or cancel the transfer.
private static final int kBufferExpansionSize = 32 * 1024;
private static final int kBufferInitialSize = kBufferExpansionSize;
private static final int kMillisecondsFactor = 1000;
private static final int kNetworkActionPeriod = 12 * kMillisecondsFactor;
private String loadContentsOfReader(Reader aReader)
{
BufferedReader br = null;
char[] array = new char[kBufferInitialSize];
int bytesRead;
int totalLength = 0;
String resourceContent = "";
long stopTime;
long nowTime;
try
{
br = new BufferedReader(aReader);
nowTime = System.nanoTime();
stopTime = nowTime + ((long)kNetworkActionPeriod * kMillisecondsFactor * kMillisecondsFactor);
while(((bytesRead = br.read(array, totalLength, array.length - totalLength)) != -1)
&& (nowTime < stopTime))
{
totalLength += bytesRead;
if(totalLength == array.length)
array = Arrays.copyOf(array, array.length + kBufferExpansionSize);
nowTime = System.nanoTime();
}
if(bytesRead == -1)
resourceContent = new String(array, 0, totalLength);
}
catch(Exception e)
{
e.printStackTrace();
}
try
{
if(br != null)
br.close();
}
catch(IOException e)
{
// TODO Auto-generated catch block
e.printStackTrace();
}
}
EDIT: It turns out that if you don't need to have the content re-encoded (ie, you want the content AS IS) you shouldn't use any of the Reader subclasses. Just use the appropriate Stream subclass.
Replace the beginning of the preceding method with the corresponding lines of the following to speed it up an extra 2 to 3 times.
String loadContentsFromStream(Stream aStream)
{
BufferedInputStream br = null;
byte[] array;
int bytesRead;
int totalLength = 0;
String resourceContent;
long stopTime;
long nowTime;
resourceContent = "";
try
{
br = new BufferedInputStream(aStream);
array = new byte[kBufferInitialSize];
If the file is long, you can optimize your code by appending to a StringBuilder instead of using a String concatenation for each line.
byte[] buffer = new byte[1024]; // buffer store for the stream
int bytes; // bytes returned from read()
// Keep listening to the InputStream until an exception occurs
while (true) {
try {
// Read from the InputStream
bytes = mmInStream.read(buffer);
String TOKEN_ = new String(buffer, "UTF-8");
String xx = TOKEN_.substring(0, bytes);
To convert the InputStream to String we use the
BufferedReader.readLine() method. We iterate until the BufferedReader return null which means there's no more data to read. Each line will appended to a StringBuilder and returned as String.
public static String convertStreamToString(InputStream is) {
BufferedReader reader = new BufferedReader(new InputStreamReader(is));
StringBuilder sb = new StringBuilder();
String line = null;
try {
while ((line = reader.readLine()) != null) {
sb.append(line + "\n");
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
is.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return sb.toString();
}
}`
And finally from any class where you want to convert call the function
String dataString = Utils.convertStreamToString(in);
complete
I am use to read full data:
// inputStream is one instance InputStream
byte[] data = new byte[inputStream.available()];
inputStream.read(data);
String dataString = new String(data);
Note that this applies to files stored on disk and not to streams with no default size.

How to read a file into string in java?

I have read a file into a String. The file contains various names, one name per line. Now the problem is that I want those names in a String array.
For that I have written the following code:
String [] names = fileString.split("\n"); // fileString is the string representation of the file
But I am not getting the desired results and the array obtained after splitting the string is of length 1. It means that the "fileString" doesn't have "\n" character but the file has this "\n" character.
So How to get around this problem?
What about using Apache Commons (Commons IO and Commons Lang)?
String[] lines = StringUtils.split(FileUtils.readFileToString(new File("...")), '\n');
The problem is not with how you're splitting the string; that bit is correct.
You have to review how you are reading the file to the string. You need something like this:
private String readFileAsString(String filePath) throws IOException {
StringBuffer fileData = new StringBuffer();
BufferedReader reader = new BufferedReader(
new FileReader(filePath));
char[] buf = new char[1024];
int numRead=0;
while((numRead=reader.read(buf)) != -1){
String readData = String.valueOf(buf, 0, numRead);
fileData.append(readData);
}
reader.close();
return fileData.toString();
}
Particularly i love this one using the java.nio.file package also described here.
You can optionally include the Charset as a second argument in the String constructor.
String content = new String(Files.readAllBytes(Paths.get("/path/to/file")));
Cool huhhh!
As suggested by Garrett Rowe and Stan James you can use java.util.Scanner:
try (Scanner s = new Scanner(file).useDelimiter("\\Z")) {
String contents = s.next();
}
or
try (Scanner s = new Scanner(file).useDelimiter("\\n")) {
while(s.hasNext()) {
String line = s.next();
}
}
This code does not have external dependencies.
WARNING: you should specify the charset encoding as the second parameter of the Scanner's constructor. In this example I am using the platform's default, but this is most certainly wrong.
Here is an example of how to use java.util.Scanner with correct resource and error handling:
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
import java.util.Iterator;
class TestScanner {
public static void main(String[] args)
throws FileNotFoundException {
File file = new File(args[0]);
System.out.println(getFileContents(file));
processFileLines(file, new LineProcessor() {
#Override
public void process(int lineNumber, String lineContents) {
System.out.println(lineNumber + ": " + lineContents);
}
});
}
static String getFileContents(File file)
throws FileNotFoundException {
try (Scanner s = new Scanner(file).useDelimiter("\\Z")) {
return s.next();
}
}
static void processFileLines(File file, LineProcessor lineProcessor)
throws FileNotFoundException {
try (Scanner s = new Scanner(file).useDelimiter("\\n")) {
for (int lineNumber = 1; s.hasNext(); ++lineNumber) {
lineProcessor.process(lineNumber, s.next());
}
}
}
static interface LineProcessor {
void process(int lineNumber, String lineContents);
}
}
You could read your file into a List instead of a String and then convert to an array:
//Setup a BufferedReader here
List<String> list = new ArrayList<String>();
String line = reader.readLine();
while (line != null) {
list.add(line);
line = reader.readLine();
}
String[] arr = list.toArray(new String[0]);
There is no built-in method in Java which can read an entire file. So you have the following options:
Use a non-standard library method, such as Apache Commons, see the code example in romaintaz's answer.
Loop around some read method (e.g. FileInputStream.read, which reads bytes, or FileReader.read, which reads chars; both read to a preallocated array). Both classes use system calls, so you'll have to speed them up with bufering (BufferedInputStream or BufferedReader) if you are reading just a small amount of data (say, less than 4096 bytes) at a time.
Loop around BufferedReader.readLine. There has a fundamental problem that it discards the information whether there was a '\n' at the end of the file -- so e.g. it is unable to distinguish an empty file from a file containing just a newline.
I'd use this code:
// charsetName can be null to use the default charset.
public static String readFileAsString(String fileName, String charsetName)
throws java.io.IOException {
java.io.InputStream is = new java.io.FileInputStream(fileName);
try {
final int bufsize = 4096;
int available = is.available();
byte[] data = new byte[available < bufsize ? bufsize : available];
int used = 0;
while (true) {
if (data.length - used < bufsize) {
byte[] newData = new byte[data.length << 1];
System.arraycopy(data, 0, newData, 0, used);
data = newData;
}
int got = is.read(data, used, data.length - used);
if (got <= 0) break;
used += got;
}
return charsetName != null ? new String(data, 0, used, charsetName)
: new String(data, 0, used);
} finally {
is.close();
}
}
The code above has the following advantages:
It's correct: it reads the whole file, not discarding any byte.
It lets you specify the character set (encoding) the file uses.
It's fast (no matter how many newlines the file contains).
It doesn't waste memory (no matter how many newlines the file contains).
FileReader fr=new FileReader(filename);
BufferedReader br=new BufferedReader(fr);
String strline;
String arr[]=new String[10];//10 is the no. of strings
while((strline=br.readLine())!=null)
{
arr[i++]=strline;
}
The simplest solution for reading a text file line by line and putting the results into an array of strings without using third party libraries would be this:
ArrayList<String> names = new ArrayList<String>();
Scanner scanner = new Scanner(new File("names.txt"));
while(scanner.hasNextLine()) {
names.add(scanner.nextLine());
}
scanner.close();
String[] namesArr = (String[]) names.toArray();
I always use this way:
String content = "";
String line;
BufferedReader reader = new BufferedReader(new FileReader(...));
while ((line = reader.readLine()) != null)
{
content += "\n" + line;
}
// Cut of the first newline;
content = content.substring(1);
// Close the reader
reader.close();
You can also use java.nio.file.Files to read an entire file into a String List then you can convert it to an array etc. Assuming a String variable named filePath, the following 2 lines will do that:
List<String> strList = Files.readAllLines(Paths.get(filePath), Charset.defaultCharset());
String[] strarray = strList.toArray(new String[0]);
A simpler (without loops), but less correct way, is to read everything to a byte array:
FileInputStream is = new FileInputStream(file);
byte[] b = new byte[(int) file.length()];
is.read(b, 0, (int) file.length());
String contents = new String(b);
Also note that this has serious performance issues.
If you have only InputStream, you can use InputStreamReader.
SmbFileInputStream in = new SmbFileInputStream("smb://host/dir/file.ext");
InputStreamReader r=new InputStreamReader(in);
char buf[] = new char[5000];
int count=r.read(buf);
String s=String.valueOf(buf, 0, count);
You can add cycle and StringBuffer if needed.
You can try Cactoos:
import org.cactoos.io.TextOf;
import java.io.File;
new TextOf(new File("a.txt")).asString().split("\n")
Fixed Version of #Anoyz's answer:
import java.io.FileInputStream;
import java.io.File;
public class App {
public static void main(String[] args) throws Exception {
File f = new File("file.txt");
long fileSize = f.length();
String file = "test.txt";
FileInputStream is = new FileInputStream("file.txt");
byte[] b = new byte[(int) f.length()];
is.read(b, 0, (int) f.length());
String contents = new String(b);
}
}

Categories