How do I open and read an ASCII file? I'm working on opening and retrieving contents of the file and analying it with graphs.
Textbased files should be opened with a java.io.Reader. Easiest way would be using a BufferedReader to read it line by line in a loop.
Here's a kickoff example:
BufferedReader reader = null;
try {
reader = new BufferedReader(new FileReader("/path/to/file.txt"));
for (String line; (line = reader.readLine()) != null;) {
// Do your thing with the line. This example is just printing it.
System.out.println(line);
}
} finally {
// Always close resources in finally!
if (reader != null) try { reader.close(); } catch (IOException ignore) {}
}
To breakdown the file content further in tokens, you may find a Scanner more useful.
See also:
Java IO tutorial
Scanner tutorial
Just open the file via the java.io methods. Show us what you've tried first, eh?
Using Guava, you could:
String s = Files.toString(new File("/path/to/file"), Charsets.US_ASCII));
More information in the javadoc.
It's probably enormous overkill to include the library for this one thing. However there are lots of useful other things in there. This approach also has the downside that it reads the entire file into memory at once, which might be unpalatable depending on the size of your file. There are alternative APIs in guava you can use for streaming lines too in a slightly more convenient way than directly using the java.io readers.
Related
Can someone say that code below is good way to read all file? Maybe the first code block is using the ready method in a wrong way.
try (var br = new BufferedReader(new FileReader("/some-file.txt"))) {
while (br.ready()) {
System.out.println(br.readLine());
}
}
Or maybe is better approach to read file without method ready?
try (var br = new BufferedReader(new FileReader("/some-file.txt"))) {
while (true) {
var line = br.readLine();
if (line == null) break;
System.out.println(line);
}
}
I tested that two blocks and all blocks print all file content, but I've never saw the first way over internet.
Here's the documentation of BufferedReader#ready():
Tells whether this stream is ready to be read. A buffered character stream is ready if the buffer is not empty, or if the underlying character stream is ready.
[...]
Returns:
True if the next read() is guaranteed not to block for input, false otherwise. Note that returning false does not guarantee that the next read will block.
So, this method is about whether or not the next read will block. You're trying to read the whole file in one go, which means you don't really care if the next read will block. Worse, what if the reader is not ready? Your loop will break, you'll close the file, and the code will continue on without having read the whole source.
A typical way to code what you're doing is:
try (var reader = new BufferedReader(new FileReader("/some-file.txt"))) {
String line;
while ((line = reader.readLine()) != null) {
System.out.println(line);
}
}
Since readLine() is contracted to return null only when the end of the stream is reached.
Note of caution though. The above did not specify a charset, which means the default charset is used (see documentation for Java 17 and before and Java 18+). You can use FileReader(String,Charset) to specify a charset.
There's also the java.nio.file.* API that you can use to do the same thing. For example:
try (var stream = Files.lines(Path.of("/some-file.txt"))) {
stream.forEachOrdered(System.out::println);
}
The above uses the UTF-8 charset. You can use Files#lines(Path,Charset) to use a specific charset.
I think everyone does it in a different way (you could also read each byte using loops) but there is a way to read the whole file at once without loops
String file = new String(Files.readAllBytes(Paths.get("file")));
You can also use Files.readAllLines depending on what you want to do with data in file.
edit. First comment under your question also shows a better way
I'm actually not that great and am relatively new at Java. I wish to receive input from the user, and want to input this data into an external application.
This application processes the data and provides an output. I wish to retrieve this output using the Java code.
I have attempted in doing this but, I haven't got the slightest idea on how to start this script.
Nothin' on the internet seems to answer this question. If you have any idea or any new functions that can be useful, please help me in doing so.
Since I'm starting from ground zero, any help is appreciated.
Thanks so much.
To communicate with an external application you need to first define the communication way. For example:
Will this application read the output from a file?
If that statement it's true, then you need to learn serialization:
Will this application read the input from the standard output (like a command-line application)
If that statement it's true then you need to send with System.out.print().
Will this application get the data over HTTP.
Then you need to learn about REST and or RPC architectures.
Assuming that it will be a command-line application, then you could use something like this:
public class App
{
public static void main(String... args)
{
// You need to implement your business logic here. Not just print whatever the user passes as arguments of the command-line.
for(String arg : args)
{
System.out.print(arg);
}
}
}
There's a lot going on here but I'll suggest an example for each part of this question and assume this is just going to be written in Java, and suggesting an iterative design/development approach.
receive input from the user::getting arguments from the command line can work, but I think most users want to use familiar user interfaces like excel to input large amounts of data. Have them export files to .csv or look into reading excel files directly with apache poi. The latter is not for beginners, but not terrible to figure out or find examples. The former should be easy to figure out if you look into reading files and splitting them line by line on the delimiter. Here's an example of that:
try (BufferedReader reader = new BufferedReader(new FileReader(new File("user_input.csv"))) {
String currentLine = reader.readLine();
while (currentLine != null) {
String splitLine[] = currentLine.split(","); //choose delimiter here
//process cells as needed
//write output somewhere so other program can read it later
currentLine = reader.readLine();
}
}
catch (IOException ex) {
System.out.println(ex.getMessage()); //maybe write to an error log
System.exit(1);
}
"input" data to other app::you can use pipes if you're at the command line. but I'd recommend you write to a file and have the other app read it. here's an expansion of the previous code snippet showing how to write to a file as that might be more practical and easier to log/archive/debug.
try (BufferedReader reader = new BufferedReader(new FileReader(new File("user_input.csv")));
BufferedWriter writer = new BufferedWriter(new FileWriter(new File("process_me.csv")))) {
String currentLine = reader.readLine();
while (currentLine != null) {
String splitLine[] = currentLine.split(","); //choose delimiter here
//process cells as needed
writer.write(processed_stuff);
currentLine = reader.readLine();
}
}
catch (IOException ex) {
System.err.println(ex.getMessage());
System.exit(1);
}
Then retrieving output::can just be reading another file with another Java program. This way you're communicating between programs using the file system. You must agree upon file formats and directories though. And you'll be limited to having both programs on the same server.
To make this at scale, you could use web services assuming the other program you're making requests to is a web service or has one wrapped around it. You can send your file and receive some response using URLConnection. This is where things will get much more complex, but now everything in your new program is just one Java program and the other code can live on another server.
Building the app first with those "intermediate" files between the user input code, the external code, and the final code will help you focus on perfecting the business logic, then you can worry about just communication over the network.
I see some posts on StackOverflow that contradict each other, and I would like to get a definite answer.
I started with the assumption that using a Java InputStream would allow me to stream bytes out of a file, and thus save on memory, as I would not have to consume the whole file at once. And that is exactly what I read here:
Loading all bytes to memory is not a good practice. Consider returning the file and opening an input stream to read it, so your application won't crash when handling large files. – andrucz
Download file to stream instead of File
But then I used an InputStream to read a very large Microsoft Excel file (using the Apache POI library) and I ran into this error:
java.lang.outofmemory exception while reading excel file (xlsx) using POI
I got an OutOfMemory error.
And this crucial bit of advice saved me:
One thing that'll make a small difference is when opening the file to start with. If you have a file, then pass that in! Using an InputStream requires buffering of everything into memory, which eats up space. Since you don't need to do that buffering, don't!
I got rid of the InputStream and just used a bare java.io.File, and then the OutOfMemory error went away.
So using java.io.File is better than an InputSteam, when it comes to memory use? That doesn't make any sense.
What is the real answer?
So you are saying that an InputStream would typically help?
It entirely depends on how the application (or library) >>uses<< the InputStream
With what kind of follow up code? Could you offer an example of memory efficient Java?
For example:
// Efficient use of memory
try (InputStream is = new FileInputStream(largeFileName);
BufferedReader br = new BufferedReader(new InputStreamReader(is))) {
String line;
while ((line = br.readLine()) != null) {
// process one line
}
}
// Inefficient use of memory
try (InputStream is = new FileInputStream(largeFileName);
BufferedReader br = new BufferedReader(new InputStreamReader(is))) {
StringBuilder sb = new StringBuilder();
while ((line = br.readLine()) != null) {
sb.append(line).append("\n");
}
String everything = sb.toString();
// process the entire string
}
// Very inefficient use of memory
try (InputStream is = new FileInputStream(largeFileName);
BufferedReader br = new BufferedReader(new InputStreamReader(is))) {
String everything = "";
while ((line = br.readLine()) != null) {
everything += line + "\n";
}
// process the entire string
}
(Note that there are more efficient ways of reading a file into memory. The above examples are purely to illustrate the principles.)
The general principles here are:
avoid holding the entire file in memory, all at the same time
if you have to hold the entire file in memory, then be careful about you "accumulate" the characters.
The posts that you linked to above:
The first one is not really about memory efficiency. Rather it is talking about a limitation of the AWS client-side library. Apparently, the API doesn't provide an easy way to stream an object while reading it. You have to save it the object to a file, then open the file as a stream. Whether that is memory efficient or not depends on what the application does with the stream; see above.
The second one specific to the POI APIs. Apparently, the POI library itself is reading the stream contents into memory if you use a stream. That would be an implementation limitation of that particular library. (But there could be a good reason; e.g. maybe because POI needs to be able to "seek" or "rewind" the stream.)
I am trying to save and load files on a project that is coded on libgdx. Which means that i cant use a buffered reader because android wont read it.. and i cant move the project to android because it has to be in the core... after days and days or understanding all.. now i am trying File handing which should work right?? but i cant get it to read line by line.. it puts all the text in on string.. Help plzz.. also is my understanding correct and saving and loading is waaaay more complicated than it should be?? here is the code..
FileHandle handle = Gdx.files.local("words.txt");
String text = handle.readString();
words.add(text);
There are several ways to read this line by line. When your reading a file in using the LibGDX FileHandle API which include strings, byte arrays and into various readers; there are several ways to read the data in. I am assuming you have some form of dictionary in this file, with the words in a list separated by newlines? If this is the case you can take your existing string and split on the new line terminator.
FileHandle handle = Gdx.files.local("words.txt");
String text = handle.readString();
String wordsArray[] = text.split("\\r?\\n");
for(String word : wordsArray) {
words.add(word);
}
There's only really two newlines (UNIX and Windows) that you need to worry about.
FileHandle API
This is to all of you out there new to saving and loading and tired of looking for answers.. let me save u the trouble and days of research...
If you start a project in libgdx and want to save load on android.. Do not follow the buffered reader or inputstreamer or any of these tutorials THEY WILL NOT WORK because for some reason android cannot read inside the assest folder.. it will work on ur desktop version only..
if you are using android studios alone then go ahead with the try catch buffered or file or inputstreamer..
Also the Context.. asset manager.. and that route WILL NOT WORK because the project has to be in your android folder not core to use these libraries..
ELSE FOLLOW THE ABOVE METHOD..
classpath.. internal.. external .. or local ... depending on where you store ur file!!!.. your welcome
String str ="";
StringBuffer buf = new StringBuffer();
FileHandle file = Gdx.files.internal("text.txt");
try {
BufferedReader reader = new BufferedReader(new InputStreamReader(file.read()));
if (is != null) {
while ((str = reader.readLine()) != null) {
buf.append(str + "\n" );
}
}
} finally {
try { is.close(); } catch (Throwable ignore) {}
}
I have a Java code that reads through an input file using a buffer reader until the readLine() method returns null. I need to use the contents of the file again indefinite number of times. How can I read this file from beginning again?
You can close and reopen it again. Another option: if it is not too large, put its content into, say, a List.
Buffer reader supports reset() to a position of buffered data only. But this cant goto the begin of file (suppose that file larger than buffer).
Solutions:
1.Reopen
2.Use RandomAccessFile
A single Reader should be used once to read the file. If you want to read the file again, create a new Reader based on it.
Using Guava's IO utilities, you can create a nice abstraction that lets you read the file as many times as you want using Files.newReaderSupplier(File, Charset). This gives you an InputSupplier<InputStreamReader> that you can retrieve a new Reader from by calling getInput() at any time.
Even better, Guava has many utility methods that make use of InputSuppliers directly... this saves you from having to worry about closing the supplied Reader yourself. The CharStreams class contains most of the text-related IO utilities. A simple example:
public void doSomeStuff(InputSupplier<? extends Reader> readerSupplier) throws IOException {
boolean needToDoMoreStuff = true;
while (needToDoMoreStuff) {
// this handles creating, reading, and closing the Reader!
List<String> lines = CharStreams.readLines(readerSupplier);
// do some stuff with the lines you read
}
}
Given a File, you could call this method like:
File file = ...;
doSomeStuff(Files.newReaderSupplier(file, Charsets.UTF_8)); // or whatever charset
If you want to do some processing for each line without reading every line into memory first, you could alternatively use the readLines overload that takes a LineProcessor.
you do this by calling the run() function recursively, after checking to see if no more lines can be read - here's a sample
// Reload the file when you reach the end (i.e. when you can't read anymore strings)
if ((sCurrentLine = br.readLine()) == null) {
run();
}
If you want to do this, you may want to consider a random access file. With that you can explicitly set the position back to the beginning and start reading again from there.
i would suggestion usings commons libraries
http://commons.apache.org/io/api-release/org/apache/commons/io/FileUtils.html
i think there is a call to just read the file into a byteArray which might be an alternate approach
Not sure if you have considered the mark() and reset() methods on the BufferedReader
that can be an option if your files are only a few MBs in size and you can set the mark at the beginning of the file and keep reset()ing once you hit the end of the file. It also appears that subsequent reads on the same file will be served entirely from the buffer without having to go to the disk.
I faced with the same issue and came wandering to this question.
1. Using mark() and reset() methods:
BufferedReader can be created using a FileReader and also a FileInputStream. FileReader doesn't support Mark and Reset methods. I got an exception while I tried to do this. Even when I tried with FileInputStream I wasn't able to do it because my file was large (even your's is I guess). If the file length is larger than the buffer then mark and reset methods won't work neither with FileReader not with FileInputStream. More on this in this answer by #jtahlborn.
2. Closing and reopening the file
When I closed and reopened the file and created a new BufferedReader, it worked well.
The ideal way I guess is to reopen the file again and construct a new BufferedReader as a FileReader or FileInputStream should be used only once to read the file.
try {
BufferedReader br = new BufferedReader(new FileReader(input));
while ((line = br.readLine()) != null)
{
//do somethng
}
br.close();
}
catch(IOException e)
{
System.err.println("Error: " + e.getMessage());
}
}