read any file efficiently in java as string - java

i'm working on a simple implementation of Huffman coding and it works fine for any files using some form of text encoding but when i try to read in any other format (e.g. .mp4 .png .exe) it still works but becomes extremely slow
(minutes instead of less than a second for the same size of file).
my question is is there another method i should be using to read these files so that the read speed depends on the size of the file not its format and if so what is it? thanks.
this is my IO class it uses a fileReader wrapped in a bufferedReader to read files based on a path entered in the console.
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
public class IO {
public String readFile(String path, boolean includeNewLine) {
String returnString = "";
try {
FileReader fileReader = new FileReader(path);
BufferedReader bufferedReader = new BufferedReader(fileReader);
String line;
int nLines = 0;
while((line = bufferedReader.readLine()) != null) {
if(nLines > 0 && includeNewLine) {
returnString += "\n";
}
returnString += line;
nLines++;
}
bufferedReader.close();
} catch(FileNotFoundException e) {
System.out.println("Unable to open file '" + path + "'");
} catch(IOException e) {
System.out.println("Error reading file '" + path + "'");
}
return returnString;
}
}

Maybe this will help: FileInputStream vs FileReader
And, of course, change your method to use StringBuilder (but that's another issue).

With returnString you are creating new instance of String by appending the new line to previous line. Instead i would suggest you use StringBuilder as follows:
StringBuilder fileContent = new StringBuilder();
//do your stuff
fileContent.append(line);
In this way, you keep on reusing the same builder object. Also if you are reading binary content then better use class from InputStream hierarchy.
We do have Files class from nio package which you could use to get lines as below instead:
try (Stream<String> stream = Files.lines( Paths.get(filePath), StandardCharsets.UTF_8)) {
stream.forEach(s -> fileContent.append(s).append("\n"));
}
Another way, would be to use already tested code provided by Apache commons IO api FileUtils.readFileToString

As long as you are trying to interpret the file as a String you'll be running into problems with efficiency. Any binary format may produce a huge string, even exceeding the 64K maximum a string can hold as there may never be a byte you'll interpret as a end of line character ('\n').
You should interpret your file as a sequence of bytes. Use a memory mapped ByteBuffer for maximum efficiency.

Related

How to load a text file to a string variable in java

I'm pretty new in the programming world, and i can't find a good explanation on how to to load a txt file to a string variable in java using eclpise.
So far, from what i have been able to understand, i am supposed to use the StdIn class, and i know that the txt file need to be located in my eclipse workspace (outside the source folder) but i don't know what excatly i need to write in the code to get the given file to load into the variable.
I could really use some help with this.
Although I'm not a Java expert, I'm pretty sure this is the information you're looking for It looks like this:
static String readFile(String path, Charset encoding)
throws IOException
{
byte[] encoded = Files.readAllBytes(Paths.get(path));
return new String(encoded, encoding);
}
Basically all languages provide you with some methods to read from the file system you're in. Hope that does it for you!
Good luck with your project!
to read a file and store it in a String you can do it by using either String or StringBuilder:
you need to define BufferedReader to with constructor of FileReader to pass the name of the file and make it ready to read from file.
use StringBuilder to append every line of result to it.
when the reading finished add the result to String data.
public static void main(String[] args) {
String data = "";
try {
BufferedReader br = new BufferedReader(new FileReader("filename"));
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
sb.append(line);
sb.append("\n");
line = br.readLine();
}
data = sb.toString();
} catch (Exception e) {
e.printStackTrace();
}
}

How to replace a line with a new line using Java

Using a Buffer reader I parse throughout a file. If Oranges: pattern is found, I want to replace it with ApplesAndOranges.
try (BufferedReader br = new BufferedReader(new FileReader(resourcesFilePath))) {
String line;
while ((line = br.readLine()) != null) {
if (line.startsWith("Oranges:")){
int startIndex = line.indexOf(":");
line = line.substring(startIndex + 2);
String updatedLine = "ApplesAndOranges";
updateLine(line, updatedLine);
I call a method updateLine and I pass my original line as well as the updated line value.
private static void updateLine(String toUpdate, String updated) throws IOException {
BufferedReader file = new BufferedReader(new FileReader(resourcesFilePath));
PrintWriter writer = new PrintWriter(new File(resourcesFilePath+".out"), "UTF-8");
String line;
while ((line = file.readLine()) != null)
{
line = line.replace(toUpdate, updated);
writer.println(line);
}
file.close();
if (writer.checkError())
throw new IOException("Can't Write To File"+ resourcesFilePath);
writer.close();
}
To get the file to update I have to save it with a different name (resourcesFilePath+".out"). If I use the original file name the saved version become blank.
So here is my question, how can I replace a line with any value in the original file without losing any data.
For this you need to use the regular expressions (RegExp) like this:
str = str.replaceAll("^Orange:(.*)", "OrangeAndApples:$1");
It's an example and maybe it's not excactly what you want, but here, in the first parameter, the expression in parentesis is called a capturing group. The expression found will be replaced by the second parameter and the $1 will be replaced by the value of the capturing group. In our example Orange:Hello at the beggining of a line will be replaced by OrangeAndApples:Hello.
In your code, it seams you create one file per line ... maybe inlining the sub-method would be better.
try (
BufferedReader br = new BufferedReader(new FileReader(resourcesFilePath));
BufferedWriter writer = Files.newBufferedWriter(outputFilePath, charset);
) {
String line;
while ((line = br.readLine()) != null) {
String repl = line.replaceAll("Orange:(.*)","OrangeAndApples:$1");
writer.writeln(repl);
}
}
The easiest way to write over everything in your original final would be to read in everything - changing whatever you want to change and closing the stream. Afterwards open up the file again, then overwrite the file and all its lines with the data you want.
You can use RandomAccessFile to write to the file, and nio.Files to read the bytes from it. In this case, I put it as a string.
You can also read the file with RandomAccessFile, but it is easier to do it this way, in my opinion.
import java.io.RandomAccessFile;
import java.io.File;
import java.io.IOException;
import java.nio.file.*;
public void replace(File file){
try {
RandomAccessFile raf = new RandomAccessFile(file, "rw");
Path p = Paths.get(file.toURI());
String line = new String(Files.readAllBytes(p));
if(line.startsWith("Oranges:")){
line.replaceAll("Oranges:", "ApplesandOranges:");
raf.writeUTF(line);
}
raf.close();
} catch (IOException e) {
e.printStackTrace();
}
}

Bufferedreader explanation?

BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
System.in(Standard input stream)- gets the input from keyboard in bytes
InputStreamReader: Converts the bytes into Unicode characters/ converts the standard input into reader object to be used with BufferedReader
Finally BufferedReader: Used to read from character input stream(Input stream reader)
String c = br.ReadLine(); -- a method used to read characters from input stream and put them in the string in one go not byte by byte.
Is everything above right ? Please correct if anything wrong !
Nearly there, but this:
String c = br.readLine(); -- a method used to read characters from input stream and put them in the string in one go not byte by byte.
It reads characters from the input reader (BufferedReader doesn't know about streams) and returns a whole line in one go, not character by character. Think of it in layers, and "above" the InputStreamReader layer, the concept of "bytes" doesn't exist any more.
Also, note that you can read blocks of characters with a Reader without reading a line: read(char[], int, int) - the point of readLine() is that it will do the line ending detection for you.
(As noted in comments, it's also readLine, not ReadLine :)
What is the purpose of BufferedReader, explanation?
Bufferedreader is a java class, the following is the hierarchy of this class.
java.lang.Object ==> java.io.Reader ==> java.io.BufferedReader
Also, BufferedReader provides an efficient way to read content. Very Simple..
Let's have a look at the following example to understand.
import java.io.BufferedReader;
import java.io.FileReader;
public class Main {
public static void main(String[] args) {
BufferedReader contentReader = null;
int total = 0; // variable total hold the number that we will add
//Create instance of class BufferedReader
//FileReader is built in class that takes care of the details of reading content from a file
//BufferedReader is something that adds some buffering on top of that to make reading fom a file more efficient.
try{
contentReader = new BufferedReader(new FileReader("c:\\Numbers.txt"));
String line = null;
while((line = contentReader.readLine()) != null)
total += Integer.valueOf(line);
System.out.println("Total: " + total);
}
catch(Exception e)
{
System.out.println(e.getMessage());
}
finally{
try{
if(contentReader != null)
contentReader.close();
}
catch (Exception e)
{
System.out.println(e.getMessage());
}
}
}
}

Search and Replace in a file using arraylist, Java

I wrote the below part of the code but I couldn't bind the arraylist with search and replace
so my csv file is as like below
1/1/1;7/6/1
1/1/2;7/7/1
I want to search the file 1.cfg for 1/1/1 and change it to 7/6/1 and 1/1/2 change to 7/7/1 and it goes so on.
Thank you all in advance
It's now only printing in a new file only the last line of the old File
import java.io.*;
import java.util.ArrayList;
import java.util.List;
public class ChangeConfiguration {
/**
* #param args
* #throws IOException
*/
public static void main(String[] args)
{
try{
// Open the file that is the first
// command line parameter
FileInputStream degistirilecek = new FileInputStream("c:/Config_Changer.csv");
FileInputStream config = new FileInputStream("c:/1.cfg");
// Get the object of DataInputStream
DataInputStream in = new DataInputStream(config);
DataInputStream degistir = new DataInputStream(degistirilecek);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
BufferedReader brdegis = new BufferedReader(new InputStreamReader(degistir));
List<Object> arrayLines = new ArrayList<Object>();
Object contents;
while ((contents = brdegis.readLine()) != null)
{
arrayLines.add(contents);
}
System.out.println(arrayLines + "\n");
String strLine;
//Read File Line By Line
while ((strLine = br.readLine()) != null) {
//Couldn't modify this part error is here :(
BufferedWriter out = new BufferedWriter(new FileWriter("c:/1_new.cfg"));
out.write(strLine);
out.close();
}
in.close();
degistir.close();
}catch (Exception e){//Catch exception if any
System.err.println("Error: " + e.getMessage());
}
}
}
You are opening the file for reading when you declare:
BufferedReader br = new BufferedReader(new InputStreamReader(in));
If you know the entire file will fit in memory, I recommend doing the following :
Open the file and read it's contents in memory into a giant string, then close the file.
Apply your replace in one shot to the giant string.
Open the file and write (e.g use a BufferedWriter) out the contents of the giant string, then close the file.
As a side note, your code as posted will not compile. The quality of the responses you receive are correlated with the quality of the question asked. Always include an SCCE with your question to increase the chance of getting a precise answer to your question.
can you elaborate the purpose of the program?
if it is a simple content replacement in a file.
then just read a line and store it in a string. then use string replace method for replacing a text in a string.
eg:
newStrog=oldString.replace(oldVlue,newValue);

Java: How to read a text file

I want to read a text file containing space separated values. Values are integers.
How can I read it and put it in an array list?
Here is an example of contents of the text file:
1 62 4 55 5 6 77
I want to have it in an arraylist as [1, 62, 4, 55, 5, 6, 77]. How can I do it in Java?
You can use Files#readAllLines() to get all lines of a text file into a List<String>.
for (String line : Files.readAllLines(Paths.get("/path/to/file.txt"))) {
// ...
}
Tutorial: Basic I/O > File I/O > Reading, Writing and Creating text files
You can use String#split() to split a String in parts based on a regular expression.
for (String part : line.split("\\s+")) {
// ...
}
Tutorial: Numbers and Strings > Strings > Manipulating Characters in a String
You can use Integer#valueOf() to convert a String into an Integer.
Integer i = Integer.valueOf(part);
Tutorial: Numbers and Strings > Strings > Converting between Numbers and Strings
You can use List#add() to add an element to a List.
numbers.add(i);
Tutorial: Interfaces > The List Interface
So, in a nutshell (assuming that the file doesn't have empty lines nor trailing/leading whitespace).
List<Integer> numbers = new ArrayList<>();
for (String line : Files.readAllLines(Paths.get("/path/to/file.txt"))) {
for (String part : line.split("\\s+")) {
Integer i = Integer.valueOf(part);
numbers.add(i);
}
}
If you happen to be at Java 8 already, then you can even use Stream API for this, starting with Files#lines().
List<Integer> numbers = Files.lines(Paths.get("/path/to/test.txt"))
.map(line -> line.split("\\s+")).flatMap(Arrays::stream)
.map(Integer::valueOf)
.collect(Collectors.toList());
Tutorial: Processing data with Java 8 streams
Java 1.5 introduced the Scanner class for handling input from file and streams.
It is used for getting integers from a file and would look something like this:
List<Integer> integers = new ArrayList<Integer>();
Scanner fileScanner = new Scanner(new File("c:\\file.txt"));
while (fileScanner.hasNextInt()){
integers.add(fileScanner.nextInt());
}
Check the API though. There are many more options for dealing with different types of input sources, differing delimiters, and differing data types.
This example code shows you how to read file in Java.
import java.io.*;
/**
* This example code shows you how to read file in Java
*
* IN MY CASE RAILWAY IS MY TEXT FILE WHICH I WANT TO DISPLAY YOU CHANGE WITH YOUR OWN
*/
public class ReadFileExample
{
public static void main(String[] args)
{
System.out.println("Reading File from Java code");
//Name of the file
String fileName="RAILWAY.txt";
try{
//Create object of FileReader
FileReader inputFile = new FileReader(fileName);
//Instantiate the BufferedReader Class
BufferedReader bufferReader = new BufferedReader(inputFile);
//Variable to hold the one line data
String line;
// Read file line by line and print on the console
while ((line = bufferReader.readLine()) != null) {
System.out.println(line);
}
//Close the buffer reader
bufferReader.close();
}catch(Exception e){
System.out.println("Error while reading file line by line:" + e.getMessage());
}
}
}
Look at this example, and try to do your own:
import java.io.*;
public class ReadFile {
public static void main(String[] args){
String string = "";
String file = "textFile.txt";
// Reading
try{
InputStream ips = new FileInputStream(file);
InputStreamReader ipsr = new InputStreamReader(ips);
BufferedReader br = new BufferedReader(ipsr);
String line;
while ((line = br.readLine()) != null){
System.out.println(line);
string += line + "\n";
}
br.close();
}
catch (Exception e){
System.out.println(e.toString());
}
// Writing
try {
FileWriter fw = new FileWriter (file);
BufferedWriter bw = new BufferedWriter (fw);
PrintWriter fileOut = new PrintWriter (bw);
fileOut.println (string+"\n test of read and write !!");
fileOut.close();
System.out.println("the file " + file + " is created!");
}
catch (Exception e){
System.out.println(e.toString());
}
}
}
Just for fun, here's what I'd probably do in a real project, where I'm already using all my favourite libraries (in this case Guava, formerly known as Google Collections).
String text = Files.toString(new File("textfile.txt"), Charsets.UTF_8);
List<Integer> list = Lists.newArrayList();
for (String s : text.split("\\s")) {
list.add(Integer.valueOf(s));
}
Benefit: Not much own code to maintain (contrast with e.g. this). Edit: Although it is worth noting that in this case tschaible's Scanner solution doesn't have any more code!
Drawback: you obviously may not want to add new library dependencies just for this. (Then again, you'd be silly not to make use of Guava in your projects. ;-)
Use Apache Commons (IO and Lang) for simple/common things like this.
Imports:
import org.apache.commons.io.FileUtils;
import org.apache.commons.lang3.ArrayUtils;
Code:
String contents = FileUtils.readFileToString(new File("path/to/your/file.txt"));
String[] array = ArrayUtils.toArray(contents.split(" "));
Done.
Using Java 7 to read files with NIO.2
Import these packages:
import java.nio.charset.Charset;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
This is the process to read a file:
Path file = Paths.get("C:\\Java\\file.txt");
if(Files.exists(file) && Files.isReadable(file)) {
try {
// File reader
BufferedReader reader = Files.newBufferedReader(file, Charset.defaultCharset());
String line;
// read each line
while((line = reader.readLine()) != null) {
System.out.println(line);
// tokenize each number
StringTokenizer tokenizer = new StringTokenizer(line, " ");
while (tokenizer.hasMoreElements()) {
// parse each integer in file
int element = Integer.parseInt(tokenizer.nextToken());
}
}
reader.close();
} catch (Exception e) {
e.printStackTrace();
}
}
To read all lines of a file at once:
Path file = Paths.get("C:\\Java\\file.txt");
List<String> lines = Files.readAllLines(file, StandardCharsets.UTF_8);
All the answers so far given involve reading the file line by line, taking the line in as a String, and then processing the String.
There is no question that this is the easiest approach to understand, and if the file is fairly short (say, tens of thousands of lines), it'll also be acceptable in terms of efficiency. But if the file is long, it's a very inefficient way to do it, for two reasons:
Every character gets processed twice, once in constructing the String, and once in processing it.
The garbage collector will not be your friend if there are lots of lines in the file. You're constructing a new String for each line, and then throwing it away when you move to the next line. The garbage collector will eventually have to dispose of all these String objects that you don't want any more. Someone's got to clean up after you.
If you care about speed, you are much better off reading a block of data and then processing it byte by byte rather than line by line. Every time you come to the end of a number, you add it to the List you're building.
It will come out something like this:
private List<Integer> readIntegers(File file) throws IOException {
List<Integer> result = new ArrayList<>();
RandomAccessFile raf = new RandomAccessFile(file, "r");
byte buf[] = new byte[16 * 1024];
final FileChannel ch = raf.getChannel();
int fileLength = (int) ch.size();
final MappedByteBuffer mb = ch.map(FileChannel.MapMode.READ_ONLY, 0,
fileLength);
int acc = 0;
while (mb.hasRemaining()) {
int len = Math.min(mb.remaining(), buf.length);
mb.get(buf, 0, len);
for (int i = 0; i < len; i++)
if ((buf[i] >= 48) && (buf[i] <= 57))
acc = acc * 10 + buf[i] - 48;
else {
result.add(acc);
acc = 0;
}
}
ch.close();
raf.close();
return result;
}
The code above assumes that this is ASCII (though it could be easily tweaked for other encodings), and that anything that isn't a digit (in particular, a space or a newline) represents a boundary between digits. It also assumes that the file ends with a non-digit (in practice, that the last line ends with a newline), though, again, it could be tweaked to deal with the case where it doesn't.
It's much, much faster than any of the String-based approaches also given as answers to this question. There is a detailed investigation of a very similar issue in this question. You'll see there that there's the possibility of improving it still further if you want to go down the multi-threaded line.
read the file and then do whatever you want
java8
Files.lines(Paths.get("c://lines.txt")).collect(Collectors.toList());

Categories