Reading a Text File in Java and Converting to Multiple Vectors - java

I am fairly new to Java, and I am trying to make it read a file.
I have a data text file that needs to be imported into Java as arrays.
The first row of data file is the names row. All the variables has their names in the first row and the data are clustered in the columns. I just want to import this into Java and to be able to export all the variables as I want, just as if they were vectors in MATLAB. So basically we acquire the data and tag each vector. I need the code to be as generic as possible, so it should read a variable number of columns and rows. I was able to create the array using a non-efficient method I believe. Now I need to divide the array into multiple arrays and then convert them to numbers. But I need to group the number according to the vector they belong in.
The text file is created from an Excel spreadsheet, so it basically has the columns for different measurements, which will create the vectors. Each column in another vector which contains the data in the rows.
I searched a lot of code trying to implement, but it came to a point I cannot proceed without help. Can someone possibly tell me how to proceed in any sense. Maybe even improve the reading part also, because I know it is not the best way to do like this in Java. Here is what I have in hand:
package Testing;
import java.io.*;
import java.nio.charset.Charset;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.*;
public class Read1 {
public static void main(String[] args) {
try {
FileReader fin = new FileReader("C:/jade/test/Winter_Full_clean.txt");
BufferedReader in = new BufferedReader(fin);
String str = "";
int count = 0;
String line;
while ((line=in.readLine()) != null) {
if (count==0) {
str = line;
}
else if (in.readLine()!=null) {
str = str + line;
}
count++;
}
in.close();
//System.out.printf(str);
System.out.print(tokens);
} catch (Exception e) {
System.out.println("error crap" + e.getClass());
}
}
//{
// Path yourFile = Paths.get("C:/jade/test/Winter_Full_clean.txt");
// Charset charset = Charset.forName("UTF-8");
// List<String> lines = null;
// try {
// lines = Files.readAllLines(yourFile, charset);
// } catch (IOException e) {
// TODO Auto-generated catch block
// e.printStackTrace();
// }
// String[] arr = lines.toArray(new String[lines.size()]);
// System.out.println(arr);
//}
}

I have some code that is functional and reads a text file into an array and split on tabs ( its a tsv file). You should be able to adapt this as a starting point to read in your initial data, and then based upon the data contained within your array, alter your logic to suit:
try (BufferedReader reader = new BufferedReader(new FileReader(path))) { //path is a String pointing to the file
int lineNum = 0;
String readLine;
while ((readLine = reader.readLine()) != null) { //read until end of stream
if (lineNum == 0) { //Skip headers, i.e. start at line 2 of your excel sheet
lineNum++;
continue;
}
String[] nextLine = readLine.split("\t"); //Split on tab
for (int x = 0; x < nextLine.length; x++) { //do something with ALL the lines.
nextLine[x] = nextLine[x].replace("\'", "");
nextLine[x] = nextLine[x].replace("/'", "");
nextLine[x] = nextLine[x].replace("\"", " ");
nextLine[x] = nextLine[x].replace("\\xFFFD", "");
}
//In this example, to access data for a certain column,
//call nextLine[1] for col 1, nextLine[2] for col 2 etc.
} //close your loop and go to the next line in your file
} //close your resources.

Related

BufferedReader sometimes read text, sometimes doesn't

I'm trying to read from a .java the methods I have on it, also the classes, I'm using taggs to identify them and stored them, the problem is that using BufferedReader sometimes just doesn't work, the buffer skips a lot of lines for a reason that I can't understand, sometimes when checking the file by myself I just put random spaces between lines, and that fixes some parts, but I can't get the Buffer read all my text without skipping anything, my code so far is like this:
public class ReadFile {
public static void main(String[] args) {
int numclas=0,numbase=0,numbaseagr=0,numbmet=0,numag=0;
String mt="//MT";
String[] nomclass2 = new String[10];
String[] nommetodo2 = new String[50];
boolean metodo=false;
BufferedReader in = null;
try {
in = new BufferedReader(new FileReader("\\Program.java"));
String read = null;
while ((read = in.readLine()) != null) {
read = in.readLine();
String[] splited = read.trim().split("\\s+");
for(int i=0;i<splited.length;i++){
System.out.println(splited[i]);
if(splited[i].equals("class")){
nomclass2[numclas]=splited[i+1];
numclas=numclas+1;
}
if (splited[i].equals(mt)){
metodo=true;
}
if (splited[i].equals("public")){
if (splited[i+1].equals("static")){
nommetodo2[numbmet]=splited[i+3];
numbmet=numbmet+1;
}
if (splited[i+1].equals("int")||splited[i+1].equals("double")||splited[i+1].equals("String")||splited[i+1].equals("boolean")){
nommetodo2[numbmet]=splited[i+2];
numbmet=numbmet+1;
}
if (splited[i].equals("int")||splited[i].equals("double")||splited[i].equals("String")||splited[i].equals("boolean")){
nommetodo2[numbmet]=splited[i+1];
numbmet=numbmet+1;
}
metodo=false;
}
if ((splited[i].equals("int")||splited[i].equals("double")||splited[i].equals("String")||splited[i].equals("boolean"))&&metodo){
nommetodo2[numbmet]=splited[i+1];
numbmet=numbmet+1;
metodo=false;
}
}
}
} catch (IOException e) {
System.out.println("There was a problem: " + e);
e.printStackTrace();
} finally {
try {
in.close();
} catch (Exception e) {
}
}
Now let me show you the .java I'm trying to read:
import java.text.DecimalFormat;
import java.io.*;
//Main file of the program 1
public class Program1 {
//MT
public static void main (String args []) {
DecimalFormat format=new DecimalFormat("##.##");
System.out.println("How many data do you want to insert?");
int num=Leer.Int();
Fila lista=new Fila();
Fila lista2=new Fila();
double x=0.0;
for(int i=0;i<num;i++){
x=Leer.Double();
lista.addNum(x);
}
double prom=0.0;
double desv=0.0;
prom=lista.getprom();
desv=lista.getdevst();
System.out.println("The mean for column 1 is: "+format.format(prom));
System.out.println("The Std.Dev for column 1 is: "+format.format(desv));
System.out.println("How many data do you want to insert?");
num=Leer.Int();
x=0.0;
for(int i=0;i<num;i++) {
x=Leer.Double();
lista2.addNum(x);
}
prom=0.0;
desv=0.0;
prom=lista2.getprom();
desv=lista2.getdevst();
System.out.println("The mean for column 2 is: "+format.format(prom));
System.out.println("The Std.Dev for column 2 is: "+format.format(desv));
}
}
And the result when I print the array
Date:
12/12/12
import
java.text.DecimalFormat;
//Main
file
of
the
program
1
//MT
DecimalFormat
format=new
DecimalFormat("##.##");
so on...
See how in the //MT the Buffer skips a lot of lines, a lot of this is happening (see how it ignores the first lines of the program), and I don't know how to fix it, because sometimes when I try to "fix it" and add some spaces in the lines, I get a nullpointer and the program ends.
Any help will be appreciated, thank you.
This is just a partial answer - at the very least your program is skipping every other line:
while ((read = in.readLine()) != null)
will read a line from the file. The line is immediately discarded because the immediately following statement:
read = in.readLine();
reads and processes the next line from the file.
(also, 'splited' should be 'splitted' along with numerous other spelling mistakes but they're not really affecting your program, just it's readability :-))

Java, problems with string array of (string) array (maybe dynamic)

To speed-up a lookup search into a multi-record file I wish to store its elements into a String array of array so that I can just search a string like "AF" into similar strings only ("AA", "AB, ... , "AZ") and not into the whole file.
The original file is like this:
AA
ABC
AF
(...)
AP
BE
BEND
(...)
BZ
(...)
SHORT
VERYLONGRECORD
ZX
which I want to translate into
AA ABC AF (...) AP
BE BEND (...) BZ
(...)
SHORT
VERYLONGRECORD
ZX
I don't know how much records there are and how many "elements" each "row" will have as the source file can change in the time (even if, after being read into memory, the array is only read).
I tried whis solution:
in a class I defined the string array of (string) arrays, without defining its dimensions
public static String[][] tldTabData;
then, in another class, I read the file:
public static void tldLoadTable() {
String rec = null;
int previdx = 0;
int rowidx = 0;
// this will hold each row
ArrayList<String> mVector = new ArrayList<String>();
FileInputStream fStream;
BufferedReader bufRead = null;
try {
fStream = new FileInputStream(eVal.appPath+eVal.tldTabDataFilename);
// Use DataInputStream to read binary NOT text.
bufRead = new BufferedReader(new InputStreamReader(fStream));
} catch (Exception er1) {
/* if we fail the 1.st try maybe we're working into some "package" (e.g. debugging)
* so we'll try a second time with a modified path (e.g. adding "bin\") instead of
* raising an error and exiting.
*/
try {
fStream = new FileInputStream(eVal.appPath +
"bin"+ File.separatorChar + eVal.tldTabDataFilename);
// Use DataInputStream to read binary NOT text.
bufRead = new BufferedReader(new InputStreamReader(fStream));
} catch (FileNotFoundException er2) {
System.err.println("Error: " + er2.getMessage());
er2.printStackTrace();
System.exit(1);
}
}
try {
while((rec = bufRead.readLine()) != null) {
// strip comments and short (empty) rows
if(!rec.startsWith("#") && rec.length() > 1) {
// work with uppercase only (maybe unuseful)
//rec.toUpperCase();
// use the 1st char as a row index
rowidx = rec.charAt(0);
// if row changes (e.g. A->B and is not the 1.st line we read)
if(previdx != rowidx && previdx != 0)
{
// store the (completed) collection into the Array
eVal.tldTabData[previdx] = mVector.toArray(new String[mVector.size()]);
// clear the collection itself
mVector.clear();
// and restart to fill it from scratch
mVector.add(rec);
} else
{
// continue filling the collection
mVector.add(rec);
}
// and sync the indexes
previdx = rowidx;
}
}
streamIn.close();
// globally flag the table as loaded
eVal.tldTabLoaded = true;
} catch (Exception er2) {
System.err.println("Error: " + er2.getMessage());
er2.printStackTrace();
System.exit(1);
}
}
When executing the program, it correctly accumulates the strings into mVector but, when trying to copy them into the eVal.tldTabData I get a NullPointerException.
I bet I have to create/initialize the array at some point but having problems to figure where and how.
First time I'm coding in Java... helloworld apart. :-)
you can use a Map to store your strings per row;
here something that you'll need :
//Assuming that mVector already holds all you input strings
Map<String,List<String>> map = new HashMap<String,List<String>>();
for (String str : mVector){
List<String> storedList;
if (map.containsKey(str.substring(0, 1))){
storedList = map.get(str.substring(0, 1));
}else{
storedList = new ArrayList<String>();
map.put(str.substring(0, 1), storedList);
}
storedList.add(str);
}
Set<String> unOrdered = map.keySet();
List<String> orderedIndexes = new ArrayList<String>(unOrdered);
Collections.sort(orderedIndexes);
for (String key : orderedIndexes){//get strings for every row
List<String> values = map.get(key);
for (String value : values){//writing strings on the same row
System.out.print(value + "\t"); // change this to writing to some file
}
System.out.println(); // add new line at the end of the row
}

Read one line of a csv file in Java

I have a csv file that currently has 20 lines of data.
The data contains employee info and is in the following format:
first name, last name, Employee ID
So one line would like this: Emma, Nolan, 2
I know how to write to the file in java and have all 20 lines print to the console, but what I'm not sure how to do is how to get Java to print one specific line to the console.
I also want to take the last employee id number in the last entry and have java add 1 to it one I add new employees. I thinking this needs to be done with a counter just not sure how.
You can do something like this:
BufferedReader reader = new BufferedReader(new FileReader(<<your file>>));
List<String> lines = new ArrayList<>();
String line = null;
while ((line = reader.readLine()) != null) {
lines.add(line);
}
System.out.println(lines.get(0));
With BufferedReader you are able to read lines directly. This example reads the file line by line and stores the lines in an array list. You can access the lines after that by using lines.get(lineNumber).
You can read text from a file one line at a time and then do whatever you want to with that line, print it, compare it, etc...
// Construct a BufferedReader object from the input file
BufferedReader r = new BufferedReader(new FileReader("employeeData.txt"));
int i = 1;
try {
// "Prime" the while loop
String line = r.readLine();
while (line != null) {
// Print a single line of input file to console
System.out.print("Line "+i+": "+line);
// Prepare for next loop iteration
line = r.readLine();
i++;
}
} finally {
// Free up file descriptor resources
r.close();
}
// Remember the next available employee number in a one-up scheme
int nextEmployeeId = i;
BufferedReader reader =new BufferedReader(new FileReader("yourfile.csv"));
String line = "";
while((line=reader.readLine())!=null){
String [] employee =line.trim().split(",");
// if you want to check either it contains some name
//index 0 is first name, index 1 is last name, index 2 is ID
}
Alternatively, If you want more control over read CSV files then u can think about CsvBeanReader that will give you more access over files contents..
Here is an algorithm which I use for reading csv files. The most effective way is to read all the data in the csv file into a 2D array first. It just makes it a lot more flexible to manipulate the data.
That way you can specify which line of the file to print to the console by specifying it in the index of the array and using a for. I.e: System.out.println(employee_Data[1][y]); for record 1. y is the index variable for fields. You would need to use a For Loop of course, to print every element for each line.
By the way, if you want to use the employee data in a larger program, in which it may for example store the data in a database or write to another file, I'd recommend encapsulating this entire code block into a function named Read_CSV_File(), which will return a 2D String array.
My Code
// The return type of this function is a String.
// The CSVFile_path can be for example "employeeData.csv".
public static String[][] Read_CSV_File(String CSVFile_path){
String employee_Data[][];
int x;
int y;
int noofFields;
try{
String line;
BufferedReader in = new BufferedReader(new FileReader(CSVFile_path));
// reading files in specified directory
// This assigns the data to the 2D array
// The program keeps looping through until the line read in by the console contains no data in it i.e. the end of the file.
while ( (( line = in.readLine()) != null ){
String[] current_Record = line.split(",");
if(x == 0) {
// Counts the number of fields in the csv file.
noofFields = current_Record.length();
}
for (String str : values) {
employee_Data[x][y] = str;
System.out.print(", "+employee_Data[x][y]);
// The field index variable, y is incremented in every loop.
y = y + 1;
}
// The record index variable, x is incremented in every loop.
x = x + 1;
}
// This frees up the BufferedReader file descriptor resources
in.close();
/* If an error occurs, it is caught by the catch statement and an error message
* is generated and displayed to the user.
*/
}catch( IOException ioException ) {
System.out.println("Exception: "+ioException);
}
// This prints to console the specific line of your choice
System.out.println(("Employee 1:);
for(y = 0; y < noofFields ; y++){
// Prints out all fields of record 1
System.out.print(employee_Data[1][y]+", ");
}
return employee_Data;
}
For reading large file,
log.debug("****************Start Reading CSV File*******");
copyFile(inputCSVFile);
StringBuilder stringBuilder = new StringBuilder();
String line= "";
BufferedReader brOldFile = null;
try {
String inputfile = inputCSVFile;
log.info("inputfile:" + inputfile);
brOldFile = new BufferedReader(new FileReader(inputfile));
while ((line = brOldFile.readLine()) != null) {
//line = replaceSpecialChar(line);
/*do your stuff here*/
stringBuilder.append(line);
stringBuilder.append("\n");
}
log.debug("****************End reading CSV File**************");
} catch (Exception e) {
log.error(" exception in readStaffInfoCSVFile ", e);
}finally {
if(null != brOldFile) {
try {
brOldFile.close();
} catch (IOException e) {
}
}
}
return stringBuilder.toString();

Count Word Pairs Java

I have this programming assignment and it is the first time in our class that we are writing code in Java. I have asked my instructor and could not get any help.
The program needs to count word pairs from a file, and display them like this:
abc:
hec, 1
That means that there was only one time in the text file that "abc" was followed by "hec". I have to use the Collections Framework in java. Here is what I have so far.
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.FileNotFoundException;
import java.util.Scanner;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.ArrayList;
// By default, this code will get its input data from the Java standard input,
// java.lang.System.in. To allow input to come from a file instead, which can be
// useful when debugging your code, you can provide a file name as the first
// command line argument. When you do this, the input data will come from the
// named file instead. If the input file is in the project directory, you will
// not need to provide any path information.
//
// In BlueJ, specify the command line argument when you call main().
//
// In Eclipse, specify the command line argument in the project's "Run Configuration."
public class Assignment1
{
// returns an InputStream that gets data from the named file
private static InputStream getFileInputStream(String fileName)
{
InputStream inputStream;
try {
inputStream = new FileInputStream(new File(fileName));
}
catch (FileNotFoundException e) { // no file with this name exists
System.err.println(e.getMessage());
inputStream = null;
}
return inputStream;
}
public static void main(String[] args)
{
// Create an input stream for reading the data. The default is
// System.in (which is the keyboard). If there is an arg provided
// on the command line then we'll use the file instead.
InputStream in = System.in;
if (args.length >= 1) {
in = getFileInputStream(args[0]);
}
// Now that we know where the data is coming from we'll start processing.
// Notice that getFileInputStream could have generated an error and left "in"
// as null. We should check that here and avoid trying to process the stream
// data if there was an error.
if (in != null) {
// Using a Scanner object to read one word at a time from the input stream.
#SuppressWarnings("resource")
Scanner sc = new Scanner(in);
String word;
System.out.printf("CS261 - Assignment 1 - Matheus Konzen Iser%n%n");
// Continue getting words until we reach the end of input
List<String> inputWords = new ArrayList<String>();
Map<String, List<String>> result = new HashMap<String, List<String>>();
while (sc.hasNext()) {
word = sc.next();
if (!word.equals("---")) {
// do something with each word in the input
// replace this line with your code (probably more than one line of code)
inputWords.add(word);
}
for(int i = 0; i < inputWords.size() - 1; i++){
// Create references to this word and next word:
String thisWord = inputWords.get(i);
String nextWord = inputWords.get(i+1);
// If this word is not in the result Map yet,
// then add it and create a new empy list for it.
if(!result.containsKey(thisWord)){
result.put(thisWord, new ArrayList<String>());
}
// Add nextWord to the list of adjacent words to thisWord:
result.get(thisWord).add(nextWord);
}
//OUTPUT
for(Entry e : result.entrySet()){
System.out.println(e.getKey() + ":");
// Count the number of unique instances in the list:
Map<String, Integer>count = new HashMap<String, Integer>();
List<String>words = (List)e.getValue();
for(String s : words){
if(!count.containsKey(s)){
count.put(s, 1);
}
else{
count.put(s, count.get(s) + 1);
}
}
// Print the occurances of following symbols:
for(Entry f : count.entrySet()){
System.out.println(" " + f.getKey() + ", " + f.getValue());
}
}
}
System.out.printf("%nbye...%n");
}
}
}
The problem that I'm having now is that it is running through the loop below way too many times:
if (!word.equals("---")) {
// do something with each word in the input
// replace this line with your code (probably more than one line of code)
inputWords.add(word);
}
Does anyone have any ideas or tips on this?
I find this part confusing:
while (sc.hasNext()) {
word = sc.next();
if (!word.equals("---")) {
// do something with each word in the input
// replace this line with your code (probably more than one line of code)
inputWords.add(word);
}
for(int i = 0; i < inputWords.size() - 1; i++){
I think you probably mean something more like this:
// Add all words (other than "---") into inputWords
while (sc.hasNext()) {
word = sc.next();
if (!word.equals("---")) {
inputWords.add(word);
}
}
// Now iterate over inputWords and process each word one-by-one
for (int i = 0; i < inputWords.size(); i++) {
It looks like you're trying to read all the words into inputWords first and then process them, while your code iterates through the list after every word that you add.
Note also that your condition in the for loop is overly-conservative, so you'll miss the last word. Removing the - 1 will give you an index for each word.

Java: How to read a text file

I want to read a text file containing space separated values. Values are integers.
How can I read it and put it in an array list?
Here is an example of contents of the text file:
1 62 4 55 5 6 77
I want to have it in an arraylist as [1, 62, 4, 55, 5, 6, 77]. How can I do it in Java?
You can use Files#readAllLines() to get all lines of a text file into a List<String>.
for (String line : Files.readAllLines(Paths.get("/path/to/file.txt"))) {
// ...
}
Tutorial: Basic I/O > File I/O > Reading, Writing and Creating text files
You can use String#split() to split a String in parts based on a regular expression.
for (String part : line.split("\\s+")) {
// ...
}
Tutorial: Numbers and Strings > Strings > Manipulating Characters in a String
You can use Integer#valueOf() to convert a String into an Integer.
Integer i = Integer.valueOf(part);
Tutorial: Numbers and Strings > Strings > Converting between Numbers and Strings
You can use List#add() to add an element to a List.
numbers.add(i);
Tutorial: Interfaces > The List Interface
So, in a nutshell (assuming that the file doesn't have empty lines nor trailing/leading whitespace).
List<Integer> numbers = new ArrayList<>();
for (String line : Files.readAllLines(Paths.get("/path/to/file.txt"))) {
for (String part : line.split("\\s+")) {
Integer i = Integer.valueOf(part);
numbers.add(i);
}
}
If you happen to be at Java 8 already, then you can even use Stream API for this, starting with Files#lines().
List<Integer> numbers = Files.lines(Paths.get("/path/to/test.txt"))
.map(line -> line.split("\\s+")).flatMap(Arrays::stream)
.map(Integer::valueOf)
.collect(Collectors.toList());
Tutorial: Processing data with Java 8 streams
Java 1.5 introduced the Scanner class for handling input from file and streams.
It is used for getting integers from a file and would look something like this:
List<Integer> integers = new ArrayList<Integer>();
Scanner fileScanner = new Scanner(new File("c:\\file.txt"));
while (fileScanner.hasNextInt()){
integers.add(fileScanner.nextInt());
}
Check the API though. There are many more options for dealing with different types of input sources, differing delimiters, and differing data types.
This example code shows you how to read file in Java.
import java.io.*;
/**
* This example code shows you how to read file in Java
*
* IN MY CASE RAILWAY IS MY TEXT FILE WHICH I WANT TO DISPLAY YOU CHANGE WITH YOUR OWN
*/
public class ReadFileExample
{
public static void main(String[] args)
{
System.out.println("Reading File from Java code");
//Name of the file
String fileName="RAILWAY.txt";
try{
//Create object of FileReader
FileReader inputFile = new FileReader(fileName);
//Instantiate the BufferedReader Class
BufferedReader bufferReader = new BufferedReader(inputFile);
//Variable to hold the one line data
String line;
// Read file line by line and print on the console
while ((line = bufferReader.readLine()) != null) {
System.out.println(line);
}
//Close the buffer reader
bufferReader.close();
}catch(Exception e){
System.out.println("Error while reading file line by line:" + e.getMessage());
}
}
}
Look at this example, and try to do your own:
import java.io.*;
public class ReadFile {
public static void main(String[] args){
String string = "";
String file = "textFile.txt";
// Reading
try{
InputStream ips = new FileInputStream(file);
InputStreamReader ipsr = new InputStreamReader(ips);
BufferedReader br = new BufferedReader(ipsr);
String line;
while ((line = br.readLine()) != null){
System.out.println(line);
string += line + "\n";
}
br.close();
}
catch (Exception e){
System.out.println(e.toString());
}
// Writing
try {
FileWriter fw = new FileWriter (file);
BufferedWriter bw = new BufferedWriter (fw);
PrintWriter fileOut = new PrintWriter (bw);
fileOut.println (string+"\n test of read and write !!");
fileOut.close();
System.out.println("the file " + file + " is created!");
}
catch (Exception e){
System.out.println(e.toString());
}
}
}
Just for fun, here's what I'd probably do in a real project, where I'm already using all my favourite libraries (in this case Guava, formerly known as Google Collections).
String text = Files.toString(new File("textfile.txt"), Charsets.UTF_8);
List<Integer> list = Lists.newArrayList();
for (String s : text.split("\\s")) {
list.add(Integer.valueOf(s));
}
Benefit: Not much own code to maintain (contrast with e.g. this). Edit: Although it is worth noting that in this case tschaible's Scanner solution doesn't have any more code!
Drawback: you obviously may not want to add new library dependencies just for this. (Then again, you'd be silly not to make use of Guava in your projects. ;-)
Use Apache Commons (IO and Lang) for simple/common things like this.
Imports:
import org.apache.commons.io.FileUtils;
import org.apache.commons.lang3.ArrayUtils;
Code:
String contents = FileUtils.readFileToString(new File("path/to/your/file.txt"));
String[] array = ArrayUtils.toArray(contents.split(" "));
Done.
Using Java 7 to read files with NIO.2
Import these packages:
import java.nio.charset.Charset;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
This is the process to read a file:
Path file = Paths.get("C:\\Java\\file.txt");
if(Files.exists(file) && Files.isReadable(file)) {
try {
// File reader
BufferedReader reader = Files.newBufferedReader(file, Charset.defaultCharset());
String line;
// read each line
while((line = reader.readLine()) != null) {
System.out.println(line);
// tokenize each number
StringTokenizer tokenizer = new StringTokenizer(line, " ");
while (tokenizer.hasMoreElements()) {
// parse each integer in file
int element = Integer.parseInt(tokenizer.nextToken());
}
}
reader.close();
} catch (Exception e) {
e.printStackTrace();
}
}
To read all lines of a file at once:
Path file = Paths.get("C:\\Java\\file.txt");
List<String> lines = Files.readAllLines(file, StandardCharsets.UTF_8);
All the answers so far given involve reading the file line by line, taking the line in as a String, and then processing the String.
There is no question that this is the easiest approach to understand, and if the file is fairly short (say, tens of thousands of lines), it'll also be acceptable in terms of efficiency. But if the file is long, it's a very inefficient way to do it, for two reasons:
Every character gets processed twice, once in constructing the String, and once in processing it.
The garbage collector will not be your friend if there are lots of lines in the file. You're constructing a new String for each line, and then throwing it away when you move to the next line. The garbage collector will eventually have to dispose of all these String objects that you don't want any more. Someone's got to clean up after you.
If you care about speed, you are much better off reading a block of data and then processing it byte by byte rather than line by line. Every time you come to the end of a number, you add it to the List you're building.
It will come out something like this:
private List<Integer> readIntegers(File file) throws IOException {
List<Integer> result = new ArrayList<>();
RandomAccessFile raf = new RandomAccessFile(file, "r");
byte buf[] = new byte[16 * 1024];
final FileChannel ch = raf.getChannel();
int fileLength = (int) ch.size();
final MappedByteBuffer mb = ch.map(FileChannel.MapMode.READ_ONLY, 0,
fileLength);
int acc = 0;
while (mb.hasRemaining()) {
int len = Math.min(mb.remaining(), buf.length);
mb.get(buf, 0, len);
for (int i = 0; i < len; i++)
if ((buf[i] >= 48) && (buf[i] <= 57))
acc = acc * 10 + buf[i] - 48;
else {
result.add(acc);
acc = 0;
}
}
ch.close();
raf.close();
return result;
}
The code above assumes that this is ASCII (though it could be easily tweaked for other encodings), and that anything that isn't a digit (in particular, a space or a newline) represents a boundary between digits. It also assumes that the file ends with a non-digit (in practice, that the last line ends with a newline), though, again, it could be tweaked to deal with the case where it doesn't.
It's much, much faster than any of the String-based approaches also given as answers to this question. There is a detailed investigation of a very similar issue in this question. You'll see there that there's the possibility of improving it still further if you want to go down the multi-threaded line.
read the file and then do whatever you want
java8
Files.lines(Paths.get("c://lines.txt")).collect(Collectors.toList());

Categories