This question already has answers here:
Read a text file until EOL in Java
(3 answers)
Closed 9 years ago.
I am trying to read a text file which has -
hello James!
How are you today!
I am using the below code:
int readedValue;
do
{
while((readedValue = bufferReader.read()) != 10)
{
//readedValue = bufferReader.read();
char ch = (char) readedValue;
System.out.print(ch);
}
}
while ((readedValue = bufferReader.read()) != -1);
When I read the file now, I get output as hello James!ow are you today!
I am not getting 'H'ow in How. How can I alter this to get the complete text?
You're losing a character in your do-while loop's conditions
do {
...
// ends when the first new line \n character is reached
}
while ((readedValue = bufferReader.read()) != -1);
^ never printed
// the character that isn't read is the first character after the \n, ie. 'H'
Use a single loop that stores the readedValue (readValue) and does any comparison on that one.
I think you need this one...
int readedValue;
while ((readedValue = bufferReader.read()) != -1)
{
if(readedValue != 10)
{
System.out.print((char) readedValue);
}
}
In your example you are reading a character twice when linefeed is encountered ascii 10
do
{
while((readedValue = bufferReader.read()) != 10) // Here
{
//readedValue = bufferReader.read();
char ch = (char) readedValue;
System.out.print(ch);
}
}
while ((readedValue = bufferReader.read()) != -1); // Again here
What you should do is read it only once
while ((readedValue = bufferReader.read()) != -1)
{
if(readedValue != 10)
{
char ch = (char) readedValue;
System.out.print(ch);
}
}
This would help you:
String line;
int readedValue;
String s = "hello James!\n\rHow are you today!";
StringReader input = new StringReader(s);
BufferedReader lineReader= new BufferedReader (input);
while((line=lineReader.readLine())!=null) {
StringReader input2 = new StringReader(line);
BufferedReader charReader= new BufferedReader (input2);
while((readedValue = charReader.read()) != -1) {
char ch = (char) readedValue;
System.out.print(ch);
}
}
Related
In short: how do you alter the StreamTokenizer so that it will split each character in an input file into tokens.
For example, if I have the following input:
1023021023584
How can this be read so that each individual character can be saved to a specific index of an array?
To read characters individually from a file as "tokens", use a Reader:
try (BufferedReader in = Files.newBufferedReader(Paths.get("test.txt"))) {
for (int charOrEOF; (charOrEOF = in.read()) != -1; ) {
String token = String.valueOf((char) charOrEOF);
// Use token here
}
}
For full support of Unicode characters from the supplemental planes, e.g. emojis, we need to read surrogate pairs:
try (BufferedReader in = Files.newBufferedReader(Paths.get("test.txt"))) {
for (int char1, char2; (char1 = in.read()) != -1; ) {
String token = (Character.isHighSurrogate​((char) char1) && (char2 = in.read()) != -1)
? String.valueOf(new char[] { (char) char1, (char) char2 })
: String.valueOf((char) char1));
// Use token here
}
}
you have to call StreamTokenizer.resetSyntax() method as below
public static void main(String[] args) {
try (FileReader fileReader = new FileReader("C:\\test.txt");){
StreamTokenizer st = new StreamTokenizer(fileReader);
st.resetSyntax();
int token =0;
while((token = st.nextToken()) != StreamTokenizer.TT_EOF) {
if(st.ttype == StreamTokenizer.TT_NUMBER) {
System.out.println("Number: "+st.nval);
} else if(st.ttype == StreamTokenizer.TT_WORD) {
System.out.println("Word: "+st.sval);
}else {
System.out.println("Ordinary Char: "+(char)token);
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
im Trying to read a TextFile of Numbers on each Line into an ArrayList.
When i execute the following Function, it always Skips the last Element.
can somebody help me out ? cause i dont get the Problem here, since it reads til the Buffer is empty, he should stop when the FileEnd is reached, correct ?
List<Double> lines = new ArrayList<>();
ByteBuffer buffer = ByteBuffer.allocateDirect(1024);
StringBuilder oneLine = new StringBuilder();
try (SeekableByteChannel byteChannel = Files.newByteChannel(Paths.get(fileName))) {
while (byteChannel.read(buffer) > 0) {
buffer.flip();
for (int i = 0; i < buffer.limit(); i++) {
char c = (char) buffer.get();
if (c == '\r') {
//Skip it
}
if (c == '\n') {
System.out.println(oneLine.toString()); //Test Output to see what he got
lines.add(Double.parseDouble(oneLine.toString().replace(',', '.')));
oneLine.setLength(0);
}
else {
if (c != '\r') {
oneLine.append(c);
}
}
}
buffer.clear();
}
System.out.println("Anzahl zeilen: " + (lines.size()));
System.out.println("Finished");
}
catch (IOException e) {
System.out.println("The File that was defined could not be found");
e.printStackTrace();
}
return lines;
}
the TextFile with one Number on each Line:
999973
22423
999974
999975
999976
999977
573643
999978
999979
999980
999981
34322
999982
999983
999984
999985
999986
999987
999988
3
67
84,000
7896575543
8.0
100001
9999991
8.0
When you reach the end of the file and there is no line break at your end of the file, you will add the characters to the StringBuilder but never append it to the list. You can either add a newline to the end of your text file or call lines.add(...) just before you call buffer.clear().
I am using BufferedReader to read a text file line by line. Then i use a method to normalize each line text. But there is something wrong with my normalization method, after the call to it, BufferedReader object stop reading file. Can someone help me with this.
Here is my code:
public static void main(String[] args) {
String string = "";
try (BufferedReader br = new BufferedReader(new FileReader("file.txt"))) {
String line;
while ((line = br.readLine()) != null) {
string += normalize(line);
}
} catch (Exception e) {
}
System.out.println(string);
}
public static String normalize(String string) {
StringBuilder text = new StringBuilder(string.trim());
for(int i = 0; i < text.length(); i++) {
if(text.charAt(i) == ' ') {
removeWhiteSpaces(i + 1, text);
}
}
if(text.charAt(text.length() - 1) != '.') {
text.append('.');
}
text.append("\n");
return text.toString();
}
public static void removeWhiteSpaces(int index, StringBuilder text) {
int j = index;
while(text.charAt(j) == ' ') {
text.deleteCharAt(j);
}
}
and here is the text file that i use:
abc .
asd.
dasd.
I think you have problem in your removeWhiteSpaces(i + 1, text);, and if you have problem in the string process, the reader wont able to read the next line.
You don't check the empty string, and you call text.charAt(text.length()-1), it is a problem too.
Print the exception, change your catch block to write out the exception:
} catch (Exception e) {
e.printStackTrace();
}
The reason is in your while(text.charAt(j) == ' ') {, you don't examine the length of StringBuilder, but you delete it...
Try this:
while ((line = br.readLine()) != null) {
if(line.trim().isEmpty()) {
continue;
}
string += normalize(line);
}
Try ScanReader
Scanner scan = new Scanner(is);
int rowCount = 0;
while (scan.hasNextLine()) {
String temp = scan.nextLine();
if(temp.trim().length()==0){
continue;
}
}
//rest of your logic
The normalize function is causing this.
the following tweak to it shoudl fix this:
public static String normalize(String string) {
if(string.length() < 1) {
return "";
}
StringBuilder text = new StringBuilder(string.trim());
if(text.length() < 1){
return "";
}
for(int i = 0; i < text.length(); i++) {
if(text.charAt(i) == ' ') {
removeWhiteSpaces(i + 1, text);
}
}
if(text.charAt(text.length() - 1) != '.') {
text.append('.');
}
text.append("\n");
return text.toString();
}
The problem is not in your code but in the understanding of the readLine() method. In the documentation is stated:
Reads a line of text. A line is considered to be terminated by any one of a line feed ('\n'), a carriage return ('\r'), or a carriage return followed immediately by a linefeed.
https://docs.oracle.com/javase/7/docs/api/java/io/BufferedReader.html#readLine()
So that means that if the method finds an empty line it will stop reading and return null.
The code proposed by #tijn167 would do the workaround using BufferedReader. If you are not restraint to BufferedReader use ScanReader as #Abhishek Soni suggested.
Also, your method removeWhiteSpaces() is checking for white spaces while the empty lines are not a white space but a carry return \r or a line feed \n or both. So your condition text.charAt(j) == ' ' is never satisfied.
Second line of your file is empty, therefore the while loop stops
This function read from a file and store in a string(input) then store input in a Stringbuffer (textfile). then prints textfile
for example: file= ab , after the print statment : a newline b
any suggestions how can i fix this?
StringBuffer textfile = new StringBuffer();
StringBuffer decodedfile=new StringBuffer();
String output;
String input = "";
int i;
FileInputStream fin;
FileOutputStream fout;
fin = new FileInputStream(args[0]); // open input file
fout = new FileOutputStream(args[1]); // open output file
do {
i = fin.read();
if (((char) i != ' ') && (i != -1)&&((char)i!='\n')) {
input += (char) i;
}
} while (i != -1);
input=input.replace('\n', '.');
for(int j=0;j<input.length();j++) // fill textfile
{
textfile.append(input.charAt(j));
}
for(int j=0;j<textfile.length();j++) // test output
{
System.out.println(textfile.charAt(j));
}
You are using println each time you print, this will print each char and then add a new line character at the end.
Just change the System.out.println(textfile.charAt(j)); to:
System.out.print(textfile.charAt(j));
I'm having issues with BufferedWriter/BufferedReader.
Basically, whenever I try to read a file with BufferedReader.readLine() it reads everything up to the new line character (i.e. The new line character is omitted).
For instance:
String temp;
File f = new File(path.toURI());
BufferedReader reader = new BufferedReader(new FileReader(f));
while ((temp = reader.readLine()) != null) {
//Work with temp
}
I know about the existence of BufferedReader#newLine(), but it appears that it does not exactly get the newline (delimiter?) that was previously omitted.
From my understanding if I were to readline the following:
abcd\n
efgh\r\n
ijkl\r
It will return:
abcd\n
efgh\n
ijkl\n
What I am asking is, is there any class that is able to read characters without omitting them like BufferedInputStream, while retaining the ability to read line like BufferedReader#readLine()
\n is a linux/unix line ending while \r\n is windows line ending.
if there is such a file that has both line ending it should be reformatted.
My suggestion would be if you ever come across such file, just reformat it to either use \n or \r\n (depending on your OS not that it matter nowadays). it makes your life easier so the life of the next person that is going to use it next.
Alternatively (please don't use this :/) you can override BufferReader.readLine(Boolean b) to this:
String readLine(boolean ignoreLF) throws IOException {
StringBuffer s = null;
int startChar;
synchronized (lock) {
ensureOpen();
boolean omitLF = ignoreLF || skipLF;
bufferLoop:
for (;;) {
if (nextChar >= nChars)
fill();
if (nextChar >= nChars) { /* EOF */
if (s != null && s.length() > 0){
if(skipLF=='\r'){
return s.toString() + "\r\n";
}else{
return s.toString() + "\n";
}
}
else
return null;
}
boolean eol = false;
char c = 0;
int i;
/* Skip a leftover '\n', if necessary */
if (omitLF && (cb[nextChar] == '\n'))
nextChar++;
skipLF = false;
omitLF = false;
charLoop:
for (i = nextChar; i < nChars; i++) {
c = cb[i];
if ((c == '\n') || (c == '\r')) {
eol = true;
break charLoop;
}
}
startChar = nextChar;
nextChar = i;
if (eol) {
String str;
if (s == null) {
str = new String(cb, startChar, i - startChar);
} else {
s.append(cb, startChar, i - startChar);
str = s.toString();
}
nextChar++;
if (c == '\r') {
skipLF = true;
}
if(skipLF=='\r'){
return str + "\r\n";
}else{
return str + "\n";
}
}
if (s == null)
s = new StringBuffer(defaultExpectedLineLength);
s.append(cb, startChar, i - startChar);
}
}
}
SOURCE CODE edited from:
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/io/BufferedReader.java#BufferedReader.readLine%28boolean%29
It probably won't be too much trouble to extend BufferedReader to include a \n or \r in the return from readLine(). In fact, the package-protected readLine(boolean ignoreLF) function is all you'd need to override:
Reads a line of text. A line is considered to be terminated by any one
of a line feed ('\n'), a carriage return ('\r') delimiter in the result, or a carriage return
followed immediately by a linefeed.
Parameters: ignoreLF If true, the
next '\n' will be skipped
Returns: A String containing the contents of
the line, not including any line-termination characters, or null if
the end of the stream has been reached
Throws: IOException If an I/O
error occurs
See also: LineNumberReader.readLine()
One solution could be to extend from BufferedReader and override the readLine() method (as it was already proposed in other answers).
Take this simplified example only as a PoC.
class MyReader extends BufferedReader {
int size = 8192;
public MyReader(Reader in) {
super(in);
}
public MyReader(Reader in, int sz) {
super(in, sz);
this.size = sz;
}
#Override
public String readLine() throws IOException {
StringBuilder sb = new StringBuilder(this.size);
for (int read = super.read(); read >= 0 && read != '\n'; read = super.read()) {
sb.append((char) read);
}
// in case you want also to preserve the line feed character
// sb.append('\n');
return sb.toString();
}
}
.
public class MyReaderDemo{
public static void main(String[] args) throws FileNotFoundException, IOException {
String text = "abcd\n"
+ "efgh\r\n"
+ "ijkl\r";
ByteArrayInputStream bis = new ByteArrayInputStream(
text.getBytes(StandardCharsets.ISO_8859_1)
);
// BufferedReader in = new BufferedReader(new InputStreamReader(bis));
BufferedReader in = new MyReader(new InputStreamReader(bis));
System.out.println(Arrays.toString(in.readLine().getBytes()));
System.out.println(Arrays.toString(in.readLine().getBytes()));
System.out.println(Arrays.toString(in.readLine().getBytes()));
}
}
output with BufferedReader
[97, 98, 99, 100]
[101, 102, 103, 104]
[105, 106, 107, 108]
output with MyReader
[97, 98, 99, 100]
[101, 102, 103, 104, 13]
[105, 106, 107, 108, 13]