I have a questions file that I'd like to read, and when its reading, I want it to Identify the questions from the answers and print them, before each questions there is a line of "#" characters, code keeps skipping question one for some reason? what am I missing here?
Here is the code:
try {
// Open the file that is the first
// command line parameter
FileInputStream fstream = new FileInputStream(path);
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
String strLine;
strLine = br.readLine();
System.out.println(strLine);
// Read File Line By Line
while ((strLine ) != null) {
strLine = strLine.trim();
if ((strLine.length()!=0) && (strLine.charAt(0)=='#' && strLine.charAt(1)=='#')) {
strLine = br.readLine();
System.out.println(strLine);
//questions[q] = strLine;
}
strLine = br.readLine();
}
// Close the input stream
fstream.close();
// System.out.println(questions[0]);
} catch (Exception e) {// Catch exception if any
System.err.println("Error: " + e.getMessage());
}
I suspect, that the file you read is in UTF-8 with BOM.
The BOM is a code before the first character, that helps to identify the proper encoding of textfiles.
The issue with BOM is, that it is invisible and disturbs the reading. The textfile with BOM is arguable no longer a textfile. Especially, if you read the first line, the first character is no longer a #, but it is something different, because it is the character BOM+#.
Try to load the file with the explicit encoding specified. Java can handle BOM in newer releases, don't remember which exactly.
BufferedReader br = new BufferedReader(new InputStreamReader(fstream, "UTF-8"));
Otherwise, take a decent text editor, like notepad++ and change the encoding to UTF-8 without BOM or ANSI encoding (yuck).
Notice that when you either enter the if statement in the while or not, you first do strLine = br.readLine(); which overwrite the line you read when you initialized strline.
Related
This question already has answers here:
Java read file got a leading BOM [  ]
(7 answers)
Closed 9 years ago.
If I write this code, I get this as output --> This first: 
and then the other lines
try {
BufferedReader br = new BufferedReader(new FileReader(
"myFile.txt"));
String line;
while (line = br.readLine() != null) {
System.out.println(line);
}
br.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
How can I avoid it?
You are getting the characters  on the first line because this sequence is the UTF-8 byte order mark (BOM). If a text file begins with a BOM, it's likely it was generated by a Windows program like Notepad.
To solve your problem, we choose to read the file explicitly as UTF-8, instead of whatever default system character encoding (US-ASCII, etc.):
BufferedReader in = new BufferedReader(
new InputStreamReader(
new FileInputStream("myFile.txt"),
"UTF-8"));
Then in UTF-8, the byte sequence  decodes to one character, which is U+FEFF. This character is optional - a legal UTF-8 file may or may not begin with it. So we will skip the first character only if it's U+FEFF:
in.mark(1);
if (in.read() != 0xFEFF)
in.reset();
And now you can continue with the rest of your code.
The problem could be in encoding used.
try this:
BufferedReader in = new BufferedReader(new InputStreamReader(
new FileInputStream("yourfile"), "UTF-8"));
So below is my code. I am having it read from a csv file with values (each one on a newline)
54232
65
6564
6232
67413
26
completely meaningless but I'm calling a sysout after its read a line and it's returning
��5 followed by newlines
I can however use this arraylist to save the file and it saves it just as before except the first value has some Chinese characters strapped on to the start. I have absolutely no idea.
BufferedReader buffer = new BufferedReader(new FileReader(file));
ArrayList<String> lines = new ArrayList();
String line = "";
while ((line = buffer.readLine()) != null) {
System.out.println(line);
lines.add(line);
}
buffer.close();
return lines;
Solved
There was a BOM in my CSV.
Not used to LibreOffice. Fiddled about with the save settings then it worked just fine
This question already has answers here:
Java read file got a leading BOM [  ]
(7 answers)
Closed 9 years ago.
If I write this code, I get this as output --> This first: 
and then the other lines
try {
BufferedReader br = new BufferedReader(new FileReader(
"myFile.txt"));
String line;
while (line = br.readLine() != null) {
System.out.println(line);
}
br.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
How can I avoid it?
You are getting the characters  on the first line because this sequence is the UTF-8 byte order mark (BOM). If a text file begins with a BOM, it's likely it was generated by a Windows program like Notepad.
To solve your problem, we choose to read the file explicitly as UTF-8, instead of whatever default system character encoding (US-ASCII, etc.):
BufferedReader in = new BufferedReader(
new InputStreamReader(
new FileInputStream("myFile.txt"),
"UTF-8"));
Then in UTF-8, the byte sequence  decodes to one character, which is U+FEFF. This character is optional - a legal UTF-8 file may or may not begin with it. So we will skip the first character only if it's U+FEFF:
in.mark(1);
if (in.read() != 0xFEFF)
in.reset();
And now you can continue with the rest of your code.
The problem could be in encoding used.
try this:
BufferedReader in = new BufferedReader(new InputStreamReader(
new FileInputStream("yourfile"), "UTF-8"));
In my java application, I have to read one file. The problem what I am facing, after reading the file, the results is coming as non readable format. that means some ascii characters are displayed. That means none of the letters are readable. How can I make it display that?
// Open the file that is the first
// command line parameter
FileInputStream fstream = new FileInputStream("c:\\hello.txt");
// Get the object of DataInputStream
DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strLine;
// Read File Line By Line
while ((strLine = br.readLine()) != null) {
// Print the content on the console
System.out.println(strLine);
}
// Close the input stream
in.close();
} catch (Exception e) {// Catch exception if any
System.err.println("Error: " + e.getMessage());
}
Perhaps you have an encoding error. The constructor you are using for an InputStreamReader uses the default character encoding; if your file contains UTF-8 text outside the ASCII range, you will get garbage. Also, you don't need a DataInputStream, since you aren't reading any data objects from the stream. Try this code:
FileInputStream fstream = null;
try {
fstream = new FileInputStream("c:\\hello.txt");
// Decode data using UTF-8
BufferedReader br = new BufferedReader(new InputStreamReader(in, "UTF-8"));
String strLine;
// Read File Line By Line
while ((strLine = br.readLine()) != null) {
// Print the content on the console
System.out.println(strLine);
}
} catch (Exception e) {// Catch exception if any
System.err.println("Error: " + e.getMessage());
} finally {
if (fstream != null) {
try { fstream.close(); }
catch (IOException e) {
// log failure to close file
}
}
}
The output you are getting is an ascii value ,so you need to type cast it into char or string before printing it.Hope this helps
You have to implement this way to handle:-
BufferedReader br = new BufferedReader(new InputStreamReader(in, encodingformat));
.
encodingformat - change it according to which type of encoding issue you are encounter.
Examples: UTF-8, UTF-16, ... soon
Refer this Supported Encodings by Java SE 6 for more info.
My problem got solved. I dont know how. I copied the hello.txt contents to another file and run the java program. I could read all letters. dont know whats the problem in that.
Since you doesn't know the encoding the file is in, use jchardet to detect the encoding used by the file and then use that encoding to read the file as others have already suggested. This is not 100 % fool proof but works for your scenario.
Also, use of DataInputStream is unnecessary.
Currently I am trying something very simple. I am looking through an XML document for a certain phrase upon which I try to replace it. The problem I am having is that when I read the lines I store each line into a StringBuffer. When I write the it to a document everything is written on a single line.
Here my code:
File xmlFile = new File("abc.xml")
BufferedReader br = new BufferedReader(new FileReade(xmlFile));
String line = null;
while((line = br.readLine())!= null)
{
if(line.indexOf("abc") != -1)
{
line = line.replaceAll("abc","xyz");
}
sb.append(line);
}
br.close();
BufferedWriter bw = new BufferedWriter(new FileWriter(xmlFile));
bw.write(sb.toString());
bw.close();
I am assuming I need a new line character when I prefer sb.append but unfortunately I don't know which character to use as "\n" does not work.
Thanks in advance!
P.S. I figured there must be a way to use Xalan to format the XML file after I write to it or something. Not sure how to do that though.
The readline reads everything between the newline characters so when you write back out, obviously the newline characters are missing. These characters depend on the OS: windows uses two characters to do a newline, unix uses one for example. To be OS agnostic, retrieve the system property "line.separator":
String newline = System.getProperty("line.separator");
and append it to your stringbuffer:
sb.append(line).append(newline);
Modified as suggested by Brel, your text-substituting approach should work, and it will work well enough for simple applications.
If things start to get a little hairier, and you end up wanting to select elements based on their position in the XML structure, and if you need to be sure to change element text but not tag text (think <abc>abc</abc>), then you'll want to call in in the cavalry and process the XML with an XML parser.
Essentially you read in a Document using a DocuemntBuilder, you hop around the document's nodes doing whatever you need to, and then ask the Document to write itself back to file. Or do you ask the parser? Anyway, most XML parsers have a handful of options that let you format the XML output: You can specify indentation (or not) and maybe newlines for every opening tag, that kinda thing, to make your XML look pretty.
Sb would be the StringBuffer object, which has not been instantiated in this example. This can added before the while loop:
StringBuffer sb = new StringBuffer();
Scanner scan = new Scanner(System.in);
String filePath = scan.next();
String oldString = "old_string";
String newString = "new_string";
String oldContent = "";
BufferedReader br = null;
FileWriter writer = null;
File xmlFile = new File(filePath);
try {
br = new BufferedReader(new FileReader(xmlFile));
String line = br.readLine();
while (line != null) {
oldContent = oldContent + line + System.lineSeparator();
line = br.readLine();
}
String newContent = oldContent.replaceAll(oldString, newString);
writer = new FileWriter(xmlFile);
writer.write(newContent);
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
scan.close();
br.close();
writer.close();
} catch (IOException e) {
e.printStackTrace();
}
}