I have created a FileReader object.
public String getFileContent(){
StringBuilder filecontent= new StringBuilder();
FileReader fileReader = new FileReader("D:/myfile");
BufferedReader bufferedReader = new BufferedReader(fileReader);
while((line = bufferedReader.readLine()) != null) {
filecontent.append(bufferedReader.readLine());
}
return filecontent.toString;
}
The problem I face is that the function always returns the same string even if the file content is changed.
Anyone to help???
append the variable line since you store it to that variable. The problem is you're calling the readline() twice
like this:
while((line = bufferedReader.readLine()) != null) {
filecontent.append(line);
}
You received an answer already, but it seems that what you're doing could be achieved way better in Java 8:
Path filePath = Paths.get("D:/myfile");
byte[] fileBytes = Files.readAllBytes(filePath);
Charset fileEncoding = StandardCharsets.UTF_8;
String fileContents = new String(fileBytes, fileEncoding);
If you're not on Java 8, you can use Apache Commons IO's FileUtils to get the String:
File file = new File("D:/myfile");
String fileContents = FileUtils.readFileToString(file);
My advice here is leverage the JDK and existing libraries as much as possible. It leads to cleaner code that is easier to maintain.
Related
I'm reading a file through a FileReader - the file is UTF-8 decoded (with BOM) now my problem is: I read the file and output a string, but sadly the BOM marker is outputted too. Why this occurs?
fr = new FileReader(file);
br = new BufferedReader(fr);
String tmp = null;
while ((tmp = br.readLine()) != null) {
String text;
text = new String(tmp.getBytes(), "UTF-8");
content += text + System.getProperty("line.separator");
}
output after first line
?<style>
In Java, you have to consume manually the UTF8 BOM if present. This behaviour is documented in the Java bug database, here and here. There will be no fix for now because it will break existing tools like JavaDoc or XML parsers. The Apache IO Commons provides a BOMInputStream to handle this situation.
Take a look at this solution: Handle UTF8 file with BOM
The easiest fix is probably just to remove the resulting \uFEFF from the string, since it is extremely unlikely to appear for any other reason.
tmp = tmp.replace("\uFEFF", "");
Also see this Guava bug report
Use the Apache Commons library.
Class: org.apache.commons.io.input.BOMInputStream
Example usage:
String defaultEncoding = "UTF-8";
InputStream inputStream = new FileInputStream(someFileWithPossibleUtf8Bom);
try {
BOMInputStream bOMInputStream = new BOMInputStream(inputStream);
ByteOrderMark bom = bOMInputStream.getBOM();
String charsetName = bom == null ? defaultEncoding : bom.getCharsetName();
InputStreamReader reader = new InputStreamReader(new BufferedInputStream(bOMInputStream), charsetName);
//use reader
} finally {
inputStream.close();
}
Here's how I use the Apache BOMInputStream, it uses a try-with-resources block. The "false" argument tells the object to ignore the following BOMs (we use "BOM-less" text files for safety reasons, haha):
try( BufferedReader br = new BufferedReader(
new InputStreamReader( new BOMInputStream( new FileInputStream(
file), false, ByteOrderMark.UTF_8,
ByteOrderMark.UTF_16BE, ByteOrderMark.UTF_16LE,
ByteOrderMark.UTF_32BE, ByteOrderMark.UTF_32LE ) ) ) )
{
// use br here
} catch( Exception e)
}
Consider UnicodeReader from Google which does all this work for you.
Charset utf8 = StandardCharsets.UTF_8; // default if no BOM present
try (Reader r = new UnicodeReader(new FileInputStream(file), utf8.name())) {
....
}
Maven Dependency:
<dependency>
<groupId>com.google.gdata</groupId>
<artifactId>core</artifactId>
<version>1.47.1</version>
</dependency>
Use Apache Commons IO.
For example, let's take a look on my code (used for reading a text file with both latin and cyrillic characters) below:
String defaultEncoding = "UTF-16";
InputStream inputStream = new FileInputStream(new File("/temp/1.txt"));
BOMInputStream bomInputStream = new BOMInputStream(inputStream);
ByteOrderMark bom = bomInputStream.getBOM();
String charsetName = bom == null ? defaultEncoding : bom.getCharsetName();
InputStreamReader reader = new InputStreamReader(new BufferedInputStream(bomInputStream), charsetName);
int data = reader.read();
while (data != -1) {
char theChar = (char) data;
data = reader.read();
ari.add(Character.toString(theChar));
}
reader.close();
As a result we have an ArrayList named "ari" with all characters from file "1.txt" excepting BOM.
If somebody wants to do it with the standard, this would be a way:
public static String cutBOM(String value) {
// UTF-8 BOM is EF BB BF, see https://en.wikipedia.org/wiki/Byte_order_mark
String bom = String.format("%x", new BigInteger(1, value.substring(0,3).getBytes()));
if (bom.equals("efbbbf"))
// UTF-8
return value.substring(3, value.length());
else if (bom.substring(0, 2).equals("feff") || bom.substring(0, 2).equals("ffe"))
// UTF-16BE or UTF16-LE
return value.substring(2, value.length());
else
return value;
}
It's mentioned here that this is usually a problem with files on Windows.
One possible solution would be running the file through a tool like dos2unix first.
The easiest way I found to bypass BOM
BufferedReader br = new BufferedReader(new InputStreamReader(fis));
while ((currentLine = br.readLine()) != null) {
//case of, remove the BOM of UTF-8 BOM
currentLine = currentLine.replace("","");
Using a Buffer reader I parse throughout a file. If Oranges: pattern is found, I want to replace it with ApplesAndOranges.
try (BufferedReader br = new BufferedReader(new FileReader(resourcesFilePath))) {
String line;
while ((line = br.readLine()) != null) {
if (line.startsWith("Oranges:")){
int startIndex = line.indexOf(":");
line = line.substring(startIndex + 2);
String updatedLine = "ApplesAndOranges";
updateLine(line, updatedLine);
I call a method updateLine and I pass my original line as well as the updated line value.
private static void updateLine(String toUpdate, String updated) throws IOException {
BufferedReader file = new BufferedReader(new FileReader(resourcesFilePath));
PrintWriter writer = new PrintWriter(new File(resourcesFilePath+".out"), "UTF-8");
String line;
while ((line = file.readLine()) != null)
{
line = line.replace(toUpdate, updated);
writer.println(line);
}
file.close();
if (writer.checkError())
throw new IOException("Can't Write To File"+ resourcesFilePath);
writer.close();
}
To get the file to update I have to save it with a different name (resourcesFilePath+".out"). If I use the original file name the saved version become blank.
So here is my question, how can I replace a line with any value in the original file without losing any data.
For this you need to use the regular expressions (RegExp) like this:
str = str.replaceAll("^Orange:(.*)", "OrangeAndApples:$1");
It's an example and maybe it's not excactly what you want, but here, in the first parameter, the expression in parentesis is called a capturing group. The expression found will be replaced by the second parameter and the $1 will be replaced by the value of the capturing group. In our example Orange:Hello at the beggining of a line will be replaced by OrangeAndApples:Hello.
In your code, it seams you create one file per line ... maybe inlining the sub-method would be better.
try (
BufferedReader br = new BufferedReader(new FileReader(resourcesFilePath));
BufferedWriter writer = Files.newBufferedWriter(outputFilePath, charset);
) {
String line;
while ((line = br.readLine()) != null) {
String repl = line.replaceAll("Orange:(.*)","OrangeAndApples:$1");
writer.writeln(repl);
}
}
The easiest way to write over everything in your original final would be to read in everything - changing whatever you want to change and closing the stream. Afterwards open up the file again, then overwrite the file and all its lines with the data you want.
You can use RandomAccessFile to write to the file, and nio.Files to read the bytes from it. In this case, I put it as a string.
You can also read the file with RandomAccessFile, but it is easier to do it this way, in my opinion.
import java.io.RandomAccessFile;
import java.io.File;
import java.io.IOException;
import java.nio.file.*;
public void replace(File file){
try {
RandomAccessFile raf = new RandomAccessFile(file, "rw");
Path p = Paths.get(file.toURI());
String line = new String(Files.readAllBytes(p));
if(line.startsWith("Oranges:")){
line.replaceAll("Oranges:", "ApplesandOranges:");
raf.writeUTF(line);
}
raf.close();
} catch (IOException e) {
e.printStackTrace();
}
}
I've a XML file and want to send its content to caller as string. This is what I'm using:
return FileUtils.readFileToString(xmlFile);
but this (or that matter all other ways I tried like reading line by line) escapes XML elements and enclose whole XML with <string> like this
<string>><.....</string>
but I want to return
<a>....</a>
I'd advise using a different file reader maybe something like this.
private String readFile( String file ) throws IOException {
BufferedReader reader = new BufferedReader( new FileReader (file));
String line = null;
StringBuilder stringBuilder = new StringBuilder();
String ls = System.getProperty("line.separator");
while( ( line = reader.readLine() ) != null ) {
stringBuilder.append( line );
stringBuilder.append( ls );
}
return stringBuilder.toString();
}
It's probably a feature of file utils.
According to your question you just want to read the file. You can use FileReader and BufferedReader to read the file.
File f=new File("demo.xml");
FileReader fr=new FileReader(f);
BufferedReader br=new BufferedReader(fr);
String line;
while((line=br.readLine())!=null)
{
System.out.println(line);
}
Hope this answer helps you
IOUtils works well. It's in package org.apache.commons.io. The toString method takes an InputStream as a parameter and returns the contents as a string maintaining format.
InputStream is = getClass.getResourceAsStream("foo.xml");
String str = IOUtils.toString(is);
BufferedReader br = new BufferedReader(new FileReader(new File(filename)));
String line;
StringBuilder sb = new StringBuilder();
while((line = br.readLine())!= null){
sb.append(line.trim());
}
The following code seems to only write a small part of the File in the StringBuilder - why?
Reader rdr = new BufferedReader(new InputStreamReader(new FileInputStream(...)));
StringBuilder buf = new StringBuilder();
CharBuffer cbuff = CharBuffer.allocate(1024);
while(rdr.read(cbuff) != -1){
buf.append(cbuff);
cbuff.clear();
}
rdr.close();
Some more information: The file is bigger than the CharBuffer, also i can see from the debugger that the charbuffer is indeed filled as expected. The only part that makes its way to the StringBuilder seems to be from somewhere in the middle of the file. I am using openJDK7.
I wonder why it would show such a behavior and how this can be fixed.
As Peter Lawrey mentioned, you need to call cbuff.flip() between the read and write. It seems that the append will read from the position of the buffer, which is at the end if we don't call cbuff.flip(). The reason why a part from somewhere in the middle is still written is because in the end, the buffer won't be completely filled, thus some "old" bytes will still be between the position in the buffer and the end of the buffer.
Mystery solved :-)
All those classes have been part of the JDK since 1.0. I doubt that any of them needs to be fixed.
Your code is a long way for the usual idiom. Was this intended as a learning exercise, one that's gone awry? Or did you really want to put this into an application?
Here's how I would expect to see those classes used:
public static final String NEWLINE = System.getProperty("line.separator");
public String readContents(File f) throws IOException {
StringBuilder builder = new StringBuilder(1024);
BufferedReader br = null;
try {
br = new BufferedReader(new FileReader(f));
String line;
while ((line = br.readLine()) != null) {
builder.append(line).append(NEWLINE);
}
} finally {
closeQuietly(br);
}
return builder.toString();
}
My program must read text files - line by line.
Files in UTF-8.
I am not sure that files are correct - can contain unprintable characters.
Is possible check for it without going to byte level?
Thanks.
Open the file with a FileInputStream, then use an InputStreamReader with the UTF-8 Charset to read characters from the stream, and use a BufferedReader to read lines, e.g. via BufferedReader#readLine, which will give you a string. Once you have the string, you can check for characters that aren't what you consider to be printable.
E.g. (without error checking), using try-with-resources (which is in vaguely modern Java version):
String line;
try (
InputStream fis = new FileInputStream("the_file_name");
InputStreamReader isr = new InputStreamReader(fis, Charset.forName("UTF-8"));
BufferedReader br = new BufferedReader(isr);
) {
while ((line = br.readLine()) != null) {
// Deal with the line
}
}
While it's not hard to do this manually using BufferedReader and InputStreamReader, I'd use Guava:
List<String> lines = Files.readLines(file, Charsets.UTF_8);
You can then do whatever you like with those lines.
EDIT: Note that this will read the whole file into memory in one go. In most cases that's actually fine - and it's certainly simpler than reading it line by line, processing each line as you read it. If it's an enormous file, you may need to do it that way as per T.J. Crowder's answer.
Just found out that with the Java NIO (java.nio.file.*) you can easily write:
List<String> lines=Files.readAllLines(Paths.get("/tmp/test.csv"), StandardCharsets.UTF_8);
for(String line:lines){
System.out.println(line);
}
instead of dealing with FileInputStreams and BufferedReaders...
If you want to check a string has unprintable characters you can use a regular expression
[^\p{Print}]
How about below:
FileReader fileReader = new FileReader(new File("test.txt"));
BufferedReader br = new BufferedReader(fileReader);
String line = null;
// if no more lines the readLine() returns null
while ((line = br.readLine()) != null) {
// reading lines until the end of the file
}
Source: http://devmain.blogspot.co.uk/2013/10/java-quick-way-to-read-or-write-to-file.html
I can find following ways to do.
private static final String fileName = "C:/Input.txt";
public static void main(String[] args) throws IOException {
Stream<String> lines = Files.lines(Paths.get(fileName));
lines.toArray(String[]::new);
List<String> readAllLines = Files.readAllLines(Paths.get(fileName));
readAllLines.forEach(s -> System.out.println(s));
File file = new File(fileName);
Scanner scanner = new Scanner(file);
while (scanner.hasNext()) {
System.out.println(scanner.next());
}
The answer by #T.J.Crowder is Java 6 - in java 7 the valid answer is the one by #McIntosh - though its use of Charset for name for UTF -8 is discouraged:
List<String> lines = Files.readAllLines(Paths.get("/tmp/test.csv"),
StandardCharsets.UTF_8);
for(String line: lines){ /* DO */ }
Reminds a lot of the Guava way posted by Skeet above - and of course same caveats apply. That is, for big files (Java 7):
BufferedReader reader = Files.newBufferedReader(path, StandardCharsets.UTF_8);
for (String line = reader.readLine(); line != null; line = reader.readLine()) {}
If every char in the file is properly encoded in UTF-8, you won't have any problem reading it using a reader with the UTF-8 encoding. Up to you to check every char of the file and see if you consider it printable or not.