File encoding : saved content is different than when read - java

I have a slight problem trying to save a file in java.
For some reason the content I get after saving my file is different from what I have when I read it.
I guess this is related to file encoding, but without being sure.
Here is test code I put together. The idea is basically to read a file, and save it again.
When I open both files, they are different.
package workspaceFun;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import org.apache.commons.codec.DecoderException;
public class FileSaveTest {
public static void main(String[] args) throws IOException, DecoderException{
String location = "test.location";
File locationFile = new File(location);
FileInputStream fis = new FileInputStream(locationFile);
InputStreamReader r = new InputStreamReader(fis, Charset.forName("UTF-8"));
System.out.println(r.getEncoding());
StringBuilder builder = new StringBuilder();
int ch;
while((ch = fis.read()) != -1){
builder.append((char)ch);
}
String fullLocationString = builder.toString();
//Now we want to save back
FileOutputStream fos = new FileOutputStream("C:/Users/me/Desktop/test");
byte[] b = fullLocationString.getBytes();
fos.write(b);
fos.close();
r.close();
}
}
An extract from the input file (opened as plain text using Sublime 2):
40b1 8b81 23bc 0014 1a25 96e7 a393 be1e
and from the output file :
40c2 b1c2 8bc2 8123 c2bc 0014 1a25 c296
The getEncoding method returns "UTF8". Trying to save the output file using the same charset doest not seem to solve the issue.
What puzzles me is that when I try to read the input file using Hex from apache.commons.codec like this :
String hexLocationString2 = Hex.encodeHexString(fullLocationString.getBytes("UTF-8"));
The String already looks like my output file, not the input.
Would you have any idea on what can go wrong?
Thanks
Extra info for those being interested, I am trying to read an eclipse .location file.
EDIT: I placed the file online so that you can test the code

I believe is the way you are reading the stream.
You are using FileInputStream directly to read the content instead of wrapping it in the InputStreamReader
By using the InputStreamReader you may determine which Charset to use.
Take in consideration that the Charset defined in the InputStream must be the same you expect as InputStream doesn't detect charsets, it just reads them in that specific format.
Try the following changes:
InputStreamReader r = new InputStreamReader(new FileInputStream(locationFile), StandardCharsets.UTF_8);
then instead of fos.read() use r.read()
Finally when writing the String get the bytes in the same Charset as your Reader
FileOutputStream fos = new FileOutputStream("C:/Users/me/Desktop/test");
fos.write(fullLocationString.getBytes(StandardCharsets.UTF_8));
fos.close()

Try to read and write back as below:
public class FileSaveTest {
public static void main(String[] args) throws IOException {
String location = "D:\\test.txt";
BufferedReader br = new BufferedReader(new FileReader(location));
StringBuilder sb = new StringBuilder();
try {
String line = br.readLine();
while (line != null) {
sb.append(line);
line = br.readLine();
if (line != null)
sb.append(System.lineSeparator());
}
} finally {
br.close();
}
FileOutputStream fos = new FileOutputStream("D:\\text_created.txt");
byte[] b = sb.toString().getBytes();
fos.write(b);
fos.close();
}
}
Test file contains both Cirillic and Latin characters.
SDFASDF
XXFsd1
12312
іва

Related

How to replace a line with a new line using Java

Using a Buffer reader I parse throughout a file. If Oranges: pattern is found, I want to replace it with ApplesAndOranges.
try (BufferedReader br = new BufferedReader(new FileReader(resourcesFilePath))) {
String line;
while ((line = br.readLine()) != null) {
if (line.startsWith("Oranges:")){
int startIndex = line.indexOf(":");
line = line.substring(startIndex + 2);
String updatedLine = "ApplesAndOranges";
updateLine(line, updatedLine);
I call a method updateLine and I pass my original line as well as the updated line value.
private static void updateLine(String toUpdate, String updated) throws IOException {
BufferedReader file = new BufferedReader(new FileReader(resourcesFilePath));
PrintWriter writer = new PrintWriter(new File(resourcesFilePath+".out"), "UTF-8");
String line;
while ((line = file.readLine()) != null)
{
line = line.replace(toUpdate, updated);
writer.println(line);
}
file.close();
if (writer.checkError())
throw new IOException("Can't Write To File"+ resourcesFilePath);
writer.close();
}
To get the file to update I have to save it with a different name (resourcesFilePath+".out"). If I use the original file name the saved version become blank.
So here is my question, how can I replace a line with any value in the original file without losing any data.
For this you need to use the regular expressions (RegExp) like this:
str = str.replaceAll("^Orange:(.*)", "OrangeAndApples:$1");
It's an example and maybe it's not excactly what you want, but here, in the first parameter, the expression in parentesis is called a capturing group. The expression found will be replaced by the second parameter and the $1 will be replaced by the value of the capturing group. In our example Orange:Hello at the beggining of a line will be replaced by OrangeAndApples:Hello.
In your code, it seams you create one file per line ... maybe inlining the sub-method would be better.
try (
BufferedReader br = new BufferedReader(new FileReader(resourcesFilePath));
BufferedWriter writer = Files.newBufferedWriter(outputFilePath, charset);
) {
String line;
while ((line = br.readLine()) != null) {
String repl = line.replaceAll("Orange:(.*)","OrangeAndApples:$1");
writer.writeln(repl);
}
}
The easiest way to write over everything in your original final would be to read in everything - changing whatever you want to change and closing the stream. Afterwards open up the file again, then overwrite the file and all its lines with the data you want.
You can use RandomAccessFile to write to the file, and nio.Files to read the bytes from it. In this case, I put it as a string.
You can also read the file with RandomAccessFile, but it is easier to do it this way, in my opinion.
import java.io.RandomAccessFile;
import java.io.File;
import java.io.IOException;
import java.nio.file.*;
public void replace(File file){
try {
RandomAccessFile raf = new RandomAccessFile(file, "rw");
Path p = Paths.get(file.toURI());
String line = new String(Files.readAllBytes(p));
if(line.startsWith("Oranges:")){
line.replaceAll("Oranges:", "ApplesandOranges:");
raf.writeUTF(line);
}
raf.close();
} catch (IOException e) {
e.printStackTrace();
}
}

Read file in java

I have file in my computer which have .file extension , I want to read it 9 character by 9 character. I know that I can read file by this code, but what should I do when my file is not .txt?does java support to read .file s with this code?
InputStream is = null;
InputStreamReader isr = null;
BufferedReader br = null;
is = new FileInputStream("c:/test.txt");
// create new input stream reader
isr = new InputStreamReader(is);
// create new buffered reader
br = new BufferedReader(isr);
// creates buffer
char[] cbuf = new char[is.available()];
for (int i = 0; i < 90000000; i += 9) {
// reads characters to buffer, offset i, len 9
br.read(cbuf, i, 9);}
The extension of a file is totally irrelevant. Extensions like .txt are mere conventions to help your operating system choose the right program when you open it.
So you can store text in any file (.txt, .file, .foobar if you are so inclined...), provided you know what kind of data it contains, and read it accordingly from your program.
So yes, Java can read .file files, and your code will work fine if that file contains text.
does java support to read .file s with this code?
No, since c:/test.txt is hard coded. If it wouldn't yes it would support it.
Yes it's possible if you write is = new FileInputStream("c:/test.file");
Yes, it reads any file you give it the same way. You can pass any file path with any extension to the FileInputStream constructor.
Anyone can read any file you want, since a file is just a sequence of bytes. The extension tells you in what format the bytes should be read, so when we have a .txt file we know that this is a file with sequences of characters.
When you have a file format called .file we know that it should be (according to you) a 9x9 set of characters. This way we know what to read and do that.
Since the .file format is characters I would say yes, you can read that with your code for instance with this:
public String[] readFileFormat (final File file) throws IOException {
if (file.exists()) {
final String[] lines = new String[9];
final BufferedReader reader = new BufferedReader ( new FileReader( file ) );
for ( int i = 0; i < lines.length; i++ ) {
lines[i] = reader.readLine();
if (lines[i] == null || lines[i].isEmpty() || lines[i].length() < 9)
throw new RuntimeException ("Line is empty when it should be filled!");
else if (lines[i].length() > 9)
throw new RuntimeException ("Line does not have exactly 9 characters!");
}
reader.close();
return lines;
}
return null;
}
The extension is totally irrelevant, so it can be .file, .txt or whatever you want it to be.
Here is an example of reading in a file with BuffereInputStream that reads a file of type .file. This is part of a larger guide that discusses 15 ways to read files in Java.
import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
public class ReadFile_BufferedInputStream_Read {
public static void main(String [] pArgs) throws FileNotFoundException, IOException {
String fileName = "c:\\temp\\sample-10KB.file";
File file = new File(fileName);
FileInputStream fileInputStream = new FileInputStream(file);
try (BufferedInputStream bufferedInputStream = new BufferedInputStream(fileInputStream)) {
int singleCharInt;
char singleChar;
while((singleCharInt = bufferedInputStream.read()) != -1) {
singleChar = (char) singleCharInt;
System.out.print(singleChar);
}
}
}
}

Find and replace in Java using regular expression without changing file format

I've a code which replaces 10:A to 12:A in a text file called sample.txt. Also, the code I've now is changing the file format, which shouldn't. Can someone please let me know how to do the same using regular expression in Java which doesn't change the file format? File has original format as below 10:A 14:Saxws But after executing the code it outputs as 10:A 14:Saxws.
import java.io.*;
import java.util.*;
public class FileReplace
{
List<String> lines = new ArrayList<String>();
String line = null;
public void doIt()
{
try
{
File f1 = new File("sample.txt");
FileReader fr = new FileReader(f1);
BufferedReader br = new BufferedReader(fr);
while ((line = br.readLine()) != null)
{
if (line.contains("10:A"))
line = line.replaceAll("10:A", "12:A") + System.lineSeparator();
lines.add(line);
}
fr.close();
br.close();
FileWriter fw = new FileWriter(f1);
BufferedWriter out = new BufferedWriter(fw);
for(String s : lines)
out.write(s);
out.flush();
out.close();
}
catch (Exception ex)
{
ex.printStackTrace();
}
}
public static void main(String[] args)
{
FileReplace fr = new FileReplace();
fr.doIt();
}
}
It looks like your OS or editor is not able to print correctly line separators generated by System.lineSeparator(). In that case consider
reading content of entire file to string (including original line separators), - then replacing part which you are interested in
and writing replaced string back to your file
You can do it using this code:
Path file = Paths.get("sample.txt");
//read all bytes from file (they will include bytes representing used line separtors)
byte[] bytesFromFile = Files.readAllBytes(file);
//convert themm to string
String textFromFile = new String(bytesFromFile, StandardCharsets.UTF_8);//use proper charset
//replace what you need (line separators will stay the same)
textFromFile = textFromFile.replaceAll("10:A", "12:A");
//write back data to file
Files.write(file, textFromFile.getBytes(StandardCharsets.UTF_8), StandardOpenOption.CREATE);

Search and Replace in a file using arraylist, Java

I wrote the below part of the code but I couldn't bind the arraylist with search and replace
so my csv file is as like below
1/1/1;7/6/1
1/1/2;7/7/1
I want to search the file 1.cfg for 1/1/1 and change it to 7/6/1 and 1/1/2 change to 7/7/1 and it goes so on.
Thank you all in advance
It's now only printing in a new file only the last line of the old File
import java.io.*;
import java.util.ArrayList;
import java.util.List;
public class ChangeConfiguration {
/**
* #param args
* #throws IOException
*/
public static void main(String[] args)
{
try{
// Open the file that is the first
// command line parameter
FileInputStream degistirilecek = new FileInputStream("c:/Config_Changer.csv");
FileInputStream config = new FileInputStream("c:/1.cfg");
// Get the object of DataInputStream
DataInputStream in = new DataInputStream(config);
DataInputStream degistir = new DataInputStream(degistirilecek);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
BufferedReader brdegis = new BufferedReader(new InputStreamReader(degistir));
List<Object> arrayLines = new ArrayList<Object>();
Object contents;
while ((contents = brdegis.readLine()) != null)
{
arrayLines.add(contents);
}
System.out.println(arrayLines + "\n");
String strLine;
//Read File Line By Line
while ((strLine = br.readLine()) != null) {
//Couldn't modify this part error is here :(
BufferedWriter out = new BufferedWriter(new FileWriter("c:/1_new.cfg"));
out.write(strLine);
out.close();
}
in.close();
degistir.close();
}catch (Exception e){//Catch exception if any
System.err.println("Error: " + e.getMessage());
}
}
}
You are opening the file for reading when you declare:
BufferedReader br = new BufferedReader(new InputStreamReader(in));
If you know the entire file will fit in memory, I recommend doing the following :
Open the file and read it's contents in memory into a giant string, then close the file.
Apply your replace in one shot to the giant string.
Open the file and write (e.g use a BufferedWriter) out the contents of the giant string, then close the file.
As a side note, your code as posted will not compile. The quality of the responses you receive are correlated with the quality of the question asked. Always include an SCCE with your question to increase the chance of getting a precise answer to your question.
can you elaborate the purpose of the program?
if it is a simple content replacement in a file.
then just read a line and store it in a string. then use string replace method for replacing a text in a string.
eg:
newStrog=oldString.replace(oldVlue,newValue);

GZIPInputStream reading line by line

I have a file in .gz format. The java class for reading this file is GZIPInputStream.
However, this class doesn't extend the BufferedReader class of java. As a result, I am not able to read the file line by line. I need something like this
reader = new MyGZInputStream( some constructor of GZInputStream)
reader.readLine()...
I though of creating my class which extends the Reader or BufferedReader class of java and use GZIPInputStream as one of its variable.
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.Reader;
import java.util.zip.GZIPInputStream;
public class MyGZFilReader extends Reader {
private GZIPInputStream gzipInputStream = null;
char[] buf = new char[1024];
#Override
public void close() throws IOException {
gzipInputStream.close();
}
public MyGZFilReader(String filename)
throws FileNotFoundException, IOException {
gzipInputStream = new GZIPInputStream(new FileInputStream(filename));
}
#Override
public int read(char[] cbuf, int off, int len) throws IOException {
// TODO Auto-generated method stub
return gzipInputStream.read((byte[])buf, off, len);
}
}
But, this doesn't work when I use
BufferedReader in = new BufferedReader(
new MyGZFilReader("F:/gawiki-20090614-stub-meta-history.xml.gz"));
System.out.println(in.readLine());
Can someone advice how to proceed ..
The basic setup of decorators is like this:
InputStream fileStream = new FileInputStream(filename);
InputStream gzipStream = new GZIPInputStream(fileStream);
Reader decoder = new InputStreamReader(gzipStream, encoding);
BufferedReader buffered = new BufferedReader(decoder);
The key issue in this snippet is the value of encoding. This is the character encoding of the text in the file. Is it "US-ASCII", "UTF-8", "SHIFT-JIS", "ISO-8859-9", …? there are hundreds of possibilities, and the correct choice usually cannot be determined from the file itself. It must be specified through some out-of-band channel.
For example, maybe it's the platform default. In a networked environment, however, this is extremely fragile. The machine that wrote the file might sit in the neighboring cubicle, but have a different default file encoding.
Most network protocols use a header or other metadata to explicitly note the character encoding.
In this case, it appears from the file extension that the content is XML. XML includes the "encoding" attribute in the XML declaration for this purpose. Furthermore, XML should really be processed with an XML parser, not as text. Reading XML line-by-line seems like a fragile, special case.
Failing to explicitly specify the encoding is against the second commandment. Use the default encoding at your peril!
GZIPInputStream gzip = new GZIPInputStream(new FileInputStream("F:/gawiki-20090614-stub-meta-history.xml.gz"));
BufferedReader br = new BufferedReader(new InputStreamReader(gzip));
br.readLine();
BufferedReader in = new BufferedReader(new InputStreamReader(
new GZIPInputStream(new FileInputStream("F:/gawiki-20090614-stub-meta-history.xml.gz"))));
String content;
while ((content = in.readLine()) != null)
System.out.println(content);
You can use the following method in a util class, and use it whenever necessary...
public static List<String> readLinesFromGZ(String filePath) {
List<String> lines = new ArrayList<>();
File file = new File(filePath);
try (GZIPInputStream gzip = new GZIPInputStream(new FileInputStream(file));
BufferedReader br = new BufferedReader(new InputStreamReader(gzip));) {
String line = null;
while ((line = br.readLine()) != null) {
lines.add(line);
}
} catch (FileNotFoundException e) {
e.printStackTrace(System.err);
} catch (IOException e) {
e.printStackTrace(System.err);
}
return lines;
}
here is with one line
try (BufferedReader br = new BufferedReader(
new InputStreamReader(
new GZIPInputStream(
new FileInputStream(
"F:/gawiki-20090614-stub-meta-history.xml.gz")))))
{br.readLine();}

Categories