Read and Write Text in ANSI format - java

Please have a look at the following code
import java.io.*;
public class CSVConverter
{
private File csvFile;
private BufferedReader reader;
private StringBuffer strBuffer;
private BufferedWriter writer;
int startNumber = 0;
private String strString[];
public CSVConverter(String location, int startNumber)
{
csvFile = new File(location);
strBuffer = new StringBuffer("");
this.startNumber = startNumber;
//Read
try
{
reader = new BufferedReader(new FileReader(csvFile));
String line = "";
while((line=reader.readLine())!=null)
{
String[] array = line.split(",");
String inputQuery = "insertQuery["+startNumber+"] = \"insert into WordList_Table ('Engl','Port','EnglishH','PortugueseH','Numbe','NumberOf','NumberOfTime','NumberOfTimesPor')values('"+array[0]+"','"+array[2]+"','"+array[1]+"','"+array[3]+"',0,0,0,0)\"";
strBuffer.append(inputQuery+";"+"\r\n");
startNumber++;
}
}
catch(Exception e)
{
e.printStackTrace();
}
System.out.println(strBuffer.toString());
//Write
try
{
File file = new File("C:/Users/list.txt");
FileWriter filewrite = new FileWriter(file);
if(!file.exists())
{
file.createNewFile();
}
writer = new BufferedWriter(filewrite);
writer.write(strBuffer.toString());
writer.flush();
writer.close();
}
catch(Exception e)
{
e.printStackTrace();
}
}
public static void main(String[]args)
{
new CSVConverter("C:/Users/list.csv",90);
}
}
I am trying to read a CSV file, edit the text in code, and write it back to a .txt file. My issue is, I have Portuguese words, so the file should be read and write using ANSI format. Right now some Portuguese words are replaced with symbols in the output file.
How can I read and write text data into a file in ANSI format in Java?

To read a text file with a specific encoding you can use a FileInputStream in conjunction with a InputStreamReader. The right Java encoding for Windows ANSI is Cp1252.
reader = new BufferedReader(new InputStreamReader(new FileInputStream(csvFile), "Cp1252"));
To write a text file with a specific character encoding you can use a FileOutputStream together with a OutputStreamWriter.
writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file), "Cp1252"));
The classes InputStreamReader and OutputStreamWriter translate between byte oriented streams and text with a specific character encoding.

Related

unalble to read clear data from pdf file as other language is not english

I am trying to copy some data from pdf to txt file here is the code
public void readPDFFile() throws IOException {
InputStreamReader reader;
OutputStreamWriter writer;
FileInputStream inputstream;
FileOutputStream outputStream;
BufferedReader bufferedReader = null;
BufferedWriter bufferedWriter = null;
String str;
File rfile = new File(
"C://Documents and Settings/Administrator/My Documents/EGDownloads/source.pdf");
File wFile = new File("C://Documents and Settings/Administrator/My Documents/Folder/destination.txt");
try {
inputstream = new FileInputStream(rfile);
outputStream = new FileOutputStream(wFile);
reader = new InputStreamReader(inputstream, "UTF-8");
writer = new OutputStreamWriter(outputStream, "UTF-8");
bufferedReader = new BufferedReader(reader);
bufferedWriter = new BufferedWriter(writer);
while ((str = bufferedReader.readLine()) != null) {
writer.write(str);
}
} catch (IOException es) {
System.out.println(es.getMessage());
es.printStackTrace(System.out);
} finally {
if (bufferedReader != null) {
bufferedReader.close();
}
if (bufferedWriter != null)
bufferedWriter.close();
}
}
Expected output is supposed in other language but all I am getting is some random boxes as tried both UTF-16 and UTF-8 unicodes
I tried pdfBox but is still not working as all I'm getting is only original language accent and in english language
Note :
1 I'm not trying to print data on console but copying from pdf to txt file
2 Other file contains non english words,
can anyone help me to solve that??
Or any link that might help
Thanks.
The PDF format is a binary format. You must have a really special PDF as all that I know of are compressed in some way. Use a proper library to read it, be it pdfbox or itext or other. Be aware that in some PDFs it's impossible to extract text, you can check it with Acrobat, if Acrobat can't do it nobody can.

get Data from text file in java [duplicate]

How do you read and display data from .txt files?
BufferedReader in = new BufferedReader(new FileReader("<Filename>"));
Then, you can use in.readLine(); to read a single line at a time. To read until the end, write a while loop as such:
String line;
while((line = in.readLine()) != null)
{
System.out.println(line);
}
in.close();
If your file is strictly text, I prefer to use the java.util.Scanner class.
You can create a Scanner out of a file by:
Scanner fileIn = new Scanner(new File(thePathToYourFile));
Then, you can read text from the file using the methods:
fileIn.nextLine(); // Reads one line from the file
fileIn.next(); // Reads one word from the file
And, you can check if there is any more text left with:
fileIn.hasNext(); // Returns true if there is another word in the file
fileIn.hasNextLine(); // Returns true if there is another line to read from the file
Once you have read the text, and saved it into a String, you can print the string to the command line with:
System.out.print(aString);
System.out.println(aString);
The posted link contains the full specification for the Scanner class. It will be helpful to assist you with what ever else you may want to do.
In general:
Create a FileInputStream for the file.
Create an InputStreamReader wrapping the input stream, specifying the correct encoding
Optionally create a BufferedReader around the InputStreamReader, which makes it simpler to read a line at a time.
Read until there's no more data (e.g. readLine returns null)
Display data as you go or buffer it up for later.
If you need more help than that, please be more specific in your question.
I love this piece of code, use it to load a file into one String:
File file = new File("/my/location");
String contents = new Scanner(file).useDelimiter("\\Z").next();
Below is the code that you may try to read a file and display in java using scanner class. Code will read the file name from user and print the data(Notepad VIM files).
import java.io.*;
import java.util.Scanner;
import java.io.*;
public class TestRead
{
public static void main(String[] input)
{
String fname;
Scanner scan = new Scanner(System.in);
/* enter filename with extension to open and read its content */
System.out.print("Enter File Name to Open (with extension like file.txt) : ");
fname = scan.nextLine();
/* this will reference only one line at a time */
String line = null;
try
{
/* FileReader reads text files in the default encoding */
FileReader fileReader = new FileReader(fname);
/* always wrap the FileReader in BufferedReader */
BufferedReader bufferedReader = new BufferedReader(fileReader);
while((line = bufferedReader.readLine()) != null)
{
System.out.println(line);
}
/* always close the file after use */
bufferedReader.close();
}
catch(IOException ex)
{
System.out.println("Error reading file named '" + fname + "'");
}
}
}
If you want to take some shortcuts you can use Apache Commons IO:
import org.apache.commons.io.FileUtils;
String data = FileUtils.readFileToString(new File("..."), "UTF-8");
System.out.println(data);
:-)
public class PassdataintoFile {
public static void main(String[] args) throws IOException {
try {
PrintWriter pw = new PrintWriter("C:/new/hello.txt", "UTF-8");
PrintWriter pw1 = new PrintWriter("C:/new/hello.txt");
pw1.println("Hi chinni");
pw1.print("your succesfully entered text into file");
pw1.close();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
BufferedReader br = new BufferedReader(new FileReader("C:/new/hello.txt"));
String line;
while((line = br.readLine())!= null)
{
System.out.println(line);
}
br.close();
}
}
In Java 8, you can read a whole file, simply with:
public String read(String file) throws IOException {
return new String(Files.readAllBytes(Paths.get(file)));
}
or if its a Resource:
public String read(String file) throws IOException {
URL url = Resources.getResource(file);
return Resources.toString(url, Charsets.UTF_8);
}
You most likely will want to use the FileInputStream class:
int character;
StringBuffer buffer = new StringBuffer("");
FileInputStream inputStream = new FileInputStream(new File("/home/jessy/file.txt"));
while( (character = inputStream.read()) != -1)
buffer.append((char) character);
inputStream.close();
System.out.println(buffer);
You will also want to catch some of the exceptions thrown by the read() method and FileInputStream constructor, but those are implementation details specific to your project.

How to write binary to a file in Java

I am trying to get input from a file, convert the characters to binary and then output the binary to another output file.
I used Integer.toBinaryString() in order to make the conversion.
Everything is working as it should but for some reason nothing is written to the output file, but when I use System.out.println() it outputs fine.
import java.io.*;
public class Binary {
FileReader fRead = null;
FileWriter fWrite = null;
byte[] bFile = null;
String fileIn;
private String binaryString(int bString) {
String binVal = Integer.toBinaryString(bString);
while (binVal.length() < 8) {
binVal = "0" + binVal;
}
return binVal;
}
public void input() throws IOException, UnsupportedEncodingException {
try {
fRead = new FileReader("in.txt");
BufferedReader reader = new BufferedReader(fRead);
fileIn = reader.readLine();
bFile = fileIn.getBytes("UTF-8");
fWrite = new FileWriter("out.txt");
BufferedWriter writer = new BufferedWriter(fWrite);
for (byte b: bFile) {
writer.write(binaryString(b));
System.out.println(binaryString(b));
}
System.out.println("Done.");
} catch (Exception e) {
e.printStackTrace();
}
}
public Binary() {
}
public static void main(String[] args) throws UnsupportedEncodingException, IOException {
Binary b = new Binary();
b.input();
}
}
I know my code is not very good, I'm relatively new to Java so I don't know many others ways to accomplish this.
Use Output stream instead of Writer as writer is not supposed to be used for writing binary content
FileOutputStream fos = new FileOutputStream(new File("output.txt"));
BufferedOutputStream bos = new BufferedOutputStream(fos);
bos.write(b); // in loop probably

Find and replace in Java using regular expression without changing file format

I've a code which replaces 10:A to 12:A in a text file called sample.txt. Also, the code I've now is changing the file format, which shouldn't. Can someone please let me know how to do the same using regular expression in Java which doesn't change the file format? File has original format as below 10:A 14:Saxws But after executing the code it outputs as 10:A 14:Saxws.
import java.io.*;
import java.util.*;
public class FileReplace
{
List<String> lines = new ArrayList<String>();
String line = null;
public void doIt()
{
try
{
File f1 = new File("sample.txt");
FileReader fr = new FileReader(f1);
BufferedReader br = new BufferedReader(fr);
while ((line = br.readLine()) != null)
{
if (line.contains("10:A"))
line = line.replaceAll("10:A", "12:A") + System.lineSeparator();
lines.add(line);
}
fr.close();
br.close();
FileWriter fw = new FileWriter(f1);
BufferedWriter out = new BufferedWriter(fw);
for(String s : lines)
out.write(s);
out.flush();
out.close();
}
catch (Exception ex)
{
ex.printStackTrace();
}
}
public static void main(String[] args)
{
FileReplace fr = new FileReplace();
fr.doIt();
}
}
It looks like your OS or editor is not able to print correctly line separators generated by System.lineSeparator(). In that case consider
reading content of entire file to string (including original line separators), - then replacing part which you are interested in
and writing replaced string back to your file
You can do it using this code:
Path file = Paths.get("sample.txt");
//read all bytes from file (they will include bytes representing used line separtors)
byte[] bytesFromFile = Files.readAllBytes(file);
//convert themm to string
String textFromFile = new String(bytesFromFile, StandardCharsets.UTF_8);//use proper charset
//replace what you need (line separators will stay the same)
textFromFile = textFromFile.replaceAll("10:A", "12:A");
//write back data to file
Files.write(file, textFromFile.getBytes(StandardCharsets.UTF_8), StandardOpenOption.CREATE);

GZIPInputStream reading line by line

I have a file in .gz format. The java class for reading this file is GZIPInputStream.
However, this class doesn't extend the BufferedReader class of java. As a result, I am not able to read the file line by line. I need something like this
reader = new MyGZInputStream( some constructor of GZInputStream)
reader.readLine()...
I though of creating my class which extends the Reader or BufferedReader class of java and use GZIPInputStream as one of its variable.
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.Reader;
import java.util.zip.GZIPInputStream;
public class MyGZFilReader extends Reader {
private GZIPInputStream gzipInputStream = null;
char[] buf = new char[1024];
#Override
public void close() throws IOException {
gzipInputStream.close();
}
public MyGZFilReader(String filename)
throws FileNotFoundException, IOException {
gzipInputStream = new GZIPInputStream(new FileInputStream(filename));
}
#Override
public int read(char[] cbuf, int off, int len) throws IOException {
// TODO Auto-generated method stub
return gzipInputStream.read((byte[])buf, off, len);
}
}
But, this doesn't work when I use
BufferedReader in = new BufferedReader(
new MyGZFilReader("F:/gawiki-20090614-stub-meta-history.xml.gz"));
System.out.println(in.readLine());
Can someone advice how to proceed ..
The basic setup of decorators is like this:
InputStream fileStream = new FileInputStream(filename);
InputStream gzipStream = new GZIPInputStream(fileStream);
Reader decoder = new InputStreamReader(gzipStream, encoding);
BufferedReader buffered = new BufferedReader(decoder);
The key issue in this snippet is the value of encoding. This is the character encoding of the text in the file. Is it "US-ASCII", "UTF-8", "SHIFT-JIS", "ISO-8859-9", …? there are hundreds of possibilities, and the correct choice usually cannot be determined from the file itself. It must be specified through some out-of-band channel.
For example, maybe it's the platform default. In a networked environment, however, this is extremely fragile. The machine that wrote the file might sit in the neighboring cubicle, but have a different default file encoding.
Most network protocols use a header or other metadata to explicitly note the character encoding.
In this case, it appears from the file extension that the content is XML. XML includes the "encoding" attribute in the XML declaration for this purpose. Furthermore, XML should really be processed with an XML parser, not as text. Reading XML line-by-line seems like a fragile, special case.
Failing to explicitly specify the encoding is against the second commandment. Use the default encoding at your peril!
GZIPInputStream gzip = new GZIPInputStream(new FileInputStream("F:/gawiki-20090614-stub-meta-history.xml.gz"));
BufferedReader br = new BufferedReader(new InputStreamReader(gzip));
br.readLine();
BufferedReader in = new BufferedReader(new InputStreamReader(
new GZIPInputStream(new FileInputStream("F:/gawiki-20090614-stub-meta-history.xml.gz"))));
String content;
while ((content = in.readLine()) != null)
System.out.println(content);
You can use the following method in a util class, and use it whenever necessary...
public static List<String> readLinesFromGZ(String filePath) {
List<String> lines = new ArrayList<>();
File file = new File(filePath);
try (GZIPInputStream gzip = new GZIPInputStream(new FileInputStream(file));
BufferedReader br = new BufferedReader(new InputStreamReader(gzip));) {
String line = null;
while ((line = br.readLine()) != null) {
lines.add(line);
}
} catch (FileNotFoundException e) {
e.printStackTrace(System.err);
} catch (IOException e) {
e.printStackTrace(System.err);
}
return lines;
}
here is with one line
try (BufferedReader br = new BufferedReader(
new InputStreamReader(
new GZIPInputStream(
new FileInputStream(
"F:/gawiki-20090614-stub-meta-history.xml.gz")))))
{br.readLine();}

Categories