read greek characters from xls file into java - java

I am trying to read an xls file in java and convert it to csv. The problem is that it contains greek characters. I have used various different methods with no success.
br = new BufferedReader(new InputStreamReader(
new FileInputStream(saveDir+"/"+fileName+".xls"), "UTF-8"));
FileWriter writer1 = new FileWriter(saveDir+"/A"+fileName+".csv");
byte[] bytes = thisLine.getBytes("UTF-8");
writer1.append(new String(bytes, "UTF-8"));
used that with different encoders, like utf16 and windoes-1253 and ofcourse with out using the bytes array. none worked. any ideas?

Use "ISO-8859-7" instead of "UTF-8". It is for latin and greek. See documentation
InputStream in = new BufferedInputStream(new FileInputStream(new File(myfile)));
result = new Scanner(in,"ISO-8859-7").useDelimiter("\\A").next();

A Byte Order Mask (BOM) should be entered at the start of the CSV file.
Can you try this code?
PrintWriter writer1 = new PrintWriter(saveDir+"/A"+fileName+".csv");
writer1.print('\ufeff');
....

Related

Java, Reading a file that has UCS-2 Little Endian encodeing

I'm trying to read a txt file that has the UCS-2 LE encoding, I have the following code below. the ??? is the encoding variable I need but I am not sure what it's supposed to be.
InputStream HostFile = new FileInputStream(Location + FileName);
Reader file = new InputStreamReader(HostFile, Charset.forName(???);
PrintWriter writer = new PrintWriter(outLocation, "UTF-8");
Any ideas would be appreciated .
Reader file = new InputStreamReader(HostFile, Charset.forName("UTF-16LE");

How to convert a UTF-8 file to UTF-16 format in Java?

I have got a file with text in it, which is in UTF-8 format. I want to convert the content of that file to UTF-16 file format, but I want to keep the special characters. How can I do this?
My attempt:
reader = new BufferedReader(new InputStreamReader(
new FileInputStream(file), StandardCharsets.UTF_8));
// ...
// read content from file
// ...
writer = new PrintWriter(new OutputStreamWriter(
new FileOutputStream(file), StandardCharsets.UTF_16));
// ...
// write content to file
// ...
However the special characters are lost using this approach. "ÜÖIJ³§`´" resulted in "������`�". I also tried replacing the characters in the Java string, but the read character is already malformed at this point. How can this be done?

Special character in txt file not being passed in the InputStream

I have an
InputStream inputStream = this.getClass().getClassLoader().getResourceAsStream("templates/createUser/new-user.txt");
and the content of the new-user.txt is :
Hello™ how r u ®
but when they are displayed in the output they are displayed as
Hello��� how r u��
Can you tell me what changes should I make to my txt file so that it starts displaying the data accordingly.
UPDATE
So here is the code :-
Handlebars handlebars = new Handlebars();
InputStream txtInputStream = this.getClass().getClassLoader()
.getResourceAsStream("templates/createUser/new-user.txt");
Template textTemplate = handlebars.compileInline(IOUtils.toString(txtInputStream));
String emailText = textTemplate.apply(vars);
The problem does not lie in the InputStream object. InputStreams are just streams of bytes, they do not differentiate between encodings. The problem is you should use this as your reader:
Reader reader = new InputStreamReader(inputStream, "UTF-8");
as opposed to using this:
Reader reader = new InputStreamReader(inputStream); // does not specify encoding
You can then get the string with:
String theString = IOUtils.toString(inputStream, "UTF-8");
Edit:
I did not realize you posted full code in the comments. Just change your second to last line to:
Template textTemplate = handlebars.compileInline(IOUtils.toString(txtInputStream, "UTF-8"));

How do I get an FileInputStream from FileItem in java?

I am trying to avoid the FileItem getInputStream(), because it will get the wrong encoding, for that I need a FileInputStream instead. Is there any way to get a FileInputStream without using this method? Or can I transform my fileitem into a file?
if (this.strEncoding != null && !this.strEncoding.isEmpty()) {
br = new BufferedReader(new InputStreamReader(clsFile.getInputStream(), this.strEncoding));
}
else {
// br = ?????
}
You can try
FileItem#getString(encoding)
Returns the contents of the file item as a String, using the specified encoding.
You can use the write method here.
File file = new File("/path/to/file");
fileItem.write(file);
An InputStream is binary data, bytes. It must be converted to text by giving the encoding of those bytes.
Java uses internally Unicode to represent all text scripts. For text it uses String/char/Reader/Writer.
For binary data, byte[], InputStream, OutputStream.
So you could use a bridging class, like InputStreamReader:
String encoding = "UTF-8"; // Or "Windows-1252" ...
BufferedReader in = new BufferedStream(
new InputStreamReader(fileItem.getInputStream(),
encoding));
Or if you read the bytes:
String s = new String(bytes, encoding);
The encoding is often an option parameter (there then exists an overloaded method without encoding).

Java reading hex file

As part of a larger program, I need to read values from a hex file and print the decimal values.
It seems to be working fine; However all hex values ranging from 80 to 9f are giving wrong values.
for example 80 hex gives a decimal value of 8364
Please help.
this is my code :
String filename = "pidno5.txt";
FileInputStream ist = new FileInputStream("sb3os2tm1r01897.032");
BufferedReader istream = new BufferedReader(new InputStreamReader(ist));
int b[]=new int[160];
for(int i=0;i<160;i++)
b[i]=istream.read();
for(int i=0;i<160;i++)
System.out.print((b[i])+" ");
If you were trying to read raw bytes this is not what you are doing.
You are using a Reader, which reads characters (in an encoding you did not specify, so it defaults to something, maybe UTF-8).
To read bytes, use an InputStream (and do not wrap it in a Reader).
You may also use a different encoding:
BufferedReader istream = new BufferedReader(new InputStreamReader(ist, "ISO-8859-15"));

Categories