Java reading hex file

Java reading hex file - java

As part of a larger program, I need to read values from a hex file and print the decimal values.
It seems to be working fine; However all hex values ranging from 80 to 9f are giving wrong values.
for example 80 hex gives a decimal value of 8364
Please help.
this is my code :
String filename = "pidno5.txt";
FileInputStream ist = new FileInputStream("sb3os2tm1r01897.032");
BufferedReader istream = new BufferedReader(new InputStreamReader(ist));
int b[]=new int[160];
for(int i=0;i<160;i++)
b[i]=istream.read();
for(int i=0;i<160;i++)
System.out.print((b[i])+" ");

If you were trying to read raw bytes this is not what you are doing.
You are using a Reader, which reads characters (in an encoding you did not specify, so it defaults to something, maybe UTF-8).
To read bytes, use an InputStream (and do not wrap it in a Reader).

You may also use a different encoding:
BufferedReader istream = new BufferedReader(new InputStreamReader(ist, "ISO-8859-15"));

Related

Base64 Encoded to Decoded File Conversion Problem

I am processing very large files (> 2Gig). Each input file is Base64 encoded, andI am outputting to new files after decoding. Depending on the buffer size (LARGE_BUF) and for a given input file, my input to output conversion either works fine, is missing one or more bytes, or throws an exception at the outputStream.write line (IllegalArgumentException: Last unit does not have enough bits). Here is the code snippet (could not cut and paste so my not be perfect):
.
.
final int LARGE_BUF = 1024;
byte[] inBuf = new byte[LARGE_BUF];
try(InputStream inputStream = new FileInputStream(inFile); OutputStream outStream new new FileOutputStream(outFile)) {
for(int len; (len = inputStream.read(inBuf)) > 0); ) {
String out = new String(inBuf, 0, len);
outStream.write(Base64.getMimeDecoder().decode(out.getBytes()));
}
}
For instance, for my sample input file, if LARGE_BUF is 1024, output file is 4 bytes too small, if 2*1024, I get the exception mentioned above, if 7*1024, it works correctly. Grateful for any ideas. Thank you.

First, you are converting bytes into a String, then immediately back into bytes. So, remove the use of String entirely.
Second, base64 encoding turns each sequence of three bytes into four bytes, so when decoding, you need four bytes to properly decode three bytes of original data. It is not safe to create a new decoder for each arbitrarily read sequence of bytes, which may or may not have a length which is an exact multiple of four.
Finally, Base64.Decoder has a wrap(InputStream) method which makes this considerably easier:
try (InputStream inputStream = Base64.getDecoder().wrap(
new BufferedInputStream(
Files.newInputStream(Paths.get(inFile))))) {
Files.copy(inputStream, Paths.get(outFile));
}

read greek characters from xls file into java

I am trying to read an xls file in java and convert it to csv. The problem is that it contains greek characters. I have used various different methods with no success.
br = new BufferedReader(new InputStreamReader(
new FileInputStream(saveDir+"/"+fileName+".xls"), "UTF-8"));
FileWriter writer1 = new FileWriter(saveDir+"/A"+fileName+".csv");
byte[] bytes = thisLine.getBytes("UTF-8");
writer1.append(new String(bytes, "UTF-8"));
used that with different encoders, like utf16 and windoes-1253 and ofcourse with out using the bytes array. none worked. any ideas?

Use "ISO-8859-7" instead of "UTF-8". It is for latin and greek. See documentation
InputStream in = new BufferedInputStream(new FileInputStream(new File(myfile)));
result = new Scanner(in,"ISO-8859-7").useDelimiter("\\A").next();

A Byte Order Mask (BOM) should be entered at the start of the CSV file.
Can you try this code?
PrintWriter writer1 = new PrintWriter(saveDir+"/A"+fileName+".csv");
writer1.print('\ufeff');
....

java convert utf-8 2 byte char to 1 byte char

There are many similar questions, but no one helped me.
utf-8 can be 1 byte or 2,3,4.
ISO-8859-15 is allways 2 bytes.
But I need 1 byte character like code page Code "page 863" (IBM863).
http://en.wikipedia.org/wiki/Code_page_863
For example "é" is code point 233 and is 2 bytes long in utf 8, how can I convert it to IBM863 (1 byte) in Java?
Running on JVM -Dfile.encoding=UTF-8 possible?
Of course that conversion would mean that some characters can be lost, because IBM863 is smaller.
But I need the language specific characters, like french, è, é etc.
Edit1:
String text = "text with é";
Socket socket = getPrinterSocket( printer);
BufferedWriter bwOut = getPrinterWriter(printer,socket);
...
bwOut.write("PRTXT \"" + text + "\n");
...
if (socket != null)
{
bwOut.close();
socket.close();
}
else
{
bwOut.flush();
}
Its going a label printer with Fingerprint 8.2.
Edit 2:
private BufferedWriter getPrinterWriter(PrinterLocal printer, Socket socket)
throws IOException
{
return new BufferedWriter(new OutputStreamWriter(socket.getOutputStream()));
}

First of all: there is no such thing as "1 byte char" or, in fact, "n byte char" for whatever n.
In Java, a char is a UTF-16 code unit; depending on the (Unicode) code point, either one, or two chars, are necessary to represent a code point.
You can use the following methods:
Character.toChars() to turn a Unicode code point into a char array representing this code point;
a CharsetEncoder to perform the char[] to byte[] conversion;
a CharsetDecoder to perform the byte[] to char[] conversion.
You obtain the two latter from a Charset's .new{Encoder,Decoder}() methods.
It is crucially important here to know what your input is exactly: is it a code point, is it an encoded byte array? You'll have to adapt your code depending on this.
Final note: the file.encoding setting defines the default charset to use when you don't specify a charset to use, for instance in a FileReader constructors; you should avoid not specifying a charset to begin with!

byte[] someUtf8Bytes = ...
String decoded = new String(someUtf8Bytes, StandardCharsets.UTF8);
byte[] someIso15Bytes = decoded.getBytes("ISO-8859-15");
byte[] someCp863Bytes = decoded.getBytes("cp863");
If you start with a string, use just getBytes with a proper encoding.
If you want to write strings with a proper encoding to a socket, you can either use OutputStream instead of PrintStream or Writer and send byte arrays, or you can do:
new BufferedWriter(new OutputStreamWriter(socket.getOutputStream(), "cp863"))

Trying to Change the Encdoing of a File in Java is Doubling the Contents of the File

I have a FileOutputStream in java that is reading the contents of UDP packets and saving them to a file. At the end of reading them, I sometimes want to convert the encoding of the file. The problem is that currently when doing this, it just ends up doubling all the contents of the file. The only workaround that I could think to do would be to create a temp file with the new encoding and then save it as the original file, but this seems too hacky.
I must be just overlooking something in my code:
if(mode.equals("netascii")){
byte[] convert = new byte[(int)file.length()];
FileInputStream input = new FileInputStream(file);
input.read(convert);
String temp = new String(convert);
convert = Charset.forName("US-ASCII").encode(temp).array();
fos.write(convert);
}
JOptionPane.showMessageDialog(frame, "Read Successful!");
fos.close();
}
Is there anything suspect?
Thanks in advance for any help!

The problem is the array of bytes you've read from the InputStream will be converted as if its ascii chars, which I'm assuming its not. Specify the InputStream encoding when converting its bytes to String and you'll get a standard Java string.
I've assumed UTF-16 as the InputStream's encoding here:
byte[] convert = new byte[(int)file.length()];
FileInputStream input = new FileInputStream(file);
// read file bytes until EOF
int r = input.read(convert);
while(r!=-1) r = input.read(convert,r,convert.length);
String temp = new String(convert, Charset.forName("UTF-16"));

Java OutputStreamWriter UTF-16 CVS wrong character on StartLine

i need to write a simple CSV file using OutputStreamWriter everything works OK but i have a problem a have in the first Header on the CSV the outer left on every line seems to ADD improperly a Character or a sequence of them in the String here is my Java Code
private final Character SEPARATOR=';';
private final Character LINE_FEED='\n';
public void createCSV(final String fileName)//......
{
try
(final OutputStream outputStream = new FileOutputStream(fileName);
final OutputStreamWriter writer=new OutputStreamWriter(outputStream,StandardCharsets.UTF_16);)
{
final StringBuilder builder = new StringBuilder().append("Fecha").append(SEPARATOR)
.append("NºExp").append(SEPARATOR)
.append("NºFactura").append(SEPARATOR).append(LINE_FEED);
writer.append(builder.toString());
writer.append(builder.toString());
writer.flush();
}catch (IOException e){e.printStackTrace();}
}
unfortunalety i am receiving this ouput always happens in the first line if i repeat the same output to the second line in the CSV everything works smoothly is a Java problem or is my Excel gives me nightmares??.. thank a lot..
OUTPUT

This is a superfluous byte order mark (BOM), \uFFFE, a zero width space, its byte encoding used to determine whether it is UTF-16LE (little endian) or UTF-16-BE (big endian).
Write "UTF16-LE", which has the Windows/Intel ordering of least significant byte, most significant byte.
StandardCharsets.UTF_16LE

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java reading hex file - java

If you were trying to read raw bytes this is not what you are doing. You are using a Reader, which reads characters (in an encoding you did not specify, so it defaults to something, maybe UTF-8). To read bytes, use an InputStream (and do not wrap it in a Reader).

You may also use a different encoding: BufferedReader istream = new BufferedReader(new InputStreamReader(ist, "ISO-8859-15"));

Related

Base64 Encoded to Decoded File Conversion Problem

read greek characters from xls file into java

java convert utf-8 2 byte char to 1 byte char

Trying to Change the Encdoing of a File in Java is Doubling the Contents of the File

Java OutputStreamWriter UTF-16 CVS wrong character on StartLine

Categories

Resources