Converting a byte array to a string back to a byte array

Converting a byte array to a string back to a byte array - java

I'm currently doing an assignment and I have a byte array that I write onto a file.
I want to be able to get the byte array back out of the file by doing the reverse of what I did but I don't think I'm getting the right value back.
This is the code to write to the text file
PrintWriter fw = new PrintWriter(new BufferedWriter(new FileWriter("tc.txt",true)));
KeyPair keyPair = generateKeyPair();
byte[] publicKey = keyPair.getPublic().getEncoded();
byte[] privateKey = keyPair.getPrivate().getEncoded();
fw.println(email +" " + publicKey + " " + privateKey); //adds the site into the text file whcih contains all the blacklisted sites
fw.flush();
And I try to get information back as a string and use .getBytes() to convert it back to a byte array like this
tempPublicKey = (blScanner.next()).getBytes();
It doesn't seem to be right, does something happen in-between that is wrong?

As you are writing arbitrary byte sequence to a text file, consider encoding your bytes to base64 format upon writing and decoding when you read this text file.
Text files are not suitable to store and retrieve arbitrary byte sequences, because some bytes may be recognized as formatting characters, etc, etc.
Encoding your bytes to base64 and decoding it back will preserve your byte sequence upon writing and reading to/from text file.

Related

Java Multipart file to blob then back to Image has loads of random characters

I am recieveing an image from a jsp, converting into a byte, then blob and saving it in my Database, on a different page I am then retriving it, and when I retrieve the image I get the following String.
'x*??{[?Y?>YE *?????_/???????~%?+y?`??uH????#??\t?????|B??k?-??Z4V?U7?F???m+(?? ? I??pq^?Q???????18?R???-???>0~?sXxCI?;[;t???9?fBX?Bp?A??^M?k? ??G?S?u???????r?U&‚w*??8????`??> Y?2?????1?j$??\??DR[??t0? pps?_Ex? ???_o?*?? xV)?6D8?$??!?9??~???N?`???}W?s?gNUf?Mn>?s?3?r?3M???X???Q????N!pr~?W????Mjq5??????2m???8????x??V?????????[???"??*,I?/#s?V?d?B?/?Vb?&R?n|?>??2????)?r??1??%7?Q??^f?R?C?????mvm??%6?K?p??;O?Z?&?????u?????\???R"ZOex???VkE???????_??????K?M#=??o?Z[?[hb?H?V????
I have cut this down manually on here.
This is the tag that I have done.
<img src="data:image/jpeg;base64,${img}" width="100" height="100"></img>
I'm not quite sure what has gone wrong here, here is where I have taken the file and made it into a byte[] then a blob.
byte[] byteData = file.getBytes();
Blob blobs = new SerialBlob(byteData);
and how then I've converted it into a base64 string.
String base64DataString = new String(byteData , "UTF-8");
System.out.println(base64DataString);
model.addAttribute("img", base64DataString);
If anyone has any idea how I can turn this string into a normal base64 string which can be used to reproduce an image, that would be very helpful.
Jim

String base64DataString = Base64.getEncoder().encodeToString(byteData);
Binary data should never be converted to String, which contains Unicode, mostly as UTF-16 chars, where every byte costs 2 bytes (one char), and the conversion takes time and probably goes wrong.

Converting file to byte array and transmitting it with UDP. How to get file name on server from this array of bytes?

So I read a file into a byte array and then I break it up into chunks and send it across the network with UDP.
Path path = Paths.get("files_upload/music.mp3");
byte[] objectBytes = Files.readAllBytes(path);
On the server I read all the chunks into a buffer and I end up with the same byte[] objectBytes as I had on the client. Now I want to write the file to disk using the original file name which is music.mp3 in this case. So how can I get the file name from the array of bytes?

The array of bytes doesn't contain the file name. You'd have to send it separately. You can call getFileName on your path, and then turn that into a byte array using getBytes() on the resulting string.
String fileName = path.getFileName();
byte[] fileNameBytes = fileName.getBytes();
You can then send this first and read it on the other end. Note, this wont contain the whole path, only the name of the file (music.mp3 in your case).
By the way, are you sure you want to be using UDP? What if you lose a packet or two when the data is being transferred? How do you detect that on the server?

Java Unicode to readable text conversion decoding

I am developing a Java application where I am consuming a web service. The web service is created using a SAP server, which encodes the data automatically in Unicode. I get a Unicode string from the web service.
"
倥䙄ㄭ㌮਍쿣ී㈊〠漠橢਍圯湩湁楳湅潣楤杮਍湥潤橢਍″‰扯൪㰊഼┊敄瑶灹⁥佐呓′†䘠湯⁴佃剕䕉⁒渠牯慭⁬慌杮䔠ൎ⼊祔数⼠潆瑮਍匯扵祴数⼠祔数റ⼊慂敳潆瑮⼠潃牵敩൲⼊慎敭⼠う㄰਍䔯据摯湩⁧′‰൒㸊ാ攊摮扯൪㐊〠漠橢਍㰼਍䰯湥瑧⁨‵‰൒㸊ാ猊牴慥൭ 䘯〰‱⸱2
"
above is the response.
I want to convert it to readable text format like String. I am using core Java.

倥䙄ㄭ㌮਍쿣ී㈊〠漠橢਍圯湩湁楳湅潣楤杮਍湥潤橢਍″‰扯൪㰊഼┊敄瑶灹⁥佐呓′†䘠湯⁴佃剕䕉⁒渠牯慭⁬慌杮䔠ൎ⼊祔数⼠潆瑮਍匯扵祴数⼠祔数റ⼊慂敳潆瑮⼠潃牵敩൲⼊慎敭⼠う㄰਍䔯据摯湩⁧′‰൒㸊ാ攊摮扯൪㐊〠漠橢਍㰼਍䰯湥瑧⁨‵‰൒㸊ാ猊牴慥൭ 䘯〰‱⸱2
That's a PDF file that has been interpreted as UTF-16LE.
You need to look at what component is receiving the response and how it's dealing with the input to stop it being decoded as UTF-16LE, but ultimately there isn't a 'readable' version of it as such, as it's a binary file. Extracting the document text out of a PDF file is a much bigger problem!
(Note: Unicode is a character set, UTF-16LE is an encoding of that set into bytes. Microsoft call the UTF-16LE encoding "Unicode" due to a historical accident, but that's misleading.)

If you have byte[] or an InputStream (both binary data) you can get a String or a Reader (both text) with:
final String encoding = "UTF-8"; // "UTF16LE" or "UTF-16BE"
byte[] b = ...;
String s = new String(b, encoding);
InputStream is = ...;
BufferedReader reader = new BufferedReader(new InputStreamReader(is, encoding));
for (;;) {
String line = reader.readLine();
}
The reverse process uses:
byte[] b = s.geBytes(encoding);
OutputStream os = ...;
BufferedWriter writer = new BufferedWriter(new OuputStreamWriter(os, encoding));
writer.println(s);
Unicode is a numbering system for all characters. The UTF variants implement Unicode as bytes.
Your problem:
In normal ways (web service), you would already have received a String. You could write that string to a file using the Writer above for instance. Either to check it yourself with a full Unicode font, or to pass the file on for a check.
You need (?) to check, which UTF variant the text is in. For Asiatic scripts UTF-16 (little endian or big endian) are optimal. In XML it would be defined already.
Addition:
FileWriter writes to a file using the default encoding (from operating system on your machine). Instead use:
new OutputStreamWriter(new FileOutputStream(new File("...")), "UTF-8")
If it is a binary PDF, as #bobince said, use just a FileOutputStream on byte[] or InputStream.

This is definitely not a valid string. This looks like mangled UTF-16.
UPDATE
Indeed #Bobince is right, this is a PDF file (most probably in UTF-8 / or plain ASCII) displayed in UTF-16. When Displayed in UTF-8 this string indeed shows PDF source code. Good catch.

AES using Base64 Encryption

my target is to encrypt a String with AES
I am using Base64 for encryption, because AES needs a byte array as input.
Moreover i want every possible Char(including chinese and german Symbols) to be stored correctly
byte[] encryptedBytes = Base64.decodeBase64 ("some input");
System.out.println(new Base64().encodeToString(encryptedBytes));
I thought "some input" should be printed. Instead "someinpu" is printed.
It is impossible for me to use sun.misc.* Instead i am using apache.commons.codec
Does someone has a clue what's going wrong?

Yes - "some input" isn't a valid base64 encoded string.
The idea of base64 is that you encode binary data into text. You then decode that text data to a byte array. You can't just decode any arbitrary text as if it were a complete base64 message any more than you can try to decode an mp3 as a jpeg image.
Encrypting a string should be this process:
Encode the string to binary data, e.g. using UTF-8 (text.getBytes("UTF-8"))
Encrypt the binary data using AES
Encode the cyphertext using Base64 to get text
Decryption is then a matter of:
Decode the base64 text to the binary cyphertext
Decrypt the cyphertext to get the binary plaintext
Decode the binary plaintext into a string using the same encoding as the first step above, e.g. new String(bytes, "UTF-8")

You cannot use Base64 to turn arbitrary text into bytes; that's not what it's designed to do.
Instead, you should use UTF8:
byte[] plainTextBytes = inputString.getBytes("UTF8");
String output = new String(plainTextBytes, "UTF8");

Java Text File Encoding

I have a text file and it can be ANSI (with ISO-8859-2 charset), UTF-8, UCS-2 Big or Little Endian.
Is there any way to detect the encoding of the file to read it properly?
Or is it possible to read a file without giving the encoding? (and it reads the file as it is)
(There are several program that can detect and convert encoding/format of text files.)

Yes, there's a number of methods to do character encoding detection, specifically in Java. Take a look at jchardet which is based on the Mozilla algorithm. There's also cpdetector and a project by IBM called ICU4j. I'd take a look at the latter, as it seems to be more reliable than the other two. They work based on statistical analysis of the binary file, ICU4j will also provide a confidence level of the character encoding it detects so you can use this in the case above. It works pretty well.

UTF-8 and UCS-2/UTF-16 can be distinguished reasonably easily via a byte order mark at the start of the file. If this exists then it's a pretty good bet that the file is in that encoding - but it's not a dead certainty. You may well also find that the file is in one of those encodings, but doesn't have a byte order mark.
I don't know much about ISO-8859-2, but I wouldn't be surprised if almost every file is a valid text file in that encoding. The best you'll be able to do is check it heuristically. Indeed, the Wikipedia page talking about it would suggest that only byte 0x7f is invalid.
There's no idea of reading a file "as it is" and yet getting text out - a file is a sequence of bytes, so you have to apply a character encoding in order to decode those bytes into characters.

You can use ICU4J (http://icu-project.org/apiref/icu4j/)
Here is my code:
String charset = "ISO-8859-1"; //Default chartset, put whatever you want
byte[] fileContent = null;
FileInputStream fin = null;
//create FileInputStream object
fin = new FileInputStream(file.getPath());
/*
* Create byte array large enough to hold the content of the file.
* Use File.length to determine size of the file in bytes.
*/
fileContent = new byte[(int) file.length()];
/*
* To read content of the file in byte array, use
* int read(byte[] byteArray) method of java FileInputStream class.
*
*/
fin.read(fileContent);
byte[] data = fileContent;
CharsetDetector detector = new CharsetDetector();
detector.setText(data);
CharsetMatch cm = detector.detect();
if (cm != null) {
int confidence = cm.getConfidence();
System.out.println("Encoding: " + cm.getName() + " - Confidence: " + confidence + "%");
//Here you have the encode name and the confidence
//In my case if the confidence is > 50 I return the encode, else I return the default value
if (confidence > 50) {
charset = cm.getName();
}
}
Remember to put all the try catch need it.
I hope this works for you.

If your text file is a properly created Unicode text file then the Byte Order Mark (BOM) should tell you all the information you need. See here for more details about BOM
If it's not then you'll have to use some encoding detection library.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Converting a byte array to a string back to a byte array - java

Related

Java Multipart file to blob then back to Image has loads of random characters

Converting file to byte array and transmitting it with UDP. How to get file name on server from this array of bytes?

Java Unicode to readable text conversion decoding

AES using Base64 Encryption

Java Text File Encoding

Categories

Resources