How to read binary file to binary characters java - java

I have a binary file that I need to read and save as characters or a string of 0's and 1's in the same order that they are in the binary file. I am currently able to read in the binary file, but am unable to obtain the 0's and 1's. Here is the code I am currently using:
public void read()
{
try
{
byte[] buffer = new byte[(int)infile.length()];
FileInputStream inputStream = new FileInputStream(infile);
int total = 0;
int nRead = 0;
while((nRead = inputStream.read(buffer)) != -1)
{
System.out.println(new String(buffer));
total += nRead;
}
inputStream.close();
System.out.println(total);
}
catch(FileNotFoundException ex)
{
System.out.println("File not found.");
}
catch(IOException ex)
{
System.out.println(ex);
}
}
and the output from running this with the binary file:
�, �¨Ã �¨ÊÃ
�!Cˇ¯åaÃ!Dˇ¸åÇÃ�"( ≠EÃ!J�H���û�������
����������������������������������������������������������������������������������������
156
Thanks for any help you can give.

Check out String to binary output in Java. Basically you need to take your String, convert it to a byte array, and print out each byte as a binary string.

Instead of converting the bytes directly into characters and then printing them, convert each byte into a binary string and print them out. In other words, replace
System.out.println(new String(buffer));
with
for (int i = 0; i<nRead; i++) {
String bin=Integer.toBinaryString(0xFF & buffer[i] | 0x100).substring(1);
System.out.println(bin);
}
Notice though that the bits of each byte are printed in big-endian order. There is no way to know if bits are actually stored in this order on disk.

with JBBP such operation will be very easy
public static final void main(final String ... args) throws Exception {
try (InputStream inStream = ClassLoader.getSystemClassLoader().getResourceAsStream("somefile.txt")) {
class Bits { #Bin(type = BinType.BIT_ARRAY) byte [] bits; }
for(final byte b : JBBPParser.prepare("bit [_] bits;",JBBPBitOrder.MSB0).parse(inStream).mapTo(Bits.class).bits)
System.out.print(b != 0 ? "1" : "0");
}
}
But it will not be working with huge files because parsed data will be cached in memory during operatio

Even though this response is in C, you can use the JNI to access it natively from a Java program.
Since they are in a binary format, you will not be able to read it. I would do it like this.
fstream fs;
int value; //Since you are reading bytes, change accordingly.
fs.open( fileName, is.in | is.binary );
fs.read((char *) &value, sizeof(int));
while(!fs.eof())
{
//Print or do something with value
fs.read((char *) &value, sizeof(long));
}

Related

IO Image reading and writing: Is writing array of bytes different from writing byte at a time using write(int b) method?

I am new to java IO and I tried to simply copy and paste a photo. I used two ways to achieve this the first works nicely but the second doesn't.
This Code works fine.
try (BufferedInputStream input = new BufferedInputStream(new FileInputStream("photoOriginal.jpg"));
BufferedOutputStream output =new BufferedOutputStream(new FileOutputStream("photoCopy.jpg"))) {
try {
int n =0;
byte[] buf = new byte[4092];
while((n = input.read(buf))!=-1){
output. Write(buf,0,n);
output.flush();
}
}
} catch (IOException e) {
System.out.println("Error: " + e.getMessage());
e.printStackTrace();
}
But the second doesn't work , after the program finished I find the copy File with the same exact size as the original but when trying to open it ,it shows format not supported error.
try (BufferedInputStream input = new BufferedInputStream(new FileInputStream("photoOriginal.jpg"));
BufferedOutputStream output =new BufferedOutputStream(new FileOutputStream("photoCopy.jpg"))) {
try {
int byteRead = input.read();
while (byteRead != -1) {
byteRead = input.read();
output.write(byteRead);
output.flush();
}
}
}
} catch (IOException e) {
System.out.println("Error: " + e.getMessage());
e.printStackTrace();
}
I don't understand where the problem is, it seems that the 2 sample are doing the same thing.
Is reading to and writing from byte array different from reading and writing single byte at a time ?
Isn't writing int to a Stream with write(int b) method only writes the lowest 8 bits and vice versa as said in Documentation ?
write
public abstract void write(int b)
throws IOException
Writes the specified byte to this output stream. The general contract for write is that one byte is written to the output stream. The byte to be written is the eight low-order bits of the argument b. The 24 high-order bits of b are ignored.
hope someone will help.
You're not writing out the first byte - you call input.read(), check that it's not -1, but then call input.read() again:
// Broken code
int byteRead = input.read();
while (byteRead != -1) {
byteRead = input.read();
output.write(byteRead);
output.flush();
}
If you just move the next input.read() call to the end of the loop, it will work:
// Working code with duplication
int byteRead = input.read();
while (byteRead != -1) {
output.write(byteRead);
output.flush();
byteRead = input.read();
}
Or you could combine the "read and test" to avoid duplication:
// Working code without duplication
int byteRead;
while ((byteRead = input.read()) != -1) {
output.write(byteRead);
output.flush();
}
However, this is still a very inefficient way of copying a stream. Copying a chunk at a time, as per your first code, is much more efficient (or using the built-in transferTo method if you're using Java 9 or higher, as rostamn79 notes).
Baeldung.com provides information on stream.transferTo() method which does not incur an additional copy to Java heap
https://www.baeldung.com/java-inputstream-to-outputstream
Example code
#Test
public void givenUsingJavaNine_whenCopyingInputStreamToOutputStream_thenCorrect() throws IOException {
String initialString = "Hello World!";
try (InputStream inputStream = new ByteArrayInputStream(initialString.getBytes());
ByteArrayOutputStream targetStream = new ByteArrayOutputStream()) {
inputStream.transferTo(targetStream);
assertEquals(initialString, new String(targetStream.toByteArray()));
}
}
See how this transferTo is called with both streams as arguments

How to convert a character into an integer in Java?

I am a beginner at Java, trying to figure out how to convert characters from a text file into integers. In the process, I wrote a program which generates a text file showing what characters are generated by what integers.
package numberchars;
import java.io.FileWriter;
import java.io.IOException;
import java.io.FileReader;
import java.lang.Character;
public class Numberchars {
public static void main(String[] args) throws IOException {
FileWriter outputStream = new FileWriter("NumberChars.txt");
//Write to the output file the char corresponding to the decimal
// from 1 to 255
int counter = 1;
while (counter <256)
{
outputStream.write(counter);
outputStream.flush();
counter++;
}
outputStream.close();
This generated NumberChars.txt, which had all the numbers, all the letters both upper and lower case, surrounded at each end by other symbols and glyphs.
Then I tried to read this file and convert its characters back into integers:
FileReader inputStream = new FileReader("NumberChars.txt");
FileWriter outputStream2 = new FileWriter ("CharNumbers.txt");
int c;
while ((c = inputStream.read()) != -1)
{
outputStream2.write(Character.getNumericValue(c));
outputStream2.flush();
}
}
}
The resulting file, CharNumbers.txt, began with the same glyphs as NumberChars.txt but then was blank. Opening the files in MS Word, I found NumberChars had 248 characters (including 5 spaces) and CharNumbers had 173 (including 8 spaces).
So why didn't the Character.getNumericValue(c) result in an integer written to CharNumbers.txt? And given that it didn't, why at least didn't it write an exact copy of NumberChars.txt? Any help much appreciated.
Character.getNumericValue doesn't do what you think it does. If you read the Javadoc:
Returns the int value that the specified character (Unicode code point) represents. For example, the character '\u216C' (the Roman numeral fifty) will return an int with a value of 50.
On error it returns -1 (which looks like 0xFF_FF_FF_FF in 2s complement).
Most characters don't have such a "numeric value," so you write the ints out, each padded to 2 bytes (more on that later), read them back in the same way, and then start writing a whole lot of 0xFFFF (-1 truncated to 2 bytes) courtesy of a misplaced Character.getNumericValue. I'm not sure what MS Word is doing, but it's probably getting confused what the encoding of your file is and glomming all those bytes into 0xFF_FF_FF_FF (because the high bits of each byte are set) and treating that as one character. (Use a text editor more suited to this kind of stuff like Notepad++, btw.) If you were to measure your file's size on disk in bytes it will probably still be 256 chars * 2 bytes/chars = 512 bytes.
I'm not sure what you meant to do here, so I'll note that InputStreamReader and OutputStreamWriter work on a (Unicode) character basis, with an encoder that defaults to the system one. That's why your ints are padded/truncated to 2 bytes. If you wanted pure byte IO, use FileInputStream/FileOutputStream. If you wanted to read and write the ints as Strings, you need to use FileWriter/FileReader, but not like you did.
// Just bytes
// This is a try-with-resources. It executes the code with the decls in it
// but is also like an implicit finally block that calls `close()` on each resource.
try(FileOutputStream fos = new FileOutputStream("bytes.bin")) {
for(int b = 0; b < 256; b++) { // Bytes are signed so we use int.
// This takes an int and truncates it for the lowest byte
fos.write(b);
// Can also fill a byte[] and dump it all at once with overloaded write.
}
}
byte[] bytes = new bytes[256];
try(FileInputStream fis = new FileInputStream("bytes.bin")) {
// Reads up to bytes.length bytes into bytes
fis.read(bytes);
}
// Foreach loop. If you don't know what this does, I think you can figure out from the name.
for(byte b : bytes) {
System.out.println(b);
}
// As Strings
try(FileWriter fw = new FileWriter("strings.txt")) {
for(int i = 0; i < 256; i++) {
// You need a delimiter lest you not be able to tell 12 from 1,2 when you read
// Uses system default encoding
fw.write(Integer.toString(i) + "\n");
}
}
byte[] bytes = new byte[256];
try(
FileReader fr = new FileReader("strings.txt");
// FileReaders can't do stuff like "read one line to String" so we wrap it
BufferedReader br = new BufferedReader(fr);
) {
for(int i = 0; i < 256; i++) {
bytes[i] = Byte.valueOf(br.readLine());
}
}
for(byte b : bytes) {
System.out.println(b);
}
public class MyCLAss {
public static void main(String[] args)
{
char x='b';
System.out.println(+x);//just by witting a plus symbol before the variable you can find it's ascii value....it will give 98.
}
}

Why do i have to typecast to byte when reading from a fileinput stream? [duplicate]

This question already has answers here:
Why does InputStream#read() return an int and not a byte?
(6 answers)
Closed 7 years ago.
I'm currently working my way through an online java lessons where the topic is about reading and writing using input stream. Specifically, the lesson demonstrates using a shift entered by the user to encrypt a image file, and then decrypting the same image file using a negative shift.
However, in the code provided, i do not understand what the critical line actually does, and why it does what it does. From what i can make of it, it reads a byte from the FileInputStream and typecasts it to a byte, and then adds the shift to it before writing it out via the file outputstream. However, since i am already reading a byte from the FileInputStream, why do i have to typecast it to a byte again?
I would really appreciate anyone shedding some light on this.
Thanks!
import java.io.*;
import java.util.Scanner;
public class ReadingAndWritingStreamsNonText {
public static String imgFilePath = "C:\\JavaProjects\\BinaryStreams\\src\\MIM_BINARY_MEME.jpg";
public static String imgFilePath2 = "C:\\JavaProjects\\BinaryStreams\\src\\data.bin";
public static String imgFilePath3 = "C:\\JavaProjects\\BinaryStreams\\src\\MIM_BINARY_MEME_Decrypted.jpg";
public static void main(String[] args) {
Scanner input = new Scanner(System.in);
System.out.println("Please enter a shift to encrypt/decrypt the file:");
int shift = Integer.parseInt(input.nextLine());
try {
FileInputStream fis = null;
FileOutputStream fos = null;
PrintStream ps = null;
if (shift > 0) {
fis = new FileInputStream(imgFilePath);
fos = new FileOutputStream(imgFilePath2);
ps = new PrintStream(fos);
}
else {
fis = new FileInputStream(imgFilePath2);
fos = new FileOutputStream(imgFilePath3);
ps = new PrintStream(fos);
}
boolean done = false;
while (!done) {
//read in the file
int next = fis.read();
if (next == -1) {
done = true;
}
else {
//encrypt or decrypt based on shift
**ps.write(((byte) next) + shift);** <--- this line
}
}
ps.close();
ps = null;
fos.close();
fos = null;
fis.close();
fis = null;
}
catch (IOException ioex) {
ioex.printStackTrace();
}
System.out.println("Operation Completed");
}
}
Because InputStream.read() returns an int instead of a byte.
Note that this method will return -1 when the end of the stream is reached, and a value in the range 0 to 255 if a byte was read, as the API documentation says:
Reads the next byte of data from the input stream. The value byte is returned as an int in the range 0 to 255. If no byte is available because the end of the stream has been reached, the value -1 is returned. This method blocks until input data is available, the end of the stream is detected, or an exception is thrown.
An int needs casting to be converted to a byte because an int is 32 bits, while a byte is only 8 bits. You can't do a narrowing conversion (which throws away the upper 24 bits of the int) without a cast.
it reads a byte from the fileinputstream and typecasts it to a byte, and then adds the shift to it before writing it out via the file outputstream. However, since i am already reading a byte from the fileinputstream, why do i have to typecast it to a byte again?
Because read() returns an int, which might be -1, indicating end of stream. If it isn't -1, it is a value in the range 0..255, which you have to typecast to byte to get the byte, in the range -128..127. If this process wasn't followed it wouldn't be possible to indicate end of stream via the return value.
FileInputStream.read() returns an int, not byte, because of this: Why does InputStream#read() return an int and not a byte?
But you want to shift exactly 8 bits of data, and int is larger (32 bits). So you need to cast it to byte.

Failure encoding files in base64 java

I have this class to encode and decode a file. When I run the class with .txt files the result is successfully. But when I run the code with .jpg or .doc I can not open the file or it is not equals to original. I don’t know why this is happening. I have modified this class
http://myjeeva.com/convert-image-to-string-and-string-to-image-in-java.html. But i want change this line
byte imageData[] = new byte[(int) file.length()];
for
byte example[] = new byte[1024];
and read the file so many times how we need. Thanks.
import java.io.*;
import java.util.*;
public class Encode {
Input = Input file root - Output = Output file root - imageDataString =String encoded
String input;
String output;
String imageDataString;
public void setFileInput(String input){
this.input=input;
}
public void setFileOutput(String output){
this.output=output;
}
public String getFileInput(){
return input;
}
public String getFileOutput(){
return output;
}
public String getEncodeString(){
return imageDataString;
}
public String processCode(){
StringBuilder sb= new StringBuilder();
try{
File fileInput= new File( getFileInput() );
FileInputStream imageInFile = new FileInputStream(fileInput);
i have seen in examples that people create a byte[] with the same length than the file. I don´t want this because i will not know what length will have the file.
byte buff[] = new byte[1024];
int r = 0;
while ( ( r = imageInFile.read( buff)) > 0 ) {
String imageData = encodeImage(buff);
sb.append( imageData);
if ( imageInFile.available() <= 0 ) {
break;
}
}
} catch (FileNotFoundException e) {
System.out.println("File not found" + e);
} catch (IOException ioe) {
System.out.println("Exception while reading the file " + ioe);
}
imageDataString = sb.toString();
return imageDataString;
}
public void processDecode(String str) throws IOException{
byte[] imageByteArray = decodeImage(str);
File fileOutput= new File( getFileOutput());
FileOutputStream imageOutFile = new FileOutputStream( fileOutput);
imageOutFile.write(imageByteArray);
imageOutFile.close();
}
public static String encodeImage(byte[] imageByteArray) {
return Base64.getEncoder().withoutPadding().encodeToString( imageByteArray);
}
public static byte[] decodeImage(String imageDataString) {
return Base64.getDecoder().decode( imageDataString);
}
public static void main(String[] args) throws IOException {
Encode a = new Encode();
a.setFileInput( "C://Users//xxx//Desktop//original.doc");
a.setFileOutput("C://Users//xxx//Desktop//original-copied.doc");
a.processCode( );
a.processDecode( a.getEncodeString());
System.out.println("C O P I E D");
}
}
I tried changing
String imageData = encodeImage(buff);
for
String imageData = encodeImage(buff,r);
and the method encodeImage
public static String encodeImage(byte[] imageByteArray, int r) {
byte[] aux = new byte[r];
for ( int i = 0; i < aux.length; i++) {
aux[i] = imageByteArray[i];
if ( aux[i] <= 0 ) {
break;
}
}
return Base64.getDecoder().decode( aux);
}
But i have the error:
Exception in thread "main" java.lang.IllegalArgumentException: Last unit does not have enough valid bits
You have two problems in your program.
The first, as mentioned in by #Joop Eggen, is that you are not handling your input correctly.
In fact, Java does not promise you that even in the middle of the file, you'll be reading the entire 1024 bytes. It could just read 50 bytes, and tell you it read 50 bytes, and then the next time it will read 50 bytes more.
Suppose you read 1024 bytes in the previous round. And now, in the current round, you're only reading 50. Your byte array now contains 50 of the new bytes, and the rest are the old bytes from the previous read!
So you always need to copy the exact number of bytes copied to a new array, and pass that on to your encoding function.
So, to fix this particular problem, you'll need to do something like:
while ( ( r = imageInFile.read( buff)) > 0 ) {
byte[] realBuff = Arrays.copyOf( buff, r );
String imageData = encodeImage(realBuff);
...
}
However, this is not the only problem here. Your real problem is with the Base64 encoding itself.
What Base64 does is take your bytes, break them into 6-bit chunks, and then treat each of those chunks as a number between N 0 and 63. Then it takes the Nth character from its character table, to represent that chunk.
But this means it can't just encode a single byte or two bytes, because a byte contains 8 bits, and which means one chunk of 6 bits, and 2 leftover bits. Two bytes have 16 bits. Thats 2 chunks of 6 bits, and 4 leftover bits.
To solve this problem, Base64 always encodes 3 consecutive bytes. If the input does not divide evenly by three, it adds additional zero bits.
Here is a little program that demonstrates the problem:
package testing;
import java.util.Base64;
public class SimpleTest {
public static void main(String[] args) {
// An array containing six bytes to encode and decode.
byte[] fullArray = { 0b01010101, (byte) 0b11110000, (byte)0b10101010, 0b00001111, (byte)0b11001100, 0b00110011 };
// The same array broken into three chunks of two bytes.
byte[][] threeTwoByteArrays = {
{ 0b01010101, (byte) 0b11110000 },
{ (byte)0b10101010, 0b00001111 },
{ (byte)0b11001100, 0b00110011 }
};
Base64.Encoder encoder = Base64.getEncoder().withoutPadding();
// Encode the full array
String encodedFullArray = encoder.encodeToString(fullArray);
// Encode the three chunks consecutively
StringBuilder encodedStringBuilder = new StringBuilder();
for ( byte [] twoByteArray : threeTwoByteArrays ) {
encodedStringBuilder.append(encoder.encodeToString(twoByteArray));
}
String encodedInChunks = encodedStringBuilder.toString();
System.out.println("Encoded full array: " + encodedFullArray);
System.out.println("Encoded in chunks of two bytes: " + encodedInChunks);
// Now decode the two resulting strings
Base64.Decoder decoder = Base64.getDecoder();
byte[] decodedFromFull = decoder.decode(encodedFullArray);
System.out.println("Byte array decoded from full: " + byteArrayBinaryString(decodedFromFull));
byte[] decodedFromChunked = decoder.decode(encodedInChunks);
System.out.println("Byte array decoded from chunks: " + byteArrayBinaryString(decodedFromChunked));
}
/**
* Convert a byte array to a string representation in binary
*/
public static String byteArrayBinaryString( byte[] bytes ) {
StringBuilder sb = new StringBuilder();
sb.append('[');
for ( byte b : bytes ) {
sb.append(Integer.toBinaryString(Byte.toUnsignedInt(b))).append(',');
}
if ( sb.length() > 1) {
sb.setCharAt(sb.length() - 1, ']');
} else {
sb.append(']');
}
return sb.toString();
}
}
So, imagine my 6-byte array is your image file. And imagine that your buffer is not reading 1024 bytes but 2 bytes each time. This is going to be the output of the encoding:
Encoded full array: VfCqD8wz
Encoded in chunks of two bytes: VfAqg8zDM
As you can see, the encoding of the full array gave us 8 characters. Each group of three bytes is converted into four chunks of 6 bits, which in turn are converted into four characters.
But the encoding of the three two-byte arrays gave you a string of 9 characters. It's a completely different string! Each group of two bytes was extended to three chunks of 6 bits by padding with zeros. And since you asked for no padding, it produces only 3 characters, without the extra = that usually marks when the number of bytes is not divisible by 3.
The output from the part of the program that decodes the 8-character, correct encoded string is fine:
Byte array decoded from full: [1010101,11110000,10101010,1111,11001100,110011]
But the result from attempting to decode the 9-character, incorrect encoded string is:
Exception in thread "main" java.lang.IllegalArgumentException: Last unit does not have enough valid bits
at java.util.Base64$Decoder.decode0(Base64.java:734)
at java.util.Base64$Decoder.decode(Base64.java:526)
at java.util.Base64$Decoder.decode(Base64.java:549)
at testing.SimpleTest.main(SimpleTest.java:34)
Not good! A good base64 string should always have multiples of 4 characters, and we only have 9.
Since you chose a buffer size of 1024, which is not a multiple of 3, that problem will happen. You need to encode a multiple of 3 bytes each time to produce the proper string. So in fact, you need to create a buffer sized 3072 or something like that.
But because of the first problem, be very careful at what you pass to the encoder. Because it can always happen that you'll be reading less than 3072 bytes. And then, if the number is not divisible by three, the same problem will occur.
Look at:
while ( ( r = imageInFile.read( buff)) > 0 ) {
String imageData = encodeImage(buff);
read returns -1 on end-of-file or the actual number of bytes that were read.
So the last buff might not be totally read, and even contain garbage from any prior read. So you need to use r.
As this is an assignment, the rest is up to you.
By the way:
byte[] array = new byte[1024]
is more conventional in Java. The syntax:
byte array[] = ...
was for compatibility with C/C++.

BufferOutputStream write zero byte when merge the file

I am trying merge n pieces of file become single file. But I got strange behavior on my function. The function are called for x times in n seconds. Let say I have 100 files which I will merge, every second I call 5 files and merger it. and in the next second the amount is double to be 10, but from 1-5 is the same file as before the rest is new file. It work normal but in some point, its give zero byte or sometime give the right size.
Could you help me spot the mistake in my function below?
public void mergeFile(list<String> fileList, int x) {
int count = 0;
BufferedOutputStream out = null;
try {
out = new BufferedOutputStream(new FileOutputStream("Test.doc"));
for (String file : fileList) {
InputStream in = new BufferedInputStream(new FileInputStream(file));
byte[] buff = new byte[1024];
in.read(buff);
out.write(buff);
in.close();
count++;
if (count == x) {
break;
}
}
out.flush();
out.close();
} catch (IOException e) {
e.printStackTrace();
}
}
*sorry for my English
in.read(buff);
Check the Javadoc. That method isn't guaranteed to fill the buffer. It returns a value which tells you how many bytes it read. You're supposed to use that, and in this situation you are supposed to use it when deciding how many bytes, if any, to write.
You do not read the full file, you read from each file only up to 1024 bytes. You need to loop the read as long as it returns data (or use something like Files.copy().
BTW: you dont need a BufferedOutputStream if you copy with large buffers.
public void mergeFile(list<String> fileList, int x) throws IOException {
try (OutputStream out = new FileOutputStream("Test.doc");) {
int count=0;
for (String file : fileList) {
Files.copy(new File(file).toPath(), out);
count++;
if (count == x) {
break;
}
}
}
}
You also do not need to flush() if you close. I am using "try-with-resource" here, so I dont need to close it explicitely. It is best to propagate the exceptions.

Categories