I'm writing a binary file header from java, and I had been using fixed values for the file size in the header. That was easy:
OutputStream os = new FileOutputStream(filename);
os.write(0x36);//LSB
os.write(0x10);
os.write(0x0E);
os.write(0x00);//MSB
But now I want to be more dynamic and write whatever size buffer I have to a file. So I might get the size of my array as say 4054; I want to take that and either break it apart and do four os.writes, or maybe there's a way to write it all at once.
OutputStream seems to only take one byte at a time, but I'd like to still use it as all the rest of my header code is already using it.
Use a ByteBuffer, so you can control whether it writes LSB or MSB first.
ByteBuffer buf = ByteBuffer.allocate(4).order(ByteOrder.LITTLE_ENDIAN);
buf.putLong(value);
os.write(buf.array());
Related
What are some practical areas where ByteArrayInputStream and/or ByteArrayOutputStream are used? Examples are also welcome.
If one searches for examples, one finds usually something like:
byte[] buf = { 16, 47, 12 };
ByteArrayInputStream byt = new ByteArrayInputStream(buf);
It does not help where or why should one use it. I know that they are used when working with images, ZIP files, or writing to ServletOutputStream.
ByteArrayInputStream: every time you need an InputStream (typically because an API takes that as argument), and you have all the data in memory already, as a byte array (or anything that can be converted to a byte array).
ByteArrayOutputStream: every time you need an OutputStream (typically because an API writes its output to an OutputStream) and you want to store the output in memory, and not in a file or on the network.
I'm working on a string compressor for a school assignment,
There's one bug that I can't seem to work out. The compressed data is being written a file using a FileWriter, represented by a byte array. The compression algorithm returns an input stream so the data flows as such:
piped input stream
-> input stream reader
-> data stored in char buffer
-> data written to file with file writer.
Now, the bug is, that with some very specific strings, the second to last byte in the byte array is written wrong. and it's always the same bit values "11111100".
Every time it's this bit values and always the second to last byte.
Here are some samples from the code:
InputStream compress(InputStream){
//...
//...
PipedInputStream pin = new PipedInputStream();
PipedOutputStream pout = new PipedOutputStream(pin);
ObjectOutputStream oos = new ObjectOutputStream(pout);
oos.writeObject(someobject);
oos.flush();
DataOutputStream dos = new DataOutputStream(pout);
dos.writeFloat(//);
dos.writeShort(//);
dos.write(SomeBytes); // ---Here
dos.flush();
dos.close();
return pin;
}
void write(char[] cbuf, int off, int len){
//....
//....
InputStreamReader s = new InputStreamReader(
c.compress(new ByteArrayInputStream(str.getBytes())));
s.read(charbuffer);
out.write(charbuffer);
}
A string which triggers it is "hello and good evenin" for example.
I have tried to iterate over the byte array and write them one by one, it didn't help.
It's also worth noting that when I tried to write to a file using the output stream in the algorithm itself it worked fine. This design was not my choice btw.
So I'm not really sure what i'm doing wrong here.
Considering that you're saying:
Now, the bug is, that with some very specific strings, the second to
last byte in the byte array is written wrong. and it's always the same
bit values "11111100".
You are taking a
binary stream (the compressed data)
-> reading it as chars
-> then writing it as chars.
And your are converting bytes to chars without clearly defining the encoding.
I'd say that the problem is that your InputStreamReader is translating some byte sequences in a way that you're not expecting.
Remember that in encodings like utf-8 two or three bytes may become one single char.
It can't be coincidence that the very byte pattern you pointed out (11111100) Is one of the utf-8 escape codes (1111110x). Check this wikipedia table at and you'll see that uft-8 is destructive since if a byte starts with: 1111110x the next must start with 10xxxxxx.
Meaning that if using utf-8 to convert
bytes1[] -> chars[] -> bytes2[]
in some cases bytes2 will be different from bytes1.
I recommend changing your code to remove those readers. Or specify ASCII encoding to see if that prevent the translations.
I solved this by encoding and decoding the bytes with Base64.
Question may be quite vague, let me expound it here.
I'm developing an application in which I'll be reading data from a file. I've a FileReader class which opens the file in following fashion
currentFileStream = new FileInputStream(currentFile);
fileChannel = currentFileStream.getChannel();
data is read as following
bytesRead = fileChannel.read(buffer); // Data is buffered using a ByteBuffer
I'm processing the data in any one of the 2 forms, one is binary and other is character.
If its processed as character I do an additional step of decoding this ByteBuffer into CharBuffer
CoderResult result = decoder.decode(byteBuffer, charBuffer, false);
Now my problem is I need to read by repositioning the file from some offset during recovery mode in case of some failure or crash in application.
For this, I maintain a byteOffset which keeps track of no of bytes processed during binary mode and I persist this variable.
If something happens I reposition the file like this
fileChannel.position(byteOffset);
which is pretty straightforward.
But if processing mode is character, I maintain recordOffset which keeps track of character position/offset in the file. During recovery I make calls to read() internally till I get some character offset which is persisted recordOffset+1.
Is there anyway to get corresponding bytes which were needed to decode characters? For instance I want something similar like this if recordOffset is 400 and its corresponding byteOffset is 410 or 480 something( considering different charsets). So that while repositioning I can do this
fileChannel.position(recordOffset); //recordOffset equivalent value in number of bytes
instead of making repeated calls internally in my application.
Other approach I could think for this was using an InputStreamReader's skip method.
If there are any better approach for this or if possible to get byte - character mapping, please let me know.
I think I'm missing something very simple. I have a byte array holding deflated data written into it using a Deflater:
deflate(outData, 0, BLOCK_SIZE, SYNC_FLUSH)
The reason I didn't just use GZIPOutputStream was because there were 4 threads (variable) that each were given a block of data and each thread compressed it's own block before storing that compressed data into a global byte array. If I used GZIPOutputStream it messes up the format because each little block has a header and trailer and is it's own gzip data (I only want to compress it).
So in the end, I've got this byteArray, outData, that's holding all of my compressed data but I'm not really sure how to wrap it. GZIPOutputStream writes from an buffer with uncompressed data, but this array is all set. It's already compressed and I'm just hitting a wall trying to figure out how to get it into a form.
EDIT: Ok, bad wording on my part. I'm writing it to output, not a file, so that it could be redirected if needed. A really simple example is that
cat file.txt | java Jzip | gzip -d | cmp file.txt
should return 0. The problem right now is if I write this byte array as is to output, it's just "raw" compressed data. I think gzip needs all this extra information.
If there's an alternative method, that would be fine to. The whole reason it's like this is because I needed to use multiple threads. Otherwise I would just call GZIPOutputStream.
DOUBLE EDIT: Since the comments provide a lot of good insight, another method is that I just have a bunch of uncompressed blocks of data that were originally one long stream. If gzip can read concatenated streams, if I took those blocks (and kept them in order) and gave each one to a thread that calls GZIPOutputStream on its own block, then took the results and concatenated them. In essence, each block now has header, the compressed info, and trailer. Would gzip recognize that if I concatenated them?
Example:
cat file.txt
Hello world! How are you? I'm ready to set fire to this assignment.
java Testcase < file.txt > file.txt.gz
So I accept it from input. Inside the program, the stream is split up into
"Hello world!" "How are you?" "I'm ready to set fire to this assignment" (they're not strings, it's just an array of bytes! this is just illustration)
So I've got these three blocks of bytes, all uncompressed. I give each of these blocks to a thread, which uses
public static class DGZIPOutputStream extends GZIPOutputStream
{
public DGZIPOutputStream(OutputStream out, boolean flush) throws IOException
{
super(out, flush);
}
public void setDictionary(byte[] b)
{
def.setDictionary(b);
}
public void updateCRC(byte[] input)
{
crc.update(input);
}
}
As you can see, the only thing here is that I've set the flush to SYNC_FLUSH so I can get the alignment right and have the ability to set the dictionary. If each thread were to use DGZIPOutputStream (which I've tested and it works for one long continuous input), and I concatenated those three blocks (now compressed each with a header and trailer), would gzip -d file.txt.gz work?
If that's too weird, ignore the dictionary completely. It doesn't really matter. I just added it in while I was at it.
If you set nowrap true when using the Deflater (sic) constructor, then the result is raw deflate. Otherwise it's zlib, and you would have to strip the zlib header and trailer. For the rest of the answer, I am assuming nowrap is true.
To wrap a complete, terminated deflate stream to be a gzip stream, you need to prepend ten bytes:
"\x1f\x8b\x08\0\0\0\0\0\0\xff"
(sorry -- C format, you'll need to convert to Java octal). You need to also append the four byte CRC in little endian order, followed by the four-byte total uncompressed length modulo 2^32, also in little endian order. Given what is available in the standard Java API, you'll need to compute the CRC in serial. It can't be done in parallel. zlib does have a function to combine separate CRCs that are computed in parallel, but that is not exposed in Java.
Note that I said a complete, terminated deflate stream. It takes some care to make one of those with parallel deflate tasks. You would need to make n-1 unterminated deflate streams and one final terminated deflate stream and concatenate those. The last one is made normally. The other n-1 need to be terminated using sync flush in order to end each on a byte boundary and to not mark it as the end of the stream. To do that, you use deflate with the flush parameter SYNC_FLUSH. Don't use finish() on those.
For better compression, you can use setDictionary on each chunk with the last 32K of the previous chunk.
If you are looking to write the outdata in a file, you may write as:
GZIPOutputStream outStream= new GZIPOutputStream(new FileOutputStream("fileName"));
outStream.write(outData, 0, outData.length);
outStream.close();
Or simply use java.io.FileOutputStream to write:
FileOutputStream outStream= new FileOutputStream("fileName");
outStream.write(outData, 0, outData.length);
outStream.close();
You just want to write a byte array - as is - to a file?
You can use Apache Commons:
FileOutputStream fos = new FileOutputStream("yourFilename");
fos.write(outData);
fos.close():
Or plain old Java:
BufferedOutputStream bs = null;
try {
FileOutputStream fs = new FileOutputStream(new File("yourFilename"));
bs = new BufferedOutputStream(fs);
bs.write(outData);
bs.close();
} catch (Exception e) {
//please handle this
}
if (bs != null) try {
bs.close();
} catch (Exception e) {
//please handle this
}
Im currently trying to build a save editor for a video game. Anyway the I figured out how to write to the binary file with output stream rather than writer I'm running into a problem. I'm trying to overwrite certain hexadecimal values but every time I try I end up replacing the whole file, theres probably an easy explanation for this but I also wanted advice on how to replace the hex values converting the hex values (ex. 5acd) from a string only gives me the byte data for the strings. Heres what I'm doing:
String textToWrite = inputField.getText();
byte[] charsToWrite = textToWrite.getBytes();
FileOutputStream out = new FileOutputStream(theFile);
out.write(charsToWrite, 23, charsToWrite.length)
Use a RandomAccessFile. This has the methods that you are looking for. FileOutputStream will only allow you to overwrite or append. However, note as Murali VP eluded to, this will only allow you to perform direct replacements (byte-for-byte) - and not removal or insertion of bytes.
Converting from Hex String to Byte Array (which is essentially what you need) - see this SO post for what you need.
HTH