getOutputSize constancy?

getOutputSize constancy? - java

I need to encrypt and send data over TCP (from a few 100 bytes to a few 100 megabytes per message) in chunks from Java to a C++ program, and need to send the size of the data ahead of time so the recipient knows when to stop reading the current message and process it, then wait for the next message (the connection stays open so there's no other way to indicate end of message; as the data can be binary, I can't use a flag to indicate message end due to the possibility the encrypted bytes might randomly happen to be identical to any flag I choose at some point).
My issue is calculating the encrypted message size before encrypting it, which will in general be different than the input length due to padding etc.
Say I have initialized as follows:
AlgorithmParameterSpec paramSpec = new IvParameterSpec(initv);
encipher = Cipher.getInstance("AES/CBC/PKCS5Padding");
mac = Mac.getInstance("HmacSHA512");
encipher.init(Cipher.ENCRYPT_MODE, key, paramSpec);
mac.init(key);
buf = new byte[encipher.getOutputSize(blockSize)];
Then I send the data as such (and also have an analogous function that uses a stream for input instead of byte[]):
public void writeBytes(DataOutputStream out, byte[] input) {
try {
//mac.reset(); // Needed ?
int left = input.length;
int offset = 0;
while (left > 0)
{
int chunk = Math.min(left, blockSize);
int ctLength = encipher.update(input, offset, chunk, buf, 0);
mac.update(input, offset, chunk);
out.write(buf, 0, ctLength);
left -= chunk;
offset += chunk;
}
out.write(encipher.doFinal(mac.doFinal());
out.flush();
} catch (Exception e) {
e.printStackTrace();
}
}
But how to precalculate the output size that will be sent to the receiving computer?
Basically, I want to out.writeInt(messageSize) before the loop. But how to calculate messageSize? The documentation for Cipher's getOutputSize() says that "This call takes into account any unprocessed (buffered) data from a previous update call, and padding." So this seems to imply that the value might change for the same function argument over multiple calls to update() or doFinal()... Can I assume that if blockSize is a multiple of the AES CBC block size to avoid padding, I should have a constant value for each block? That is, simply check that _blockSize % encipher.getOutputSize(1) != 0 and then in the write function,
int messageSize = (input.length / blockSize) * encipher.getOutputSize(blockSize) +
encipher.getOutputSize(input.length % blockSize + mac.getMacLength());
??
If not, what alternatives do I have?

When using PKCS5 padding, the size of the message after padding will be:
padded_size = original_size + BLOCKSIZE - (original_size % BLOCKSIZE);
The above gives the complete size of the entire message (up to the doFinal() call), given the complete size of the input message. It occurs to me that you actually want to know just the length of the final portion - all you need to do is store the output byte array of the doFinal() call, and use the .length() method on that array.

Related

Java Inflater will loop infinitely sometimes

In my application, I'm trying to compress/decompress byte array using java's Inflater/Deflater class.
Here's part of the code I used at first:
ByteArrayOutputStream outputStream = new ByteArrayOutputStream(data.length);
byte[] buffer = new byte[1024];
while (!inflater.finished()) {
int count = inflater.inflate(buffer);
outputStream.write(buffer, 0, count);
}
Then after I deployed the code it'll randomly (very rare) cause the whole application hang, and when I took a thread dump, I can identify that one thread hanging
at java.util.zip.Inflater.inflateBytes(Native Method)
at java.util.zip.Inflater.inflate(Inflater.java:259)
- locked java.util.zip.ZStreamRef#fc71443
at java.util.zip.Inflater.inflate(Inflater.java:280)
It doesn't happen very often. Then I googled everywhere and found out it could be some empty byte data passed in the inflater and finished() will never return true.
So I used a workaround, instead of using
while (!inflater.finished())
to determine if it's finished, I used
while (inflater.getRemaining() > 0)
But it happened again.
Now it makes me wonder what's the real reason that causes the issue. There shouldn't be any empty array passed in the inflater, even if it did, how come getRemaining() method did not break the while loop?
Can anybody help pls? It's really bugging me.

Confused by the same problem, I find this page.
This is my workaround for this, it may helps:
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
while (!inflater.finished()) {
int i = inflater.inflate(buffer);
if (i == 0) {
break;
}
byteArrayOutputStream.write(buffer, 0, i);
}

The javadoc of inflate:
Uncompresses bytes into specified buffer. Returns actual number of bytes uncompressed. A return value of 0 indicates that needsInput() or needsDictionary() should be called in order to determine if more input data or a preset dictionary is required. In the latter case, getAdler() can be used to get the Adler-32 value of the dictionary required.
So #Wildo Luo was certainly right to check for 0 being returned.
byte[] buffer = new byte[1024];
while (!inflater.finished()) {
int count = inflater.inflate(buffer);
if (count != 0 ) {
outputStream.write(buffer, 0, count);
} else {
if (inflater.needsInput()) { // Not everything read
inflater.setInput(...);
} else if (inflater.needsDictionary()) { // Dictionary to be loaded
inflater.setDictionary(...);
}
}
}
inflater.end();
I can only imagine that elsewhere the code is not entirely right, maybe on the compression size. Better first check the general code. There is the Inflater(boolean nowrap) requiring an extra byte, the end() call. Exception handling (try-finally). Etcetera.
For unkown data, unknown occurrences: using a try-catch, find compressed data to check whether it is a data based error, and for testing any solution.

Having the same problem...
What I'm sure about:
I'm having an infinite loop, assured with logs printed.
inflater.inflate returns 0, and the output buffer size is 0.
My loop is like this (Hive ORC code):
while (!(inflater.finished() || inflater.needsDictionary() ||
inflater.needsInput())) {
try {
int count = inflater.inflate(out.array(),
out.arrayOffset() + out.position(),
out.remaining());
out.position(count + out.position());
} catch (DataFormatException dfe) {
throw new IOException("Bad compression data", dfe);
}
}
After the out buffer is consumed and its remaining size is 0, the loop will infinitely run.
But I'm not sure about whether it's orc or zlib caused this. On orc side, it fills original data with the same compression buffer size then do the compression, so theoretically it's not possible I get an compressed chunk larger than the buffer size. Possibilities may be zlib or hardware.
That being said, break the loop when count == 0 is dangerous, since there may be still uncompressed data in the inflator.

Java DataOutputStream / DataInputStream OutOfMemoryError

I'm trying to send a byte array containing 16 items over sockets using DataOutputStream on the client and DataInputStream on the server.
These are the methods I am using for sending/receiving.
public void sendBytes(byte[] myByteArray) throws IOException {
sendBytes(myByteArray, 0, myByteArray.length);
}
public void sendBytes(byte[] myByteArray, int start, int len) throws IOException {
if (len < 0)
throw new IllegalArgumentException("Negative length not allowed");
if (start < 0 || start >= myByteArray.length)
throw new IndexOutOfBoundsException("Out of bounds: " + start);
dOutput.writeInt(len);
if (len > 0) {
dOutput.write(myByteArray, start, len);
dOutput.flush();
}
}
public byte[] readBytes() throws IOException {
int len = dInput.readInt();
System.out.println("Byte array length: " + len); //prints '16'
byte[] data = new byte[len];
if (len > 0) {
dInput.readFully(data);
}
return data;
}
It all works fine, and I can print the byte array length, byte array (ciphertext), and then decrypt the byte array and print out the original plaintext I sent, but immediately after it prints in the console, the program crashes with a OutOfMemoryError: Java heap space.
I have read this is usually because of not flushing the DataOutputStream, but I am calling it inside the sendBytes method so it should clear it after every array is sent.
The compiler is telling me the error is occuring inside readBytes on the line byte[] data = new byte[len]; and also where I call readBytes() in the main method.
Any help will be greatly appreciated!
Edit
I am actually getting some unexpected results.
17:50:14 Server waiting for Clients on port 1500.
Thread trying to create Object Input/Output Streams
17:50:16 Client[0.7757499147242042] just connected.
17:50:16 Server waiting for Clients on port 1500.
Byte array length: 16
Server recieved ciphertext: 27 10 -49 -83 127 127 84 -81 48 -85 -57 -38 -13 -126 -88 6
Server decrypted ciphertext to: asd
17:50:19 Client[0.7757499147242042]
Byte array length: 1946157921
I am calling readBytes() in a while loop, so the server will be listening for anything being transmitted over the socket. I guess its trying to run it a second time even though nothing else has been sent and the len variable is somehow being set to 1946157921. What logic could be behind this?

You must be sending something else over the socket; not reading it the same way you wrote it; and so getting out of sync. The effect will be that you're reading a length it that isn't a real length; is too big; and runs out of memory when you try to allocate it. The fault isn't in this code. Except of course that if len == 0 you shouldn't allocate the bye array when reading.
I have read this is usually because of not flushing the DataOutputStream
It isn't.
len variable is somehow being set to 1946157921.
Exactly as predicted. QED

You are running out of the available heap. Quick solution for this would be increasing (or specifying is missing) the -Xmx parameter in your JVM startup parameters to the level where the application is able to complete the task at hand.

Run your application with -Xms1500m in console, in Netbeans you can find it in project properties->Run->VM options.
I faced this out of memory problem today and after tweaking sometime with Xms I was able to fix the problem. Check if it work with you, if there is something really bigger then this than you will have to check how you can improve your code.
Check discussion here

A better way to find out how many bytes are in a stream?

Currently, I am relying on the ObjectInputStream.available() method to tell me how many bytes are left in a stream. Reason for this -- I am writing some unit/integration tests on certain functions that deal with streams and I am just trying to ensure that the available() method returns 0 after I am done.
Unfortunately, upon testing for failure (i.e., I have sent about 8 bytes down the stream) my assertion for available() == 0 is coming up true when it should be false. It should show >0 or 8 bytes!
I know that the available() method is classically unreliable, but I figured it would show something at least > 0!
Is there a more reliable way of checking if a stream is empty or not (The is my main goal here after all)? Perhaps in the Apache IO domain or some other library out there?
Does anyone know why the available() method is so profoundly unreliable; what is the point of it? Or, is there a specific, proper way of using it?
Update:
So, as many of you can read from the comments, the main issue I am facing is that on one end of a stream, I am sending a certain number of bytes but on the other end, not all the bytes are arriving!
Specifically, I am sending 205498 bytes on one end and only getting 204988 on the other, consistently. I am controlling both sides of this operation between threads in a socket, but it should be no matter.
Here is the code I have written to collect all the bytes.
public static int copyStream(InputStream readFrom, OutputStream writeTo, int bytesToRead)
throws IOException {
int bytesReadTotal = 0, bytesRead = 0, countTries = 0, available = 0, bufferSize = 1024 * 4;
byte[] buffer = new byte[bufferSize];
while (bytesReadTotal < bytesToRead) {
if (bytesToRead - bytesReadTotal < bufferSize)
buffer = new byte[bytesToRead - bytesReadTotal];
if (0 < (available = readFrom.available())) {
bytesReadTotal += (bytesRead = readFrom.read(buffer));
writeTo.write(buffer, 0, bytesRead);
countTries = 0;
} else if (countTries < 1000)
try {
countTries++;
Thread.sleep(1L);
} catch (InterruptedException ignore) {}
else
break;
}
return bytesReadTotal;
}
I put the countTries variable in there just to see what happens. Even without countTires in there, it will block forever before it reaches the BytesToRead.
What would cause the stream to suddenly block indefinitely like that? I know on the other end it fully sends the bytes over (as it actually utilizes the same method and I see that it completes the function with the full BytesToRead matching bytesReadTotal in the end. But the receiver doesn't. In fact, when I look at the arrays, they match up perfectly up till the end as well.
UPDATE2
I noticed that when I added a writeTo.flush() at the end of my copyStream method, it seems to work again. Hmm.. Why are flushes so vital in this situation. I.e., why would not using it cause a stream to perma-block?

The available() method only returns how many bytes can be read without blocking (which may be 0). In order to see if there are any bytes left in the stream, you have to read() or read(byte[]) which will return the number of bytes read. If the return value is -1 then you have reached the end of file.
This little code snippet will loop through an InputStream until it gets to the end (read() returns -1). I don't think it can ever return 0 because it should block until it can either read 1 byte or discover there is nothing left to read (and therefore return -1)
int currentBytesRead=0;
int totalBytesRead=0;
byte[] buf = new byte[1024];
while((currentBytesRead =in.read(buf))>0){
totalBytesRead+=currentBytesRead;
}

AES+HMAC encryption in multiple threads - Java

I'm developing a little program to encryp/decrypt a binary file using AES-256 and HMAC to check the results.
My code is based on AESCrypt implementation in Java, but I wanted to modify it to allow multiple threads to do the job simultaneously.
I get the size of original bytes and calculate the number of 16 bytes blocks per thread, then I startes the threads with information about the offset to apply for reading and writing (because there is a header for the encrypted file, so the offset_write = offset_read+header_length).
When it finishes the encryption I passed the output content (without the header) trough the HMAC to generate the checksum.
The problem is that some bytes get corrupted in the bytes between two threads.
Code of main:
//..
// Initialization and creation of iv, aesKey
//..
in = new FileInputStream(fromPath);
out = new FileOutputStream(toPath);
//..
// Some code for generate the header and write it to out
//..
double totalBytes = new Long(archivo.length()).doubleValue();
int bloquesHilo = new Double(Math.ceil(totalBytes/(AESCrypt.NUM_THREADS*AESCrypt.BLOCK_SIZE))).intValue();
int offset_write = new Long((out.getChannel()).position()).intValue();
for (int i = 0; i < AESCrypt.NUM_THREADS; i++)
{
int offset = bloquesHilo*AESCrypt.BLOCK_SIZE*i;
HiloCrypt hilo = new HiloCrypt(fromPath, toPath, ivSpec, aesKey, offset, offsetInicio, bloquesHilo, this);
hilo.start();
}
Code for a thread (class HiloCrypt):
public class HiloCrypt extends Thread {
private RandomAccessFile in;
private RandomAccessFile out;
private Cipher cipher;
private Mac hmac;
private IvParameterSpec ivSpec2;
private SecretKeySpec aesKey2;
private Integer num_blocks;
private Integer offset_read;
private Integer offset_write;
private AESCrypt parent;
public HiloCrypt(String input, String output, IvParameterSpec ivSpec, SecretKeySpec aesKey, Integer offset_thread, Integer offset_write, Integer blocks, AESCrypt parent2)
{
try
{
// If i don't use RandomAccessFile there is a problem copying data
this.in = new RandomAccessFile(input, "r");
this.out = new RandomAccessFile(output, "rw");
int total_offset_write = offset_write + offset_thread;
// Adjust the offset for reading and writing
this.out.seek(total_offset_write);
this.in.seek(offset_thread);
this.ivSpec2 = ivSpec;
this.aesKey2 = aesKey;
this.cipher = Cipher.getInstance(AESCrypt.CRYPT_TRANS);
this.hmac = Mac.getInstance(AESCrypt.HMAC_ALG);
this.num_blocks = blocks;
this.offset_read = offset_thread;
this.offset_write = total_offset_write;
this.parent = parent2;
} catch (Exception e)
{
System.err.println(e);
return;
}
}
public void run()
{
int len, last,block_counter,total = 0;
byte[] text = new byte[AESCrypt.BLOCK_SIZE];
try{
// Start encryption objects
this.cipher.init(Cipher.ENCRYPT_MODE, this.aesKey2, this.ivSpec2);
this.hmac.init(new SecretKeySpec(this.aesKey2.getEncoded(), AESCrypt.HMAC_ALG));
while ((len = this.in.read(text)) > 0 && block_counter < this.num_blocks)
{
this.cipher.update(text, 0, AESCrypt.BLOCK_SIZE, text);
this.hmac.update(text);
// Write the block
this.out.write(text);
last = len;
total+=len;
block_counter++;
}
if (len < 0) // If it's the last block, calculate the HMAC
{
last &= 0x0f;
this.out.write(last);
this.out.seek(this.offset_write-this.offset_read);
while ((len = this.out.read(text)) > 0)
{
this.hmac.update(text);
}
// write last block of HMAC
text=this.hmac.doFinal();
this.out.write(text);
}
// Close streams
this.in.close();
this.out.close();
// Code to notify the end of the thread
}
catch(Exception e)
{
System.err.println("Hola!");
System.err.println(e);
}
}
}
With this code if I execute only 1 thread, the encryption/decryption goes perfect, but with 2+ threads there is a problem with bytes in the zone between threads jobs, the data gets corrupted there and the checksum also fails.
I'm trying to do this with threads because it gets near 2x faster than with one thread, I think it should be because of processing and not by the accessing of the file.
As a irrelevant data, it compress 250Mb of data in 43 seconds on a MB Air. ¿It's a good time?

AESCrypt is not thread safe. You cannot use multiple threads with it.
Generally speaking, encryption code is rarely thread safe, as it requires complex mathematics to generate secure output. AES by itself is relatively fast, if you need better speed from it, consider vertical scaling or hardware accelerators as a first step. Later, you can add more servers to encrypt different files concurrently (horizontal scaling).

You basically want to multithread an operation that is intrinsically sequential.
Stream cipher cannot be made parallel because each block depends on the completion of the previous block. So you can encrypt multiple files in parallel independently with slight performance increase, especially if the files are in memory rather than on disk, but you cannot encrypt a single file using multiple cores.
As I can see, you use an update method. I'm not an expert in Java crypography but even the name of the method tells me that the encryption algorithm holds a state: "multithreading" and "state" are not friends, you have to deal with state management across threads.
Race condition explains why you get blocks damaged.

It makes absolutely no sense to use more than 1 thread for the HMAC because 1) it has to be computed sequentially and 2) I/O access R/W is much slower than actual HMAC computation
For AES it can be a good idea to use multiple threads when using CNT mode or other chaining modes which don't require knowledge of previous data blocks.
what about moving the question to crypto-stackexchange?

How can I increase performance on reading the InputStream?

This very well may just be a KISS moment, but I feel like I should ask anyway.
I have a thread and it's reading from a sockets InputStream. Since I am dealing in particularly small data sizes (as in the data that I can expect to recieve from is in the order of 100 - 200 bytes), I set the buffer array size to 256. As part of my read function I have a check that will ensure that when I read from the InputStream that I got all of the data. If I didn't then I will recursively call the read function again. For each recursive call I merge the two buffer arrays back together.
My problem is, while I never anticipate using more than the buffer of 256, I want to be safe. But if sheep begin to fly and the buffer is significantly more the read the function (by estimation) will begin to take an exponential curve more time to complete.
How can I increase the effiency of the read function and/or the buffer merging?
Here is the read function as it stands.
int BUFFER_AMOUNT = 256;
private int read(byte[] buffer) throws IOException {
int bytes = mInStream.read(buffer); // Read the input stream
if (bytes == -1) { // If bytes == -1 then we didn't get all of the data
byte[] newBuffer = new byte[BUFFER_AMOUNT]; // Try to get the rest
int newBytes;
newBytes = read(newBuffer); // Recurse until we have all the data
byte[] oldBuffer = new byte[bytes + newBytes]; // make the final array size
// Merge buffer into the begining of old buffer.
// We do this so that once the method finishes, we can just add the
// modified buffer to a queue later in the class for processing.
for (int i = 0; i < bytes; i++)
oldBuffer[i] = buffer[i];
for (int i = bytes; i < bytes + newBytes; i++) // Merge newBuffer into the latter half of old Buffer
oldBuffer[i] = newBuffer[i];
// Used for the recursion
buffer = oldBuffer; // And now we set buffer to the new buffer full of all the data.
return bytes + newBytes;
}
return bytes;
}
EDIT: Am I being paranoid (unjustifiedly) and should just set the buffer to 2048 and call it done?

BufferedInputStream, as noted by Roland, and DataInputStream.readFully(), which replaces all the looping code.

int BUFFER_AMOUNT = 256;
Should be final if you don't want it changing at runtime.
if (bytes == -1) {
Should be !=
Also, I'm not entirely clear on what you're trying to accomplish with this code. Do you mind shedding some light on that?

I have no idea what you mean by "small data sizes". You should measure whether the time is spent in kernel mode (then you are issuing too many reads directly on the socket) or in user mode (then your algorithm is too complicated).
In the former case, just wrap the input with a BufferedInputStream with 4096 bytes of buffer and read from it.
In the latter case, just use this code:
/**
* Reads as much as possible from the stream.
* #return The number of bytes read into the buffer, or -1
* if nothing has been read because the end of file has been reached.
*/
static int readGreedily(InputStream is, byte[] buf, int start, int len) {
int nread;
int ptr = start; // index at which the data is put into the buffer
int rest = len; // number of bytes that we still want to read
while ((nread = is.read(buf, ptr, rest)) > 0) {
ptr += nread;
rest -= nread;
}
int totalRead = len - rest;
return (nread == -1 && totalRead == 0) ? -1 : totalRead;
}
This code completely avoids creating new objects, calling unnecessary methods and furthermore --- it is straightforward.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

getOutputSize constancy? - java

Related

Java Inflater will loop infinitely sometimes

Java DataOutputStream / DataInputStream OutOfMemoryError

A better way to find out how many bytes are in a stream?

AES+HMAC encryption in multiple threads - Java

How can I increase performance on reading the InputStream?

Categories

Resources