I am writing a Java program for fun which stores sensitive information from users.
For this reason I want to ensure that the garbage collection does not touch it, so that in the future when I am finished I can wipe it from memory.
So far I have this line of code creating 2048 bytes which is more than enough to store any user's passwords.
My question is how do I store a String such as "secret123", and after delete it? This is a very basic question I know but I could not see it in the documentation. I am probably making this more difficult than it is in my head, but better safe than sorry.
ByteBuffer pass = ByteBuffer.allocateDirect(2048);
I am aware of other risks such as swap page files, the computer being coldboot attacked etc...
Thanks!
EDIT:
In response to first answer - I mean to fill memory with '0' characters afterwards, not to free it.
You can't explicitly free the allocated memory, but you can clear the buffer and then write zeros (or random bytes) to the buffer when you are done. This will destroy any data that was previously stored in the buffer, reducing the window of attack.
pass.clear();
while (pass.hasRemaining())
pass.put((byte) 0);
As an alternative to #erickson's approach, if you allocate the byte array yourself and create the ByteBuffer by wrapping, then you can clear the array with a call to Arrays.fill().
byte[] byteArray = new byte[2048];
ByteBuffer bb = ByteBuffer.wrap(byteArray);
//... do your thing here
Arrays.fill(byteArray, (byte)0);
As long as you maintain a reference to either the byteArray or the ByteBuffer, garbage collection won't touch the byte array. You can also get the array back later by calling ByteBuffer.array() and then zeroing it out. (NB: You are not guaranteed an actual array if you try this with a ByteBuffer created by allocateDirect().)
Related
I'm new to Java ByteBuffers and was wondering what the correct way to write to a ByteBuffer after it has been flipped.
In my use case, I am writing an outputBuffer to a socket:
outBuffer.flip();
//Non-blocking SocketChannel
int bytesWritten = getSocket().write(outBuffer);
After this, the output buffer has to be written to again. Also not all of the bytes in the outBuffer may have been written to the socket.
Since it is currently flipped, how can I make it writable again, without overriding any data if it is still in the buffer and wasn't written to the socket?
If I am right, outBuffer.position() == bytesWritten and limit should be at how much data there was to write.
So would using the following in order to reuse the output buffer be right? :
int limit = outBuffer.limit()
outBuffer.limit(outBuffer.capacity());
outBuffer.position(limit);
Again from the API spec.:
The following loop copies bytes from one channel to another via the buffer buf:
while (in.read(buf) >= 0 || buf.position != 0) {
buf.flip();
out.write(buf);
buf.compact(); // In case of partial write
}
since it is currently flipped
It will stay flipped. The write doesn't change that.
how can I make it writable again, without overriding any data if it is still in the buffer and wasn't written to the socket?
You don't have to do anything, but if you want to read before you write again you should do flip/write/compact. If you just want to repeat the write just call write() again, with the buffer still in its current state.
But I prefer to always keep these buffers ready for reading, so there is no possibility of a slip-up, and to flip/write/compact (or flip/get/compact) when those operations are necessary, atomically as it were.
Note that you should not use clear(), unless you are certain that the write was complete and the buffer is now empty. In that case compact and clear are equivalent. But it is simpler to just always compact.
If you're copying in blocking mode, use the loop quoted by #zlakad.
Is there an approach that avoids having to copy byte[] from ByteBuffer with the ByteBuffer.get() operation.
I was looking at this post Java: Converting String to and from ByteBuffer and associated problems
and that causes an intermediary CharBuffer which I don't want as well.
I would like it to go from ByteBuffer to String.
When I know I have a byte[] underlying, this is easy with the code like so
new String(data, offset, length, charSet);
I was hoping for something similar with ByteBuffer. I am beginning to think this may not be possible? I need to decode N bytes of my ByteBuffer really.
This may be a bit of premature optimization but I am really just curious and wanted to test out the performance and squeeze every little bit out. (personal project really).
thanks,
Dean
Not really for a direct ByteBuffer, no. You need to have intermediate something, because String doesn't take a ByteBuffer as a constructor argument, and you can't wrap one (or even a char[]). If the buffer is non-direct, you can use the array() method to get a reference to the backing array (which isn't an intermediate array) and create a String out of that.
On the plus side, there's probably a lot more performance sensitive places in your project.
I'm working on a Java program where I'm reading from a file in dynamic, unknown blocks. That is, each block of data will not always be the same size and the size is determined as data is being read. For I/O I'm using a MappedByteBuffer (the file inputs are on the order of MB).
My goal:
Find an efficient way to store each complete block during the input phase so that I can process it.
My constraints:
I am reading one byte at a time from the buffer
My processing method takes a primitive byte array as input
Each block gets processed before the next block is read
What I've tried:
I played around with dynamic structures like Lists but they don't have backing arrays and the conversion time to a primitive array concerns me
I also thought about using a String to store each block and then getBytes() to get the byte[], but it's so slow
Reading the file multiple times in order to find the block size first, and then grab the relevant bytes
I am trying to find an approach that doesn't defeat the purpose of fast I/O. Any advice would be greatly appreciated.
Additional Info:
I'm using a rolling hash to decide where blocks should end
Here's a bit of pseudo-code:
circular_buffer[] = read first 128 bytes
rolling_hash = hash(buffer[])
block_storage = ??? // this is the data structure I'd like to use
while file has more text
b = next byte
add b to block_storage
add b to next index in circular_buffer (if reached end, start adding/overwriting front)
shift rolling_hash one byte to the right
if hash has a certain characteristic
process block_storage as a byte[] //should contain entire block of data
As you can see, I'm reading one byte at a time, and storing/overwriting that one byte repeatedly. However, once I get to the processing stage, I want to be able to access all of the info in the block. There is no predetermined max size of a block either, so I can't pre-allocate.
It seems to me, that you reqire a dynamically growing buffer. You can use the built in BytaArrayOutputStream to achieve that. It will automatically grow to store all data written to it. You can use write(int b) and toByteArray() to realize add b to block_storage and process block_storage as a byte[].
But take care - this stream will grow unbounded. You should implement some sanity checks around it to avoid using up all memory (e.g. count bytes written to it and break by throwing an exception, when it exceeds an reasonable amount). Also make sure to close and throw away the reference to a stream after consuming the block, to allow the GC to free up memory.
edit: As #marcman pointed out, the buffer can be reset().
Basically, my situation is this:
Server streams data from the client connection to a ByteBuffer object called inQueue. This contains whatever the most recent stream of data is
Server must process the data in each of these streams and expect a packet of data in a specific format
The payload of data is to be read into a byte[] object then processed separately
Now my question boils down to this: is copying the remaining buffer data (the payload) to a byte[] array bad for performance?
Here's what it would look like:
// pretend we're reading the packet ID and length
// int len = LENGTH OF PACKET PAYLOAD
/*
* Mark the starting position of the packet's payload.
*/
int pos = inQueue.position();
byte[] payload = new byte[len];
inQueue.get(payload);
// Process the packet's payload here
/*
* Set the inQueue buffer to the length added to the previous position
* so as to move onto the next packet to process.
*/
inQueue.position(pos + len);
As you can see, I'm essentially doing this:
Mark the position of the complete buffer as it were just before the payload
Copy the contents of inQueue as far as the payload goes to a separate byte[] object
Set the complete buffer's position to after the payload we just read so we can read more packets
My concern is that, in doing this, I'm wasting memory by copying the buffer. Keep in mind the packets used will never exceed 500 bytes and are often under 100 bytes.
Is my concern valid, or am I being performance-paranoid? :p
You should avoid it. That's the whole reason for the ByteBuffer design: to avoid data copies.
What exactly do you mean by 'process payload here'?
With a little rearrangement of whatever happens in there, you should be able to do that directly in the ByteBuffer, calling flip() first, one or more get()s to get the data you require, and compact() afterwards (clear() if you're sure it's empty), without an intermediate copy step into yet another byte[] array.
Not only is this unnecessary but, to answer your question, no you won't notice a performance change even when scaling up.
Has anyone has ever seen an implementation of java.nio.ByteBuffer that will grow dynamically if a putX() call overruns the capacity?
The reason I want to do it this way is twofold:
I don't know how much space I need ahead of time.
I'd rather not do a new ByteBuffer.allocate() then a bulk put() every time I run out of space.
In order for asynchronous I/O to work, you must have continuous memory. In C you can attempt to re-alloc an array, but in Java you must allocate new memory. You could write to a ByteArrayOutputStream, and then convert it to a ByteBuffer at the time you are ready to send it. The downside is you are copying memory, and one of the keys to efficient IO is reducing the number of times memory is copied.
A ByteBuffer cannot really work this way, as its design concept is to be just a view of a specific array, which you may also have a direct reference to. It could not try to swap that array for a larger array without weirdness happening.
What you want to use is a DataOutput. The most convenient way is to use the (pre-release) Guava library:
ByteArrayDataOutput out = ByteStreams.newDataOutput();
out.write(someBytes);
out.writeInt(someInt);
// ...
return out.toByteArray();
But you could also create a DataOutputStream from a ByteArrayOutputStream manually, and just deal with the spurious IOExceptions by chaining them into AssertionErrors.
Another option is to use direct memory with a large buffer. This consumes virtual memory but only uses as much physical memory as you use (by page which is typically 4K)
So if you allocate a buffer of 1 MB, it comsumes 1 MB of virtual memory, but the only OS gives physical pages to the application which is actually uses.
The effect is you see your application using alot of virtual memory but a relatively small amount of resident memory.
Have a look at Mina IOBuffer https://mina.apache.org/mina-project/userguide/ch8-iobuffer/ch8-iobuffer.html which is a drop in replacement (it wraps the ByteBuffer)
However , I suggest you allocate more than you need and don't worry about it too much. If you allocate a buffer (esp a direct buffer) the OS gives it virtual memory but it only uses physical memory when its actually used. Virtual memory should be very cheap.
It may be also worth to have a look at Netty's DynamicChannelBuffer. Things that I find handy are:
slice(int index, int length)
unsigned operations
separated writer and reader indexes
Indeed, auto-extending buffers are so much more intuitive to work with. If you can afford the performance luxury of reallocation, why wouldn't you!?
Netty's ByteBuf gives you exactly this. It's like they've taken java.nio's ByteBuffer and scraped away the edges, making it much easier to use.
Furthermore, it's on Maven in an independent netty-buffer package so you don't need to include the full Netty suite to use.
I'd suggest using an input stream to receive data from a file (with a sperate thread if you need non-blocking) then read bytes into a ByteArrayOutstream which gives you the ability to get it as a byte array. Heres a simple example without adding too many workarounds.
try (InputStream inputStream = Files.newInputStream(
Paths.get("filepath"), StandardOpenOption.READ)){
ByteArrayOutputStream baos = new ByteArrayOutputStream();
int byteRead = 0;
while(byteRead != -1){
byteRead = inputStream.read();
baos.write(byteRead);
}
ByteBuffer byteBuffer = ByteBuffer.allocate(baos.size())
byteBuffer.put(baos.toByteArray());
//. . . . use the buffer however you want
}catch(InvalidPathException pathException){
System.out.println("Path exception: " + pathException);
}
catch (IOException exception){
System.out.println("I/O exception: " + exception);
}
Another solution for this would be to allocate more than enough memory, fill the ByteBuffer and then only return the occupied byte array:
Initialize a big ByteBuffer:
ByteBuffer byteBuffer = ByteBuffer.allocate(1000);
After you're done putting things into it:
private static byte[] getOccupiedArray(ByteBuffer byteBuffer)
{
int position = byteBuffer.position();
return Arrays.copyOfRange(byteBuffer.array(), 0, position);
}
However, using a org.apache.commons.io.output.ByteArrayOutputStream from the start would probably be the best solution.
Netty ByteBuf is pretty good on that.
A Vector allows for continuous growth
Vector<Byte> bFOO = new Vector<Byte>();
bFOO.add((byte) 0x00);`
To serialize somethiing you will need object in entry. What you can do is put your object in collection of objects, and after that make loop to get iterator and put them in byte array. Then, call ByteBuffer.allocate(byte[].length). That is what I did and it worked for me.