Is there an approach that avoids having to copy byte[] from ByteBuffer with the ByteBuffer.get() operation.
I was looking at this post Java: Converting String to and from ByteBuffer and associated problems
and that causes an intermediary CharBuffer which I don't want as well.
I would like it to go from ByteBuffer to String.
When I know I have a byte[] underlying, this is easy with the code like so
new String(data, offset, length, charSet);
I was hoping for something similar with ByteBuffer. I am beginning to think this may not be possible? I need to decode N bytes of my ByteBuffer really.
This may be a bit of premature optimization but I am really just curious and wanted to test out the performance and squeeze every little bit out. (personal project really).
thanks,
Dean
Not really for a direct ByteBuffer, no. You need to have intermediate something, because String doesn't take a ByteBuffer as a constructor argument, and you can't wrap one (or even a char[]). If the buffer is non-direct, you can use the array() method to get a reference to the backing array (which isn't an intermediate array) and create a String out of that.
On the plus side, there's probably a lot more performance sensitive places in your project.
Related
I am writing a Java program for fun which stores sensitive information from users.
For this reason I want to ensure that the garbage collection does not touch it, so that in the future when I am finished I can wipe it from memory.
So far I have this line of code creating 2048 bytes which is more than enough to store any user's passwords.
My question is how do I store a String such as "secret123", and after delete it? This is a very basic question I know but I could not see it in the documentation. I am probably making this more difficult than it is in my head, but better safe than sorry.
ByteBuffer pass = ByteBuffer.allocateDirect(2048);
I am aware of other risks such as swap page files, the computer being coldboot attacked etc...
Thanks!
EDIT:
In response to first answer - I mean to fill memory with '0' characters afterwards, not to free it.
You can't explicitly free the allocated memory, but you can clear the buffer and then write zeros (or random bytes) to the buffer when you are done. This will destroy any data that was previously stored in the buffer, reducing the window of attack.
pass.clear();
while (pass.hasRemaining())
pass.put((byte) 0);
As an alternative to #erickson's approach, if you allocate the byte array yourself and create the ByteBuffer by wrapping, then you can clear the array with a call to Arrays.fill().
byte[] byteArray = new byte[2048];
ByteBuffer bb = ByteBuffer.wrap(byteArray);
//... do your thing here
Arrays.fill(byteArray, (byte)0);
As long as you maintain a reference to either the byteArray or the ByteBuffer, garbage collection won't touch the byte array. You can also get the array back later by calling ByteBuffer.array() and then zeroing it out. (NB: You are not guaranteed an actual array if you try this with a ByteBuffer created by allocateDirect().)
Using String.getBytes(Charset ch), allocates a new buffer, in fact it returns a byte[]. Is there a way to avoid this? I'd like to have a reusable byte array and have the strings encoded in this buffer.
You can use the Charset and CharsetEncoder APIs directly, in particular calling encode(CharBuffer, ByteBuffer, boolean). However, I wouldn't expect it to end up being particularly pleasant code.
If you're like me an don't master ByteBuffer, to complement Jon's answer, you could also create your own OutputStream implementation wrapping your byte array, and use an OutputStreamWriter to write the String to this custom OutputStream.
You can use
getChars(int srcBegin, int srcEnd, char[] dst, int dstBegin)
//Copies characters from this string into the destination character array.
and manage the array by yourself.
This may sound foolish, but I'm wondering all the same...
Is it possible to take a string composed of a given character set and compress it by using a bigger character set, or composing it into a number then converting it back at one?
For example, if you had a string that you know what be composed of [a-z][A-Z][0-9]-_+=, could you turn that into a number, the swap it back using more characters in order to compress it?
This is an area I'm not familiar with, I still want to keep it as a string, just a shorter one. (for displaying/echoing/etc, not memory)
I wouldn't bother doing that, unless the string is huge. You can then try to compress it with commons-compress or java.util.zip
A String internally keeps an array of 16 bit characters, which for western european languages is a waste, you can convert to utf-8 which should give you 50% reduction by doing
String myString = .....
ByteArrayOutputStream baos = new ByteArrayOutputStream();
baos.write(myString.getBytes("UTF-8");
byte[] data = baos.toByteArray();
and hold onto it as a byte array.
Of course this is rather inconvienent if you actually want to use them as Strings, but if the point is long term storage, without much access, this would save you a bunch.
You would have to do the reverse to recreate a String.
String is a primitive type, you are unlikely to regain any space by converting unless you use Java's zip library, and even that will not yield the performance benefits you are presumably seeking.
Java newbie here. Are there any helper functions to serialize data in and out of byte arrays? I am writing a Java package that implements a network protocol. So I have to write some typical variables like a version (1byte), sequence Number (long) and binary data (bytes) in a loop. How do I do this in Java? Coming from C I am thinking of creating a byte array of the required size and then since there is no memcpy() I am converting the long into a temporary byte array and then copying it into the actual byte array. It seems so inefficient and also really error prone. Is there a class I could use to marshall and unmarshall parameters to a byte array?
Also why does all the Socket classes only deals with char[] and not byte[]? A socket by definition has to deal with binary data also. How is this done in Java?
I am sure what I am missing is the Java mindset. Appreciate it if some one can point it to me.
EDIT: I did look at DataOutputStream and DataInputStream but I cannot convert the bytes to a String not to a byte[] which means the information might be lost in the conversion to write to a socket.
Pav
Have a look at DataInputStream, DataOutputStream, ObjectInputStream and ObjectOutputStream. Check first if the layout of the data is acceptable to you. Also, Serialization.
Sockets neither deal with char[] nor with byte[] but with InputStream and OutputStream which are used to read and write bytes.
If you are sending the data over a socket, then you don't need a temporary byte array at all; you can wrap the socket's OutputStream with DataOutputStream or ObjectOutputStream and just write what you want to write.
There might be an aspect I've missed that means you do actually need temporary byte arrays. If so, look at ByteArrayOutputStream. Also, there's no memcpy(), sure, but there is System.arraycopy.
As above, DataInputStream and DataOutputStream are exactly what you are looking for. Re your comment about String, if you're planning to use Java Strings over the wire, you're not designing a network protocol, youre designing a Java protocol. There are readUTF() and writeUTF() if you're sure the other end is Java or if you can code the other end to understand these formats. Or you can send as bytes along with the appropriate charset, or predefine the charset for the entire protocol if that makes sense.
Has anyone has ever seen an implementation of java.nio.ByteBuffer that will grow dynamically if a putX() call overruns the capacity?
The reason I want to do it this way is twofold:
I don't know how much space I need ahead of time.
I'd rather not do a new ByteBuffer.allocate() then a bulk put() every time I run out of space.
In order for asynchronous I/O to work, you must have continuous memory. In C you can attempt to re-alloc an array, but in Java you must allocate new memory. You could write to a ByteArrayOutputStream, and then convert it to a ByteBuffer at the time you are ready to send it. The downside is you are copying memory, and one of the keys to efficient IO is reducing the number of times memory is copied.
A ByteBuffer cannot really work this way, as its design concept is to be just a view of a specific array, which you may also have a direct reference to. It could not try to swap that array for a larger array without weirdness happening.
What you want to use is a DataOutput. The most convenient way is to use the (pre-release) Guava library:
ByteArrayDataOutput out = ByteStreams.newDataOutput();
out.write(someBytes);
out.writeInt(someInt);
// ...
return out.toByteArray();
But you could also create a DataOutputStream from a ByteArrayOutputStream manually, and just deal with the spurious IOExceptions by chaining them into AssertionErrors.
Another option is to use direct memory with a large buffer. This consumes virtual memory but only uses as much physical memory as you use (by page which is typically 4K)
So if you allocate a buffer of 1 MB, it comsumes 1 MB of virtual memory, but the only OS gives physical pages to the application which is actually uses.
The effect is you see your application using alot of virtual memory but a relatively small amount of resident memory.
Have a look at Mina IOBuffer https://mina.apache.org/mina-project/userguide/ch8-iobuffer/ch8-iobuffer.html which is a drop in replacement (it wraps the ByteBuffer)
However , I suggest you allocate more than you need and don't worry about it too much. If you allocate a buffer (esp a direct buffer) the OS gives it virtual memory but it only uses physical memory when its actually used. Virtual memory should be very cheap.
It may be also worth to have a look at Netty's DynamicChannelBuffer. Things that I find handy are:
slice(int index, int length)
unsigned operations
separated writer and reader indexes
Indeed, auto-extending buffers are so much more intuitive to work with. If you can afford the performance luxury of reallocation, why wouldn't you!?
Netty's ByteBuf gives you exactly this. It's like they've taken java.nio's ByteBuffer and scraped away the edges, making it much easier to use.
Furthermore, it's on Maven in an independent netty-buffer package so you don't need to include the full Netty suite to use.
I'd suggest using an input stream to receive data from a file (with a sperate thread if you need non-blocking) then read bytes into a ByteArrayOutstream which gives you the ability to get it as a byte array. Heres a simple example without adding too many workarounds.
try (InputStream inputStream = Files.newInputStream(
Paths.get("filepath"), StandardOpenOption.READ)){
ByteArrayOutputStream baos = new ByteArrayOutputStream();
int byteRead = 0;
while(byteRead != -1){
byteRead = inputStream.read();
baos.write(byteRead);
}
ByteBuffer byteBuffer = ByteBuffer.allocate(baos.size())
byteBuffer.put(baos.toByteArray());
//. . . . use the buffer however you want
}catch(InvalidPathException pathException){
System.out.println("Path exception: " + pathException);
}
catch (IOException exception){
System.out.println("I/O exception: " + exception);
}
Another solution for this would be to allocate more than enough memory, fill the ByteBuffer and then only return the occupied byte array:
Initialize a big ByteBuffer:
ByteBuffer byteBuffer = ByteBuffer.allocate(1000);
After you're done putting things into it:
private static byte[] getOccupiedArray(ByteBuffer byteBuffer)
{
int position = byteBuffer.position();
return Arrays.copyOfRange(byteBuffer.array(), 0, position);
}
However, using a org.apache.commons.io.output.ByteArrayOutputStream from the start would probably be the best solution.
Netty ByteBuf is pretty good on that.
A Vector allows for continuous growth
Vector<Byte> bFOO = new Vector<Byte>();
bFOO.add((byte) 0x00);`
To serialize somethiing you will need object in entry. What you can do is put your object in collection of objects, and after that make loop to get iterator and put them in byte array. Then, call ByteBuffer.allocate(byte[].length). That is what I did and it worked for me.