In Java, is there a library anywhere that will, given a byte sequence (preferable expressed as hex), translate to another byte sequence given an InputStream? For example:
InputStream input = new FileInputStream(new File(...));
OutputStream output = new FileOutputStream(new File(...));
String fromHex = "C3BEAB";
String toHex = "EAF6"
MyMagicLibrary.translate(fromHex, toHex, input, output)
So if the input file (in hex looked like)
00 00 12 18 33 C3 BE AB 00 23 C3 BE AB 00
after translation, the result would be
00 00 12 18 33 EA F6 00 23 EA F6 00
Once I did something like this (for trivially patching exe-files) using regexes. I read the whole input into a byte[] and converted into String using latin1, then did the substitution and converted back. It wasn't efficient but it didn't matter at all. You don't need regexes, simple String.replace would do.
But in your case it can be done quite simply and very efficiently:
int count = 0;
while (true) {
int n = input.read();
if (n == (fromAsByteArray[count] & 255)) {
++count;
if (count==fromAsByteArray.length) { // match found
output.write(toAsByteArray);
count = 0;
}
} else { // mismatch
output.write(fromAsByteArray, 0, count); // flush matching chars so far
count = 0;
if (n == -1) break;
output.write(n);
}
}
}
If u mean that u want to use a class whch translate from hex and to hex
here's two methods I usualluy use, u can put them inside a class and reuse it any where u want
public static String toHex(byte buf[]) {
StringBuffer strbuf = new StringBuffer(buf.length * 2);
int i;
for (i = 0; i < buf.length; i++) {
if (((int) buf[i] & 0xff) < 0x10) {
strbuf.append("0");
}
strbuf.append(Long.toString((int) buf[i] & 0xff, 16));
}
return strbuf.toString();
}
public static byte[] fromHexString(String s) {
int len = s.length();
byte[] data = new byte[len / 2];
for (int i = 0; i < len; i += 2) {
data[i / 2] = (byte) ((Character.digit(s.charAt(i), 16) << 4)
+ Character.digit(s.charAt(i + 1), 16));
}
return data;
}
Actually I don't understand each line in the code, but I usually reuse them.
Since your input can have spaces then you first need to scrub your input to remove the spaces. After reading a pair of characters just use Byte.parseByte(twoCharString, 16) then use String.format to convert back to a String.
Doing it byte by byte would most likely be VERY inefficient, though easy to test. Once you get the result you want, you can tweak it by reading and parsing a whole buffer and spitting out more than one resulting byte a time, maybe 16 "byte" characters per line for formatting. It is all up to you at that point.
One way to implement this is to use IOUtils and String replace method.
public static void translate(byte[] fromHex, byte[] toHex, InputStream input, OutputStream output) throws IOException {
IOUtils.write(translate(fromHex, toHex, IOUtils.toByteArray(input)), output);
}
public static byte[] translate(byte[] fromHex, byte[] toHex, byte[] inBytes) throws UnsupportedEncodingException {
String inputText = new String(inBytes, "ISO-8859-1");
String outputText = inputText.replace(new String(fromHex, "ISO-8859-1"), new String(toHex, "ISO-8859-1"));
return outputText.getBytes("ISO-8859-1");
}
Related
I have a problem with CharsetDecoder class.
First example of code (which works):
final CharsetDecoder dec = Charset.forName("UTF-8").newDecoder();
final ByteBuffer b = ByteBuffer.allocate(3);
final byte[] tab = new byte[]{(byte)-30, (byte)-126, (byte)-84}; //char €
for (int i=0; i<tab.length; i++){
b.put(tab, i, 1);
}
try {
b.flip();
System.out.println("a" + dec.decode(b).toString() + "a");
} catch (CharacterCodingException e1) {
e1.printStackTrace();
}
The result is a€a
But when i execute this code:
final CharsetDecoder dec = Charset.forName("UTF-8").newDecoder();
final CharBuffer chars = CharBuffer.allocate(3);
final byte[] tab = new byte[]{(byte)-30, (byte)-126, (byte)-84}; //char €
for (int i=0; i<tab.length; i++){
ByteBuffer buffer = ByteBuffer.wrap(tab, i, 1);
dec.decode(buffer, chars, i == 2);
}
dec.flush(chars);
System.out.println("a" + chars.toString() + "a");
The result is a
Why is not the same result?
How to use the method decode(ByteBuffer, CharBuffer, endOfInput) of class CharsetDecoder in order to retrieve the result a€a ?
-- EDIT --
So with code of Jesper I do that. It's no perfect but works with a step = 1, 2 and 3
final CharsetDecoder dec = Charset.forName("UTF-8").newDecoder();
final CharBuffer chars = CharBuffer.allocate(6);
final byte[] tab = new byte[]{(byte)97, (byte)-30, (byte)-126, (byte)-84, (byte)97, (byte)97}; //char €
final ByteBuffer buffer = ByteBuffer.allocate(10);
final int step = 3;
for (int i = 0; i < tab.length; i++) {
// Add the next byte to the buffer
buffer.put(tab, i, step);
i+=step-1;
// Remember the current position
final int pos = buffer.position();
int l=chars.position();
// Try to decode
buffer.flip();
final CoderResult result = dec.decode(buffer, chars, i >= tab.length -1);
System.out.println(result);
if (result.isUnderflow() && chars.position() == l) {
// Underflow, prepare the buffer for more writing
buffer.position(pos);
}else{
if (buffer.position() == buffer.limit()){
//ByteBuffer decoded
buffer.clear();
buffer.position(0);
}else{
//a part of ByteBuffer is decoded. We keep only bytes which are not decoded
final byte[] b = buffer.array();
final int f = buffer.position();
final int g = buffer.limit() - buffer.position();
buffer.clear();
buffer.position(0);
buffer.put(b, f, g);
}
}
buffer.limit(buffer.capacity());
}
dec.flush(chars);
chars.flip();
System.out.println(chars.toString());
The method decode(ByteBuffer, CharBuffer, boolean) returns a result, but you are ignoring the result. If print the result in your second code fragment:
for (int i = 0; i < tab.length; i++) {
ByteBuffer buffer = ByteBuffer.wrap(tab, i, 1);
System.out.println(dec.decode(buffer, chars, i == 2));
}
you'll see this output:
UNDERFLOW
MALFORMED[1]
MALFORMED[1]
a a
Apparently it does not work correctly if you start decoding in the middle of a character. The decoder expects that the first thing it reads is the start of a valid UTF-8 sequence.
edit - When the decoder reports UNDERFLOW, it expects you to add more data to the input buffer and then try to call decode() again, but you must re-offer it the data from the start of the UTF-8 sequence that you are trying to decode. You can't continue in the middle of an UTF-8 sequence.
Here is a version that works, adding one byte from tab in every iteration of the loop:
final CharsetDecoder dec = Charset.forName("UTF-8").newDecoder();
final CharBuffer chars = CharBuffer.allocate(3);
final byte[] tab = new byte[]{(byte) -30, (byte) -126, (byte) -84}; //char €
final ByteBuffer buffer = ByteBuffer.allocate(10);
for (int i = 0; i < tab.length; i++) {
// Add the next byte to the buffer
buffer.put(tab[i]);
// Remember the current position
final int pos = buffer.position();
// Try to decode
buffer.flip();
final CoderResult result = dec.decode(buffer, chars, i == 2);
System.out.println(result);
if (result.isUnderflow()) {
// Underflow, prepare the buffer for more writing
buffer.limit(buffer.capacity());
buffer.position(pos);
}
}
dec.flush(chars);
chars.flip();
System.out.println("a" + chars.toString() + "a");
The decoder does not internally cache the data from partial characters, but this does not mean that you have to do complicated things to figure out what data to re-feed the decoder. You gave it a clear way to represent what data it actually consumed, i.e. the input ByteBuffer and its position. In the second example, by giving it a new ByteBuffer every time, the OP failed to pass the decoder back what it reported it had not yet consumed.
The standard pattern for using NIO Buffers is input, flip, output, compact, loop. Short of optimization (which may be premature), there is no reason to re-implement compact yourself. You might just get it wrong, like #Jesper and #lecogiteur did (if more than a single character was ever presented). You should NOT be resetting to the position from before the decode call.
The second example should have read something like:
final CharsetDecoder dec = Charset.forName("UTF-8").newDecoder();
final CharBuffer chars = CharBuffer.allocate(3);
final byte[] tab = new byte[]{(byte)-30, (byte)-126, (byte)-84}; //char €
final ByteBuffer buffer = ByteBuffer.wrap(new byte[3]);
for (int i=0; i<tab.length; i++){
b.put(tab, i, 1); // In actual usage some type of IO read/transfer would occur here
b.flip();
dec.decode(buffer, chars, i == 2);
b.compact();
}
dec.flush(chars);
System.out.println("a" + chars.toString() + "a");
NOTE: The above does not check the return value to detect malformed input or other error handling for running safely on arbitrary input/IO conditions.
It might be a question already asked, but I have not found a satisfactory answer yet out there. In particular because this conversion has always been done in c or C++.
Btw, how do you convert an hexadecimal file (200MB) into its UINT32 Big-endian representation in Java?
This an example of what I am trying to achieve:
54 00 00 00 -> 84
55 F1 2E 04 -> 70185301
A2 3F 32 01 -> 20070306
and so on
EDIT
File fileInputString = new File(inputFileField.getText());
FileInputStream fin = new FileInputStream(fileInputString);
FileOutputStream out = new FileOutputStream(fileDirectoryFolder.getText() +"/"+ fileInputString.getName());
byte[] fileContent = new byte[(int)fileInputString.length()];
fin.read(fileContent);
System.out.println("File Lenght" + fileContent.length);
for(int i = 0; i < fileContent.length; i++){
Byte b = fileContent[i]; // Boxing conversion converts `byte` to `Byte`
int value = b.intValue();
out.write(value);
}
close();
System.out.println("Done");
EDIT 2
File fileInputString = new File(inputFileField.getText());
FileInputStream fin = new FileInputStream(fileInputString);
FileOutputStream out = new FileOutputStream(fileDirectoryFolder.getText() +"/"+ fileInputString.getName());
ByteArrayOutputStream bos = new ByteArrayOutputStream();
byte[] fileContent = new byte[(int)fileInputString.length()];
System.out.println("File Lenght" + fileContent.length);
int bytesRead;
while (( bytesRead = fin.read(fileContent)) != -1) {
ByteBuffer.wrap(fileContent).order(ByteOrder.LITTLE_ENDIAN).getLong();
bos.write(fileContent, 0, bytesRead);
}
out.write(bos.toByteArray());
System.out.println("Done");
EDIT 3
DataOutputStream out = new DataOutputStream(new FileOutputStream(output));
DataInputStream in = new DataInputStream(new FileInputStream(input))) {
int count = 0;
while (count < input.length() - 4) {
in.readFully(buffer, 4, 4);
String s=Long.toString(ByteBuffer.wrap(buffer).order(ByteOrder.LITTLE_ENDIAN).getLong());
out.writeBytes( s + " ");
count += 4;
}
Thanks
The following code should hopefully suffice. It uses long values to ensure we can fully represent the range of positive values that four bytes can represent.
Note: this code assumes the hex input is four bytes. You may want to add some more checks and measures in production code.
private static long toLong(String hex) {
hex = hex.replace(" ", "") + "00000000";
byte[] data = DatatypeConverter.parseHexBinary(hex);
return ByteBuffer.wrap(data).order(ByteOrder.LITTLE_ENDIAN).getLong();
}
public static void main(String[] args) throws Exception {
System.out.println(toLong("54 00 00 00"));
System.out.println(toLong("55 F1 2E 04"));
System.out.println(toLong("A2 3F 32 01"));
System.out.println(toLong("FF FF FF FF"));
}
Output:
84
70185301
20070306
4294967295
Based on your recent edits, I propose some code such as the following. Note that it assumes your input is a multiple of four bytes in length. Any left-over bytes are ignored:
File input = new File("whatever");
byte[] buffer = new byte[8];
List<Long> result = new ArrayList<>();
try (DataInputStream in = new DataInputStream(new FileInputStream(input))) {
int count = 0;
// Note: any trailing bytes are ignored
while (count < input.length() - 4) {
in.readFully(buffer, 4, 4);
result.add(ByteBuffer.wrap(buffer)
.order(ByteOrder.LITTLE_ENDIAN).getLong());
count += 4;
}
}
You need to switch the byte order within the 4 bytes that form an int. The conversion is symetric, so when the input is little endian, output becomes big endian and vice versa.
Big Endian: 12 34 56 78
Little Endian: 78 56 34 12
So if you were doing that while processing an InputStream, read four bytes, and write them to output in reverse order.
while (!bStop) {
byte[] buffer = new byte[256];
if (inputStream.available() > 0) {
inputStream.read(buffer);
int i = 0;
for (i = 0; i < buffer.length && buffer[i] != 0; i++) {
}
final String strInput = new String(buffer, 0, i);
System.out.println(strInput);`
}
The inputstream data is coming in encrypted form in bytes. When i print the data i get funny characters. How can i directly convert the inputstream to hexadecimal in a form of -> 01 2A 03 AA.
Please Help.
try like this
byte[] array = ByteStreams.toByteArray(inputStream);
I am working on Socket connection. I am working on the client side.
I have gone through this discussion Socket pass value as Hex. I need to send the String e.g(0x01 is a hex value and a String "Ravi") at the server they are expecting hexa value like 1 72 61 76 69. I tried of converting String Ravi to hexa value as String and appending "1" and try to convert to byte array. I am getting an exception that StringIndexOutOfBound exception.
update:
`public static byte[] hexStringToByteArray(String s) {
int len = s.length();
byte[] data = new byte[len / 2];
for (int i = 0; i < len; i += 2) {
data[i / 2] = (byte) ((Character.digit(s.charAt(i), 16) << 2)
+ Character.digit(s.charAt(i+1), 16));
}
return data;
}
public String toHex(String arg) {
return String.format("%x", new BigInteger(arg.getBytes()));
}`
I used these two methods to convert the 1Ravi string to byte array but i am getting exception hexstringtobytearray method.
try this
Socket sock = new Socket("host", port);
OutputStream out = sock.getOutputStream();
out.write(0);
String s = "ravi";
byte[] bytes = s.getBytes("UTF-8");
out.write(bytes);
I have been experimenting with using UUIDs as database keys. I want to take up the least amount of bytes as possible, while still keeping the UUID representation human readable.
I think that I have gotten it down to 22 bytes using base64 and removing some trailing "==" that seem to be unnecessary to store for my purposes. Are there any flaws with this approach?
Basically my test code does a bunch of conversions to get the UUID down to a 22 byte String, then converts it back into a UUID.
import java.io.IOException;
import java.util.UUID;
public class UUIDTest {
public static void main(String[] args){
UUID uuid = UUID.randomUUID();
System.out.println("UUID String: " + uuid.toString());
System.out.println("Number of Bytes: " + uuid.toString().getBytes().length);
System.out.println();
byte[] uuidArr = asByteArray(uuid);
System.out.print("UUID Byte Array: ");
for(byte b: uuidArr){
System.out.print(b +" ");
}
System.out.println();
System.out.println("Number of Bytes: " + uuidArr.length);
System.out.println();
try {
// Convert a byte array to base64 string
String s = new sun.misc.BASE64Encoder().encode(uuidArr);
System.out.println("UUID Base64 String: " +s);
System.out.println("Number of Bytes: " + s.getBytes().length);
System.out.println();
String trimmed = s.split("=")[0];
System.out.println("UUID Base64 String Trimmed: " +trimmed);
System.out.println("Number of Bytes: " + trimmed.getBytes().length);
System.out.println();
// Convert base64 string to a byte array
byte[] backArr = new sun.misc.BASE64Decoder().decodeBuffer(trimmed);
System.out.print("Back to UUID Byte Array: ");
for(byte b: backArr){
System.out.print(b +" ");
}
System.out.println();
System.out.println("Number of Bytes: " + backArr.length);
byte[] fixedArr = new byte[16];
for(int i= 0; i<16; i++){
fixedArr[i] = backArr[i];
}
System.out.println();
System.out.print("Fixed UUID Byte Array: ");
for(byte b: fixedArr){
System.out.print(b +" ");
}
System.out.println();
System.out.println("Number of Bytes: " + fixedArr.length);
System.out.println();
UUID newUUID = toUUID(fixedArr);
System.out.println("UUID String: " + newUUID.toString());
System.out.println("Number of Bytes: " + newUUID.toString().getBytes().length);
System.out.println();
System.out.println("Equal to Start UUID? "+newUUID.equals(uuid));
if(!newUUID.equals(uuid)){
System.exit(0);
}
} catch (IOException e) {
}
}
public static byte[] asByteArray(UUID uuid) {
long msb = uuid.getMostSignificantBits();
long lsb = uuid.getLeastSignificantBits();
byte[] buffer = new byte[16];
for (int i = 0; i < 8; i++) {
buffer[i] = (byte) (msb >>> 8 * (7 - i));
}
for (int i = 8; i < 16; i++) {
buffer[i] = (byte) (lsb >>> 8 * (7 - i));
}
return buffer;
}
public static UUID toUUID(byte[] byteArray) {
long msb = 0;
long lsb = 0;
for (int i = 0; i < 8; i++)
msb = (msb << 8) | (byteArray[i] & 0xff);
for (int i = 8; i < 16; i++)
lsb = (lsb << 8) | (byteArray[i] & 0xff);
UUID result = new UUID(msb, lsb);
return result;
}
}
output:
UUID String: cdaed56d-8712-414d-b346-01905d0026fe
Number of Bytes: 36
UUID Byte Array: -51 -82 -43 109 -121 18 65 77 -77 70 1 -112 93 0 38 -2
Number of Bytes: 16
UUID Base64 String: za7VbYcSQU2zRgGQXQAm/g==
Number of Bytes: 24
UUID Base64 String Trimmed: za7VbYcSQU2zRgGQXQAm/g
Number of Bytes: 22
Back to UUID Byte Array: -51 -82 -43 109 -121 18 65 77 -77 70 1 -112 93 0 38 -2 0 38
Number of Bytes: 18
Fixed UUID Byte Array: -51 -82 -43 109 -121 18 65 77 -77 70 1 -112 93 0 38 -2
Number of Bytes: 16
UUID String: cdaed56d-8712-414d-b346-01905d0026fe
Number of Bytes: 36
Equal to Start UUID? true
I was also trying to do something similar. I am working with a Java application which uses UUIDs of the form 6fcb514b-b878-4c9d-95b7-8dc3a7ce6fd8 (which are generated with the standard UUID lib in Java). In my case I needed to be able to get this UUID down to 30 characters or less. I used Base64 and these are my convenience functions. Hopefully they will be helpful for someone as the solution was not obvious to me right away.
Usage:
String uuid_str = "6fcb514b-b878-4c9d-95b7-8dc3a7ce6fd8";
String uuid_as_64 = uuidToBase64(uuid_str);
System.out.println("as base64: "+uuid_as_64);
System.out.println("as uuid: "+uuidFromBase64(uuid_as_64));
Output:
as base64: b8tRS7h4TJ2Vt43Dp85v2A
as uuid : 6fcb514b-b878-4c9d-95b7-8dc3a7ce6fd8
Functions:
import org.apache.commons.codec.binary.Base64;
private static String uuidToBase64(String str) {
Base64 base64 = new Base64();
UUID uuid = UUID.fromString(str);
ByteBuffer bb = ByteBuffer.wrap(new byte[16]);
bb.putLong(uuid.getMostSignificantBits());
bb.putLong(uuid.getLeastSignificantBits());
return base64.encodeBase64URLSafeString(bb.array());
}
private static String uuidFromBase64(String str) {
Base64 base64 = new Base64();
byte[] bytes = base64.decodeBase64(str);
ByteBuffer bb = ByteBuffer.wrap(bytes);
UUID uuid = new UUID(bb.getLong(), bb.getLong());
return uuid.toString();
}
You can safely drop the padding "==" in this application. If you were to decode the base-64 text back to bytes, some libraries would expect it to be there, but since you are just using the resulting string as a key, it's not a problem.
I'd use Base-64 because its encoding characters can be URL-safe, and it looks less like gibberish. But there's also Base-85. It uses more symbols and codes 4 bytes as 5 characters, so you could get your text down to 20 characters.
Here's my code, it uses org.apache.commons.codec.binary.Base64 to produce url-safe unique strings that are 22 characters in length (and that have the same uniqueness as UUID).
private static Base64 BASE64 = new Base64(true);
public static String generateKey(){
UUID uuid = UUID.randomUUID();
byte[] uuidArray = KeyGenerator.toByteArray(uuid);
byte[] encodedArray = BASE64.encode(uuidArray);
String returnValue = new String(encodedArray);
returnValue = StringUtils.removeEnd(returnValue, "\r\n");
return returnValue;
}
public static UUID convertKey(String key){
UUID returnValue = null;
if(StringUtils.isNotBlank(key)){
// Convert base64 string to a byte array
byte[] decodedArray = BASE64.decode(key);
returnValue = KeyGenerator.fromByteArray(decodedArray);
}
return returnValue;
}
private static byte[] toByteArray(UUID uuid) {
byte[] byteArray = new byte[(Long.SIZE / Byte.SIZE) * 2];
ByteBuffer buffer = ByteBuffer.wrap(byteArray);
LongBuffer longBuffer = buffer.asLongBuffer();
longBuffer.put(new long[] { uuid.getMostSignificantBits(), uuid.getLeastSignificantBits() });
return byteArray;
}
private static UUID fromByteArray(byte[] bytes) {
ByteBuffer buffer = ByteBuffer.wrap(bytes);
LongBuffer longBuffer = buffer.asLongBuffer();
return new UUID(longBuffer.get(0), longBuffer.get(1));
}
I have an application where I'm doing almost exactly this. 22 char encoded UUID. It works fine. However, the main reason I'm doing it this way is that the IDs are exposed in the web app's URIs, and 36 characters is really quite big for something that appears in a URI. 22 characters is still kinda long, but we make do.
Here's the Ruby code for this:
# Make an array of 64 URL-safe characters
CHARS64 = ("a".."z").to_a + ("A".."Z").to_a + ("0".."9").to_a + ["-", "_"]
# Return a 22 byte URL-safe string, encoded six bits at a time using 64 characters
def to_s22
integer = self.to_i # UUID as a raw integer
rval = ""
22.times do
c = (integer & 0x3F)
rval += CHARS64[c]
integer = integer >> 6
end
return rval.reverse
end
It's not exactly the same as base64 encoding because base64 uses characters that would have to be escaped if they appeared in a URI path component. The Java implementation is likely to be quite different since you're more likely to have an array of raw bytes instead of a really big integer.
Here is an example with java.util.Base64 introduced in JDK8:
import java.nio.ByteBuffer;
import java.util.Base64;
import java.util.Base64.Encoder;
import java.util.UUID;
public class Uuid64 {
private static final Encoder BASE64_URL_ENCODER = Base64.getUrlEncoder().withoutPadding();
public static void main(String[] args) {
// String uuidStr = UUID.randomUUID().toString();
String uuidStr = "eb55c9cc-1fc1-43da-9adb-d9c66bb259ad";
String uuid64 = uuidHexToUuid64(uuidStr);
System.out.println(uuid64); //=> 61XJzB_BQ9qa29nGa7JZrQ
System.out.println(uuid64.length()); //=> 22
String uuidHex = uuid64ToUuidHex(uuid64);
System.out.println(uuidHex); //=> eb55c9cc-1fc1-43da-9adb-d9c66bb259ad
}
public static String uuidHexToUuid64(String uuidStr) {
UUID uuid = UUID.fromString(uuidStr);
byte[] bytes = uuidToBytes(uuid);
return BASE64_URL_ENCODER.encodeToString(bytes);
}
public static String uuid64ToUuidHex(String uuid64) {
byte[] decoded = Base64.getUrlDecoder().decode(uuid64);
UUID uuid = uuidFromBytes(decoded);
return uuid.toString();
}
public static byte[] uuidToBytes(UUID uuid) {
ByteBuffer bb = ByteBuffer.wrap(new byte[16]);
bb.putLong(uuid.getMostSignificantBits());
bb.putLong(uuid.getLeastSignificantBits());
return bb.array();
}
public static UUID uuidFromBytes(byte[] decoded) {
ByteBuffer bb = ByteBuffer.wrap(decoded);
long mostSigBits = bb.getLong();
long leastSigBits = bb.getLong();
return new UUID(mostSigBits, leastSigBits);
}
}
The UUID encoded in Base64 is URL safe and without padding.
You don't say what DBMS you're using, but it seems that RAW would be the best approach if you're concerned about saving space. You just need to remember to convert for all queries, or you'll risk a huge performance drop.
But I have to ask: are bytes really that expensive where you live?
This is not exactly what you asked for (it isn't Base64), but worth looking at, because of added flexibility: there is a Clojure library that implements a compact 26-char URL-safe representation of UUIDs (https://github.com/tonsky/compact-uuids).
Some highlights:
Produces strings that are 30% smaller (26 chars vs traditional 36 chars)
Supports full UUID range (128 bits)
Encoding-safe (uses only readable characters from ASCII)
URL/file-name safe
Lowercase/uppercase safe
Avoids ambiguous characters (i/I/l/L/1/O/o/0)
Alphabetical sort on encoded 26-char strings matches default UUID sort order
These are rather nice properties. I've been using this encoding in my applications both for database keys and for user-visible identifiers, and it works very well.
The codecs Base64Codec and Base64UrlCodec can encode UUIDs efficiently to base-64 and base-64-url.
// Returns a base-64 string
// input:: 01234567-89AB-4DEF-A123-456789ABCDEF
// output: ASNFZ4mrTe+hI0VniavN7w
String string = Base64Codec.INSTANCE.encode(uuid);
// Returns a base-64-url string
// input:: 01234567-89AB-4DEF-A123-456789ABCDEF
// output: ASNFZ4mrTe-hI0VniavN7w
String string = Base64UrlCodec.INSTANCE.encode(uuid);
There are codecs for other encodings in the same package of uuid-creator.
Below is what I use for a UUID (Comb style). It includes code for converting a uuid string or uuid type to base64. I do it per 64 bits, so I don't deal with any equal signs:
JAVA
import java.util.Calendar;
import java.util.UUID;
import org.apache.commons.codec.binary.Base64;
public class UUIDUtil{
public static UUID combUUID(){
private UUID srcUUID = UUID.randomUUID();
private java.sql.Timestamp ts = new java.sql.Timestamp(Calendar.getInstance().getTime().getTime());
long upper16OfLowerUUID = this.zeroLower48BitsOfLong( srcUUID.getLeastSignificantBits() );
long lower48Time = UUIDUtil.zeroUpper16BitsOfLong( ts );
long lowerLongForNewUUID = upper16OfLowerUUID | lower48Time;
return new UUID( srcUUID.getMostSignificantBits(), lowerLongForNewUUID );
}
public static base64URLSafeOfUUIDObject( UUID uuid ){
byte[] bytes = ByteBuffer.allocate(16).putLong(0, uuid.getLeastSignificantBits()).putLong(8, uuid.getMostSignificantBits()).array();
return Base64.encodeBase64URLSafeString( bytes );
}
public static base64URLSafeOfUUIDString( String uuidString ){
UUID uuid = UUID.fromString( uuidString );
return UUIDUtil.base64URLSafeOfUUIDObject( uuid );
}
private static long zeroLower48BitsOfLong( long longVar ){
long upper16BitMask = -281474976710656L;
return longVar & upper16BitMask;
}
private static void zeroUpper16BitsOfLong( long longVar ){
long lower48BitMask = 281474976710656L-1L;
return longVar & lower48BitMask;
}
}
Surprised no one mentioned uuidToByteArray(…) from commons-lang3.
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.12.0</version>
</dependency>
And then the code will be
import org.apache.commons.lang3.Conversion;
import java.util.*;
public static byte[] asByteArray(UUID uuid) {
return Conversion.uuidToByteArray(uuid, new byte[16], 0, 16);
}
Here's my approach in kotlin:
val uuid: UUID = UUID.randomUUID()
val uid = BaseEncoding.base64Url().encode(
ByteBuffer.allocate(16)
.putLong(uuid.mostSignificantBits)
.putLong(uuid.leastSignificantBits)
.array()
).trimEnd('=')