Fastest way to permute bits in a Java array - java

What is the fastest way to randomly (but repeatedly) permute all the bits within a Java byte array? I've tried successfully doing it with a BitSet, but is there a faster way? Clearly the for-loop consumes the majority of the cpu time.
I've just done some profiling in my IDE and the for-loop constitutes 64% of the cpu time within the entire permute() method.
To clarify, the array (preRound) contains an existing array of numbers going into the procedure. I want the individual set bits of that array to be mixed up in a random manner. This is the reason for P[]. It contains a random list of bit positions. So for example, if bit 13 of preRound is set, it is transferred to place P[13] of postRound. This might be at position 20555 of postRound. The whole thing is part of a substitution - permutation network, and I'm looking to the fastest way to permute the incoming bits.
My code so far...
private byte[] permute(byte[] preRound) {
BitSet beforeBits = BitSet.valueOf(preRound);
BitSet afterBits = new BitSet(blockSize * 8);
for (int i = 0; i < blockSize * 8; i++) {
assert i != P[i];
if (beforeBits.get(i)) {
afterBits.set(P[i]);
}
}
byte[] postRound = afterBits.toByteArray();
postRound = Arrays.copyOf(postRound, blockSize); // Pad with 0s to the specified length
assert postRound.length == blockSize;
return postRound;
}
FYI, blockSize is about 60,000 and P is a random lookup table.

I didn't perform any performance tests, but you may want to consider the following:
To omit the call to Arrays.copyOf (which copies the copy of the long[] used interally, which is kind of annoying), just set the last bit in case it wasn't set before and unset it afterwards.
Furthermore, there is a nice idiom to iterate over the set bits in the input permutation.
private byte[] permute(final byte[] preRound) {
final BitSet beforeBits = BitSet.valueOf(preRound);
final BitSet afterBits = new BitSet(blockSize*8);
for (int i = beforeBits.nextSetBit(0); i >= 0; i =
beforeBits.nextSetBit(i + 1)) {
final int to = P[i];
assert i != to;
afterBits.set(to);
}
final int lastIndex = blockSize*8-1;
if (afterBits.get(lastIndex)) {
return afterBits.toByteArray();
}
afterBits.set(lastIndex);
final byte[] postRound = afterBits.toByteArray();
postRound[blockSize - 1] &= 0x7F;
return postRound;
}
If that doesn't cut it, in case you use the same P for lots of iterations, it may be worthwhile to consider transforming the permutation into cycle notation and perform the transformation in-place.
This way you can linearly iterate over P which may enable you to better exploit caching (P is 32 times as large as the byte array, assuming its an int array).
Yet, you will lose the advantage that you only have to look at 1s and end up shifting around every single bit in the byte array, set or not.
If you want to avoid using the BitSet, you can just do it by hand:
private byte[] permute(final byte[] preRound) {
final byte[] result = new byte[blockSize];
for (int i = 0; i < blockSize; i++) {
final byte b = preRound[i];
// if 1s are sparse, you may want to use this:
// if ((byte) 0 == b) continue;
for (int j = 0; j < 8; ++j) {
if (0 != (b & (1 << j))) {
final int loc = P[i * 8 + j];
result[loc / 8] |= (1 << (loc % 8));
}
}
}
return result;
}

Related

Fetch variable sized arrays type Array<T> from Observable<T>

I'm working with an observable that receives a stream of bytes, one by one, that can be put together to represent serialized data. I wan't to transform this stream into a stream of serialized data Array<Byte>. The data are MQTT packets.
The size of each array is variable, depending on the value indicated on positions 1..m. The longer the array, the more bytes needed to indicate the size. The highest value bit indicates that a byte is the last byte used to calculate the remaining size. After calculating the remaining size n through the given bytes, I only need to await the following n bytes and then put them together on an Array<Byte> and restart the process.
Is there any operator I can use to handle this situation?
EDIT 1
This code seems to be doing the trick without any operators, but it doesn't look to be the rx way to do it:
ArrayList<Byte> buffer = new ArrayList();
int n = 0;
boolean fetchSize = false;
byteStream.subscribe (b -> {
buffer.add(b)
if (n == 0) {
n = 1;
fetchSize = true;
}
else if (fetchSize) {
n *= b % 0x80;
if ((b & 0x80) == 0) fetchSize = false;
}
else if (n > 1) n--;
else {
n = 0;
dataObservable.onNext(buffer);
buffer.clear();
}
});

Reverse long array to string algorithm

i need to reverse the following algorithm which converts a long array into a string:
public final class LongConverter {
private final long[] l;
public LongConverter(long[] paramArrayOfLong) {
this.l = paramArrayOfLong;
}
private void convertLong(long paramLong, byte[] paramArrayOfByte, int paramInt) {
int i = Math.min(paramArrayOfByte.length, paramInt + 8);
while (paramInt < i) {
paramArrayOfByte[paramInt] = ((byte) (int) paramLong);
paramLong >>= 8;
paramInt++;
}
}
public final String toString() {
int i = this.l.length;
byte[] arrayOfByte = new byte[8 * (i - 1)];
long l1 = this.l[0];
Random localRandom = new Random(l1);
for (int j = 1; j < i; j++) {
long l2 = localRandom.nextLong();
convertLong(this.l[j] ^ l2, arrayOfByte, 8 * (j - 1));
}
String str;
try {
str = new String(arrayOfByte, "UTF8");
} catch (UnsupportedEncodingException localUnsupportedEncodingException) {
throw new AssertionError(localUnsupportedEncodingException);
}
int k = str.indexOf(0);
if (-1 == k) {
return str;
}
return str.substring(0, k);
}
So when I do the following call
System.out.println(new LongConverter(new long[]{-6567892116040843544L, 3433539276790523832L}).toString());
it prints 400 as result.
It would be great if anyone could say what algorithm this is or how i could reverse it.
Thanks for your help
This is not a solvable problem as stated because
you only use l[0] so any additional long values could be anything.
it is guaranteed that there is N << 16 solutions to this problem. While the seed for random is 64-bit in reality the value used internally is 48-bit. This means is there is any solution, there if at least 16K solutions for a long seed.
What you can do is;
find the smallest seed which would generate the string using brute force. For a short strings this won't take long, however if you have 5-6 character this will take a while and for 7+ character there might not be a solution.
instead of generating 8-bit characters where all 8-bit values are equal. You could restrict the range to say space, A-Z, a-z and 0-9. This means you can have ~6-bits of randomness, shorter seeds and slightly longer Strings.
BTW You might find this post interesting where I use contrived random seeds to generate specific sequences. http://vanillajava.blogspot.co.uk/2011/10/randomly-no-so-random.html
If you want a process which ensures you can always re-create the original longs from a String or a byte[], I suggest using encryption. You can encrypt a String which has been UTF-8 encoded or a byte[] into another byte[] which can be base64 encoded to be readable as text. (Or you could skip the encryption and use base64 alone)

Optimizing java heap usage by String using StringBuffer , StringBuilder , String.intern()

I am monitoring the performance and CPU of a large java application , using VisualVM. When I look at its memory profile I see maximum heap (about 50%) is being used up by char arrays.
Following is a screenshot of the memory profile:
In the memory profile at any given time i see roughly about 9000 char[] objects.
The application accepts a large file as input. The file roughly has about 80 lines each line consisting of 15-20 delimited config options. The application parses the file and stores these lines in a ArrayList of Strings. It then parses these string to get the individual config options for each server.
The application also frequently logs each event to the console.
Java implementation of Strings uses char[] internally along with a reference to array and 3 integer.
From different posts on the internet it seems like StringBuffer , StringBuilder , String.intern() are more memory efficient data types.
How do they compare to java.lang.String ? Has anybody benchmarked them ? If the application uses multithreading (which it does)are they a safe alternative ?
What I do is is have one or more String pools. I do this to a) not create new Strings if I have one in the pool and b) reduce the retained memory size, sometimes by a factor of 3-5. You can write a simple string interner yourself but I suggest you consider how the data is read in first to determine the optimal solution. This matters as you can easily make matters worse if you don't have an efficient solution.
As EJP points out processing a line at a time is more efficient, as is parsing each line as you read it. i.e. an int or double takes up far less space than the same String (unless you have a very high rate of duplication)
Here is an example of a StringInterner which takes a StringBuilder to avoid creating objects needlessly. You first populate a recycled StringBuilder with the text and if a String matching that text is in the interner, that String is returned (or a toString() of the StringBuilder is.) The benefit is that you only create objects (and no more than needed) when you see a new String (or at least one not in the array) This can get a 80% to 99% hit rate and reduce memory consumption (and garbage) dramatically when loading many strings of data.
public class StringInterner {
#NotNull
private final String[] interner;
private final int mask;
public StringInterner(int capacity) {
int n = nextPower2(capacity, 128);
interner = new String[n];
mask = n - 1;
}
#Override
#NotNull
public String intern(#NotNull CharSequence cs) {
long hash = 0;
for (int i = 0; i < cs.length(); i++)
hash = 57 * hash + cs.charAt(i);
int h = hash(hash) & mask;
String s = interner[h];
if (isEqual(s, cs))
return s;
String s2 = cs.toString();
return interner[h] = s2;
}
static boolean isEqual(#Nullable CharSequence s, #NotNull CharSequence cs) {
if (s == null) return false;
if (s.length() != cs.length()) return false;
for (int i = 0; i < cs.length(); i++)
if (s.charAt(i) != cs.charAt(i))
return false;
return true;
}
static int nextPower2(int n, int min) {
if (n < min) return min;
if ((n & (n - 1)) == 0) return n;
int i = min;
while (i < n) {
i *= 2;
if (i <= 0) return 1 << 30;
}
return i;
}
static int hash(long n) {
n ^= (n >> 43) ^ (n >> 21);
n ^= (n >> 15) ^ (n >> 7);
return (int) n;
}
}
This class is interesting in that it is not thread safe in the tradition sense, but will work correctly when used concurrently, in fact might work more efficiently when multiple threads have different views of the contents of the array.

Reading and writing huge files in java

My idea is to make a little software that reads a file (which can't be read "naturally", but it contains some images), turns its data into hex, looks for the PNG chunks (a kind of marks that are at the beginning and end of a .png file), and saves the resulting data in different files (after getting it back from hex). I am doing this in Java, using a code like this:
// out is where to show the result and file is the source
public static void hexDump(PrintStream out, File file) throws IOException {
InputStream is = new FileInputStream(file);
StringBuffer Buffer = new StringBuffer();
while (is.available() > 0) {
StringBuilder sb1 = new StringBuilder();
for (int j = 0; j < 16; j++) {
if (is.available() > 0) {
int value = (int) is.read();
// transform the current data into hex
sb1.append(String.format("%02X ", value));
}
}
Buffer.append(sb1);
// Should I look for the PNG here? I'm not sure
}
is.close();
// Print the result in out (that may be the console or a file)
out.print(Buffer);
}
I'm sure there are another ways to do this using less "machine-resources" while opening huge files. If you have any idea, please tell me. Thanks!
This is the first time I post, so if there is any error, please help me to correct it.
As Erwin Bolwidt says in the comments, first thing is don't convert to hex. If for some reason you must convert to hex, quit appending the content to two buffers, and always use StringBuilder, not StringBuffer. StringBuilder can be as much as 3x faster than StringBuffer.
Also, buffer your file reads with BufferedReader. Reading one character at a time with FileInputStream.read() is very slow.
A very simple way to do this, which is probably quite fast, is to read the entire file into memory (as binary data, not as a hex dump) and then search for the markers.
This has two limitations:
it only handles files up to 2 GiB in length (max size of Java arrays)
it requires large chunks of memory - it is possible to optimize this by reader smaller chunks but that makes the algorithm more complex
The basic code to do that is like this:
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
public class Png {
static final String PNG_MARKER_HEX = "abcdef0123456789"; // TODO: replace with real marker
static final byte[] PNG_MARKER = hexStringToByteArray(PNG_MARKER_HEX);
public void splitPngChunks(File file) throws IOException {
byte[] bytes = Files.readAllBytes(file.toPath());
int offset = KMPMatch.indexOf(bytes, 0, PNG_MARKER);
while (offset >= 0) {
int nextOffset = KMPMatch.indexOf(bytes, 0, PNG_MARKER);
if (nextOffset < 0) {
writePngChunk(bytes, offset, bytes.length - offset);
} else {
writePngChunk(bytes, offset, nextOffset - offset);
}
offset = nextOffset;
}
}
public void writePngChunk(byte[] bytes, int offset, int length) {
// TODO: implement - where do you want to write the chunks?
}
}
I'm not sure how these PNG chunk markers work exactly, I'm assuming above that they start the section of the data that you're interested in, and that the next marker starts the next section of the data.
There are two things missing in standard Java: code to convert a hex string to a byte array and code to search for a byte array inside another byte array.
Both can be found in various apache-commons libraries but I'll include that answers the people posted to earlier questions on StackOverflow. You can copy these verbatim into the Png class to make the above code work.
Convert a string representation of a hex dump to a byte array using Java?
public static byte[] hexStringToByteArray(String s) {
int len = s.length();
byte[] data = new byte[len / 2];
for (int i = 0; i < len; i += 2) {
data[i / 2] = (byte) ((Character.digit(s.charAt(i), 16) << 4) + Character.digit(s.charAt(i + 1), 16));
}
return data;
}
Searching for a sequence of Bytes in a Binary File with Java
/**
* Knuth-Morris-Pratt Algorithm for Pattern Matching
*/
static class KMPMatch {
/**
* Finds the first occurrence of the pattern in the text.
*/
public static int indexOf(byte[] data, int offset, byte[] pattern) {
int[] failure = computeFailure(pattern);
int j = 0;
if (data.length - offset <= 0)
return -1;
for (int i = offset; i < data.length; i++) {
while (j > 0 && pattern[j] != data[i]) {
j = failure[j - 1];
}
if (pattern[j] == data[i]) {
j++;
}
if (j == pattern.length) {
return i - pattern.length + 1;
}
}
return -1;
}
/**
* Computes the failure function using a boot-strapping process, where the pattern is matched against itself.
*/
private static int[] computeFailure(byte[] pattern) {
int[] failure = new int[pattern.length];
int j = 0;
for (int i = 1; i < pattern.length; i++) {
while (j > 0 && pattern[j] != pattern[i]) {
j = failure[j - 1];
}
if (pattern[j] == pattern[i]) {
j++;
}
failure[i] = j;
}
return failure;
}
}
I modified this last piece of code to make it possible to start the search at an offset other than zero.
Reading the file a byte at a time would be taking substantial time here. You can improve that by orders of magnitude. You should be using a DataInputStream around a BufferedInputStream around the FileInputStream, and reading 16 bytes at a time with readFully.
And then processing them, without conversion to and from hex, which is quite unnecessary here, and writing them to the output(s) as you go, via a BufferedOutputStream around the FileOutputStream, rather than concatenating the entire file into memory and having to write it all out in one go. Of course that takes time, but that's because it does, not because you have to do it that way.

Improving a prime sieve algorithm

I'm trying to make a decent Java program that generates the primes from 1 to N (mainly for Project Euler problems).
At the moment, my algorithm is as follows:
Initialise an array of booleans (or a bitarray if N is sufficiently large) so they're all false, and an array of ints to store the primes found.
Set an integer, s equal to the lowest prime, (ie 2)
While s is <= sqrt(N)
Set all multiples of s (starting at s^2) to true in the array/bitarray.
Find the next smallest index in the array/bitarray which is false, use that as the new value of s.
Endwhile.
Go through the array/bitarray, and for every value that is false, put the corresponding index in the primes array.
Now, I've tried skipping over numbers not of the form 6k + 1 or 6k + 5, but that only gives me a ~2x speed up, whilst I've seen programs run orders of magnitudes faster than mine (albeit with very convoluted code), such as the one here
What can I do to improve?
Edit: Okay, here's my actual code (for N of 1E7):
int l = 10000000, n = 2, sqrt = (int) Math.sqrt(l);
boolean[] nums = new boolean[l + 1];
int[] primes = new int[664579];
while(n <= sqrt){
for(int i = 2 * n; i <= l; nums[i] = true, i += n);
for(n++; nums[n]; n++);
}
for(int i = 2, k = 0; i < nums.length; i++) if(!nums[i]) primes[k++] = i;
Runs in about 350ms on my 2.0GHz machine.
While s is <= sqrt(N)
One mistake people often do in such algorithms is not precomputing square root.
while (s <= sqrt(N)) {
is much, much slower than
int limit = sqrt(N);
while (s <= limit) {
But generally speaking, Eiko is right in his comment. If you want people to offer low-level optimisations, you have to provide code.
update Ok, now about your code.
You may notice that number of iterations in your code is just little bigger than 'l'. (you may put counter inside first 'for' loop, it will be just 2-3 times bigger) And, obviously, complexity of your solution can't be less then O(l) (you can't have less than 'l' iterations).
What can make real difference is accessing memory effectively. Note that guy who wrote that article tries to reduce storage size not just because he's memory-greedy. Making compact arrays allows you to employ cache better and thus increase speed.
I just replaced boolean[] with int[] and achieved immediate x2 speed gain. (and 8x memory) And I didn't even try to do it efficiently.
update2
That's easy. You just replace every assignment a[i] = true with a[i/32] |= 1 << (i%32) and each read operation a[i] with (a[i/32] & (1 << (i%32))) != 0. And boolean[] a with int[] a, obviously.
From the first replacement it should be clear how it works: if f(i) is true, then there's a bit 1 in an integer number a[i/32], at position i%32 (int in Java has exactly 32 bits, as you know).
You can go further and replace i/32 with i >> 5, i%32 with i&31. You can also precompute all 1 << j for each j between 0 and 31 in array.
But sadly, I don't think in Java you could get close to C in this. Not to mention, that guy uses many other tricky optimizations and I agree that his could would've been worth a lot more if he made comments.
Using the BitSet will use less memory. The Sieve algorithm is rather trivial, so you can simply "set" the bit positions on the BitSet, and then iterate to determine the primes.
Did you also make the array smaller while skipping numbers not of the form 6k+1 and 6k+5?
I only tested with ignoring numbers of the form 2k and that gave me ~4x speed up (440 ms -> 120 ms):
int l = 10000000, n = 1, sqrt = (int) Math.sqrt(l);
int m = l/2;
boolean[] nums = new boolean[m + 1];
int[] primes = new int[664579];
int i, k;
while (n <= sqrt) {
int x = (n<<1)+1;
for (i = n+x; i <= m; nums[i] = true, i+=x);
for (n++; nums[n]; n++);
}
primes[0] = 2;
for (i = 1, k = 1; i < nums.length; i++) {
if (!nums[i])
primes[k++] = (i<<1)+1;
}
The following is from my Project Euler Library...Its a slight Variation of the Sieve of Eratosthenes...I'm not sure, but i think its called the Euler Sieve.
1) It uses a BitSet (so 1/8th the memory)
2) Only uses the bitset for Odd Numbers...(another 1/2th hence 1/16th)
Note: The Inner loop (for multiples) begins at "n*n" rather than "2*n" and also multiples of increment "2*n" are only crossed off....hence the speed up.
private void beginSieve(int mLimit)
{
primeList = new BitSet(mLimit>>1);
primeList.set(0,primeList.size(),true);
int sqroot = (int) Math.sqrt(mLimit);
primeList.clear(0);
for(int num = 3; num <= sqroot; num+=2)
{
if( primeList.get(num >> 1) )
{
int inc = num << 1;
for(int factor = num * num; factor < mLimit; factor += inc)
{
//if( ((factor) & 1) == 1)
//{
primeList.clear(factor >> 1);
//}
}
}
}
}
and here's the function to check if a number is prime...
public boolean isPrime(int num)
{
if( num < maxLimit)
{
if( (num & 1) == 0)
return ( num == 2);
else
return primeList.get(num>>1);
}
return false;
}
You could do the step of "putting the corresponding index in the primes array" while you are detecting them, taking out a run through the array, but that's about all I can think of right now.
I wrote a simple sieve implementation recently for the fun of it using BitSet (everyone says not to, but it's the best off the shelf way to store huge data efficiently). The performance seems to be pretty good to me, but I'm still working on improving it.
public class HelloWorld {
private static int LIMIT = 2140000000;//Integer.MAX_VALUE broke things.
private static BitSet marked;
public static void main(String[] args) {
long startTime = System.nanoTime();
init();
sieve();
long estimatedTime = System.nanoTime() - startTime;
System.out.println((float)estimatedTime/1000000000); //23.835363 seconds
System.out.println(marked.size()); //1070000000 ~= 127MB
}
private static void init()
{
double size = LIMIT * 0.5 - 1;
marked = new BitSet();
marked.set(0,(int)size, true);
}
private static void sieve()
{
int i = 0;
int cur = 0;
int add = 0;
int pos = 0;
while(((i<<1)+1)*((i<<1)+1) < LIMIT)
{
pos = i;
if(marked.get(pos++))
{
cur = pos;
add = (cur<<1);
pos += add*cur + cur - 1;
while(pos < marked.length() && pos > 0)
{
marked.clear(pos++);
pos += add;
}
}
i++;
}
}
private static void readPrimes()
{
int pos = 0;
while(pos < marked.length())
{
if(marked.get(pos++))
{
System.out.print((pos<<1)+1);
System.out.print("-");
}
}
}
}
With smaller LIMITs (say 10,000,000 which took 0.077479s) we get much faster results than the OP.
I bet java's performance is terrible when dealing with bits...
Algorithmically, the link you point out should be sufficient
Have you tried googling, e.g. for "java prime numbers". I did and dug up this simple improvement:
http://www.anyexample.com/programming/java/java_prime_number_check_%28primality_test%29.xml
Surely, you can find more at google.
Here is my code for Sieve of Erastothenes and this is actually the most efficient that I could do:
final int MAX = 1000000;
int p[]= new int[MAX];
p[0]=p[1]=1;
int prime[] = new int[MAX/10];
prime[0]=2;
void sieve()
{
int i,j,k=1;
for(i=3;i*i<=MAX;i+=2)
{
if(p[i])
continue;
for(j=i*i;j<MAX;j+=2*i)
p[j]=1;
}
for(i=3;i<MAX;i+=2)
{
if(p[i]==0)
prime[k++]=i;
}
return;
}

Categories