loading (deserializing) _quickly_ 2MB of Data in Android on Application Startup - java

I need to load around 2MB of data quickly on startup of my Android application.
I really need all this data in memory, so something like SQLite etc. is not an alternative.
The data consists of about 3000 int[][] arrays. The array dimension is around [7][7] on average.
I first implemented some prototype on my desktop, and ported it to android. On the desktop, I simply used Java's (de)serialization. Deserialization of that data takes about 90ms on my desktop computer.
However on Android 2.2.1 the same process takes about 15seconds(!) on my HTC Magic. It's so slow that if I don't to the deserialization in a seperate thred, my app will be killed. All in all, this is unacceptably slow.
What am I doing wrong? Should I
switch to something like protocol buffers? Would that really speed up the process of deserialization of several magnitudes - after all, it's not complex objects that I am deserializing, just int[][] arrays?!
design my own custom binary file format? I've never done that before, and no clue where to start
do something else?

Why not bypass the built-in deserialization, and use direct binary I/O?
When speed is your primary concern, not necessarily ease of programming, you can't beat it.
For output the pseudo-code would look like this:
write number of arrays
for each array
write n,m array sizes
for each element of array
write array element
For input, the pseudo-code would be:
read number of arrays
for each array
read n,m array sizes
allocate the array
for each element of array
read array element
When you read/write numbers in binary, you bypass all the conversion between binary and characters.
The speed should be limited only by the data transfer rate of the file storage media.

after trying out several things, as Mike Dunlavey suggested, direct binary I/O seemed fastest. I almost verbatim used his sketched out version. For completeness however, and if someone else wants to try, I'll post my full code here; even though it's very basic and without any kind of sanity check. This is for reading such a binary stream; writing is absolutely analogous.
import java.io.*;
public static int[][][] readBinaryInt(String filename) throws IOException {
DataInputStream in = new DataInputStream(
new BufferedInputStream(new FileInputStream(filename)));
int dimOfData = in.readInt();
int[][][] patternijk = new int[dimofData][][];
for(int i=0;i<dimofData;i++) {
int dimStrokes = in.readInt();
int[][] patternjk = new int[dimStrokes][];
for(int j=0;j<dimStrokes;j++) {
int dimPoints = in.readInt();
int[] patternk = new int[dimPoints];
for(int k=0;k<dimPoints;k++) {
patternk[k] = in.readInt();
patternjk[j] = patternk;
patternijk[i] = patternjk;
return patternijk;

I had the same kind of issues on a project some months ago. I think you should split your file in various parts, and only load relevant parts following a choice from the user for example.
Hope it will be helpful!

I dont know your data but if you optimize your loop, it will effect the deserialize time unbelievably.
if you look at example below
computeRecursivelyWithLoop(30); // 270 milisecond
computeIteratively(30); // 1 milisecond
computeRecursivelyFasterUsingBigInteger(30); // about twice s fast as before version
computeRecursivelyFasterUsingBigIntegerAllocations(50000); // only 1.3 Second !!!
public class Fibo {
public static void main(String[] args) {
// try the methods
public static long computeRecursively(int n) {
if (n > 1) {
System.out.println(computeRecursively(n - 2)
+ computeRecursively(n - 1));
return computeRecursively(n - 2) + computeRecursively(n - 1);
return n;
public static long computeRecursivelyWithLoop(int n) {
if (n > 1) {
long result = 1;
do {
result += computeRecursivelyWithLoop(n - 2);
} while (n > 1);
return result;
return n;
public static long computeIteratively(int n) {
if (n > 1) {
long a = 0, b = 1;
do {
long tmp = b;
b += a;
a = tmp;
} while (--n > 1);
return b;
return n;
public static BigInteger computeRecursivelyFasterUsingBigInteger(int n) {
if (n > 1) {
int m = (n / 2) + (n & 1); // not obvious at first – wouldn’t it be
// great to have a better comment here?
BigInteger fM = computeRecursivelyFasterUsingBigInteger(m);
BigInteger fM_1 = computeRecursivelyFasterUsingBigInteger(m - 1);
if ((n & 1) == 1) {
// F(m)^2 + F(m-1)^2
return fM.pow(2).add(fM_1.pow(2)); // three BigInteger objects
// created
} else {
// (2*F(m-1) + F(m)) * F(m)
System.out.println( fM_1.shiftLeft(1).add(fM).multiply(fM));
return fM_1.shiftLeft(1).add(fM).multiply(fM); // three
// BigInteger
// objects
// created
return (n == 0) ? BigInteger.ZERO : BigInteger.ONE; // no BigInteger
// object created
public static long computeRecursivelyFasterUsingBigIntegerAllocations(int n) {
long allocations = 0;
if (n > 1) {
int m = (n / 2) + (n & 1);
allocations += computeRecursivelyFasterUsingBigIntegerAllocations(m);
allocations += computeRecursivelyFasterUsingBigIntegerAllocations(m - 1);
// 3 more BigInteger objects allocated
allocations += 3;
return allocations; // approximate number of BigInteger objects
// allocated when
// computeRecursivelyFasterUsingBigInteger(n) is
// called


Java Fibonacci Sequence fast method

I need a task about finding Fibonacci Sequence for my independent project in Java. Here are methods for find.
private static long getFibonacci(int n) {
switch (n) {
case 0:
return 0;
case 1:
return 1;
return (getFibonacci(n-1)+getFibonacci(n-2));
private static long getFibonacciSum(int n) {
long result = 0;
while(n >= 0) {
result += getFibonacci(n);
return result;
private static boolean isInFibonacci(long n) {
long a = 0, b = 1, c = 0;
while (c < n) {
c = a + b;
a = b;
b = c;
return c == n;
Here is main method:
long key = getFibonacciSum(n);
System.out.println("Sum of all Fibonacci Numbers until Fibonacci[n]: "+key);
System.out.println(getFibonacci(n)+" is Fibonacci[n]");
System.out.println("Is n2 in Fibonacci Sequence ?: "+isInFibonacci(n2));
Codes are completely done and working. But if the n or n2 will be more than normal (50th numbers in Fib. Seq.) ? Codes will be runout. Are there any suggestions ?
There is a way to calculate Fibonacci numbers instantaneously by using Binet's Formula
function fib(n):
root5 = squareroot(5)
gr = (1 + root5) / 2
igr = 1 - gr
value = (power(gr, n) - power(igr, n)) / root5
// round it to the closest integer since floating
// point arithmetic cannot be trusted to give
// perfect integer answers.
return floor(value + 0.5)
Once you do this, you need to be aware of the programming language you're using and how it behaves. This will probably return a floating point decimal type, whereas integers are probably desired.
The complexity of this solution is O(1).
Yes, one improvement you can do is to getFibonacciSum(): instead of calling again and again to isInFibonacci which re-calculates everything from scratch, you can do the exact same thing that isInFibonacci is doing and get the sum in one pass, something like:
private static int getFibonacciSum(int n) {
int a = 0, b = 1, c = 0, sum = 0;
while (c < n) {
c = a + b;
a = b;
sum += b;
b = c;
sum += c;
return sum;
Well, here goes my solution using a Map and some math formulas. (source:https://www.nayuki.io/page/fast-fibonacci-algorithms)
F(2k) = F(k)[2F(k+1)−F(k)]
F(2k+1) = F(k+1)^2+F(k)^2
It is also possible implement it using lists instead of a map but it is just reinventing the wheel.
When using Iteration solution, we don't worry about running out of memory, but it takes a lot of time to get fib(1000000), for example. In this solution we may be running out of memory for very very very very big inputs (like 10000 billion, idk) but it is much much much faster.
public BigInteger fib(BigInteger n) {
if (n.equals(BigInteger.ZERO))
return BigInteger.ZERO;
if (n.equals(BigInteger.ONE) || n.equals(BigInteger.valueOf(2)))
return BigInteger.ONE;
BigInteger index = n;
//we could have 2 Lists instead of a map
Map<BigInteger,BigInteger> termsToCalculate = new TreeMap<BigInteger,BigInteger>();
//add every index needed to calculate index n
populateMapWhitTerms(termsToCalculate, index);
termsToCalculate.put(n,null); //finally add n to map
Iterator<Map.Entry<BigInteger, BigInteger>> it = termsToCalculate.entrySet().iterator();//it
it.next(); //it = key number 1, contains fib(1);
it.next(); //it = key number 2, contains fib(2);
//map is ordered
while (it.hasNext()) {
Map.Entry<BigInteger, BigInteger> pair = (Entry<BigInteger, BigInteger>)it.next();//first it = key number 3
index = (BigInteger) pair.getKey();
if(index.remainder(BigInteger.valueOf(2)).equals(BigInteger.ZERO)) {
//index is divisible by 2
//F(2k) = F(k)[2F(k+1)−F(k)]
else {
//index is odd
//F(2k+1) = F(k+1)^2+F(k)^2
// fib(n) was calculated in the while loop
return termsToCalculate.get(n);
private void populateMapWhitTerms(Map<BigInteger, BigInteger> termsToCalculate, BigInteger index) {
if (index.equals(BigInteger.ONE)) { //stop
termsToCalculate.put(BigInteger.ONE, BigInteger.ONE);
} else if(index.equals(BigInteger.valueOf(2))){
termsToCalculate.put(BigInteger.valueOf(2), BigInteger.ONE);
} else if(index.remainder(BigInteger.valueOf(2)).equals(BigInteger.ZERO)) {
// index is divisible by 2
// FORMUMA: F(2k) = F(k)[2F(k+1)−F(k)]
// add F(k) key to termsToCalculate (the key is replaced if it is already there, we are working with a map here)
termsToCalculate.put(index.divide(BigInteger.valueOf(2)), null);
populateMapWhitTerms(termsToCalculate, index.divide(BigInteger.valueOf(2)));
// add F(k+1) to termsToCalculate
termsToCalculate.put(index.divide(BigInteger.valueOf(2)).add(BigInteger.ONE), null);
populateMapWhitTerms(termsToCalculate, index.divide(BigInteger.valueOf(2)).add(BigInteger.ONE));
} else {
// index is odd
// FORMULA: F(2k+1) = F(k+1)^2+F(k)^2
// add F(k+1) to termsToCalculate
// add F(k) to termsToCalculate
termsToCalculate.put((index.subtract(BigInteger.ONE)).divide(BigInteger.valueOf(2)), null);
populateMapWhitTerms(termsToCalculate, (index.subtract(BigInteger.ONE)).divide(BigInteger.valueOf(2)));
This method of solution is called dynamic programming
In this method we are remembering the previous results
so when recursion happens then the cpu doesn't have to do any work to recompute the same value again and again
class fibonacci
static int fib(int n)
/* Declare an array to store Fibonacci numbers. */
int f[] = new int[n+1];
int i;
/* 0th and 1st number of the series are 0 and 1*/
f[0] = 0;
f[1] = 1;
for (i = 2; i <= n; i++)
/* Add the previous 2 numbers in the series
and store it */
f[i] = f[i-1] + f[i-2];
return f[n];
public static void main (String args[])
int n = 9;
public static long getFib(final int index) {
long a=0,b=0,total=0;
for(int i=0;i<= index;i++) {
if(i==0) {
}else if(i==1) {
else if(i%2==0) {
total = a+b;
}else {
total = a+b;
return total;
I have checked all solutions and for me, the quickest one is to use streams and this code could be easily modified to collect all Fibonacci numbers.
public static Long fibonaciN(long n){
return Stream.iterate(new long[]{0, 1}, a -> new long[]{a[1], a[0] + a[1]})
50 or just below 50 is as far as you can go with straight recursive implementation. You can switch to iterative or dynamic programming (DP) approaches if you want to go much higher than that. I suggest learning about those from this: https://www.javacodegeeks.com/2014/02/dynamic-programming-introduction.html. And don't forget to look the a solution in the comment by David therein, real efficient. The links shows how even n = 500000 can be computed instantaneously using the DP method. The link also explains the concept of "memoization" to speed up computation by storing intermediate (but later on re-callable) results.

how to improve this code?

I have developed a code for expressing the number in terms of the power of the 2 and I am attaching the same code below.
But the problem is that the expressed output should of minimum length.
I am getting output as 3^2+1^2+1^2+1^2 which is not minimum length.
I need to output in this format:
package com.algo;
import java.util.Scanner;
public class GetInputFromUser {
public static void main(String[] args) {
// TODO Auto-generated method stub
int n;
Scanner in = new Scanner(System.in);
System.out.println("Enter an integer");
n = in.nextInt();
System.out.println("The result is:");
public static int algofunction(int n1)
int r1 = 0;
int r2 = 0;
int r3 = 0;
//System.out.println("n1: "+n1);
r1 = (int) Math.sqrt(n1);
r2 = (int) Math.pow(r1, 2);
// System.out.println("r1: "+r1);
//System.out.println("r2: "+r2);
r3 = n1-r2;
//System.out.println("r3: "+r3);
if (r3 == 0)
return 1;
if(r3 == 1)
return 1;
else {
return 1;
Dynamic programming is all about defining the problem in such a way that if you knew the answer to a smaller version of the original, you could use that to answer the main problem more quickly/directly. It's like applied mathematical induction.
In your particular problem, we can define MinLen(n) as the minimum length representation of n. Next, say, since we want to solve MinLen(12), suppose we already knew the answer to MinLen(1), MinLen(2), MinLen(3), ..., MinLen(11). How could we use the answer to those smaller problems to figure out MinLen(12)? This is the other half of dynamic programming - figuring out how to use the smaller problems to solve the bigger one. It doesn't help you if you come up with some smaller problem, but have no way of combining them back together.
For this problem, we can make the simple statement, "For 12, it's minimum length representation DEFINITELY has either 1^2, 2^2, or 3^2 in it." And in general, the minimum length representation of n will have some square less than or equal to n as a part of it. There is probably a better statement you can make, which would improve the runtime, but I'll say that it is good enough for now.
This statement means that MinLen(12) = 1^2 + MinLen(11), OR 2^2 + MinLen(8), OR 3^2 + MinLen(3). You check all of them and select the best one, and now you save that as MinLen(12). Now, if you want to solve MinLen(13), you can do that too.
Advice when solo:
The way I would test this kind of program myself is to plug in 1, 2, 3, 4, 5, etc, and see the first time it goes wrong. Additionally, any assumptions I happen to have thought were a good idea, I question: "Is it really true that the largest square number less than n will be in the representation of MinLen(n)?"
Your code:
r1 = (int) Math.sqrt(n1);
r2 = (int) Math.pow(r1, 2);
embodies that assumption (a greedy assumption), but it is wrong, as you've clearly seen with the answer for MinLen(12).
Instead you want something more like this:
public ArrayList<Integer> minLen(int n)
// base case of recursion
if (n == 0)
return new ArrayList<Integer>();
ArrayList<Integer> best = null;
int bestInt = -1;
for (int i = 1; i*i <= n; ++i)
// Check what happens if we use i^2 as part of our representation
ArrayList<Integer> guess = minLen(n - i*i);
// If we haven't selected a 'best' yet (best == null)
// or if our new guess is better than the current choice (guess.size() < best.size())
// update our choice of best
if (best == null || guess.size() < best.size())
best = guess;
bestInt = i;
return best;
Then, once you have your list, you can sort it (no guarantees that it came in sorted order), and print it out the way you want.
Lastly, you may notice that for larger values of n (1000 may be too large) that you plug in to the above recursion, it will start going very slow. This is because we are constantly recalculating all the small subproblems - for example, we figure out MinLen(3) when we call MinLen(4), because 4 - 1^2 = 3. But we figure it out twice for MinLen(7) -> 3 = 7 - 2^2, but 3 also is 7 - 1^2 - 1^2 - 1^2 - 1^2. And it gets much worse the larger you go.
The solution to this, which lets you solve up to n = 1,000,000 or more, very quickly, is to use a technique called Memoization. This means that once we figure out MinLen(3), we save it somewhere, let's say a global location to make it easy. Then, whenever we would try to recalculate it, we check the global cache first to see if we already did it. If so, then we just use that, instead of redoing all the work.
import java.util.*;
class SquareRepresentation
private static HashMap<Integer, ArrayList<Integer>> cachedSolutions;
public static void main(String[] args)
cachedSolutions = new HashMap<Integer, ArrayList<Integer>>();
for (int j = 100000; j < 100001; ++j)
ArrayList<Integer> answer = minLen(j);
for (int i = 0; i < answer.size(); ++i)
if (i != 0)
System.out.printf("%d^2", answer.get(i));
public static ArrayList<Integer> minLen(int n)
// base case of recursion
if (n == 0)
return new ArrayList<Integer>();
// new base case: problem already solved once before
if (cachedSolutions.containsKey(n))
// It is a bit tricky though, because we need to be careful!
// See how below that we are modifying the 'guess' array we get in?
// That means we would modify our previous solutions! No good!
// So here we need to return a copy
ArrayList<Integer> ans = cachedSolutions.get(n);
ArrayList<Integer> copy = new ArrayList<Integer>();
for (int i: ans) copy.add(i);
return copy;
ArrayList<Integer> best = null;
int bestInt = -1;
// THIS IS WRONG, can you figure out why it doesn't work?:
// for (int i = 1; i*i <= n; ++i)
for (int i = (int)Math.sqrt(n); i >= 1; --i)
// Check what happens if we use i^2 as part of our representation
ArrayList<Integer> guess = minLen(n - i*i);
// If we haven't selected a 'best' yet (best == null)
// or if our new guess is better than the current choice (guess.size() < best.size())
// update our choice of best
if (best == null || guess.size() < best.size())
best = guess;
bestInt = i;
// check... not needed unless you coded wrong
int sum = 0;
for (int i = 0; i < best.size(); ++i)
sum += best.get(i) * best.get(i);
if (sum != n)
throw new RuntimeException(String.format("n = %d, sum=%d, arr=%s\n", n, sum, best));
// New step: Save the solution to the global cache
cachedSolutions.put(n, best);
// Same deal as before... if you don't return a copy, you end up modifying your previous solutions
ArrayList<Integer> copy = new ArrayList<Integer>();
for (int i: best) copy.add(i);
return copy;
It took my program around ~5s to run for n = 100,000. Clearly there is more to be done if we want it to be faster, and to solve for larger n. The main issue now is that in storing the entire list of results of previous answers, we use up a lot of memory. And all of that copying! There is more you can do, like storing only an integer and a pointer to the subproblem, but I'll let you do that.
And by the by, 1000 = 30^2 + 10^2.

Optimizing java heap usage by String using StringBuffer , StringBuilder , String.intern()

I am monitoring the performance and CPU of a large java application , using VisualVM. When I look at its memory profile I see maximum heap (about 50%) is being used up by char arrays.
Following is a screenshot of the memory profile:
In the memory profile at any given time i see roughly about 9000 char[] objects.
The application accepts a large file as input. The file roughly has about 80 lines each line consisting of 15-20 delimited config options. The application parses the file and stores these lines in a ArrayList of Strings. It then parses these string to get the individual config options for each server.
The application also frequently logs each event to the console.
Java implementation of Strings uses char[] internally along with a reference to array and 3 integer.
From different posts on the internet it seems like StringBuffer , StringBuilder , String.intern() are more memory efficient data types.
How do they compare to java.lang.String ? Has anybody benchmarked them ? If the application uses multithreading (which it does)are they a safe alternative ?
What I do is is have one or more String pools. I do this to a) not create new Strings if I have one in the pool and b) reduce the retained memory size, sometimes by a factor of 3-5. You can write a simple string interner yourself but I suggest you consider how the data is read in first to determine the optimal solution. This matters as you can easily make matters worse if you don't have an efficient solution.
As EJP points out processing a line at a time is more efficient, as is parsing each line as you read it. i.e. an int or double takes up far less space than the same String (unless you have a very high rate of duplication)
Here is an example of a StringInterner which takes a StringBuilder to avoid creating objects needlessly. You first populate a recycled StringBuilder with the text and if a String matching that text is in the interner, that String is returned (or a toString() of the StringBuilder is.) The benefit is that you only create objects (and no more than needed) when you see a new String (or at least one not in the array) This can get a 80% to 99% hit rate and reduce memory consumption (and garbage) dramatically when loading many strings of data.
public class StringInterner {
private final String[] interner;
private final int mask;
public StringInterner(int capacity) {
int n = nextPower2(capacity, 128);
interner = new String[n];
mask = n - 1;
public String intern(#NotNull CharSequence cs) {
long hash = 0;
for (int i = 0; i < cs.length(); i++)
hash = 57 * hash + cs.charAt(i);
int h = hash(hash) & mask;
String s = interner[h];
if (isEqual(s, cs))
return s;
String s2 = cs.toString();
return interner[h] = s2;
static boolean isEqual(#Nullable CharSequence s, #NotNull CharSequence cs) {
if (s == null) return false;
if (s.length() != cs.length()) return false;
for (int i = 0; i < cs.length(); i++)
if (s.charAt(i) != cs.charAt(i))
return false;
return true;
static int nextPower2(int n, int min) {
if (n < min) return min;
if ((n & (n - 1)) == 0) return n;
int i = min;
while (i < n) {
i *= 2;
if (i <= 0) return 1 << 30;
return i;
static int hash(long n) {
n ^= (n >> 43) ^ (n >> 21);
n ^= (n >> 15) ^ (n >> 7);
return (int) n;
This class is interesting in that it is not thread safe in the tradition sense, but will work correctly when used concurrently, in fact might work more efficiently when multiple threads have different views of the contents of the array.

"Last 100 bytes" Interview Scenario

I got this question in an interview the other day and would like to know some best possible answers(I did not answer very well haha):
Scenario: There is a webpage that is monitoring the bytes sent over a some network. Every time a byte is sent the recordByte() function is called passing that byte, this could happen hundred of thousands of times per day. There is a button on this page that when pressed displays the last 100 bytes passed to recordByte() on screen (it does this by calling the print method below).
The following code is what I was given and asked to fill out:
public class networkTraffic {
public void recordByte(Byte b){
public String print() {
What is the best way to store the 100 bytes? A list? Curious how best to do this.
Something like this (circular buffer) :
byte[] buffer = new byte[100];
int index = 0;
public void recordByte(Byte b) {
index = (index + 1) % 100;
buffer[index] = b;
public void print() {
for(int i = index; i < index + 100; i++) {
System.out.print(buffer[i % 100]);
The benefits of using a circular buffer:
You can reserve the space statically. In a real-time network application (VoIP, streaming,..)this is often done because you don't need to store all data of a transmission, but only a window containing the new bytes to be processed.
It's fast: can be implemented with an array with read and write cost of O(1).
I don't know java, but there must be a queue concept whereby you would enqueue bytes until the number of items in the queue reached 100, at which point you would dequeue one byte and then enqueue another.
public void recordByte(Byte b)
if (queue.ItemCount >= 100)
You could print by peeking at the items:
public String print()
foreach (Byte b in queue)
print("X", b); // some hexadecimal print function
Circular Buffer using array:
Array of 100 bytes
Keep track of where the head index is i
For recordByte() put the current byte in A[i] and i = i+1 % 100;
For print(), return subarray(i+1, 100) concatenate with subarray(0, i)
Queue using linked list (or the java Queue):
For recordByte() add new byte to the end
If the new length to be more than 100, remove the first element
For print() simply print the list
Here is my code. It might look a bit obscure, but I am pretty sure this is the fastest way to do it (at least it would be in C++, not so sure about Java):
public class networkTraffic {
public networkTraffic() {
_ary = new byte[100];
_idx = _ary.length;
public void recordByte(Byte b){
_ary[--_idx] = b;
if (_idx == 0) {
_idx = _ary.length;
private int _idx;
private byte[] _ary;
Some points to note:
No data is allocated/deallocated when calling recordByte().
I did not use %, because it is slower than a direct comparison and using the if (branch prediction might help here too)
--_idx is faster than _idx-- because no temporary variable is involved.
I count backwards to 0, because then I do not have to get _ary.length each time in the call, but only every 100 times when the first entry is reached. Maybe this is not necessary, the compiler could take care of it.
if there were less than 100 calls to recordByte(), the rest is zeroes.
Easiest thing is to shove it in an array. The max size that the array can accommodate is 100 bytes. Keep adding bytes as they are streaming off the web. After the first 100 bytes are in the array, when the 101st byte comes, remove the byte at the head (i.e. 0th). Keep doing this. This is basically a queue. FIFO concept. Ater the download is done, you are left with the last 100 bytes.
Not just after the download but at any given point in time, this array will have the last 100 bytes.
#Yottagray Not getting where the problem is? There seems to be a number of generic approaches (array, circular array etc) & a number of language specific approaches (byteArray etc). Am I missing something?
Multithreaded solution with non-blocking I/O:
private static final int N = 100;
private volatile byte[] buffer1 = new byte[N];
private volatile byte[] buffer2 = new byte[N];
private volatile int index = -1;
private volatile int tag;
synchronized public void recordByte(byte b) {
if (index == N * 2) {
//both buffers are full
buffer1 = buffer2;
buffer2 = new byte[N];
index = N;
if (index < N) {
buffer1[index] = b;
} else {
buffer2[index - N] = b;
public void print() {
byte[] localBuffer1, localBuffer2;
int localIndex, localTag;
synchronized (this) {
localBuffer1 = buffer1;
localBuffer2 = buffer2;
localIndex = index;
localTag = tag++;
int buffer1Start = localIndex - N >= 0 ? localIndex - N + 1 : 0;
int buffer1End = localIndex < N ? localIndex : N - 1;
printSlice(localBuffer1, buffer1Start, buffer1End, localTag);
if (localIndex >= N) {
printSlice(localBuffer2, 0, localIndex - N, localTag);
private void printSlice(byte[] buffer, int start, int end, int tag) {
for(int i = start; i <= end; i++) {
System.out.println(tag + ": "+ buffer[i]);
Just for the heck of it. How about using an ArrayList<Byte>? Say why not?
public class networkTraffic {
static ArrayList<Byte> networkMonitor; // ArrayList<Byte> reference
static { networkMonitor = new ArrayList<Byte>(100); } // Static Initialization Block
public void recordByte(Byte b){
while(networkMonitor.size() > 100){
public void print() {
for (int i = 0; i < networkMonitor.size(); i++) {
// if(networkMonitor.size() < 100){
// for(int i = networkMonitor.size(); i < 100; i++){
// System.out.println("Emtpy byte");
// }
// }

Improving a prime sieve algorithm

I'm trying to make a decent Java program that generates the primes from 1 to N (mainly for Project Euler problems).
At the moment, my algorithm is as follows:
Initialise an array of booleans (or a bitarray if N is sufficiently large) so they're all false, and an array of ints to store the primes found.
Set an integer, s equal to the lowest prime, (ie 2)
While s is <= sqrt(N)
Set all multiples of s (starting at s^2) to true in the array/bitarray.
Find the next smallest index in the array/bitarray which is false, use that as the new value of s.
Go through the array/bitarray, and for every value that is false, put the corresponding index in the primes array.
Now, I've tried skipping over numbers not of the form 6k + 1 or 6k + 5, but that only gives me a ~2x speed up, whilst I've seen programs run orders of magnitudes faster than mine (albeit with very convoluted code), such as the one here
What can I do to improve?
Edit: Okay, here's my actual code (for N of 1E7):
int l = 10000000, n = 2, sqrt = (int) Math.sqrt(l);
boolean[] nums = new boolean[l + 1];
int[] primes = new int[664579];
while(n <= sqrt){
for(int i = 2 * n; i <= l; nums[i] = true, i += n);
for(n++; nums[n]; n++);
for(int i = 2, k = 0; i < nums.length; i++) if(!nums[i]) primes[k++] = i;
Runs in about 350ms on my 2.0GHz machine.
While s is <= sqrt(N)
One mistake people often do in such algorithms is not precomputing square root.
while (s <= sqrt(N)) {
is much, much slower than
int limit = sqrt(N);
while (s <= limit) {
But generally speaking, Eiko is right in his comment. If you want people to offer low-level optimisations, you have to provide code.
update Ok, now about your code.
You may notice that number of iterations in your code is just little bigger than 'l'. (you may put counter inside first 'for' loop, it will be just 2-3 times bigger) And, obviously, complexity of your solution can't be less then O(l) (you can't have less than 'l' iterations).
What can make real difference is accessing memory effectively. Note that guy who wrote that article tries to reduce storage size not just because he's memory-greedy. Making compact arrays allows you to employ cache better and thus increase speed.
I just replaced boolean[] with int[] and achieved immediate x2 speed gain. (and 8x memory) And I didn't even try to do it efficiently.
That's easy. You just replace every assignment a[i] = true with a[i/32] |= 1 << (i%32) and each read operation a[i] with (a[i/32] & (1 << (i%32))) != 0. And boolean[] a with int[] a, obviously.
From the first replacement it should be clear how it works: if f(i) is true, then there's a bit 1 in an integer number a[i/32], at position i%32 (int in Java has exactly 32 bits, as you know).
You can go further and replace i/32 with i >> 5, i%32 with i&31. You can also precompute all 1 << j for each j between 0 and 31 in array.
But sadly, I don't think in Java you could get close to C in this. Not to mention, that guy uses many other tricky optimizations and I agree that his could would've been worth a lot more if he made comments.
Using the BitSet will use less memory. The Sieve algorithm is rather trivial, so you can simply "set" the bit positions on the BitSet, and then iterate to determine the primes.
Did you also make the array smaller while skipping numbers not of the form 6k+1 and 6k+5?
I only tested with ignoring numbers of the form 2k and that gave me ~4x speed up (440 ms -> 120 ms):
int l = 10000000, n = 1, sqrt = (int) Math.sqrt(l);
int m = l/2;
boolean[] nums = new boolean[m + 1];
int[] primes = new int[664579];
int i, k;
while (n <= sqrt) {
int x = (n<<1)+1;
for (i = n+x; i <= m; nums[i] = true, i+=x);
for (n++; nums[n]; n++);
primes[0] = 2;
for (i = 1, k = 1; i < nums.length; i++) {
if (!nums[i])
primes[k++] = (i<<1)+1;
The following is from my Project Euler Library...Its a slight Variation of the Sieve of Eratosthenes...I'm not sure, but i think its called the Euler Sieve.
1) It uses a BitSet (so 1/8th the memory)
2) Only uses the bitset for Odd Numbers...(another 1/2th hence 1/16th)
Note: The Inner loop (for multiples) begins at "n*n" rather than "2*n" and also multiples of increment "2*n" are only crossed off....hence the speed up.
private void beginSieve(int mLimit)
primeList = new BitSet(mLimit>>1);
int sqroot = (int) Math.sqrt(mLimit);
for(int num = 3; num <= sqroot; num+=2)
if( primeList.get(num >> 1) )
int inc = num << 1;
for(int factor = num * num; factor < mLimit; factor += inc)
//if( ((factor) & 1) == 1)
primeList.clear(factor >> 1);
and here's the function to check if a number is prime...
public boolean isPrime(int num)
if( num < maxLimit)
if( (num & 1) == 0)
return ( num == 2);
return primeList.get(num>>1);
return false;
You could do the step of "putting the corresponding index in the primes array" while you are detecting them, taking out a run through the array, but that's about all I can think of right now.
I wrote a simple sieve implementation recently for the fun of it using BitSet (everyone says not to, but it's the best off the shelf way to store huge data efficiently). The performance seems to be pretty good to me, but I'm still working on improving it.
public class HelloWorld {
private static int LIMIT = 2140000000;//Integer.MAX_VALUE broke things.
private static BitSet marked;
public static void main(String[] args) {
long startTime = System.nanoTime();
long estimatedTime = System.nanoTime() - startTime;
System.out.println((float)estimatedTime/1000000000); //23.835363 seconds
System.out.println(marked.size()); //1070000000 ~= 127MB
private static void init()
double size = LIMIT * 0.5 - 1;
marked = new BitSet();
marked.set(0,(int)size, true);
private static void sieve()
int i = 0;
int cur = 0;
int add = 0;
int pos = 0;
while(((i<<1)+1)*((i<<1)+1) < LIMIT)
pos = i;
cur = pos;
add = (cur<<1);
pos += add*cur + cur - 1;
while(pos < marked.length() && pos > 0)
pos += add;
private static void readPrimes()
int pos = 0;
while(pos < marked.length())
With smaller LIMITs (say 10,000,000 which took 0.077479s) we get much faster results than the OP.
I bet java's performance is terrible when dealing with bits...
Algorithmically, the link you point out should be sufficient
Have you tried googling, e.g. for "java prime numbers". I did and dug up this simple improvement:
Surely, you can find more at google.
Here is my code for Sieve of Erastothenes and this is actually the most efficient that I could do:
final int MAX = 1000000;
int p[]= new int[MAX];
int prime[] = new int[MAX/10];
void sieve()
int i,j,k=1;
