Currently I have this prime generator that is limited to n < 2^32-1. I'm not entirely sure how I could expand the limit further, given the limit of elements in an array.
Sieve:
public class Main {
public static void main(String args[]){
long N = 2000000000;
// initially assume all integers are prime
boolean[] isPrime = new boolean[N + 1];
for (int i = 2; i <= N; i++) {
isPrime[i] = true;
}
// mark non-primes <= N using Sieve of Eratosthenes
for (int i = 2; i*i <= N; i++) {
// if i is prime, then mark multiples of i as nonprime
// suffices to consider mutiples i, i+1, ..., N/i
if (isPrime[i]) {
for (int j = i; i*j <= N; j++) {
isPrime[i*j] = false;
}
}
}
}
}
How could I modify this to go past n = 2^32-1?
You may use an array of BitSet objects to represent long bit-set. Here's the complete example:
public class Main {
private static class LongBitSet {
// max value stored in single BitSet
private static final int BITSET_SIZE = 1 << 30;
BitSet[] bitsets;
public LongBitSet(long limit) {
bitsets = new BitSet[(int) (limit/BITSET_SIZE+1)];
// set all bits by default
for(int i=0; i<bitsets.length; i++) {
bitsets[i] = new BitSet();
int max = (int) (i == bitsets.length-1 ?
limit % BITSET_SIZE : BITSET_SIZE);
bitsets[i].set(0, max);
}
}
// clear specified bit
public void clear(long pos) {
bitsets[(int) (pos / BITSET_SIZE)].clear((int) (pos % BITSET_SIZE));
}
// get the value of the specified bit
public boolean get(long pos) {
return bitsets[(int) (pos / BITSET_SIZE)].get((int) (pos % BITSET_SIZE));
}
// get the number of set bits
public long cardinality() {
long cardinality = 0;
for(BitSet bs : bitsets) {
cardinality += bs.cardinality();
}
return cardinality;
}
}
public static void main(String args[]) {
long N = 4000000000L;
// initially assume all integers are prime
LongBitSet bs = new LongBitSet(N+1);
// clear 0 and 1: non-primes
bs.clear(0);
bs.clear(1);
// mark non-primes <= N using Sieve of Eratosthenes
for (long i = 2; i * i <= N; i++) {
if (bs.get(i)) {
for (long j = i; i * j <= N; j++) {
bs.clear(i * j);
}
}
}
System.out.println(bs.cardinality());
}
}
This program for N = 4_000_000_000L takes around 512Mb of memory, works couple of minutes and prints 189961812 which is correct number of primes below 4 billions according to Wolfram Alpha. If you have enough RAM, you may try setting even bigger N.
You can segment the sieve: instead of allocating a single, gigantic array you allocate many small arrays. If you want to find the primes up to 10^10 you might use arrays (or better yet, BitSets) of size 10^6 or so. Then you run the sieve 10^4 times. Each time you run a new segment you need to find out where to start each prime in the sieve, but that's not really too hard.
In addition to allowing much smaller memory use this keeps more of the memory in the cache, so it's often significantly faster.
I see options:
pack 16 numbers / 1 BYTE
remember odd numbers only per each bit
use unsigned variable to avoid sign bit waste
use more than 1 table
but in 32bit app you are limited by OS capabilities
usually to 1/2/4 GB of usable memory
such big table usually does not fit inside CACHE so it is not very fast anyway
you can use more approaches at once
I combine periodic sieves with found prime list binary search
If you choose the sizes right you can even improve performance
by better using platform Caching properties
see Prime numbers by Eratosthenes quicker sequential than concurrently?
the idea is to use sieves to test small divisors
and only then check the presence inside prime list
it is not requiring that much memory
and is pretty fast
to spare memory you can combine 16/32/64 bit variables
use small bit width while you can
so have prime list divided to 3 groups: small/medium/big
if you want also bigints add them as 4th list
I currently am looking for the best way so select x unique ints among a range of n ints. It would be like doing Random.nextInt(range) multiple time except it should never select twice the same int.
If it happens that x > n then the result would only contain n ints
I tried to do this myself and I currently have done this based on the Fisher/Yates shuffle:
private static final Random R = new Random();
public static int[] distinctRandoms(int nb, int max) {
int[] all = new int[max];
for (int i = 0; i < all.length; i++) {
all[i] = i;
}
if (max <= nb) {
return all;
}
int index;
int[] result = new int[nb];
for (int j = 0, k = all.length - 1; k > 0 && j < nb; k--, j++) {
index = R.nextInt(k + 1);
result[j] = all[index]; // save element
all[index] = all[k]; // overwrite chosen with last element
}
return result;
}
It works and performance seems good but I can't help thinking there must still be some more performant way to do so and that i'm reinventing the wheel. I thought about doing things differently if nb > (max / 2) (remove elements rather than select elements) but as you can't truncate an array in java you still end up copying all the elements you need.
This method costs alot if nb = max-1
Is there any built in way to randomly select distinct ints efficiently in java ?
Edit 1:
What I mean by performant is time-efficient. I want it to be fast. I'll mostly work with small sets of randoms.
Edit 2:
I tried using shuffle like that but it's much more expensive in terms of time because of all the extra object creation.
public static Integer[] distinctRandoms2(int nb, int max) {
ArrayList<Integer> all = new ArrayList<Integer>(max);
for (int i = 0; i < max; i++) {
all.add(i);
}
if (max <= nb) {
return all.toArray(new Integer[max]);
}
Collections.shuffle(all);
return all.subList(0, nb).toArray(new Integer[nb]);
}
You can use Floyd's algorithm. It is much more efficient than shuffling if the number of elements to be selected is smaller than their range.
private static final Random random = new Random();
/**
* Converts a set of Integer to an array of int.
*/
private static int[] setToArray(Set<Integer> aSet) {
int[] result = new int[aSet.size()];
int index = 0;
for (int number : aSet) {
result[index] = number;
index++;
}
return result;
}
/**
* Generates an array of min(count, maxValue) distinct random ints
* from [0, maxValue - 1] range.
* #param count The number of elements to be generated.
* #param maxValue The upper bound of the range(exclusively).
*/
public static int[] getDistinctRandomNumbers(int count, int maxValue) {
Set<Integer> was = new HashSet<>();
for (int i = Math.max(0, maxValue - count); i < maxValue; i++) {
int curr = i == 0 ? 0 : random.nextInt(i);
if (was.contains(curr))
curr = i;
was.add(curr);
}
return setToArray(was);
}
It has O(count) time and space complexity, where count is number of distinct integers that should be generated.
You can use shuffle method from java.util.Collections class.
Just create list of Integers from 0 to x-1, then call shuffle method on it and take first nb elements.
Using shuffle method has sense when nb is close to max. So it would be good for following pairs of parameters:
nb=70, max=100
nb=900, max=1000
nb=9000, max=10000
but not so good for:
nb=10, max=10^8
nb=100, max=10^9
It would be a good idea to combine above method (using shuffle) with Floyd's algorithm from other answer. Selection of algorithm should be based on ratio nb/max. Border ratio should be chosen carefully.
It depends on what you mean by Performant and Random.
If you really are in need of something that costs O(1) or similar then you could use a Linear feedback shift register or LFSR. It generates a random-like sequence of numbers (i.e. statistically random but theoretically predictable) using a simple XOR operation on the previous number and is thus probably the fastest mechanism possible.
This approach is most appropriate if you want any n-bit number. Limiting the number range by discarding those outside the required range may reduce performance.
If by "small sets of randoms" you mean that max is small, the Collections#shuffle approach is probably as good as you can get.
If max can be arbitrary large but nb is small then using a HashSet may be your best option, although you will have some boxing/unboxign cost. If you want to avoid that cost, you can try using an IntHashSet or a similar primitive specialisation of HashSet.
I defined a 2d array in Java. As I read about it(i.e. 2d array), the first dimension of this 2d array is a pointer (I do not know that is it right or not, please tell me about it). So If I consider it as pointer, in a 64-bit system, what will be the size of below code after execution?
short [][] array1 = new short [10][];
short[][] array2 = new short[10][];
for (int i = 0; i < 10; i++)
array1[i] = new short[1];
for (int i = 0; i < 10; i++)
array2[i] = array1[i];
Please tell me about the size of above code.
For every one dimensional array you have a 24 byte overhead in addition to the space for the data in the array.
So your first two lines of code each create an array of 10 pointers - you are right about that - which take 8 bytes each on 64-bit system. This means you are allocating 2 * (24 + 10 * 8) = 208 bytes.
In the first for loop you are creating 10 arrays which are 24 + 2 * 10 = 44 bytes each. These are padded to at least 8 byte boundaries and thus take up 48 bytes or 480 bytes in total.
In the second loop, you are not allocating any new memory.
In total you are using 208 + 480 = 688 bytes.
Note that the actual usage depends on the JVM. For example:
Some JVMs compress pointers.
Some JVMs only use a 12 byte header for arrays.
In order to see the difference between outer and inner arrays it might be helpful to rewrite your code like this:
short[][] outerArray = new short[10][]; // Array of references to short arrays
short[] innerArray; // Array of shorts
for (int i = 0; i < 10; i++) {
innerArray = new short[1];
outerArray[i] = innerArray;
}
I have a char [], and I want to set the value of every index to the same char value.
There is the obvious way to do it (iteration):
char f = '+';
char [] c = new char [50];
for(int i = 0; i < c.length; i++){
c[i] = f;
}
But I was wondering if there's a way that I can utilize System.arraycopy or something equivalent that would bypass the need to iterate. Is there a way to do that?
EDIT :
From Arrays.java
public static void fill(char[] a, int fromIndex, int toIndex, char val) {
rangeCheck(a.length, fromIndex, toIndex);
for (int i = fromIndex; i < toIndex; i++)
a[i] = val;
}
This is exactly the same process, which shows that there might not be a better way to do this.
+1 to everyone who suggested fill anyway - you're all correct and thank you.
Try Arrays.fill(c, f) : Arrays javadoc
As another option and for posterity I was looking into this recently and found a solution that allows a much shorter loop by handing some of the work off to the System class, which (if the JVM you're using is smart enough) can be turned into a memset operation:-
/*
* initialize a smaller piece of the array and use the System.arraycopy
* call to fill in the rest of the array in an expanding binary fashion
*/
public static void bytefill(byte[] array, byte value) {
int len = array.length;
if (len > 0){
array[0] = value;
}
//Value of i will be [1, 2, 4, 8, 16, 32, ..., len]
for (int i = 1; i < len; i += i) {
System.arraycopy(array, 0, array, i, ((len - i) < i) ? (len - i) : i);
}
}
This solution was taken from the IBM research paper "Java server performance: A case study of building efficient, scalable Jvms" by R. Dimpsey, R. Arora, K. Kuiper.
Simplified explanation
As the comment suggests, this sets index 0 of the destination array to your value then uses the System class to copy one object i.e. the object at index 0 to index 1 then those two objects (index 0 and 1) into 2 and 3, then those four objects (0,1,2 and 3) into 4,5,6 and 7 and so on...
Efficiency (at the point of writing)
In a quick run through, grabbing the System.nanoTime() before and after and calculating a duration I came up with:-
This method : 332,617 - 390,262 ('highest - lowest' from 10 tests)
Float[] n = new Float[array.length]; //Fill with null : 666,650
Setting via loop : 3,743,488 - 9,767,744 ('highest - lowest' from 10 tests)
Arrays.fill : 12,539,336
The JVM and JIT compilation
It should be noted that as the JVM and JIT evolves, this approach may well become obsolete as library and runtime optimisations could reach or even exceed these numbers simply using fill().
At the time of writing, this was the fastest option I had found. It has been mentioned this might not be the case now but I have not checked. This is the beauty and the curse of Java.
Use Arrays.fill
char f = '+';
char [] c = new char [50];
Arrays.fill(c, f)
Java Programmer's FAQ Part B Sect 6 suggests:
public static void bytefill(byte[] array, byte value) {
int len = array.length;
if (len > 0)
array[0] = value;
for (int i = 1; i < len; i += i)
System.arraycopy( array, 0, array, i,
((len - i) < i) ? (len - i) : i);
}
This essentially makes log2(array.length) calls to System.arraycopy which hopefully utilizes an optimized memcpy implementation.
However, is this technique still required on modern Java JITs such as the Oracle/Android JIT?
System.arraycopy is my answer. Please let me know is there any better ways. Thx
private static long[] r1 = new long[64];
private static long[][] r2 = new long[64][64];
/**Proved:
* {#link Arrays#fill(long[], long[])} makes r2 has 64 references to r1 - not the answer;
* {#link Arrays#fill(long[], long)} sometimes slower than deep 2 looping.<br/>
*/
private static void testFillPerformance() {
SimpleDateFormat sdf = new SimpleDateFormat("HH:mm:ss");
System.out.println(sdf.format(new Date()));
Arrays.fill(r1, 0l);
long stamp0 = System.nanoTime();
// Arrays.fill(r2, 0l); -- exception
long stamp1 = System.nanoTime();
// System.out.println(String.format("Arrays.fill takes %s nano-seconds.", stamp1 - stamp0));
stamp0 = System.nanoTime();
for (int i = 0; i < 64; i++) {
for (int j = 0; j < 64; j++)
r2[i][j] = 0l;
}
stamp1 = System.nanoTime();
System.out.println(String.format("Arrays' 2-looping takes %s nano-seconds.", stamp1 - stamp0));
stamp0 = System.nanoTime();
for (int i = 0; i < 64; i++) {
System.arraycopy(r1, 0, r2[i], 0, 64);
}
stamp1 = System.nanoTime();
System.out.println(String.format("System.arraycopy looping takes %s nano-seconds.", stamp1 - stamp0));
stamp0 = System.nanoTime();
Arrays.fill(r2, r1);
stamp1 = System.nanoTime();
System.out.println(String.format("One round Arrays.fill takes %s nano-seconds.", stamp1 - stamp0));
stamp0 = System.nanoTime();
for (int i = 0; i < 64; i++)
Arrays.fill(r2[i], 0l);
stamp1 = System.nanoTime();
System.out.println(String.format("Two rounds Arrays.fill takes %s nano-seconds.", stamp1 - stamp0));
}
12:33:18
Arrays' 2-looping takes 133536 nano-seconds.
System.arraycopy looping takes 22070 nano-seconds.
One round Arrays.fill takes 9777 nano-seconds.
Two rounds Arrays.fill takes 93028 nano-seconds.
12:33:38
Arrays' 2-looping takes 133816 nano-seconds.
System.arraycopy looping takes 22070 nano-seconds.
One round Arrays.fill takes 17042 nano-seconds.
Two rounds Arrays.fill takes 95263 nano-seconds.
12:33:51
Arrays' 2-looping takes 199187 nano-seconds.
System.arraycopy looping takes 44140 nano-seconds.
One round Arrays.fill takes 19555 nano-seconds.
Two rounds Arrays.fill takes 449219 nano-seconds.
12:34:16
Arrays' 2-looping takes 199467 nano-seconds.
System.arraycopy looping takes 42464 nano-seconds.
One round Arrays.fill takes 17600 nano-seconds.
Two rounds Arrays.fill takes 170971 nano-seconds.
12:34:26
Arrays' 2-looping takes 198907 nano-seconds.
System.arraycopy looping takes 24584 nano-seconds.
One round Arrays.fill takes 10616 nano-seconds.
Two rounds Arrays.fill takes 94426 nano-seconds.
As of Java-8, there are four variants of the setAll method which sets all elements of the specified array, using a provided generator function to compute each element.
Of those four overloads only three of them accept an array of primitives declared as such:
setAll(double[] array, IntToDoubleFunction generator)
setAll(int[] array, IntUnaryOperator generator)
setAll(long[] array, IntToLongFunction generator)
Examples of how to use the aforementioned methods:
// given an index, set the element at the specified index with the provided value
double [] doubles = new double[50];
Arrays.setAll(doubles, index -> 30D);
// given an index, set the element at the specified index with the provided value
int [] ints = new int[50];
Arrays.setAll(ints, index -> 60);
// given an index, set the element at the specified index with the provided value
long [] longs = new long[50];
Arrays.setAll(longs, index -> 90L);
The function provided to the setAll method receives the element index and returns a value for that index.
you may be wondering how about characters array?
This is where the fourth overload of the setAll method comes into play. As there is no overload that consumes an array of character primitives, the only option we have is to change the declaration of our character array to a type Character[].
If changing the type of the array to Character is not appropriate then you can fall back to the Arrays.fill method.
Example of using the setAll method with Character[]:
// given an index, set the element at the specified index with the provided value
Character[] character = new Character[50];
Arrays.setAll(characters, index -> '+');
Although, it's simpler to use the Arrays.fill method rather than the setAll method to set a specific value.
The setAll method has the advantage that you can either set all the elements of the array to have the same value or generate an array of even numbers, odd numbers or any other formula:
e.g.
int[] evenNumbers = new int[10];
Arrays.setAll(evenNumbers, i -> i * 2);
There's also several overloads of the parallelSetAll method which is executed in parallel, although it's important to note that the function passed to the parallelSetAll method must be side-effect free.
Conclusion
If your goal is simply to set a specific value for each element of the array then using the Arrays.fill overloads would be the most appropriate option. However, if you want to be more flexible or generate elements on demand then using the Arrays.setAll or Arrays.parallelSetAll (when appropriate) would be the option to go for.
I have a minor improvement on Ross Drew's answer.
For a small array, a simple loop is faster than the System.arraycopy approach, because of the overhead associated with setting up System.arraycopy. Therefore, it's better to fill the first few bytes of the array using a simple loop, and only move to System.arraycopy when the filled array has a certain size.
The optimal size of the initial loop will be JVM specific and system specific of course.
private static final int SMALL = 16;
public static void arrayFill(byte[] array, byte value) {
int len = array.length;
int lenB = len < SMALL ? len : SMALL;
for (int i = 0; i < lenB; i++) {
array[i] = value;
}
for (int i = SMALL; i < len; i += i) {
System.arraycopy(array, 0, array, i, len < i + i ? len - i : i);
}
}
If you have another array of char, char[] b and you want to replace c with b, you can use c=b.clone();.
See Arrays.fill method:
char f = '+';
char [] c = new char [50];
Arrays.fill(c, f);
Arrays.fill might suit your needs
Arrays.fill(myArray, 'c');
Arrays.fill
Although it is quite possible that this is doing the loop in the background and is therefore not any more efficient than what you have (other than the lines of code savings). If you really care about efficiency, try the following in comparison to the above:
int size = 50;
char[] array = new char[size];
for (int i=0; i<size; i++){
array[i] = 'c';
}
Notice that the above doesn't call array.size() for each iteration.
/**
* Assigns the specified char value to each element of the specified array
* of chars.
*
* #param a the array to be filled
* #param val the value to be stored in all elements of the array
*/
public static void fill(char[] a, char val) {
for (int i = 0, len = a.length; i < len; i++)
a[i] = val;
}
That's the way Arrays.fill does it.
(I suppose you could drop into JNI and use memset.)
You could use arraycopy but it depends on whether you can predefine the source array, - do you need a different character fill each time, or are you filling arrays repeatedly with the same char?
Clearly the length of the fill matters - either you need a source that is bigger than all possible destinations, or you need a loop to repeatedly arraycopy a chunk of data until the destination is full.
char f = '+';
char[] c = new char[50];
for (int i = 0; i < c.length; i++)
{
c[i] = f;
}
char[] d = new char[50];
System.arraycopy(c, 0, d, 0, d.length);
Arrays.fill is the best option for general purpose use.
If you need to fill large arrays though as of latest idk 1.8 u102, there is a faster way that leverages System.arraycopy.
You can take a look at this alternate Arrays.fill implementation:
According to the JMH benchmarks you can get almost 2x performance boost for large arrays (1000 +)
In any case, these implementations should be used only where needed. JDKs Arrays.fill should be the preferred choice.
How can I initialize an array of size 1000 * 1000 * 1000 * 1000 of all Integer.MAXVALUE?
for example, I want to make this int[][][][]dp = new int [1000][1000][1000][1000]; all have max value as later I need to compare a minimum.
I tried
int [] arr = new int arr[N];
Arrays.fill(arr,Integer.MAXVALUE);
but it doesn't work with multidimensional arrays, can anyone help?
You'll have to do this to fill your multi-dimensional array:
for (int i = 0; i < dp.length; i++) {
for (int j = 0; j < dp[i].length; j++) {
for (int k = 0; k < dp[j].length; k++) {
Arrays.fill(dp[i][j][k], Integer.MAX_VALUE);
}
}
}
You won't however be able to initialize new int[1000][1000][1000][1000] unless you have at least 3.64 terabytes of memory. Not to mention how long that would take if you did have that much memory.
You need something very specialized like Colt to generate what is called a Sparse Matrix. You need to alter your logic slightly, instead of testing against a Integer.MAX_VALUE you test to see if something exists at a location ( defaults to ZERO ), if it doesn't then consider it Integer.MAX_VALUE and leave it alone.
This assumes you only insert a fraction of the possible data with values < Integer.MAX_VALUE.
fill will need as arguments the array and the values to fill per dimension. Say fill(array, 0,0,0) or in your case fill(array, maxValue, maxValue, maxValue).
Cheers,