Related
Before i start i just want to say this is for a homework assignment. An issue i have faced recently in this class is the ability to test my code effectively so that when i submit my solutions i can be confident all input combinations will be covered. I have been told that much of testing is subjective and requires actual knowledge of the implementation.. one size never fits all. That being said are there guidelines on how to effectively test?
One example i am currently struggling with is dynamic programing implementation of longest subsequence if elements. All of my tests work but when i submit to the grader i get stuck at what i assume is an edgecase ( we are not allowed to see the input or output of a failed test case after a certain test ).
import java.util.Arrays;
import java.util.Scanner;
public class LCS2 {
int [][] solutionMatrix;
public int lcs2(int[] a, int[] b) {
final int aARRAY_LENGTH = a.length + 1;
final int bARRAY_LENGTH = b.length + 1;
solutionMatrix = new int[aARRAY_LENGTH][bARRAY_LENGTH];
//set endge indexes equal to i
for (int i = 1; i < aARRAY_LENGTH; i++) {
solutionMatrix[i][0] = i;
}
for (int i = 1; i < bARRAY_LENGTH; i++) {
solutionMatrix[0][i] = i;
}
//fill in matrix to determine if each element is a insert, delete, match or mismatch
for (int i = 1; i < aARRAY_LENGTH; i++) {
for (int j = 1; j < bARRAY_LENGTH; j++) {
int insertion = solutionMatrix[i ][j - 1] + 1;
int deletion = solutionMatrix[i - 1][j ] + 1;
int match = solutionMatrix[i - 1][j - 1];
int mismatch = solutionMatrix[i - 1][j - 1] + 1;
//System.out.println("i: " + i + " j: " + j + " [" +insertion + " " + deletion + " " + match + " " + mismatch + "] ");
if (a[i - 1] == b[j - 1])
solutionMatrix[i][j] = Math.min(insertion, Math.min(deletion,match));
else
solutionMatrix[i][j] = Math.min(insertion, Math.min(deletion,mismatch));
}
}
//print out matrix for visualization
System.out.println(" " + Arrays.toString(b));
for (int i = 0; i < aARRAY_LENGTH; i++) {
if (i - 1 < 0)
System.out.println(" " + Arrays.toString(solutionMatrix[i]));
else
System.out.println("[" + a[i -1] + "] " + Arrays.toString(solutionMatrix[i]));
}
return outputAlignment(a.length, b.length, 0);
}
private int outputAlignment(int i, int j, int ret){
//recursive call.. if indexes are 0 then return
if (i == 0 && j == 0)
return ret;
//find pointer.. is this a insert, deletion, match or mismatch
int backtrack = backtrack(i, j);
//change current index based on result of backtrack
if (backtrack == 3)
//if matched then add one to the longest sequence
ret = outputAlignment(i - 1, j - 1, ret) + 1;
else if (backtrack == 2)
ret = outputAlignment(i - 1, j , ret);
else if (backtrack == 1)
ret = outputAlignment(i , j - 1, ret);
else
ret = outputAlignment(i - 1, j - 1, ret);
return ret;
}
private int backtrack(int i, int j){
//System.out.println("i: " + i + " j: " + j);
int currValue = solutionMatrix[i][j];
// System.out.println(currValue);
if ( currValue == solutionMatrix[i ][j - 1] + 1)
return 1; // insertion
else if (currValue == solutionMatrix[i - 1][j ] + 1)
return 2; //deletion
else if (currValue == solutionMatrix[i - 1][j - 1] )
return 3; //match
else if (currValue == solutionMatrix[i - 1][j - 1] + 1)
return 4; //mismatch
return 5;
}
public static void main(String[] args) {
Scanner scanner = new Scanner(System.in);
int n = scanner.nextInt();
int[] a = new int[n];
for (int i = 0; i < n; i++) {
a[i] = scanner.nextInt();
}
int m = scanner.nextInt();
int[] b = new int[m];
for (int i = 0; i < m; i++) {
b[i] = scanner.nextInt();
}
LCS2 lcs = new LCS2();
System.out.println(lcs.lcs2(a, b));
}
}
Here are some junit tests i have put together with no additional setup needed to run.
#Test
public void testLCS2(){
LCS2 lcs = new LCS2();
int[] a = {2,7,5};
int[] b = {2,5};
assertEquals("Testing Longest Common SubSequence for [2,7,5] --> [2,5]", 2,lcs.lcs2(a,b));
a = new int[] {2,3,9};
b = new int[] {2,9,7,8};
assertEquals("Testing Longest Common SubSequence for [2,3,9] --> [2,9,7,8]", 2,lcs.lcs2(a,b));
a = new int[] {1,2,3,4};
b = new int[] {1,2,3,4};
assertEquals("Testing Longest Common SubSequence for [1,2,3,4] --> [1,2,3,4]", 4,lcs.lcs2(a,b));
a = new int[] {7};
b = new int[] {1,2,3,4};
assertEquals("Testing Longest Common SubSequence for [7] --> [1,2,3,4]", 0,lcs.lcs2(a,b));
a = new int[] {2,7,8,3};
b = new int[] {5,2,8,7};
assertEquals("Testing Longest Common SubSequence for [2,7,8,3] --> [5,2,8,7]", 2,lcs.lcs2(a,b));
a = new int[] {1,1,1,1};
b = new int[] {1,2,3,4};
assertEquals("Testing Longest Common SubSequence for [2,7,8,3] --> [5,2,8,7]", 1,lcs.lcs2(a,b));
a = new int[] {1,1,1,1};
b = new int[] {1,2,3,4,1};
assertEquals("Testing Longest Common SubSequence for [2,7,8,3] --> [5,2,8,7]", 2,lcs.lcs2(a,b));
}
Before posting here, you should have done your research to answer your stated question: "are there guidelines on how to effectively test?" It's easy enough to find overviews on mind-numbing length, such as this very good one.
In your case, I suggest outlining "equivalence classes". Since exhaustive testing is impractical in most cases, and impossible when the input is of arbitrary length, you develop representative cases -- one for each class, to stand for all members of that class. For instance, lists with no common members might form one such class.
I can't comment well on your given tests, in part because you haven't explained why you tested these particular inputs, or why you decided that these were sufficient. Equivalence classes require you to analyze the problem to determine (or estimate) what processing differences there might be among various inputs.
Write out your rationale, and perhaps get a friend to brainstorm with you. Have you covered various ways in which the first iterations of your algorithm could go down the wrong path, and have to backtrack?
So in conjunction with all of your responses i ended up doing the following so thank you for your help.
bruce force algorithm to compare to
found the bug in my underlying algorithm
simplified the whole thing with only one method called from main or junit
int [][] solutionMatrix;
public int lcs2(int[] a, int[] b) {
final int aARRAY_LENGTH = a.length + 1;
final int bARRAY_LENGTH = b.length + 1;
solutionMatrix = new int[aARRAY_LENGTH][bARRAY_LENGTH];
//fill in matrix to determine if each element is a insert, delete, match or mismatch
for (int i=0; i<=a.length; i++) {
for (int j=0; j<=b.length; j++) {
if (i == 0 || j == 0)
solutionMatrix[i][j] = 0;
else if (a[i-1] == b[j-1])
solutionMatrix[i][j] = solutionMatrix[i-1][j-1] + 1;
else
solutionMatrix[i][j] = Math.max(solutionMatrix[i-1][j], solutionMatrix[i][j-1]);
}
}
return solutionMatrix[a.length][b.length];
}
Basically I replaced n with aData[i] in the Non-working implementation. Am I missing something fundamentally wrong? The Second implementation fails on the same TEST Data.
Passing implementation:
static long[] sort(long[] aData) {
for (int i = 1; i < aData.length; i++) {
long n = aData[i];
int j = i - 1;
while (j >= 0 && aData[j] > n) {
aData[j + 1] = aData[j];
j--;
}
aData[j + 1] = n;
}
return aData;
}
Failing implementation:
static long[] sort(long[] aData) {
for (int i = 1; i < aData.length; i++) {
int j = i - 1;
while (j >= 0 && aData[j] > aData[i]) {
aData[j + 1] = aData[j];
j--;
}
aData[j + 1] = aData[i];
}
return aData;
}
In the first iteration of the while loop, j + 1 == i. So when you write aData[j + 1] = aData[j], you change the value of aData[i] within the loop.
In the initial version, n is constant throughout the operation. Also note that using aData[i] instead of n is very unlikely to improve performance (if anything, it will probably be slower).
I want to find all the pairs of numbers from an array whose sum is equal to 10, and am trying to improve upon this bit of code here:
for (int j = 0; j < arrayOfIntegers.length - 1; j++)
{
for (int k = j + 1; k < arrayOfIntegers.length; k++)
{
int sum = arrayOfIntegers[j] + arrayOfIntegers[k];
if (sum == 10)
return j + "," + k;
}
}
However, I'm having trouble moving through the array. Here's what I have so far:
int[] arrayOfIntegers = {0, 5, 4, 6, 3, 7, 2, 10};
Arrays.sort(arrayOfIntegers);
System.out.println(Arrays.toString(arrayOfIntegers));
int left = arrayOfIntegers[0];
int right = (arrayOfIntegers[arrayOfIntegers.length - 1]);
while (left < right)
{
int sum = left + right;
if (sum == 10) //check to see if equal to 10
{
System.out.println(left + "," + right);
}
if (sum > 10) // if sum is more than 10, move to lesser number
{
right --;
}
if (sum < 10) // if sum is less than 10, move to greater number
{
left++;
}
} // end of while
Try this code by passing the value of the sum and array in which you want to find the pair of elements equals to a given sum using one for loop
private void pairofArrayElementsEqualstoGivenSum(int sum,Integer[] arr){
List numList = Arrays.asList(arr);
for (int i = 0; i < arr.length; i++) {
int num = sum - arr[i];
if (numList.contains(num)) {
System.out.println("" + arr[i] + " " + num + " = "+sum);
}
}
}
You need to capture the values as well as the indexes:
int[] arrayOfIntegers = {0, 5, 4, 6, 3, 7, 2, 10};
Arrays.sort(arrayOfIntegers);
System.out.println(Arrays.toString(arrayOfIntegers));
int left = 0;
int right = arrayOfIntegers.length - 1;
while (left < right)
{
int leftVal = arrayOfIntegers[left];
int rightVal = (arrayOfIntegers[right]);
int sum = leftVal + rightVal;
if (sum == 10) //check to see if equal to 10
{
System.out.println(arrayOfIntegers[left] + "," + arrayOfIntegers[right]);
right --;
left++;
}
if (sum > 10) // if sum is more than 10, move to lesser number
{
right --;
}
if (sum < 10) // if sum is less than 10, move to greater number
{
left++;
}
} // end of while
output:
[0, 2, 3, 4, 5, 6, 7, 10]
0,10
3,7
4,6
This is sample code with javascrypt. Someone can use it
var arr = [0, 5, 4, 6, 3, 7, 2, 10]
var arr1 = arr;
for(var a=0; a<arr.length;a++){
for(var b=0; b<arr.length; b++){
if(arr[a]+arr[b]===10 && a!==b){
console.log(arr[a]+" + "+arr[b])
arr.splice(a,1);
}
}
}
Java - Using single loop
public static void findElements() {
List<Integer> list = List.of(0, 5, 4, 6, 3, 7, 2, 10);
for (int i = 0; i < list.size(); i++) {
int sum = 0;
if (i < list.size() - 1) {
sum = list.get(i) + list.get(i + 1);
if (sum == 10) {
System.out.println("Element: " + list.get(i) + "," + list.get(i + 1));
}
} else {
if (list.get(i) == 10) {
System.out.println("Element: " + list.get(i));
}
}
}
}
I'm trying to make the merge sort to use only (n/2 + 1) extra space and still O(n log n) time. This is my homework.
The original quesetion:
Write the non-recursive version of merge sort. Your program should run
in O(n log n) time and use n/2 + O(1) extra spaces.
The program will split an array in to two like normal merge sort. The left part will be in another array, which is ceil(n/2) long, so it will fit the requirement.
The right part will be in the original array. So it will be half in-place sorting
Sorry, I don't know how to explain further.
I think this is basically correct. But I kept on facing OutOfBounds error.
I know the code is quite long and messy. But can anyone help me about that?
I spent about 5 hours to implement this. Please help me.
package comp2011.lec6;
import java.util.Arrays;
public class MergeSort {
public static void printArr(int[] arr){
for(int i = 0; i < arr.length; i++){
System.out.printf("%d ", arr[i]);
}
}
public static void mergeSort(int[] arr){
if(arr.length<2) {
return;
}
int n, lBegin, rBegin;
n = 1;
int[] leftArr = new int[arr.length - (arr.length/2)];
while(n<arr.length) {
lBegin = 0;
rBegin = n;
while(rBegin + n <= arr.length) {
mergeArrays(arr, lBegin, lBegin+n, rBegin, rBegin+n, leftArr);
lBegin = rBegin+n;
rBegin = lBegin+n;
}
if(rBegin < arr.length) {
mergeArrays(arr, lBegin, lBegin+n, rBegin, arr.length, leftArr);
}
n = n*2;
}
}
public static void mergeArrays(int[] array, int startL, int stopL, int startR, int stopR, int[] leftArr) {
// int[] right = new int[stopR - startR + 1];
// int[] left = new int[stopL - startL + 1];
// for(int i = 0, k = startR; i < (right.length - 1); ++i, ++k) {
// right[i] = array[k];
// }
System.out.println("==============");
System.out.println("stopL: " + stopL +" startL: " + startL);
for(int i = 0, k = startL; i <= (stopL - startL); ++i, ++k) {
System.out.println(leftArr[i]);
leftArr[i] = array[k];
}
// right[right.length-1] = Integer.MAX_VALUE;
leftArr[stopL - startL] = Integer.MAX_VALUE;
System.out.println("leftArr: " + Arrays.toString(leftArr));
System.out.println("RightArr: " + Arrays.toString(Arrays.copyOfRange(array, startR, stopR)));
System.out.println("before: " + Arrays.toString(array));
// for(int k = startL, m = 0, n = startR; k < stopR; ++k) {
System.out.println("StartL: " + startL + " StartR: " + stopR);
for(int k = startL, m = 0, n = startR; ( (k < stopR) ); ++k) {
System.out.println("k: " + k);
System.out.println("Left: " + leftArr[m]);
System.out.println("Right: " + array[n]);
System.out.println("Array[k] before: " + array[k]);
// if(leftArr[m] == Integer.MAX_VALUE){
// System.out.println("YES");
// }
if( (leftArr[m] <= array[n]) || (n >= stopR) ) {
System.out.println("Left is smaller than right");
array[k] = leftArr[m];
m++;
}
else {
System.out.println("Right is smaller than left");
array[k] = array[n];
System.out.println("right: " + array[k]);
n++;
}
System.out.println("Array[k] after: " + array[k]+"\n");
}
System.out.println("after " + Arrays.toString(array));
}
public static void main(String[] args) {
int[] array = new int[] { 5, 2, 4, 12, 2, 10, 13, 1, 7 };
mergeSort(array);
printArr(array);
}
}
We know that all primes above 3 can be generated using:
6 * k + 1
6 * k - 1
However we all numbers generated from the above formulas are not prime.
For Example:
6 * 6 - 1 = 35 which is clearly divisible by 5.
To Eliminate such conditions, I used a Sieve Method and removing the numbers which are factors of the numbers generated from the above formula.
Using the facts:
A number is said to be prime if it has no prime factors.
As we can generate all the prime numbers using the above formulas.
If we can remove all the multiples of the above numbers we are left with only prime numbers.
To generate primes below 1000.
ArrayList<Integer> primes = new ArrayList<>();
primes.add(2);//explicitly add
primes.add(3);//2 and 3
int n = 1000;
for (int i = 1; i <= (n / 6) ; i++) {
//get all the numbers which can be generated by the formula
int prod6k = 6 * i;
primes.add(prod6k - 1);
primes.add(prod6k + 1);
}
for (int i = 0; i < primes.size(); i++) {
int k = primes.get(i);
//remove all the factors of the numbers generated by the formula
for(int j = k * k; j <= n; j += k)//changed to k * k from 2 * k, Thanks to DTing
{
int index = primes.indexOf(j);
if(index != -1)
primes.remove(index);
}
}
System.out.println(primes);
However, this method does generate the prime numbers correctly. This runs in a much faster way as we need not check for all the numbers which we do check in a Sieve.
My question is that am I missing any edge case? This would be a lot better but I never saw someone using this. Am I doing something wrong?
Can this approach be much more optimized?
Taking a boolean[] instead of an ArrayList is much faster.
int n = 100000000;
boolean[] primes = new boolean[n + 1];
for (int i = 0; i <= n; i++)
primes[i] = false;
primes[2] = primes[3] = true;
for (int i = 1; i <= n / 6; i++) {
int prod6k = 6 * i;
primes[prod6k + 1] = true;
primes[prod6k - 1] = true;
}
for (int i = 0; i <= n; i++) {
if (primes[i]) {
int k = i;
for (int j = k * k; j <= n && j > 0; j += k) {
primes[j] = false;
}
}
}
for (int i = 0; i <= n; i++)
if (primes[i])
System.out.print(i + " ");
5 is the first number generated by your criteria. Let's take a look at the numbers generated up to 25:
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
Now, let's look at these same numbers, when we use the Sieve of Eratosthenes algorithm:
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
After removing 2:
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
After removing 3:
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
This is the same as the first set! Notice they both include 25, which is not prime. If we think about it, this is an obvious result. Consider any group of 6 consecutive numbers:
6k - 3, 6k - 2, 6k - 1, 6k, 6k + 1, 6k + 2
If we factor a little, we get:
3*(2k - 1), 2*(3k - 1), 6k - 1, 6*(k), 6k + 1, 2*(3k + 1)
In any group of 6 consecutive numbers, three of them will be divisible by two, and two of them will be divisible by three. These are exactly the numbers we have removed so far! Therefore:
Your algorithm to only use 6k - 1 and 6k + 1 is exactly the same as the first two rounds of the Sieve of Erathosthenes.
It's a pretty nice speed improvement over the Sieve, too, because we don't have to add all those extra elements just to remove them. This explains why your algorithm works and why it doesn't miss any cases; because it's exactly the same as the Sieve.
Anyway, I agree that once you've generated primes, your boolean way is by far the fastest. I have set up a benchmark using your ArrayList way, your boolean[] way, and my own way using LinkedList and iterator.remove() (because removals are fast in a LinkedList. Here's the code for my test harness. Note that I run the test 12 times to ensure that the JVM is warmed up, and I print the size of the list and change the size of n to attempt to prevent too much branch prediction optimization. You can also get faster in all three methods by using += 6 in the initial seed, instead of prod6k:
import java.util.*;
public class PrimeGenerator {
public static List<Integer> generatePrimesArrayList(int n) {
List<Integer> primes = new ArrayList<>(getApproximateSize(n));
primes.add(2);// explicitly add
primes.add(3);// 2 and 3
for (int i = 6; i <= n; i+=6) {
// get all the numbers which can be generated by the formula
primes.add(i - 1);
primes.add(i + 1);
}
for (int i = 0; i < primes.size(); i++) {
int k = primes.get(i);
// remove all the factors of the numbers generated by the formula
for (int j = k * k; j <= n; j += k)// changed to k * k from 2 * k, Thanks
// to DTing
{
int index = primes.indexOf(j);
if (index != -1)
primes.remove(index);
}
}
return primes;
}
public static List<Integer> generatePrimesBoolean(int n) {
boolean[] primes = new boolean[n + 5];
for (int i = 0; i <= n; i++)
primes[i] = false;
primes[2] = primes[3] = true;
for (int i = 6; i <= n; i+=6) {
primes[i + 1] = true;
primes[i - 1] = true;
}
for (int i = 0; i <= n; i++) {
if (primes[i]) {
int k = i;
for (int j = k * k; j <= n && j > 0; j += k) {
primes[j] = false;
}
}
}
int approximateSize = getApproximateSize(n);
List<Integer> primesList = new ArrayList<>(approximateSize);
for (int i = 0; i <= n; i++)
if (primes[i])
primesList.add(i);
return primesList;
}
private static int getApproximateSize(int n) {
// Prime Number Theorem. Round up
int approximateSize = (int) Math.ceil(((double) n) / (Math.log(n)));
return approximateSize;
}
public static List<Integer> generatePrimesLinkedList(int n) {
List<Integer> primes = new LinkedList<>();
primes.add(2);// explicitly add
primes.add(3);// 2 and 3
for (int i = 6; i <= n; i+=6) {
// get all the numbers which can be generated by the formula
primes.add(i - 1);
primes.add(i + 1);
}
for (int i = 0; i < primes.size(); i++) {
int k = primes.get(i);
for (Iterator<Integer> iterator = primes.iterator(); iterator.hasNext();) {
int primeCandidate = iterator.next();
if (primeCandidate == k)
continue; // Always skip yourself
if (primeCandidate == (primeCandidate / k) * k)
iterator.remove();
}
}
return primes;
}
public static void main(String... args) {
int initial = 4000;
for (int i = 0; i < 12; i++) {
int n = initial * i;
long start = System.currentTimeMillis();
List<Integer> result = generatePrimesArrayList(n);
long seconds = System.currentTimeMillis() - start;
System.out.println(result.size() + "\tArrayList Seconds: " + seconds);
start = System.currentTimeMillis();
result = generatePrimesBoolean(n);
seconds = System.currentTimeMillis() - start;
System.out.println(result.size() + "\tBoolean Seconds: " + seconds);
start = System.currentTimeMillis();
result = generatePrimesLinkedList(n);
seconds = System.currentTimeMillis() - start;
System.out.println(result.size() + "\tLinkedList Seconds: " + seconds);
}
}
}
And the results of the last few trials:
3432 ArrayList Seconds: 430
3432 Boolean Seconds: 0
3432 LinkedList Seconds: 90
3825 ArrayList Seconds: 538
3824 Boolean Seconds: 0
3824 LinkedList Seconds: 81
4203 ArrayList Seconds: 681
4203 Boolean Seconds: 0
4203 LinkedList Seconds: 100
4579 ArrayList Seconds: 840
4579 Boolean Seconds: 0
4579 LinkedList Seconds: 111
You don't need to add all possible candidates to the array. You can create a Set to store all non primes.
Also you can start checking at k * k, rather than 2 * k
public void primesTo1000() {
Set<Integer> notPrimes = new HashSet<>();
ArrayList<Integer> primes = new ArrayList<>();
primes.add(2);//explicitly add
primes.add(3);//2 and 3
for (int i = 1; i < (1000 / 6); i++) {
handlePossiblePrime(6 * i - 1, primes, notPrimes);
handlePossiblePrime(6 * i + 1, primes, notPrimes);
}
System.out.println(primes);
}
public void handlePossiblePrime(
int k, List<Integer> primes, Set<Integer> notPrimes) {
if (!notPrimes.contains(k)) {
primes.add(k);
for (int j = k * k; j <= 1000; j += k) {
notPrimes.add(j);
}
}
}
untested code, check corners
Here is a bit packing version of the sieve as suggested in the answer referenced by #Will Ness. Rather than return the nth prime, this version returns a list of primes to n:
public List<Integer> primesTo(int n) {
List<Integer> primes = new ArrayList<>();
if (n > 1) {
int limit = (n - 3) >> 1;
int[] sieve = new int[(limit >> 5) + 1];
for (int i = 0; i <= (int) (Math.sqrt(n) - 3) >> 1; i++)
if ((sieve[i >> 5] & (1 << (i & 31))) == 0) {
int p = i + i + 3;
for (int j = (p * p - 3) >> 1; j <= limit; j += p)
sieve[j >> 5] |= 1 << (j & 31);
}
primes.add(2);
for (int i = 0; i <= limit; i++)
if ((sieve[i >> 5] & (1 << (i & 31))) == 0)
primes.add(i + i + 3);
}
return primes;
}
There seems to be a bug in your updated code that uses a boolean array (it is not returning all the primes).
public static List<Integer> booleanSieve(int n) {
boolean[] primes = new boolean[n + 5];
for (int i = 0; i <= n; i++)
primes[i] = false;
primes[2] = primes[3] = true;
for (int i = 1; i <= n / 6; i++) {
int prod6k = 6 * i;
primes[prod6k + 1] = true;
primes[prod6k - 1] = true;
}
for (int i = 0; i <= n; i++) {
if (primes[i]) {
int k = i;
for (int j = k * k; j <= n && j > 0; j += k) {
primes[j] = false;
}
}
}
List<Integer> primesList = new ArrayList<>();
for (int i = 0; i <= n; i++)
if (primes[i])
primesList.add(i);
return primesList;
}
public static List<Integer> bitPacking(int n) {
List<Integer> primes = new ArrayList<>();
if (n > 1) {
int limit = (n - 3) >> 1;
int[] sieve = new int[(limit >> 5) + 1];
for (int i = 0; i <= (int) (Math.sqrt(n) - 3) >> 1; i++)
if ((sieve[i >> 5] & (1 << (i & 31))) == 0) {
int p = i + i + 3;
for (int j = (p * p - 3) >> 1; j <= limit; j += p)
sieve[j >> 5] |= 1 << (j & 31);
}
primes.add(2);
for (int i = 0; i <= limit; i++)
if ((sieve[i >> 5] & (1 << (i & 31))) == 0)
primes.add(i + i + 3);
}
return primes;
}
public static void main(String... args) {
Executor executor = Executors.newSingleThreadExecutor();
executor.execute(() -> {
for (int i = 0; i < 10; i++) {
int n = (int) Math.pow(10, i);
Stopwatch timer = Stopwatch.createUnstarted();
timer.start();
List<Integer> result = booleanSieve(n);
timer.stop();
System.out.println(result.size() + "\tBoolean: " + timer);
}
for (int i = 0; i < 10; i++) {
int n = (int) Math.pow(10, i);
Stopwatch timer = Stopwatch.createUnstarted();
timer.start();
List<Integer> result = bitPacking(n);
timer.stop();
System.out.println(result.size() + "\tBitPacking: " + timer);
}
});
}
0 Boolean: 38.51 μs
4 Boolean: 45.77 μs
25 Boolean: 31.56 μs
168 Boolean: 227.1 μs
1229 Boolean: 1.395 ms
9592 Boolean: 4.289 ms
78491 Boolean: 25.96 ms
664116 Boolean: 133.5 ms
5717622 Boolean: 3.216 s
46707218 Boolean: 32.18 s
0 BitPacking: 117.0 μs
4 BitPacking: 11.25 μs
25 BitPacking: 11.53 μs
168 BitPacking: 70.03 μs
1229 BitPacking: 471.8 μs
9592 BitPacking: 3.701 ms
78498 BitPacking: 9.651 ms
664579 BitPacking: 43.43 ms
5761455 BitPacking: 1.483 s
50847534 BitPacking: 17.71 s
There are several things that could be optimized.
For starters, the "contains" and "removeAll" operations on an ArrayList are rather expensive operations (linear for the former, worst case quadratic for the latter) so you might not want to use the ArrayList for this. A Hash- or TreeSet has better complexities for this, being nearly constant (Hashing complexities are weird) and logarithmic I think
You could look into the sieve of sieve of Eratosthenes if you want a more efficient sieve altogeter, but that would be besides the point of your question about the 6k +-1 trick. It is slightly but not noticably more memory expensive than your solution, but way faster.
Can this approach be much more optimized?
The answer is yes.
I'll start by saying that it is a good idea to use the sieve on a subset of number within a certain range, and your suggesting is doing exactly that.
Reading about generating Primes:
...Furthermore, based on the sieve formalisms, some integer sequences
(sequence A240673 in OEIS) are constructed which they also could be used for generating primes in certain intervals.
The meaning of this paragraph is that your approach of starting with a reduced list of integers was indeed adopted by the academy, but their techniques are more efficient (but also, naturally, more complex).
You can generate your trial numbers with a wheel, adding 2 and 4 alternately, that eliminates the multiplication in 6 * k +/- 1.
public void primesTo1000() {
Set<Integer> notPrimes = new HashSet<>();
ArrayList<Integer> primes = new ArrayList<>();
primes.add(2); //explicitly add
primes.add(3); //2 and 3
int step = 2;
int num = 5 // 2 and 3 already handled.
while (num < 1000) {
handlePossiblePrime(num, primes, notPrimes);
num += step; // Step to next number.
step = 6 - step; // Step by 2, 4 alternately.
}
System.out.println(primes);
}
Probably the most suitable standard datastructure for Sieve of Eratosthenes is the BitSet. Here's my solution:
static BitSet genPrimes(int n) {
BitSet primes = new BitSet(n);
primes.set(2); // add 2 explicitly
primes.set(3); // add 3 explicitly
for (int i = 6; i <= n ; i += 6) { // step by 6 instead of multiplication
primes.set(i - 1);
primes.set(i + 1);
}
int max = (int) Math.sqrt(n); // don't need to filter multiples of primes bigger than max
// this for loop enumerates all set bits starting from 5 till the max
// sieving 2 and 3 is meaningless: n*6+1 and n*6-1 are never divisible by 2 or 3
for (int i = primes.nextSetBit(5); i >= 0 && i <= max; i = primes.nextSetBit(i+1)) {
// The actual sieve algorithm like in your code
for(int j = i * i; j <= n; j += i)
primes.clear(j);
}
return primes;
}
Usage:
BitSet primes = genPrimes(1000); // generate primes up to 1000
System.out.println(primes.cardinality()); // print number of primes
// print all primes like {2, 3, 5, ...}
System.out.println(primes);
// print all primes one per line
for(int prime = primes.nextSetBit(0); prime >= 0; prime = primes.nextSetBit(prime+1))
System.out.println(prime);
// print all primes one per line using java 8:
primes.stream().forEach(System.out::println);
The boolean-based version may work faster for small n values, but if you need, for example, a million of prime numbers, BitSet will outperform it in several times and actually works correctly. Here's lame benchmark:
public static void main(String... args) {
long start = System.nanoTime();
BitSet res = genPrimes(10000000);
long diff = System.nanoTime() - start;
System.out.println(res.cardinality() + "\tBitSet Seconds: " + diff / 1e9);
start = System.nanoTime();
List<Integer> result = generatePrimesBoolean(10000000); // from durron597 answer
diff = System.nanoTime() - start;
System.out.println(result.size() + "\tBoolean Seconds: " + diff / 1e9);
}
Output:
664579 BitSet Seconds: 0.065987717
664116 Boolean Seconds: 0.167620323
664579 is the correct number of primes below 10000000.
This method below shows how to find prime nos using 6k+/-1 logic
this was written in python 3.6
def isPrime(n):
if(n<=1):
return 0
elif(n<4): #2 , 3 are prime
return 1
elif(n%2==0): #already excluded no.2 ,so any no. div. by 2 cant be prime
return 0
elif(n<9): #5, 7 are prime and 6,8 are excl. in the above step
return 1
elif(n%3==0):
return 1
f=5 #Till now we have checked the div. of n with 2,3 which means with 4,6,8 also now that is why f=5
r=int(n**.5) #rounding of root n, i.e: floor(sqrt(n)) r*r<=n
while(f<=r):
if(n%f==0): #checking if n has any primefactor lessthan sqrt(n), refer LINE 1
return 0
if(n%(f+2)==0): #remember her we are not incrementing f, see the 6k+1 rule to understand this while loop steps ,you will see that most values of f are prime
return 0
f=f+6
return 1
def prime_nos():
counter=2 #we know 2,3 are prime
print(2)
print(3) #we know 2,3 are prime
i=1
s=5 #sum 2+3
t=0
n=int(input("Enter the upper limit( should be > 3: "))
n=(n-1)//6 #finding max. limit(n=6*i+1) upto which I(here n on left hand side) should run
while(i<n):#2*(10**6)):
if (isPrime(6*i-1)):
counter=counter+1
print(6*i-1) #prime no
if(isPrime(6*i+1)):
counter=counter+1
print(6*i+1) #prime no
i+=1
prime_nos() #fn. call
Your prime number formula mathematically incorrect ex. take 96 it dividable to 6 96/6=16 so by this logic 97 and 95 must be prime if square root passed but square root of 95 is 9.7467... (passed) so its "prime". But 95 clearly dividable by 5 fast algorithm in c#
int n=100000000;
bool [] falseprimes = new bool[n + 2];
int ed=n/6;
ed = ed * 6;
int md = (int)Math.Sqrt((double)ed);
for (int i = ed; i > md; i-=6)
{
falseprimes[i + 1] = true;
falseprimes[i - 1] = true;
}
md = md / 6;
md = md * 5;
for (int i = md; i > 5; i -= 6)
{
falseprimes[i + 1] = true;
falseprimes[i - 1] = true;
falseprimes[(i + 1)* (i + 1)] = false;
falseprimes[(i-1) * (i-1)] = false;
}
falseprimes[2] = true;
falseprimes[3] = true;
To generate prime numbers using 6 * k + - 1 rule use this algorithm:
int n = 100000000;
int j,jmax=n/6;
boolean[] primes5m6 = new boolean[jmax+1];
boolean[] primes1m6 = new boolean[jmax+1];
for (int i = 0; i <= jmax; i++){
primes5m6[i] = false;
primes1m6[i] = false;
}
for (int i = 1; i <= (int)((Math.sqrt(n)+1)/6)+1; i++){
if (!primes5m6[i]){
for (j = 6*i*i; j <= jmax; j+=6*i-1){
primes5m6[j]=true;
primes1m6[j-2*i]=true;
}
for (; j <= jmax+2*i; j+=6*i-1)
primes1m6[j-2*i]=true;
}
if (!primes1m6[i]){
for (j = 6*i*i; j <= jmax-2*i; j+=6*i+1){
primes5m6[j]=true;
primes1m6[j+2*i]=true;
}
for (; j <= jmax; j+=6*i+1)
primes5m6[j]=true;
}
}
System.out.print(2 + " ");
System.out.print(3 + " ");
for (int i = 1; i <= jmax; i++){
if (!primes5m6[i])
System.out.print((6*i-1) + " ");
if (!primes1m6[i])
System.out.print((6*i+1) + " ");
}