I've been working on a 3-way merge sort algorithm, and the professor wants me to average how much time it takes to merge sort an array of 1000, 10000, 100000, 1000000 and 10000000 randomized integers over 10 trials each. However, I've been facing a problem while trying to calculate the run time; for example, when I calculate the run time for 1 million elements it displays that it took around 300ms while in reality it's taking around 40 seconds. I'm going to leave the code of my main function, I also tried putting startTime at the start and it would display the same running time.
public static void main(String args[]) {
Random r = new Random();
int[] arr = new int[1000000];
for (int i = 0; i < arr.length; i++)
arr[i] = r.nextInt();
long startTime = System.currentTimeMillis();
MergeSort test = new MergeSort();
test.sort(arr, 0, arr.length);
long endTime = System.currentTimeMillis();
long timeElapsed = endTime - startTime;
print(arr);
System.out.println("Execution time in milliseconds: " + timeElapsed);
}
Related
The average case time complexity for a quick sort algorithm is O(nlogn).That means it should be strictly increasing with increasing value of n.
But why the time is increasing-decreasing-increasing in this case.
Why is it decreasing with increasing value of n
I am generating random inputs using the function of java
What will be the solution for this??
public class Qsort {
static Scanner sc = new Scanner(System.in);
public static void main(String args[]) throws IOException {
File file = new File("Qtest.txt");
PrintWriter output = new PrintWriter(file);
int n,i;
System.out.println("Enter the times of iteration");
n = sc.nextInt();
for(int j=1000;j<=n*1000;j=j+1000)
{
Random r = new Random();
int array[];
array = new int[j];
for(i=0;i<j;i++)
{
array[i] = r.nextInt(2000);
}
long startTime = System.nanoTime();
Qsort(array,0,j-1);
long endTime = System.nanoTime();
output.println(j + " " + (endTime-startTime));
System.out.println("After sorting the time elapsed is " +(endTime-startTime));
}
output.close();
}
The sort function for Quick Sort
public static void Qsort(int A[],int start,int end)
{
if(start>=end) return;
int p = partition(A,start,end);
Qsort(A,start,p-1);
Qsort(A,p+1,end);
}
public static int partition(int A[],int start,int end)
{
int pivot = A[end];
int p = start;
for(int i =start;i<end;i++)
{
if(A[i]<=pivot)
{
int temp = A[i];
A[i] = A[p];
A[p] = temp;
p++;
}
}
int temp = A[p];
A[p] = A[end];
A[end] = temp;
return p;
}
}
The output that I am getting:
Enter the times of iteration
10
For 1000 inputs : After sorting the time elapsed is 779133
For 2000 inputs : After sorting the time elapsed is 8350639
For 3000 inputs : After sorting the time elapsed is 607856
For 4000 inputs : After sorting the time elapsed is 833593
For 5000 inputs : After sorting the time elapsed is 1042426
For 6000 inputs : After sorting the time elapsed is 1283195
For 7000 inputs : After sorting the time elapsed is 1497488
For 8000 inputs : After sorting the time elapsed is 1261182
For 9000 inputs : After sorting the time elapsed is 1207128
For 10000 inputs : After sorting the time elapsed is 1427456
There are two things conflated in your question: measuring algorithmic complexity and microbenchmarking.
Algorithmic time complexity works with the number of operations. What you are measuring is the wall time spent in quick sort - which makes it a microbenchmark test.
Microbenchmarking is hard. On the first run you are most probably seeing JVM warmup time. On the second drop it might be instruction caching or JIT compilation. I would suggest to rerun the test in a way that before you run it, do a sort on 10000 elements! Other things might also influence the results, such as CPU utilization in your machine, the randomness in your arrays (sometimes the swap happens sometimes it doesn't) - that usually requires the microbenchmark to run each "experiment" multiple times and draw conclusions only in a statistical way, using standard error, percentiles, etc. instead of drawing it from single experiments.
Bottom line: warmup does help remove some of the JVM specific noise, and number of operations is a more precise measurement of time complexity.
The code below is a version of your code that calculates number of operations and does a warmup round:
public class QSort {
private static int numOps = 0;
static Scanner sc = new Scanner(System.in);
public static void main(String args[]) throws IOException {
File file = new File("Qtest.txt");
PrintWriter output = new PrintWriter(file);
int n, i;
System.out.println("warming up...");
Qsort(randomInts(1000000), 0, 100000-1);
System.out.println("Enter the times of iteration");
n = sc.nextInt();
for (int j = 1000; j <= n * 1000; j = j + 1000) {
int[] array = randomInts(j);
long startTime = System.nanoTime();
numOps = 0;
Qsort(array, 0, j - 1);
long endTime = System.nanoTime();
output.println(j + " " + (endTime - startTime) + " " + numOps);
System.out.println("After sorting the time elapsed is " + (endTime - startTime) + " numOps: " + numOps);
}
output.close();
}
private static int[] randomInts(int j) {
int i;Random r = new Random();
int array[];
array = new int[j];
for (i = 0; i < j; i++) {
array[i] = r.nextInt(2000);
}
return array;
}
public static void Qsort(int A[], int start, int end) {
if (start >= end) return;
int p = partition(A, start, end);
Qsort(A, start, p - 1);
Qsort(A, p + 1, end);
}
public static int partition(int A[], int start, int end) {
int pivot = A[end];
int p = start;
for (int i = start; i < end; i++) {
if (A[i] <= pivot) {
int temp = A[i];
A[i] = A[p];
A[p] = temp;
p++;
numOps++;
}
}
int temp = A[p];
A[p] = A[end];
A[end] = temp;
return p;
}
}
The output:
warming up...
Enter the times of iteration
20
After sorting the time elapsed is 94206 numOps: 5191
After sorting the time elapsed is 150524 numOps: 12718
After sorting the time elapsed is 232478 numOps: 20359
After sorting the time elapsed is 314819 numOps: 31098
After sorting the time elapsed is 475933 numOps: 38483
After sorting the time elapsed is 500866 numOps: 55114
After sorting the time elapsed is 614642 numOps: 57251
After sorting the time elapsed is 693324 numOps: 68683
After sorting the time elapsed is 738800 numOps: 83332
After sorting the time elapsed is 798644 numOps: 83057
After sorting the time elapsed is 899891 numOps: 99975
After sorting the time elapsed is 987163 numOps: 113854
After sorting the time elapsed is 1059323 numOps: 124735
After sorting the time elapsed is 1103815 numOps: 143278
After sorting the time elapsed is 1192974 numOps: 164740
After sorting the time elapsed is 1276277 numOps: 166781
After sorting the time elapsed is 1344138 numOps: 180460
After sorting the time elapsed is 1439943 numOps: 204095
After sorting the time elapsed is 1593336 numOps: 209483
After sorting the time elapsed is 1644561 numOps: 225523
I am trying determine the running times of bubble sort algorithm in three different kinds of input:
1) randomly selected numbers
2) already sorted numbers
3) sorted in reverse order numbers
My expectation about their running time was:
Reverse ordered numbers would take longer than other two.
Already sorted numbers would have the fastest running time.
Randomly selected numbers would lie between these two.
I've tested the algorithm with inputs containing more than 100.000 numbers. The results wasn't like I expected. Already sorted numbers had the fastest running time but randomly selected numbers took almost twice as much time to execute compared to reverse ordered numbers. I was wondering why this is happening?
Here is how I test the inputs
int[] random = fillRandom();
int[] sorted = fillSorted();
int[] reverse = fillReverse();
int[] temp;
long time, totalTime = 0;
for (int i = 0; i < 100; i++) {
temp = random.clone();
time = System.currentTimeMillis();
BubbleSort.sort(temp);
time = System.currentTimeMillis() - time;
totalTime += time;
}
System.out.println("random - average time: " + totalTime/100.0 + " ms");
totalTime = 0;
for (int i = 0; i < 100; i++) {
temp = sorted.clone();
time = System.currentTimeMillis();
BubbleSort.sort(temp);
time = System.currentTimeMillis() - time;
totalTime += time;
}
System.out.println("sorted - average time: " + totalTime/100.0 + " ms");
totalTime = 0;
for (int i = 0; i < 100; i++) {
temp = reverse.clone();
time = System.currentTimeMillis();
BubbleSort.sort(temp);
time = System.currentTimeMillis() - time;
totalTime += time;
}
System.out.println("reverse - average time: " + totalTime/100.0 + " ms");
Benchmarks for java code are not easy, as JVM might apply a lot of optimizations to your code at runtime. It can optimize out a loop if computation result is not used, it can inline some code, JIT can compile some code into native and many other things. As a result, benchmark output is very unstable.
There are tools like jmh that simplify benchmarking a lot.
I recommend you to check this article, it has an example of benchmark for sorting algorithm.
I am trying to print a long value held by elapsed, can someone help me with the format of how to do it?
This prints 0.0
but i know it has more significant digits (maybe like .0005324 or something)
System.out.println("It took " + (double)elapsed + " milliseconds to complete SELECTION_SORT algorithm.");
'
System.currentTimeMillis();
long start = System.currentTimeMillis();
int sortedArr[] = selectionSort(arr1);
long elapsed = System.currentTimeMillis() - start;
System.out.println("\n///////////SELECTIONSort//////////////");
System.out.println("\nSelection sort implemented below prints a sorted list:");
print(sortedArr);
System.out.printf("It took %.7f ms....", elapsed);
//System.out.println("It took " + (double)elapsed + " milliseconds to complete SELECTION_SORT algorithm.");'
'
private static int[] selectionSort(int[] arr) {
int minIndex, tmp;
int n = arr.length;
for (int i = 0; i < n - 1; i++) {
minIndex = i;
for (int j = i + 1; j < n; j++)
if (arr[j] < arr[minIndex])
minIndex = j;
if (minIndex != i) {
tmp = arr[i];
arr[i] = arr[minIndex];
arr[minIndex] = tmp;
}
}
return arr;
}'
Changing the format won't give you more resolution which is what your real problem is hee if you print 1 ms with 7 digits you just get 1.0000000 every time. This doesn't help you at all.
What you need is a high resolution timer
long start = System.nanoTime();
int sortedArr[] = selectionSort(arr1);
long elapsed = System.nanoTime() - start;
System.out.println("\n///////////SELECTIONSort//////////////");
System.out.println("\nSelection sort implemented below prints a sorted list:");
print(sortedArr);
System.out.printf("It took %.3f ms....", elapsed / 1e6);
However, if you do this once you are fooling yourself because Java compiles code dynamically and gets fast the more you run it. It can get 100x faster or more making the first number you see pretty useless.
Normally I suggest you run loops many times and ignore the first 10,000+ times. This will change the results so much that you will see that the first digit was completely wrong. I suggest you try this
for(int iter = 1; iter<=100000; iter *= 10) {
long start = System.nanoTime();
int[] sortedArr = null
for(int i=0;i<iter;i++)
sortedArr = selectionSort(arr1);
long elapsed = System.nanoTime() - start;
System.out.println("\n///////////SELECTIONSort//////////////");
System.out.println("\nSelection sort implemented below prints a sorted list:");
print(sortedArr);
System.out.printf("It took %.3f ms on average....", elapsed / 1e6 / iter);
}
You will see you results improve 10x maybe even 100x just by running the code for longer.
You can use print formatting. For a double or float, to get 7 places after the decimal place, you would do:
System.out.printf("It took %.7f ms....", elapsed);
EDIT:
You are actually using a long, not a double, so you cannot have significant digits, because long only takes on integer values.
A long is an integer value and does not have decimal places.
To get an approximation of the runtime, run the same sort in a loop, say 1000 times and then divide the measured time by 1000.
For example:
System.out.println("It took " + ((double)elapsed) / NUMBER_OF_ITERATONS);
Try this:
String.format("%.7f",longvalue);
by using above line you can format your long or any floating point numbers. Here 7 is referred how many digits you want after '.' .
I've written a bubble sort program that sorts 10000 unique values into order.
I've run the program and it gives me an output, but the output doesn't seem to look right to me.
Here is the code:
public class BubbleSort {
public static void main(String[] args) {
int BubArray[] = new int[]{#here are 10000 integers#};
System.out.println("Array Before Bubble Sort");
for(int a = 0; a < BubArray.length; a++){
System.out.print(BubArray[a] + " ");
}
double timeTaken = bubbleSortTimeTaken(BubArray);
bubbleSort(BubArray);
System.out.println("");
System.out.println("Array After Bubble Sort");
for(int a = 0; a < BubArray.length; a++){
System.out.println(" Time taken for Sort : " + timeTaken + " milliseconds.");
System.out.print(BubArray[a] + " ");
}
}
private static void bubbleSort(int[] BubArray) {
int z = BubArray.length;
int temp = 0;
for(int a = 0; a < z; a++){
for(int x=1; x < (z-a); x++){
if(BubArray[x-1] > BubArray[x]){
temp = BubArray[x-1];
BubArray[x-1] = BubArray[x];
BubArray[x] = temp;
}
}
}
}
public static double bubbleSortTimeTaken(int[] BubArray) {
long startTime = System.nanoTime();
bubbleSort(BubArray);
long timeTaken = System.nanoTime() - startTime;
return timeTaken;
}
}
The code runs smooth and no errors, but this is the output I receive:
Array Before Bubble Sort
#10000 integers randomly#
Array After Bubble Sort
Time taken for Sort : 1.0114869E7 milliseconds.
10 Time taken for Sort : 1.0114869E7 milliseconds.
11 Time taken for Sort : 1.0114869E7 milliseconds.
17 Time taken for Sort : 1.0114869E7 milliseconds.
24 Time taken for Sort : 1.0114869E7 milliseconds.
35 Time taken for Sort : 1.0114869E7 milliseconds.
53 Time taken for Sort : 1.0114869E7 milliseconds.
....
14940 Time taken for Sort : 1.0114869E7 milliseconds.
14952 Time taken for Sort : 1.0114869E7 milliseconds.
14957 Time taken for Sort : 1.0114869E7 milliseconds.
14958 Time taken for Sort : 1.0114869E7 milliseconds.
14994 Time taken for Sort : 1.0114869E7 milliseconds.
14997 Time taken for Sort : 1.0114869E7 milliseconds.
BUILD SUCCESSFUL (total time: 1 second)
The 1.0114869E7 milliseconds runs throughout the program, and I don't think the output is exactly what I'm trying to do, though it looks like. I wish to output time taken for the program to run through and also each sort time.
~I hope this makes sense.
Any help would be appreciated, thanks.
I guess you might want to output this. The sysout should be before the for loop.
System.out.println(" Time taken for Sort : " + timeTaken + " milliseconds.");
for(int a = 0; a < BubArray.length; a++){
System.out.print(BubArray[a] + " ");
}
You have sorted down the array already, and you displaying it in the for loop later. The time you see output is the total time taken by bubble sort (approx.), which is being calculated in following method
public static double bubbleSortTimeTaken(int[] BubArray) {
long startTime = System.nanoTime();
bubbleSort(BubArray);
long timeTaken = System.nanoTime() - startTime;
return timeTaken;
}
}
So, that's the total time.
Using java I want to generate some random values in one program and then use these values in some other program everytime the 2nd program is executed.
Purpose of this is to generate random values once and then hold and keep them constant for every run of the program later. Is it possible in some way? Thanks
When you exit a program, anything you don't store in a file is lost.
I suspect you don't need to worry about IO as much as you think. You should be able to read millions of values in a few milli-seconds. In fact you should be able to generate millions of random numbers in a fraction of a second.
Random random = new Random(1);
long start = System.nanoTime();
int values = 1000000;
for (int i = 0; i < values; i++)
random.nextInt();
long time = System.nanoTime() - start;
System.out.printf("Took %.3f seconds to generate %,d values%n",
time / 1e9, values);
prints
Took 0.015 seconds to generate 1,000,000 values
Generating and writing
int values = 1000000;
ByteBuffer buffer = ByteBuffer.allocateDirect(4 * values).order(ByteOrder.nativeOrder());
Random random = new Random(1);
long start = System.nanoTime();
for (int i = 0; i < values; i++)
buffer.putInt(random.nextInt());
buffer.flip();
FileOutputStream fos = new FileOutputStream("/tmp/random.ints");
fos.getChannel().write(buffer);
fos.close();
long time = System.nanoTime() - start;
System.out.printf("Took %.3f seconds to generate&write %,d values%n", time / 1e9, values);
prints
Took 0.021 seconds to generate&write 1,000,000 values
Reading the same file.
long start2 = System.nanoTime();
FileInputStream fis = new FileInputStream("/tmp/random.ints");
MappedByteBuffer buffer2 = fis.getChannel().map(FileChannel.MapMode.READ_ONLY, 0, values * 4).order(ByteOrder.nativeOrder());
for (int i = 0; i < values; i++)
buffer2.getInt();
fis.close();
long time2 = System.nanoTime() - start2;
System.out.printf("Took %.3f seconds to read %,d values%n", time2 / 1e9, values);
prints
Took 0.011 seconds to read 1,000,000 values
Reading the same file repeatedly
long sum = 0;
int repeats = 1000;
for (int j = 0; j < repeats; j++) {
buffer2.position(0);
for (int i = 0; i < values; i++)
sum += buffer2.getInt();
}
fis.close();
long time2 = System.nanoTime() - start2;
System.out.printf("Took %.3f seconds to read %,d values%n", time2 / 1e9, repeats * values);
prints
Took 1.833 seconds to read 1,000,000,000 values
a couple of ways you may consider to go
1 using the same seed to generate random number
2 generate the random and save them in a file. your program two reads the file to get values.
The fastest way is to run the program once, note down the random numbers generated, and then hardcode the random numbers in an array in your program! Then, the next time onwards, your program can read these same values from the array.
So suppose you program generates random numbers as follows -
0.34, 0.15, 0.28, 0.45, ...
You can then define an array and store these values in it.
randomValues[0] = 0.34;
randomValues[1] = 0.15;
randomValues[2] = 0.28;
randomValues[3] = 0.45;
.
.
.
Then each time, simply use an index to get the random number you want.
index = 0;
randomNumber = randomValues[index];
index++; // so the next time, you can get the next random number in sequence.
Generate them once, then save them into a file...
afterthat, everytime you want to run your program, you have to load these values again.