Why sentinel search slower than linear? - java

I decided to reduce the number of comparisons required to find an element in an array. Here we replace the last element of the list with the search element itself and run a while loop to see if there exists any copy of the search element in the list and quit the loop as soon as we find the search element. See the code snippet for clarification.
import java.util.Random;
public class Search {
public static void main(String[] args) {
int n = 10000000;
int key = 10000;
int[] arr = generateRandomSize(n);
long start = System.nanoTime();
int find = sentinels(arr, key);
long end = System.nanoTime();
System.out.println(find);
System.out.println(end - start);
arr = generateRandomSize(n);
start = System.nanoTime();
find = linear(arr, key);
end = System.nanoTime();
System.out.println(find);
System.out.println(end - start);
}
public static int[] generateRandomSize(int n) {
int[] arr = new int[n];
Random rand = new Random();
for (int i = 0; i < n; ++i) {
arr[i] = rand.nextInt(5000);
}
return arr;
}
public static int linear(int[] a, int key) {
for(int i = 0; i < a.length; ++i) {
if (a[i] == key) {
return i;
}
}
return -1;
}
public static int sentinels(int[] a, int key) {
int n = a.length;
int last = a[n-1];
a[n-1] = key;
int i = 0;
while (a[i] != key) {
++i;
}
a[n-1] = last;
if ((i < n - 1) || a[n-1] == key ) {
return i;
}
return -1;
}
}
So using sentinel search we are not doing 10000000 comparisons like i < arr.length. But why linear search always shows up better performance?

You'd have to look at the byte code, and even deeper to see what hotspot is making from this. But I am quite sure that this statement is not true:
using sentinel search we are not doing 10000000 comparisons like i <
arr.length
Why? Because when you access a[i], i has to be bounds checked. In the linear case on the other hand, the optimiser can deduce that it can omit the bounds check since it "knows" that i>=0 (because of the loop structure) and also i<arr.length because it has already been tested in the loop condition.
So the sentinel approach just adds overhead.
This makes me think of a smart C++ optimisation (called "Template Meta Programming" and "Expression Templates") I did about 20 years ago that led to faster execution times (at cost of a much higher compilation time), and after the next compiler version was released, I discovered that the new version was able to optimise the original source to produce the exact same assembly - in short I should have rather used my time differently and stayed with the more readable (=easier to maintain) version of the code.

Related

Java perfomance issue with Arrays.sort [duplicate]

This question already has answers here:
Why is processing a sorted array *slower* than an unsorted array? (Java's ArrayList.indexOf)
(3 answers)
Closed 9 months ago.
I've been solving one algorithmic problem and found solution, as I thought. But unexpectedly I bumped into a weird problem.
Let's assume i have the following code on java 8/17(replicates on both), intel 11th gen processor:
import java.util.Arrays;
import java.util.concurrent.ThreadLocalRandom;
public class DistanceYandex{
static class Elem implements Comparable<Elem>{
int value;
int index;
long dist;
public Elem(int value, int index){
this.value = value;
this.index = index;
}
#Override
public int compareTo(Elem o){
return Integer.compare(value, o.value);
}
}
public static void main(String[] args){
int n = 300_000;
int k = 3_000;
Elem[] elems = new Elem[n];
for(int i = 0; i < n; i++){
elems[i] = new Elem(ThreadLocalRandom.current().nextInt(), i);
}
solve(n, k, elems);
}
private static void solve(int n, int k, Elem[] elems){
Arrays.sort(elems); // interesting line
long time = System.nanoTime();
for(int i = 0; i < n; i++){
elems[i].dist = findDistForIth(elems, i, k);
}
// i omit output, because it's irrelevant
// Arrays.sort(elems, Comparator.comparingInt(elem -> elem.index));
// System.out.print(elems[0].dist);
// for(int i = 1; i < n; i++){
// System.out.print(" " + elems[i].dist);
// }
System.out.println((System.nanoTime() - time)/1_000_000_000.0);
}
private static long findDistForIth(Elem[] elems, int i, int k){
int midElem = elems[i].value;
int left = i - 1;
int right = i + 1;
long dist = 0;
for(int j = 0; j < k; j++){
if(left < 0){
dist += elems[right++].value - midElem;
}else if(right >= elems.length){
dist += midElem - elems[left--].value;
}else{
int leftAdd = midElem - elems[left].value;
int rightAdd = elems[right].value - midElem;
if(leftAdd < rightAdd){
dist+=leftAdd;
left--;
}else{
dist+=rightAdd;
right++;
}
}
}
return dist;
}
}
Point your eyes at solve function.
Here we have simple solution, that calls function findDistForIth n times and measures time it takes(I don't use JMH, because testing system for my problem uses simple one-time time measures). And before it captures start time, it sorts the array by natural order using built-in Arrays.sort function.
As you could notice, measured time doesn't include the time the array gets sorted. Also function findDistForIth's behaviour does not depend on whether input array is sorted or not(it mostly goes to third else branch). But if I comment out line with Arrays.sort I get significantly faster execution: instead of roughly 7.3 seconds, it takes roughly 1.6 seconds. More that 4 times faster!
I don't understand what's going on.
I thought maybe it is gc that's messing up here, I tried to increase memory I give to jvm to 2gb(-Xmx2048M -Xms2048M). Didn't help.
I tried to pass explicit comparator to Arrays.sort as second argument(Comparator.comparingInt(e -> e.value)) and deimplementing Comparable interface on Elem class. Didn't help.
I launched the profiler(Intellij Profiler)
With Arrays.sort included:
With Arrays.sort excluded:
But it didn't give me much information...
I tried building it directly to .jar and launching via java cmd(before i did it via intellij). It also didn't help.
Do anybody know what's goind on?
This problem also replicates in online compiler: https://onlinegdb.com/MPyNIknB8T
May be you need to sort your data using red black tree sorting algo which implemented in SortedSet, Arrays.sort use mergesort sorting algo which works well for small number of data

How to improve efficiency

Write a function:
class Solution{
public int solution(int[] A);
}
that, given an array A of N integers, returns the smallest positive integer(greater than 0)
that does not occur in A.
For example, given A = [1,3,6,4,1,2], the function should return 5.
Given A = [1,2,3], the function should return 4.
Given A = [-1, -3], the function should return 1.
Write an efficient algorithm for the following assumptions.
N is an integer within the range [1..100,000];
each element of array A is an integer within the range [-1,000,000..1,000,000].
I wrote the following algorithm in Java:
public class TestCodility {
public static void main(String args[]){
int a[] = {1,3,6,4,1,2};
//int a[] = {1,2,3};
//int b[] = {-1,-3};
int element = 0;
//checks if the array "a" was traversed until the last position
int countArrayLenght = 0;
loopExtern:
for(int i = 0; i < 1_000_000; i++){
element = i + 1;
countArrayLenght = 0;
loopIntern:
for(int j = 0; j < a.length; j++){
if(element == a[j]){
break loopIntern;
}
countArrayLenght++;
}
if(countArrayLenght == a.length && element > 0){
System.out.println("Smallest possible " + element);
break loopExtern;
}
}
}
}
It does the job but I am pretty sure that it is not efficient. So my question is, how to improve this algorithm so that it becomes efficient?
You should get a grasp on Big O, and runtime complexities.
Its a universal construct for better understanding the implementation of efficiency in code.
Check this website out, it shows the graph for runtime complexities in terms of Big O which can aid you in your search for more efficient programming.
http://bigocheatsheet.com/
However, long story short...
The least amount of operations and memory consumed by an arbitrary program is the most efficient way to achieve something you set out to do with your code.
You can make something more efficient by reducing redundancy in your algorithms and getting rid of any operation that does not need to occur to achieve what you are trying to do
Point is to sort your array and then iterate over it. With sorted array you can simply skip all negative numbers and then find minimal posible element that you need.
Here more general solution for your task:
import java.util.Arrays;
public class Main {
public static int solution(int[] A) {
int result = 1;
Arrays.sort(A);
for(int a: A) {
if(a > 0) {
if(result == a) {
result++;
} else if (result < a){
return result;
}
}
}
return result;
}
public static void main(String args[]){
int a[] = {1,3,6,4,1,2};
int b[] = {1,2,3};
int c[] = {-1,-3};
System.out.println("a) Smallest possible " + solution(a)); //prints 5
System.out.println("b) Smallest possible " + solution(b)); //prints 4
System.out.println("c) Smallest possible " + solution(c)); //prints 1
}
}
Complexity of that algorithm should be O(n*log(n))
The main idea is the same as Denis.
First sort, then process but using java8 feature.
There are few methods that may increase timings.(not very sure how efficient java 8 process them:filter,distinct and even take-while ... in the worst case you have here something similar with 3 full loops. One additional loop is for transforming array into stream). Overall you should get the same run-time complexity.
One advantage could be on verbosity, but also need some additional knowledge compared with Denis solution.
import java.util.function.Supplier;
import java.util.stream.IntStream;
public class AMin
{
public static void main(String args[])
{
int a[] = {-2,-3,1,2,3,-7,5,6};
int[] i = {1} ;
// get next integer starting from 1
Supplier<Integer> supplier = () -> i[0]++;
//1. transform array into specialized int-stream
//2. keep only positive numbers : filter
//3. keep no duplicates : distinct
//4. sort by natural order (ascending)
//5. get the maximum stream based on criteria(predicate) : longest consecutive numbers starting from 1
//6. get the number of elements from the longest "sub-stream" : count
long count = IntStream.of(a).filter(t->t>0).distinct().sorted().takeWhile(t->t== supplier.get()).count();
count = (count==0) ? 1 : ++count;
//print 4
System.out.println(count);
}
}
There are many solutions with O(n) space complexity and O(n) type complexity. You can convert array to;
set: array to set and for loop (1...N) check contains number or not. If not return number.
hashmap: array to map and for loop (1...N) check contains number or not. If not return number.
count array: convert given array to positive array count array like if arr[i] == 5, countArr[5]++, if arr[i] == 1, countArr[1]++ then check each item in countArr with for loop (1...N) whether greate than 1 or not. If not return it.
For now, looking more effective algoritm like #Ricola mentioned. Java solution with O(n) time complexity and O(1) space complexity:
static void swap(final int arr[], final int i,final int j){
final int temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
}
static boolean isIndexInSafeArea(final int arr[], final int i){
return arr[i] > 0 && arr[i] - 1 < arr.length && arr[i] != i + 1 ;
}
static int solution(final int arr[]){
for (int i = 0; i < arr.length; i++) {
while (isIndexInSafeArea(arr,i) && arr[i] != arr[arr[i] - 1]) {
swap(arr, i, arr[i] - 1);
}
}
for (int i = 0; i < arr.length; i++) {
if (arr[i] != i + 1) {
return i+1;
}
}
return arr.length + 1;
}

Java: how ot optimize sum of big array

I try to solve one problem on codeforces. And I get Time limit exceeded judjment. The only time consuming operation is calculation sum of big array. So I've tried to optimize it, but with no result.
What I want: Optimize the next function:
//array could be Integer.MAX_VALUE length
private long canocicalSum(int[] array) {
int sum = 0;
for (int i = 0; i < array.length; i++)
sum += array[i];
return sum;
}
Question1 [main]: Is it possible to optimize canonicalSum?
I've tried: to avoid operations with very big numbers. So i decided to use auxiliary data. For instance, I convert array1[100] to array2[10], where array2[i] = array1[i] + array1[i+1] + array1[i+9].
private long optimizedSum(int[] array, int step) {
do {
array = sumItr(array, step);
} while (array.length != 1);
return array[0];
}
private int[] sumItr(int[] array, int step) {
int length = array.length / step + 1;
boolean needCompensation = (array.length % step == 0) ? false : true;
int aux[] = new int[length];
for (int i = 0, auxSum = 0, auxPointer = 0; i < array.length; i++) {
auxSum += array[i];
if ((i + 1) % step == 0) {
aux[auxPointer++] = auxSum;
auxSum = 0;
}
if (i == array.length - 1 && needCompensation) {
aux[auxPointer++] = auxSum;
}
}
return aux;
}
Problem: But it appears that canonicalSum is ten times faster than optimizedSum. Here my test:
#Test
public void sum_comparison() {
final int ARRAY_SIZE = 100000000;
final int STEP = 1000;
int[] array = genRandomArray(ARRAY_SIZE);
System.out.println("Start canonical Sum");
long beg1 = System.nanoTime();
long sum1 = canocicalSum(array);
long end1 = System.nanoTime();
long time1 = end1 - beg1;
System.out.println("canon:" + TimeUnit.MILLISECONDS.convert(time1, TimeUnit.NANOSECONDS) + "milliseconds");
System.out.println("Start optimizedSum");
long beg2 = System.nanoTime();
long sum2 = optimizedSum(array, STEP);
long end2 = System.nanoTime();
long time2 = end2 - beg2;
System.out.println("custom:" + TimeUnit.MILLISECONDS.convert(time2, TimeUnit.NANOSECONDS) + "milliseconds");
assertEquals(sum1, sum2);
assertTrue(time2 <= time1);
}
private int[] genRandomArray(int size) {
int[] array = new int[size];
Random random = new Random();
for (int i = 0; i < array.length; i++) {
array[i] = random.nextInt();
}
return array;
}
Question2: Why optimizedSum works slower than canonicalSum?
As of Java 9, vectorisation of this operation has been implemented but disabled, based on benchmarks measuring the all-in cost of the code plus its compilation. Depending on your processor, this leads to the relatively entertaining result that if you introduce artificial complications into your reduction loop, you can trigger autovectorisation and get a quicker result! So the fastest code, for now, assuming numbers small enough not to overflow, is:
public int sum(int[] data) {
int value = 0;
for (int i = 0; i < data.length; ++i) {
value += 2 * data[i];
}
return value / 2;
}
This isn't intended as a recommendation! This is more to illustrate that the speed of your code in Java is dependent on the JIT, its trade-offs, and its bugs/features in any given release. Writing cute code to optimise problems like this is at best vain and will put a shelf life on the code you write. For instance, had you manually unrolled a loop to optimise for an older version of Java, your code would be much slower in Java 8 or 9 because this decision would completely disable autovectorisation. You'd better really need that performance to do it.
Question1 [main]: Is it possible to optimize canonicalSum?
Yes, it is. But I have no idea with what factor.
Some things you can do are:
use the parallel pipelines introduced in Java 8. The processor has instruction for doing parallel sum of 2 arrays (and more). This can be observed in Octave when you sum two vectors with ".+" (parallel addition) or "+" it is way faster than using a loop.
use multithreading. You could use a divide and conquer algorithm. Maybe like this:
divide the array into 2 or more
keep dividing recursively until you get an array with manageable size for a thread.
start computing the sum for the sub arrays (divided arrays) with separate threads.
finally add the sum generated (from all the threads) for all sub arrays together to produce final result
maybe unrolling the loop would help a bit, too. By loop unrolling I mean reducing the steps the loop will have to make by doing more operations in the loop manually.
An example from http://en.wikipedia.org/wiki/Loop_unwinding :
for (int x = 0; x < 100; x++)
{
delete(x);
}
becomes
for (int x = 0; x < 100; x+=5)
{
delete(x);
delete(x+1);
delete(x+2);
delete(x+3);
delete(x+4);
}
but as mentioned this must be done with caution and profiling since the JIT could do this kind of optimizations itself probably.
A implementation for mathematical operations for the multithreaded approach can be seen here.
The example implementation with the Fork/Join framework introduced in java 7 that basically does what the divide and conquer algorithm above does would be:
public class ForkJoinCalculator extends RecursiveTask<Double> {
public static final long THRESHOLD = 1_000_000;
private final SequentialCalculator sequentialCalculator;
private final double[] numbers;
private final int start;
private final int end;
public ForkJoinCalculator(double[] numbers, SequentialCalculator sequentialCalculator) {
this(numbers, 0, numbers.length, sequentialCalculator);
}
private ForkJoinCalculator(double[] numbers, int start, int end, SequentialCalculator sequentialCalculator) {
this.numbers = numbers;
this.start = start;
this.end = end;
this.sequentialCalculator = sequentialCalculator;
}
#Override
protected Double compute() {
int length = end - start;
if (length <= THRESHOLD) {
return sequentialCalculator.computeSequentially(numbers, start, end);
}
ForkJoinCalculator leftTask = new ForkJoinCalculator(numbers, start, start + length/2, sequentialCalculator);
leftTask.fork();
ForkJoinCalculator rightTask = new ForkJoinCalculator(numbers, start + length/2, end, sequentialCalculator);
Double rightResult = rightTask.compute();
Double leftResult = leftTask.join();
return leftResult + rightResult;
}
}
Here we develop a RecursiveTask splitting an array of doubles until
the length of a subarray doesn't go below a given threshold. At this
point the subarray is processed sequentially applying on it the
operation defined by the following interface
The interface used is this:
public interface SequentialCalculator {
double computeSequentially(double[] numbers, int start, int end);
}
And the usage example:
public static double varianceForkJoin(double[] population){
final ForkJoinPool forkJoinPool = new ForkJoinPool();
double total = forkJoinPool.invoke(new ForkJoinCalculator(population, new SequentialCalculator() {
#Override
public double computeSequentially(double[] numbers, int start, int end) {
double total = 0;
for (int i = start; i < end; i++) {
total += numbers[i];
}
return total;
}
}));
final double average = total / population.length;
double variance = forkJoinPool.invoke(new ForkJoinCalculator(population, new SequentialCalculator() {
#Override
public double computeSequentially(double[] numbers, int start, int end) {
double variance = 0;
for (int i = start; i < end; i++) {
variance += (numbers[i] - average) * (numbers[i] - average);
}
return variance;
}
}));
return variance / population.length;
}
If you want to add N numbers then the runtime is O(N). So in this aspect your canonicalSum can not be "optimized".
What you can do to reduce runtime is make the summation parallel. I.e. break the array to parts and pass it to separate threads and in the end sum the result returned by each thread.
Update: This implies multicore system but there is a java api to get the number of cores

tukey's ninther for different shufflings of same data

While implementing improvements to quicksort partitioning,I tried to use Tukey's ninther to find the pivot (borrowing almost everything from sedgewick's implementation in QuickX.java)
My code below gives different results each time the array of integers is shuffled.
import java.util.Random;
public class TukeysNintherDemo{
public static int tukeysNinther(Comparable[] a,int lo,int hi){
int N = hi - lo + 1;
int mid = lo + N/2;
int delta = N/8;
int m1 = median3a(a,lo,lo+delta,lo+2*delta);
int m2 = median3a(a,mid-delta,mid,mid+delta);
int m3 = median3a(a,hi-2*delta,hi-delta,hi);
int tn = median3a(a,m1,m2,m3);
return tn;
}
// return the index of the median element among a[i], a[j], and a[k]
private static int median3a(Comparable[] a, int i, int j, int k) {
return (less(a[i], a[j]) ?
(less(a[j], a[k]) ? j : less(a[i], a[k]) ? k : i) :
(less(a[k], a[j]) ? j : less(a[k], a[i]) ? k : i));
}
private static boolean less(Comparable x,Comparable y){
return x.compareTo(y) < 0;
}
public static void shuffle(Object[] a) {
Random random = new Random(System.currentTimeMillis());
int N = a.length;
for (int i = 0; i < N; i++) {
int r = i + random.nextInt(N-i); // between i and N-1
Object temp = a[i];
a[i] = a[r];
a[r] = temp;
}
}
public static void show(Comparable[] a){
int N = a.length;
if(N > 20){
System.out.format("a[0]= %d\n", a[0]);
System.out.format("a[%d]= %d\n",N-1, a[N-1]);
}else{
for(int i=0;i<N;i++){
System.out.print(a[i]+",");
}
}
System.out.println();
}
public static void main(String[] args) {
Integer[] a = new Integer[]{17,15,14,13,19,12,11,16,18};
System.out.print("data= ");
show(a);
int tn = tukeysNinther(a,0,a.length-1);
System.out.println("ninther="+a[tn]);
}
}
Running this a cuople of times gives
data= 11,14,12,16,18,19,17,15,13,
ninther=15
data= 14,13,17,16,18,19,11,15,12,
ninther=14
data= 16,17,12,19,18,13,14,11,15,
ninther=16
Will tuckey's ninther give different values for different shufflings of the same dataset? when I tried to find the median of medians by hand ,I found that the above calculations in the code are correct.. which means that the same dataset yield different results unlike a median of the dataset.Is this the proper behaviour? Can someone with more knowledge in statistics comment?
Tukey's ninther examines 9 items and calculates the median using only those.
For different random shuffles, you may very well get a different Tukey's ninther, because different items may be examined. After all, you always examine the same array slots, but a different shuffle may have put different items in those slots.
The key here is that Tukey's ninther is not the median of the given array. It is an attempted appromixation of the median, made with very little effort: we only have to read 9 items and make 12 comparisons to get it. This is much faster than getting the actual median, and has a smaller chance of resulting in an undesirable pivot compared to the 'median of three'. Note that the chance still exists.
Does this answer you question?
On a side note, does anybody know if quicksort using Tukey's ninther still requires shuffling? I'm assuming yes, but I'm not certain.

Timing quick sort algorithm

Hey I seem to be having a problem trying to implement some Java quick sort code over an array of 10,000 random numbers. I have a text file containing the numbers which are placed into an array, which is then passed to the sorting algorithm to be sorted. My aim is to time how long it takes to time the sorting increasing the numbers sorted each time using the timing loop I have. But for some reason using this code gives me a curved graph instead of a straight linear line. I know the timing loop and array code work fine so there seems to be a problem with the sorting code but can't seem to find anything! Any help is greatly appreciated thanks!
import java.io.*;
import java.util.*;
public class Quicksort {
public static void main(String args[]) throws IOException {
//Import the random integer text file into an integer array
File fil = new File("randomASC.txt");
FileReader inputFil = new FileReader(fil);
int [] myarray = new int [10000];
Scanner in = new Scanner(inputFil);
for(int q = 0; q < myarray.length; q++)
{
myarray[q] = in.nextInt();
}
in.close();
for (int n = 100; n < 10000; n += 100) {
long total = 0;
for (int r = 0; r < 10; ++r) {
long start = System.nanoTime ();
quickSort(myarray,0,n-1);
total += System.nanoTime() - start;
}
System.out.println (n + "," + (double)total / 10.0);
}
}
public static void quickSort(int[] a, int p, int r)
{
if(p<r)
{
int q=partition(a,p,r);
quickSort(a,p,q);
quickSort(a,q+1,r);
}
}
private static int partition(int[] a, int p, int r) {
int x = a[p];
int i = p-1 ;
int j = r+1 ;
while (true) {
i++;
while ( i< r && a[i] < x)
i++;
j--;
while (j>p && a[j] > x)
j--;
if (i < j)
swap(a, i, j);
else
return j;
}
}
private static void swap(int[] a, int i, int j) {
// TODO Auto-generated method stub
int temp = a[i];
a[i] = a[j];
a[j] = temp;
}
}
Only the first iteration of the inner loop actually sorts the array that you've read from the file. All the subsequent iterations are applied to the already-sorted array.
But for some reason using this code gives me a curved graph instead of a straight linear line.
If you mean that the run time grows non-linearly in n, that's to be expected since quicksort is not a linear-time algorithm (no comparison sort is).
Your performance graph looks like a nice quadratic function:
You're getting quadratic rather than O(n log(n)) time due to your choice of pivot: since most of the time you're calling your function on a sorted array, your method of choosing the pivot means you're hitting the worst case every single time.

Categories