Complexity of sorting algorithm - java

Here is a sorting algorithm, not a clever one. In this version, it works well when elements are non-negative and occur at most once. I'm confused about its time complexity. Is it O(n)? So is it better than quick sort in terms of that notation? Thanks. Here is the code:
public int[] stupidSort( int[] array ){
// Variables
int max = array[0];
int index = 0;
int[] lastArray = new int[array.length];
// Find max element in input array
for( int i = 0; i < array.length; i++ ){
if ( array[i] > max )
max = array[i];
}
// Create a new array. In this array, element n will represent number of n's in input array
int[] newArray = new int[max + 1];
for ( int j = 0; j < array.length; j++ )
newArray[array[j]]++;
// If element is bigger than 0, it means that number occured in input. So put it in output array
for( int k = 0; k < newArray.length; k++ ){
if( newArray[k] > 0 )
lastArray[index++] = k;
}
return lastArray;
}

What you wrote is the counting sort, and it has O(n) complexity indeed. However, it cannot be compared to QuickSort because QuickSort is an algorithm based on comparisons. These 2 algorithms belong to different categories (yours is a non-comparison, quicksort is a comparison algorithm). Your algorithm (counting sort) makes the assumption that the range of numbers in the array is known and that all numbers are integer, whereas QuickSort works for every number.
You can learn more for sorting algorithms here. In that link you can see the complexity for sorting algorithms divided in the 2 categories: comparison and non-comparison.
EDIT
As Paul Hankin pointed out the complexity isn't always O(n). It is O(n+k) where k is the max of the input array. Quoted below is the time complexity as explained in the wikipedia article for the counting sort:
Because the algorithm uses only simple for loops, without recursion or subroutine calls, it is straightforward to analyze. The initialization of the count array, and the second for loop which performs a prefix sum on the count array, each iterate at most k + 1 times and therefore take O(k) time. The other two for loops, and the initialization of the output array, each take O(n) time. Therefore, the time for the whole algorithm is the sum of the times for these steps, O(n + k).

The given algorithm is very much similar to Count sort. Whereas QuickSort is a comparison model based sorting algorithm. Only in the worst case QuickSort gives O(n^2) time complexity, otherwise it is O(nlogn). Also the QuickSort is used with it's randomized version that is pivot is selected randomly and hence the worst-case is avoided most often this way.
Edit: Correction as pointed by paulhankin in comment complexity=O(n+k):
The code you have put forth is using counting based sort, that is count sort and your code's time complexity is O(n+k). But what you must realize is that this algorithm is dependent on the range of the input and that range could be anything. Furthermore this algorithm is not InPlace but Stable. In many cases the data you want to sort is not only integer rather the data can be anything with a key that is required to be sorted with the help of the key. If Stable algorithm is not used than in such a case sorting can be problematic.
Just in case if someone does not know:
InPlace Algorithm: Is the one in which additional space required is not dependent on the given input.
Stable Algorithm: Is the one in which for example if there were two 5's in the data set before sorting, the 5 that came first before sorting comes first than the second even after the sorting.
EDIT: (Regarding aladinsane7's comment): Yes the formal version of countSort does handle this aspect also. It would be good if you have a look at CountSort. And its time complexity is O(n+k). Where K accounts for the range of data and n is complexity for remaining algorithm.

Related

I believe this algorithm is O(N). Am I correct?

This algorithm reverses an array of N integers. I believe this algorithm is O(N) because for each loop iteration, the four lines of code are executed once thus completing the job in 4N time.
public static void reverseTheNumbers(int[] list) {
for (int i = 0; i < list.length / 2; i++) {
int j = list.length - 1 - i;
int temp = list[i];
list[i] = list[j];
list[j] = temp;
}
}
There isn't such a thing as 4N time. The algorithm is linear because as you increase the size of the input the runtime of the algorithm increases proportionally. In other words if you doubled the size of list you would expect the algorithm to take twice as long.
It doesn't matter how many operations you do inside your loop - as long as they are each constant time (relative to the input) the runtime of the loop is determined simply by the number of iterations.
Put another way, these four statements are - all together - an O(1) operation.
int j = list.length - 1 - i;
int temp = list[i];
list[i] = list[j];
list[j] = temp;
There's nothing significant about the fact that this sequence of steps is expressed in four statements in Java syntax - experimenting with javap suggests these four lines compiles into ~20 bytecode commands, and who knows how many processor instructions that bytecode gets converted into. The good news is Big-O notation works the same regardless of the particular syntax - a sequence of operations is O(1) or constant time if its execution time is the same regardless of the input.
Therefore you're doing an O(1) operation N times; aka O(N).
Yes, you are correct. The number of operations is linearly dependent on the size of the array (N), making it an O(N) algorithm.
Yes, the complexity of the algorithm is O(n).
However, the exact "time" (because there are no constant factors in asymptotic complexity, check comment below) is not 4 times the size of the array, we could say it is 1/2*(c1+c2+c3+c3) times the size of the array, where 1/2 corresponds to each loop iteration and each c corresponds to the time needed for each operation inside theloop.
It would be 4 times the size of the array, if the algorithm was iterating the whole array 4 times.

repeated element in Array

I have an array of N elements and contain 1 to (N-1) integers-a sequence of integers starting from 1 to the max number N-1-, meaning that there is only one number is repeated, and I want to write an algorithm that return this repeated element, I have found a solution but it only could work if the array is sorted, which is may not be the case.
?
int i=0;
while(i<A[i])
{
i++
}
int rep = A[i];
I do not know why RC removed his comment but his idea was good.
With the knowledge of N you easy can calculate that the sum of [1:N-1]. then sum up all elementes in your array and subtract the above sum and you have your number.
This comes at the cost of O(n) and is not beatable.
However this only works with the preconditions you mentioned.
A more generic approach would be to sort the array and then simply walk through it. This would be O(n log(n)) and still better than your O(n²).
I you know the maximum number you may create a lookup table and init it with all zeros, walk through the array and check for one and mark the entries with one. The complexity is also just O(n) but at the expense of memory.
if the value range is unknown a simiar approach can be used but instead of using a lookup table a hashset canbe used.
Linear search will help you with complexity O(n):
final int n = ...;
final int a[] = createInput(n); // Expect each a[i] < n && a[i] >= 0
final int b[] = new int[n];
for (int i = 0; i < n; i++)
b[i]++;
for (int i = 0; i < n; i++)
if (b[i] >= 2)
return a[i];
throw new IllegalArgumentException("No duplicates found");
A possible solution is to sum all elements in the array and then to compute the sym of the integers up to N-1. After that subtract the two values and voila - you found your number. This is the solution proposed by vlad_tepesch and it is good, but has a drawback - you may overflow the integer type. To avoid this you can use 64 bit integer.
However I want to propose a slight modification - compute the xor sum of the integers up to N-1(that is compute 1^2^3^...(N-1)) and compute the xor sum of your array(i.e. a0^a1^...aN-1). After that xor the two values and the result will be the repeated element.

Efficiently determine the parity of a permutation

I have an int[] array of length N containing the values 0, 1, 2, .... (N-1), i.e. it represents a permutation of integer indexes.
What's the most efficient way to determine if the permutation has odd or even parity?
(I'm particularly keen to avoid allocating objects for temporary working space if possible....)
I think you can do this in O(n) time and O(n) space by simply computing the cycle decomposition.
You can compute the cycle decomposition in O(n) by simply starting with the first element and following the path until you return to the start. This gives you the first cycle. Mark each node as visited as you follow the path.
Then repeat for the next unvisited node until all nodes are marked as visited.
The parity of a cycle of length k is (k-1)%2, so you can simply add up the parities of all the cycles you have discovered to find the parity of the overall permutation.
Saving space
One way of marking the nodes as visited would be to add N to each value in the array when it is visited. You would then be able to do a final tidying O(n) pass to turn all the numbers back to the original values.
I selected the answer by Peter de Rivaz as the correct answer as this was the algorithmic approach I ended up using.
However I used a couple of extra optimisations so I thought I would share them:
Examine the size of data first
If it is greater than 64, use a java.util.BitSet to store the visited elements
If it is less than or equal to 64, use a long with bitwise operations to store the visited elements. This makes it O(1) space for many applications that only use small permutations.
Actually return the swap count rather than the parity. This gives you the parity if you need it, but is potentially useful for other purposes, and is no more expensive to compute.
Code below:
public int swapCount() {
if (length()<=64) {
return swapCountSmall();
} else {
return swapCountLong();
}
}
private int swapCountLong() {
int n=length();
int swaps=0;
BitSet seen=new BitSet(n);
for (int i=0; i<n; i++) {
if (seen.get(i)) continue;
seen.set(i);
for(int j=data[i]; !seen.get(j); j=data[j]) {
seen.set(j);
swaps++;
}
}
return swaps;
}
private int swapCountSmall() {
int n=length();
int swaps=0;
long seen=0;
for (int i=0; i<n; i++) {
long mask=(1L<<i);
if ((seen&mask)!=0) continue;
seen|=mask;
for(int j=data[i]; (seen&(1L<<j))==0; j=data[j]) {
seen|=(1L<<j);
swaps++;
}
}
return swaps;
}
You want the parity of the number of inversions. You can do this in O(n * log n) time using merge sort, but either you lose the initial array, or you need extra memory on the order of O(n).
A simple algorithm that uses O(n) extra space and is O(n * log n):
inv = 0
mergesort A into a copy B
for i from 1 to length(A):
binary search for position j of A[i] in B
remove B[j] from B
inv = inv + (j - 1)
That said, I don't think it's possible to do it in sublinear memory. See also:
https://cs.stackexchange.com/questions/3200/counting-inversion-pairs
https://mathoverflow.net/questions/72669/finding-the-parity-of-a-permutation-in-little-space
Consider this approach...
From the permutation, get the inverse permutation, by swapping the rows and
sorting according to the top row order. This is O(nlogn)
Then, simulate performing the inverse permutation and count the swaps, for O(n). This should give the parity of the permutation, according to this
An even permutation can be obtained as the composition of an even
number and only an even number of exchanges (called transpositions) of
two elements, while an odd permutation be obtained by (only) an odd
number of transpositions.
from Wikipedia.
Here's some code I had lying around, which performs an inverse permutation, I just modified it a bit to count swaps, you can just remove all mention of a, p contains the inverse permutation.
size_t
permute_inverse (std::vector<int> &a, std::vector<size_t> &p) {
size_t cnt = 0
for (size_t i = 0; i < a.size(); ++i) {
while (i != p[i]) {
++cnt;
std::swap (a[i], a[p[i]]);
std::swap (p[i], p[p[i]]);
}
}
return cnt;
}

Time Complexity for the given code.

Below code is to find how many times a number is shown in an array.
For Example:
1,2,2,2,3,4,5,5,5,5
number 2 = 3 times
number 5 = 4 times.
What is the time complexity in Java for the below code?
What is the best way to solve this problem in respect to Time complexity?
public static void main(String[]args)
{
int[] data = {1,1,2,3,4,4,4,5,6,7,8,8,8,8};
System.out.println(count(data,8));
}
public static int count(int[] a, int x)
{
int count=0;
int index=0;
while(index<a.length)
{
if(a[index]==x)
{
count++;
}
index++;
}
return count;
}
is it o(n) ,logN or else?
You look at every element once so it is O(n)
if it is o(n), can I can i do this task with LogN complicity?
Do a binary search with a low and high value searching for the value just less than the one you search for and just above. The difference between these two result will tell you how many there are. This is an O(Log N) search.
while(index<a.length)
This loop is run once per value in data. This is O(n). If you want to do this in O(log n) you will need a sorted array and a binary search. You have a sorted array, so you would just need to do a binary search.
The answer is O(n).
You loop through the array looking once at every index.
e.g. array size is 10 --> 10 comparisons, array size is 100 --> 100 comparisons and so on.
You examine each element of the array, and therefore your code has O(n) time complexity.
To do it in O(log n) time, you have to exploit the fact that the array is sorted. This can be done with a variant of binary search. Since this looks like homework, I'll let you think about it for a bit before offering any more hints.
O(n)
This code uses each element of the array.
while(index<a.length)
You can replace it on
for(int index = 0; index < a.length; i++)
O(n).When the code iterates through each and every element of the array,its 0(n).

Fastest strategy to form and sort an array of positive integers

In Java, what is faster: to create, fill in and then sort an array of ints like below
int[] a = int[1000];
for (int i = 0; i < a.length; i++) {
// not sure about the syntax
a[i] = Maths.rand(1, 500) // generate some random positive number less than 500
}
a.sort(); // (which algorithm is best?)
or insert-sort on the fly
int[] a = int[1000];
for (int i = 0; i < a.length; i++) {
// not sure about the syntax
int v = Maths.rand(1, 500) // generate some random positive number less than 500
int p = findPosition(a, v); // where to insert
if (a[p] == 0) a[p] = v;
else {
shift a by 1 to the right
a[p] = v;
}
}
There are many ways that you could do this:
Build the array and sort as you go. This is likely to be very slow, since the time required to move array elements over to make space for the new element will almost certainly dominate the sorting time. You should expect this to take at best Ω(n2) time, where n is the number of elements that you want to put into the array, regardless of the algorithm used. Doing insertion sort on-the-fly will take expected O(n2) time here.
Build the array unsorted, then sort it. This is probably going to be extremely fast if you use a good sorting algorithm, such as quicksort or radix sort. You should expect this to take O(n log n) time (for quicksort) or O(n lg U) time (for radix sort), where n is the number of values and U is the largest value.
Add the numbers incrementally to a priority queue, then dequeue all elements from the priority queue. Depending on how you implement the priority queue, this could be very fast. For example, using a binary heap here would cause this process to take O(n log n) time, and using a van Emde Boas tree would take O(n lg lg U) time, where U is the largest number that you are storing. That said, the constant factors here would likely make this approach slower than just sorting the values.
Hope this helps!

Categories