I have an array of N elements and contain 1 to (N-1) integers-a sequence of integers starting from 1 to the max number N-1-, meaning that there is only one number is repeated, and I want to write an algorithm that return this repeated element, I have found a solution but it only could work if the array is sorted, which is may not be the case.
?
int i=0;
while(i<A[i])
{
i++
}
int rep = A[i];
I do not know why RC removed his comment but his idea was good.
With the knowledge of N you easy can calculate that the sum of [1:N-1]. then sum up all elementes in your array and subtract the above sum and you have your number.
This comes at the cost of O(n) and is not beatable.
However this only works with the preconditions you mentioned.
A more generic approach would be to sort the array and then simply walk through it. This would be O(n log(n)) and still better than your O(n²).
I you know the maximum number you may create a lookup table and init it with all zeros, walk through the array and check for one and mark the entries with one. The complexity is also just O(n) but at the expense of memory.
if the value range is unknown a simiar approach can be used but instead of using a lookup table a hashset canbe used.
Linear search will help you with complexity O(n):
final int n = ...;
final int a[] = createInput(n); // Expect each a[i] < n && a[i] >= 0
final int b[] = new int[n];
for (int i = 0; i < n; i++)
b[i]++;
for (int i = 0; i < n; i++)
if (b[i] >= 2)
return a[i];
throw new IllegalArgumentException("No duplicates found");
A possible solution is to sum all elements in the array and then to compute the sym of the integers up to N-1. After that subtract the two values and voila - you found your number. This is the solution proposed by vlad_tepesch and it is good, but has a drawback - you may overflow the integer type. To avoid this you can use 64 bit integer.
However I want to propose a slight modification - compute the xor sum of the integers up to N-1(that is compute 1^2^3^...(N-1)) and compute the xor sum of your array(i.e. a0^a1^...aN-1). After that xor the two values and the result will be the repeated element.
Related
Here is a sorting algorithm, not a clever one. In this version, it works well when elements are non-negative and occur at most once. I'm confused about its time complexity. Is it O(n)? So is it better than quick sort in terms of that notation? Thanks. Here is the code:
public int[] stupidSort( int[] array ){
// Variables
int max = array[0];
int index = 0;
int[] lastArray = new int[array.length];
// Find max element in input array
for( int i = 0; i < array.length; i++ ){
if ( array[i] > max )
max = array[i];
}
// Create a new array. In this array, element n will represent number of n's in input array
int[] newArray = new int[max + 1];
for ( int j = 0; j < array.length; j++ )
newArray[array[j]]++;
// If element is bigger than 0, it means that number occured in input. So put it in output array
for( int k = 0; k < newArray.length; k++ ){
if( newArray[k] > 0 )
lastArray[index++] = k;
}
return lastArray;
}
What you wrote is the counting sort, and it has O(n) complexity indeed. However, it cannot be compared to QuickSort because QuickSort is an algorithm based on comparisons. These 2 algorithms belong to different categories (yours is a non-comparison, quicksort is a comparison algorithm). Your algorithm (counting sort) makes the assumption that the range of numbers in the array is known and that all numbers are integer, whereas QuickSort works for every number.
You can learn more for sorting algorithms here. In that link you can see the complexity for sorting algorithms divided in the 2 categories: comparison and non-comparison.
EDIT
As Paul Hankin pointed out the complexity isn't always O(n). It is O(n+k) where k is the max of the input array. Quoted below is the time complexity as explained in the wikipedia article for the counting sort:
Because the algorithm uses only simple for loops, without recursion or subroutine calls, it is straightforward to analyze. The initialization of the count array, and the second for loop which performs a prefix sum on the count array, each iterate at most k + 1 times and therefore take O(k) time. The other two for loops, and the initialization of the output array, each take O(n) time. Therefore, the time for the whole algorithm is the sum of the times for these steps, O(n + k).
The given algorithm is very much similar to Count sort. Whereas QuickSort is a comparison model based sorting algorithm. Only in the worst case QuickSort gives O(n^2) time complexity, otherwise it is O(nlogn). Also the QuickSort is used with it's randomized version that is pivot is selected randomly and hence the worst-case is avoided most often this way.
Edit: Correction as pointed by paulhankin in comment complexity=O(n+k):
The code you have put forth is using counting based sort, that is count sort and your code's time complexity is O(n+k). But what you must realize is that this algorithm is dependent on the range of the input and that range could be anything. Furthermore this algorithm is not InPlace but Stable. In many cases the data you want to sort is not only integer rather the data can be anything with a key that is required to be sorted with the help of the key. If Stable algorithm is not used than in such a case sorting can be problematic.
Just in case if someone does not know:
InPlace Algorithm: Is the one in which additional space required is not dependent on the given input.
Stable Algorithm: Is the one in which for example if there were two 5's in the data set before sorting, the 5 that came first before sorting comes first than the second even after the sorting.
EDIT: (Regarding aladinsane7's comment): Yes the formal version of countSort does handle this aspect also. It would be good if you have a look at CountSort. And its time complexity is O(n+k). Where K accounts for the range of data and n is complexity for remaining algorithm.
Good evening, I have an array in java with n integer numbers. I want to check if there is a subset of size k of the entries that satisfies the condition:
The sum of those k entries is a multiple of m.
How may I do this as efficiently as possible? There are n!/k!(n-k)! subsets that I need to check.
You can use dynamic programming. The state is (prefix length, sum modulo m, number of elements in a subset). Transitions are obvious: we either add one more number(increasing the number of elements in a subset and computing new sum modulo m), or we just increase prefix lenght(not adding the current number). If you just need a yes/no answer, you can store only the last layer of values and apply bit optimizations to compute transitions faster. The time complexity is O(n * m * k), or about n * m * k / 64 operations with bit optimizations. The space complexity is O(m * k). It looks feasible for a few thousands of elements. By bit optimizations I mean using things like bitset in C++ that can perform an operation on a group of bits at the same time using bitwise operations.
I don't like this solution, but it may work for your needs
public boolean containsSubset( int[] a , int currentIndex, int currentSum, int depth, int divsor, int maxDepth){
//you could make a, maxDepth, and divisor static as well
//If maxDepthis equal to depth, then our subset has k elements, in addition the sum of
//elements must be divisible by out divsor, m
//If this condition is satisafied, then there exists a subset of size k whose sum is divisible by m
if(depth==maxDepth&¤tSum%divsor==0)
return true;
//If the depth is greater than or equal maxDepth, our subset has more than k elements, thus
//adding more elements can not satisfy the necessary conditions
//additionally we know that if it contains k elements and is divisible by m, it would've satisafied the above condition.
if(depth>=maxdepth)
return false;
//boolean to be returned, initialized to false because we have not found any sets yet
boolean ret = false;
//iterate through all remaining elements of our array
for (int i = currentIndex+1; i < a.length; i++){
//this may be an optimization or this line
//for (int i = currentIndex+1; i < a.length-maxDepth+depth; i++){
//by recursing, we add a[i] to our set we then use an or operation on all our subsets that could
//be constructed from the numbers we have so far so that if any of them satisfy our condition (return true)
//then the value of the variable ret will be true
ret |= containsSubset(a,i,currentSum+a[i],depth+1,divisor, maxDepth);
} //end for
//return the variable storing whether any sets of numbers that could be constructed from the numbers so far.
return ret;
}
Then invoke this method as such
//this invokes our method with "no numbers added to our subset so far" so it will try adding
// all combinations of other elements to determine if the condition is satisfied.
boolean answer = containsSubset(myArray,-1,0,0,m,k);
EDIT:
You could probably optimize this by taking everything modulo (%) m and deleting repeats. For examples with large values of n and/or k, but small values of m, this could be a pretty big optimization.
EDIT 2:
The above optimization I listed isn't helpful. You may need the repeats to get the correct information. My bad.
Happy Coding! Let me know if you have any questions!
If numbers have lower and upper bounds, it might be better to:
Iterate all multiples of n where lower_bound * k < multiple < upper_bound * k
Check if there is a subset with sum multiple in the array (see Subset Sum problem) using dynamic programming.
Complexity is O(k^2 * (lower_bound + upper_bound)^2). This approach can be optimized further, I believe with careful thinking.
Otherwise you can find all subsets of size k. Complexity is O(n!). Using backtracking (pseudocode-ish):
function find_subsets(array, k, index, current_subset):
if current_subset.size = k:
add current_subset to your solutions list
return
if index = array.size:
return
number := array[index]
add number to current_subset
find_subsets(array, k, index + 1, current_subset)
remove number from current_subset
find_subsets(array, k, index + 1, current_subset)
I have an int[] array of length N containing the values 0, 1, 2, .... (N-1), i.e. it represents a permutation of integer indexes.
What's the most efficient way to determine if the permutation has odd or even parity?
(I'm particularly keen to avoid allocating objects for temporary working space if possible....)
I think you can do this in O(n) time and O(n) space by simply computing the cycle decomposition.
You can compute the cycle decomposition in O(n) by simply starting with the first element and following the path until you return to the start. This gives you the first cycle. Mark each node as visited as you follow the path.
Then repeat for the next unvisited node until all nodes are marked as visited.
The parity of a cycle of length k is (k-1)%2, so you can simply add up the parities of all the cycles you have discovered to find the parity of the overall permutation.
Saving space
One way of marking the nodes as visited would be to add N to each value in the array when it is visited. You would then be able to do a final tidying O(n) pass to turn all the numbers back to the original values.
I selected the answer by Peter de Rivaz as the correct answer as this was the algorithmic approach I ended up using.
However I used a couple of extra optimisations so I thought I would share them:
Examine the size of data first
If it is greater than 64, use a java.util.BitSet to store the visited elements
If it is less than or equal to 64, use a long with bitwise operations to store the visited elements. This makes it O(1) space for many applications that only use small permutations.
Actually return the swap count rather than the parity. This gives you the parity if you need it, but is potentially useful for other purposes, and is no more expensive to compute.
Code below:
public int swapCount() {
if (length()<=64) {
return swapCountSmall();
} else {
return swapCountLong();
}
}
private int swapCountLong() {
int n=length();
int swaps=0;
BitSet seen=new BitSet(n);
for (int i=0; i<n; i++) {
if (seen.get(i)) continue;
seen.set(i);
for(int j=data[i]; !seen.get(j); j=data[j]) {
seen.set(j);
swaps++;
}
}
return swaps;
}
private int swapCountSmall() {
int n=length();
int swaps=0;
long seen=0;
for (int i=0; i<n; i++) {
long mask=(1L<<i);
if ((seen&mask)!=0) continue;
seen|=mask;
for(int j=data[i]; (seen&(1L<<j))==0; j=data[j]) {
seen|=(1L<<j);
swaps++;
}
}
return swaps;
}
You want the parity of the number of inversions. You can do this in O(n * log n) time using merge sort, but either you lose the initial array, or you need extra memory on the order of O(n).
A simple algorithm that uses O(n) extra space and is O(n * log n):
inv = 0
mergesort A into a copy B
for i from 1 to length(A):
binary search for position j of A[i] in B
remove B[j] from B
inv = inv + (j - 1)
That said, I don't think it's possible to do it in sublinear memory. See also:
https://cs.stackexchange.com/questions/3200/counting-inversion-pairs
https://mathoverflow.net/questions/72669/finding-the-parity-of-a-permutation-in-little-space
Consider this approach...
From the permutation, get the inverse permutation, by swapping the rows and
sorting according to the top row order. This is O(nlogn)
Then, simulate performing the inverse permutation and count the swaps, for O(n). This should give the parity of the permutation, according to this
An even permutation can be obtained as the composition of an even
number and only an even number of exchanges (called transpositions) of
two elements, while an odd permutation be obtained by (only) an odd
number of transpositions.
from Wikipedia.
Here's some code I had lying around, which performs an inverse permutation, I just modified it a bit to count swaps, you can just remove all mention of a, p contains the inverse permutation.
size_t
permute_inverse (std::vector<int> &a, std::vector<size_t> &p) {
size_t cnt = 0
for (size_t i = 0; i < a.size(); ++i) {
while (i != p[i]) {
++cnt;
std::swap (a[i], a[p[i]]);
std::swap (p[i], p[p[i]]);
}
}
return cnt;
}
How to pick exactly k bits from a Java BitSet of length m with n bits turned on, where k≤n≤m?
Example input: m=20, n=11
Example output: k=3
The naive approach
Choose a random number 0≤ i ≤ m-1.if it's turned on on the input and not turned on on the output, turn it on in the output, until k bits are turned on in the output.
This approach fails when n is much smaller than m. Any other ideas?
You could scan the set from the first bit to the last, and apply reservoir sampling to the bits that are set.
The algorithm has O(m) time complexity, and requires O(k) memory.
How about finding n positions of all set bits and placing them in a collection as the first step, and them choosing k positions from that collection randomly?
If the constraints allow it you can solve the task by:
Construct a List holding all the set bits indexes. Do Collections#shuffle on it. Choose the first k indexes from the shuffled list.
EDIT As per the comments this algorithm can be inefficient if k is really small, whilst n is big. Here is an alternative: generate k random, different numbers in the interval [0, n]. If in the generation of a number the number is already present in the set of chosen indices, do the chaining approach: that is increase the number by 1 until you get a number that is not yet present in the set. Finally the generated indices are those that you choose amongst the set bits.
If n is much larger than k, you can just pare down the Fisher-Yates shuffle algorithm to stop after you've chosen as many as you need:
private static Random rand = new Random();
public static BitSet chooseBits(BitSet b, int k) {
int n = b.cardinality();
int[] indices = new int[n];
// collect indices:
for (int i = 0, j = 0; i < n; i++) {
j=b.nextSetBit(j);
indices[i] =j++;
}
// create returning set:
BitSet ret = new BitSet(b.size());
// choose k bits:
for (int i = 0; i<k; i++) {
//The first n-i elements are still available.
//We choose one:
int pick = rand.nextInt(n-i);
//We add it to our returning set:
ret.set(indices[pick]);
//Then we replace it with the current (n-i)th element
//so that, when i is incremented, the
//first n-i elements are still available:
indices[pick] = indices[n-i-1];
}
return ret;
}
private void findLDS() {
Integer[] array = Arrays.copyOf(elephants.iq, elephants.iq.length);
Hashtable<Integer, Integer> eq = elephants.elephantiqs;
Integer[] lds = new Integer[array.length];
Integer[] prev= new Integer[array.length];
lds[0] = 0;
prev[0] = 0;
int maxlds = 1, ending=0;
for(int i = 0; i < array.length; ++i) {
lds[i] = 1;
prev[i] = -1;
for (int j = i; j >= 0; --j) {
if(lds[j] + 1 > lds[i] && array[j] > array[i] && eq.get(array[j]) < eq.get(array[i])) {
lds[i] = lds[j]+1;
prev[i] = j;
}
}
if(lds[i] > maxlds) {
ending = i;
maxlds = lds[i];
}
}
System.out.println(maxlds);
for(int i = ending; i >= 0; --i) {
if(prev[i] != -1) {
System.out.println(eq.get(array[prev[i]]));
}
}
I have based this algorithm on this SO question. This code is trying to find longest decreasing subsequence instead of increasing. array[] is sorted in descending order, and I also have a hashtable with the elephants IQ's as keys for their weights.
I'm having a hard time properly understanding DP, and I need some help.
My algorithm seems to work fine besides tracking the chosen sequence in prev[], where it always misses one element. Does anyone know how to do this?
A few ways to approach this one:
Sort by weight in decreasing order, then find the longest increasing subsequence.
Sort by IQ in decreasing order, then find the longest increasing subsequence of weights.
and 4. are just (1) and (2), switching the words "increasing" and "decreasing"
If you don't understand the DP for longest increasing subsequence O(N^2), it's basically this:
Since the list has to be strictly increasing/decreasing anyway, you can just eliminate some elephants beforehand to make the set unique.
Create an array, which I will call llis standing for "Longest Increasing Subsequence", of length N, the number of elephants there now are. Create another array called last with the same length. I will assume the sorted list of elephants is called array as it is in your problem statement.
Assuming that you've already sorted the elephants in decreasing order, you will want to find the longest increasing subsequence of IQs.
Tell yourself that the element in the array llis at index n (this is a different "n") < N will be the length of the longest increasing subsequence for the sub-array of array from index 0 to n, inclusive. Also say that the element in the next array at index n will be the next index in array in the longest increasing subsequence.
Therefore, finding the length of the longest increasing subsequence in the "sub-array" of 0 to N - 1 inclusive, which is also the whole array, would only require you to find the N - 1 th element in the array llis after the DP calculations, and finding the actual subsequence would simplify to following the indices in the next array.
Now that you know what you're looking for, you can proceed with the algorithm. At index n in the array, how do you know what the longest increasing subsequence is? Well, if you've calculated the length of the longest increasing subsequence and the last value in the subsequences for every k < n, you can try adding the elephant at index n to the longest increasing subsequence ending at k if the IQ of the elephant n is higher than the IQ of the elephant at k. In this case, the length of the longest increasing subsequence ending at elephant n would be llis[k] + 1. (Also, remember to set next[k] to be n, since the next elephant in the increasing subsequence will be the one at n.)
We've found the DP relation that llis[n] = max(llis[n], llis[k] + 1), after going through all k s that come strictly before n. Just process the n s in the right order (linearly) and you should get the correct result.
Procedure/warnings: 1) Process n in order from 0 to N - 1. 2) For every n, process k in order from n - 1 to 0 because you want to minimize the k that you choose. 3) After you're done processing, make sure to find the maximum number in the array llis to get your final result.
Since this is tagged as homework, I won't explicitly say how to modify this to find the longest decreasing subsequence, but I hope my explanation has helped with your understanding of DP. It should be easy to figure out the decreasing version on your own, if you choose to use it. (Note that this problem can be solved using the increasing version, as described in approaches 1 or 2.)