Finding duplicate element in an array?

Finding duplicate element in an array? - java

I saw a interview question as follows:
One number in array is duplicating.Find it
Simple solution is as follows:
for(int i=0;i<n;i++){
{
dup = false;
for(j=0;j<n;j++){
if(i!=j && a[i]= a[j]){
dup = true;
}
if(dup == true)
return a[i]
}
}
But I want to implement it in O(n log(n)) and in O(n) time. How can i do it?

Sort the array (that can be done in the first O (n Log n) then the comparison just has to be done for the adjacent elements. Or just put the array into a hash table and stop if you find the first key with an exsting entry.

I'm answering to "Finding duplicate element in an array?"
You search for i and j from 0 to < n, and later you check for j != i. Instead you could form your loops like this:
for (int i=0; i<n-1; i++)
{
for (j=i+1; j<n; j++)
{
if (a[i] == a[j])
{
return i;
}
}
}
return -1;
Repeatedly setting dup=false is nonsense. Either dup is still false, or it was true, then you left the code with 'return'.

Writing the previous answers in actual code (Java):
O(n log n) time:
Arrays.sort(arr);
for (int i = 1; i < arr.length; i++)
if (arr[i] == arr[i - 1])
return arr[i];
throw new Exception(); // error: no duplicate
O(n) time:
Set<Integer> set = new HashSet<Integer>();
for (int i = 0; i < arr.length; i++) {
if (set.contains(arr[i]))
return arr[i];
set.add(arr[i]);
}
throw new Exception(); // error: no duplicate

Reference java.util.TreeSet which is implemented Red-Black tree underlying, it's O(n*log(n)).

I recommend to use the hash-map (assuming no collisions) to solve it.
private boolean hasDuplicate(int[] arr) {
Map<Integer, Boolean> map = new HashMap();
// find the duplicate element from an array using map
for (int i = 0; i < arr.length; i++) {
if(map.containsKey(arr[i])) {
return true;
} else {
map.put(arr[i], true);
}
}
return false;
}
Time complexity : O(n)
Space complexity : O(n)
Another approach is sorting and comparing and but the sorting adds extra overhead.

By using collections we can go for below code snippet -
Set<String> set = new HashSet<String>();
for (String arrayElement : arr) {
if (!set.add(arrayElement)) {
System.out.println("Duplicate Element is : " + arrayElement);
}
}

Find O(n) complexity solution as below -
int ar[]={0,1,2,3,0,2,3,1,0,2};
Set <Integer>mySet=new HashSet<>();
for(int n:ar){
if(!mySet.add(n)){
System.out.println(" "+n);
}
}
And another process with lesser space complexity O(N) and possibly O(n Log n) --
public void duplicateElementSolution(int ar[]){
Arrays.sort(ar);
for(int i=0;i<(ar.length-1);i++){
if(ar[i]==ar[i+1]){
System.out.println(" "+ar[i]);
}
}
}

(The question in its current form is a little confusing - my answer is assuming that the question is about finding two numbers in an array that sum to a given value)
Since the given array is unsorted, I am assuming that we are not allowed to sort the array (i.e. the given order of the array cannot be changed).
The simplest solution IMHO is to iterate over each number x and check if I-x occurs anywhere in the arrays. This is essentially what your O(n^2) solution is doing.
This can be brought down to O(n) or O(nlogn) by making the search faster using some sort of fast set data structure. Basically, as we iterate over the array, we query to see if I-x occurs in the set.
Code (in Python):
l=[1,2,3,4,5,6,7,8,9]
seen=set()
I=11
for item in l:
if I-item in seen:
print "(%d,%d)"%(item,I-item)
seen.add(item)
The complexity of the solution depends on the insert/lookup complexity of the set data structure that you use. A hashtable based implementation has a O(1) complexity so it gives you a O(n) algorithm, while a tree based set results in a O(nlogn) algorithm.
Edit:
The equivalent data structure to Python's set would be stl::set in C++ and TreeSet/HashSet in Java. The line I-x in seen would translate to seen.contains(I-x) in Java and seen.find(I-x)==seen.end() in C++.

Related

Complexity for a limited for loop iterations

I have a quick question about Complexity. I have this code in Java:
pairs is a HashMap that contains an Integer as a key, and it's frequency in a Collection<Integer> as a value. So :
pairs = new Hashmap<Integer number, Integer numberFrequency>()
Then I want to find the matching Pairs (a,b) that verify a + b == targetSum.
for (int i = 0; i < pairs.getCapacity(); i++) { // Complexity : O(n)
if (pairs.containsKey(targetSum - i) && targetSum - i == i) {
for (int j = 1; j < pairs.get(targetSum - i); j++) {
collection.add(new MatchingPair(targetSum - i, i));
}
}
}
I know that the complexity of the first For loop is O(n), but the second for Loop it only loops a small amount of times, which is the frequency of the number-1, do we still count it as O(n) so this whole portion of code will be O(n^2) ? If it is does someone have any alternative to just make it O(n) ?

Its O(n) if 'pairs.getCapacity()' or 'pairs.get(targetSum - i)' is a constant you know before hand. Else, two loops, one nested in the other, is generally O(n^2).

You can consider that for the wors case your complexity is O(n2)

Optimizing recursive backtrack

I solved a variation of the knapsack problem by backtracking all of the possible solutions. Basically 0 means that item is not in the backpack, 1 means that the item is in the backpack. Cost is the value of all items in the backpack, we are trying to achieve the lowest value possible while having items of every "class". Each time that a combination of all classes is found, I calculate the value of all items and if it's lower than globalBestValue, I save the value. I do this is verify().
Now I'm trying to optimize my recursive backtrack. My idea was to iterate over my array as it's being generated and return the generator if the "cost" of my generated numbers is already higher then my current best-value, therefore the combination currently being generated can't be the new best-value and can be skipped.
However with my optimization, my backtrack is not generating all the values and it actually skips the "best" value I'm trying to find. Could you tell me where the problem is?
private int globalBestValue = Integer.MAX_VALUE;
private int[] arr;
public KnapSack(int numberOfItems) {
arr = new int[numberOfItems];
}
private void generate(int fromIndex) {
int currentCost = 0; // my optimisation starts here
for (int i = 0; i < arr.length; i++) {
if (currentCost > globalBestValue) {
return;
}
if (arr[i] == 1) {
currentCost += allCosts.get(i);
}
} // ends here
if (fromIndex == arr.length) {
verify();
return;
}
for (int i = 0; i <= 1; i++) {
arr[fromIndex] = i;
generate(fromIndex + 1);
}
}
public void verify() {
// skipped the code verifying the arr if it's correct, it's long and not relevant
if (isCorrect == true && currentValue < globalBestValue) {
globalBestValue = currentValue;
}else{
return;
}
}

Pardon my bluntness, but your efforts at optimizing an inefficient algorithm can only be described as polishing the turd. You will not solve a knapsack problem of any decent size by brute force, and early return isn't enough. I have mentioned one approach to writing an efficient program on CodeReview SE; it requires a considerable effort, but you gotta do what you gotta do.
Having said that, I'd recommend you write the arr to console in order to troubleshoot the sequence. It looks like when you go back to the index i-1, the element at i remains set to 1, and you estimate the upper bound instead of the lower one. The following change might work: replace your code
for (int i = 0; i <= 1; i++) {
arr[fromIndex] = i;
generate(fromIndex + 1);
}
with
arr[fromIndex] = 1;
generate(fromIndex + 1);
arr[fromIndex] = 0;
generate(fromIndex + 1);
This turns it into a sort of greedy algorithm: instead of starting with 0000000, you effectively start with 1111111. And obviously, when you store the globalBestValue, you should store the actual data which gives it. But the main advice is: when your algorithm behaves weirdly, tracing is your friend.

Two Sum: How is the solution with O(1) space complexity implemented?

The classical Two Sum problem is described in LeetCode.
I know how to solve it with a hash table, which results in O(n) extra space. Now I want to solve it with O(1) space, so I'll first sort the array and then use two pointers to find the two integers, as shown in the (incorrect) code below.
public int[] twoSum(int[] numbers, int target) {
java.util.Arrays.sort(numbers);
int start = 0, end = numbers.length - 1;
while(start < end) {
if(numbers[start] + numbers[end] < target) {
start++;
}
else if(numbers[start] + numbers[end] > target) {
end--;
}
else {
int[] result = {start + 1, end + 1};
return result;
}
}
return null;
}
This code is incorrect: I'm returning the indices after sorting. So how will I keep track of the original indices of the selected integers? Or are there other O(1) space solutions? Thank you.

If you are only worried about space complexity, and not the time complexity, then you don't need to sort. That way, the whole issue of keeping track of original indices goes away.
int[] twoSum(int[] numbers, int target) {
for (int i = 0; i < numbers.length-1; i++) {
for (int j = i+1; j < numbers.length; j++) {
if (numbers[i] + numbers[j] == target)
return new int[]{i+1, j+1};
}
}
return null;
}
If you want to return all such pairs, not just the first one, then just continue with the iterations instead of returning immediately (of course, the return type will have to change to a list or 2-d array or ... ).

There are certain limits what can be achieved and what can't be. There are some parameters that depend on each other. Time & Space complexities are two such parameters when it comes to algorithms.
If you want to optimize your problem for space, it will increase the time complexity in most of the cases except in some special circumstances.
In this problem, if you don't want to increase the space complexity and want to preserve the original indices, the only way to do it is to not sort the array and take every two numbers combinations from the array and check if their sum is your target. This means the code becomes something similar to below.
while(i < n)
{
while(j < n)
{
if(i!=j && arr[i]+arr[j]==target)
{
int[] result = {i, j};
return result;
}
j++;
}
i++;
}
As you can see this obviously is an O(n^2) algorithm. Even in the program you have written the sorting will be something like O(nlogn).
So, the bottom line is if you want to reduce space complexity, it increases time complexity.

Swap-search algorithm

I have an array A of n integers. I also have an array B of k (k < n) integers. What I need is that any integer from array A that appears in array B to be increased by 3.
If I go with the most obvious way, I get to n*k complexity.
Array A cannot (must not) be sorted.
Is there a more efficient way of achieveing this?

Is there a more efficient way of achieveing this?
Yes: put the elements of B into a HashSet. Loop over A and, if the element you're on is contained in the set, increase it by 3. This will have O(n + k) complexity.
For instance:
Set<Integer> bSet = new HashSet<>(B.length);
for (int a : B) // O(k)
bSet.add(a);
for (int i = 0; i < A.length; i++) { // O(n)
if (bSet.contains(a[i]))
a[i] += 3;
}

If your integers are in a range that you can create and array with the length of the greatest value (for instance 0 <= A[i] and B[i] <= 65535) then you can do this
boolean [] constains = new boolean[65535];
for (int i = 0; i < k; i++){
constains[B[i]] = true;
}
for (int i = 0; i < n; i++){
if (constains[A[i]]){
A[i] += 3;
}
}
Which is O(n + k)

if array B can be sorted - then solution is obvious, sort it, then you can optimize "contains" to be log2(K), so your complexity will be N*log2(k)
if you cannot sort array B - then the only thing is straight forward N*K
UPDATE
really forgot about bitmask, if you know that you have only 32 bit integers, and have enough memory - you can store huge bitmask array, were "add" and "contains" always will be O(1), but of course it is needed only for very special performance optimizations

How to return the smallest integers from array?

I have an array int[] a= {5,3,1,2} and I want to make a method that picks out the "k" smallest numbers and return an array with the k smallest integers in ascending order. But when I run this code I get the output: [1,3].
I know the code skips some numbers somehow, but I cant twist my brain to fix it.
Any ideas?
EDIT: Without sorting the original array.
public static int[] nrSmallest(int[] a, int k) {
if(k <1 || k>a.length)
throw new IllegalArgumentException("must be at least 1");
int[] values= Arrays.copyOf(a, k);
Arrays.sort(values);
int counter= 0;
for(int i= k; i < a.length; i++) {
if(a[i]< values[counter]) {
for(int j= k-1; j> counter; j--) {
values[j]= values[j-1];
}
values[counter]= a[i];
}
if(counter< k) counter++;
}
return values;
}
EDIT: Joop Eggen solved this for me. Scroll down to see answer. Thanks!

As already pointed out in the comments, simply return a part of the sorted array.
public static int[] nrSmallest(int[] a, int k) {
// check parameters..
// copy all so we don't sort a
int[] sorted = Arrays.copyOf(a, a.length);
Arrays.sort(sorted);
return Arrays.copyOf(sorted, Math.min(k, sorted.length));
}

If you can't modify the original array, this is typically done with some type of priority queue, often a binary heap.
The method that you use in your example is O(n^2), and uses O(k) extra space. Sorting the original array and selecting the top k items is O(n log n). If you copy the array and then sort it, it uses O(n) extra space.
Using a heap is O(n log k), and requires O(k) extra space.
There is an O(n) solution that involves manipulating the original array (or making a copy of the array and manipulating it). See Quickselect.
My own testing shows that Quickselect is faster in the general case, but Heap select is faster when the number of items to be selected (k) is less than 1% of the total items (n). See my blog post, When theory meets practice. That comes in quite handy when selecting, say, the top 100 items from a list of two million.

(Corrected) To keep your code:
for (int i= k; i < a.length; i++) {
if (a[i] < values[counter]) { // Found small value
// Insert sorted
for (int j = k-1; j >= 0; j--) {
if (j == 0 || a[i] > values[j-1]) { // Insert pos
// Move greater ones up.
for (int m = k - 1; m > j; m--) {
values[m] = values[m - 1];
}
values[j] = a[i]; // Store
break; // Done
}
}
}
}

int[] values= Arrays.copyOf(a, k); this line is wrong. you are copying only k elements. but you are suppose to copy all elements and then sort the array.

First sort the array and then return the sorted part of the array upto k.
public static int[] nrSmallest(int[] a, int k) {
if(k <1 || k>a.length)
throw new IllegalArgumentException("must be at least 1");
Arrays.sort(a);
return Arrays.copyOf(a,k);
}

You could use the "pivoting" idea of quicksort,
The pivot denotes the "rank" of that number in the array, so your end goal would be having a pivot at index "k", which will result in a subarray less than the Kth element, in other words first K smallest numbers (not exactly sorted).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Finding duplicate element in an array? - java

Sort the array (that can be done in the first O (n Log n) then the comparison just has to be done for the adjacent elements. Or just put the array into a hash table and stop if you find the first key with an exsting entry.

Reference java.util.TreeSet which is implemented Red-Black tree underlying, it's O(n*log(n)).

By using collections we can go for below code snippet - Set<String> set = new HashSet<String>(); for (String arrayElement : arr) { if (!set.add(arrayElement)) { System.out.println("Duplicate Element is : " + arrayElement); } }

Related

Complexity for a limited for loop iterations

Optimizing recursive backtrack

Two Sum: How is the solution with O(1) space complexity implemented?

Swap-search algorithm

How to return the smallest integers from array?

Categories

Resources