Finding common element in two arrays with best performing method

Finding common element in two arrays with best performing method - java

Implement a method that checks whether an integer is present in both integer array parameter 1 and integer array parameter 2 and prints the result of the search, with the best performance you can. The method parameters are: (1) the first integer array and (2) the second integer array of the same size as parameter 1 and (3) the integer to search for.
Note - Consider better performance to mean that a better performing method requires fewer general work steps to solve the problem with the same size of arrays. You may want to review the Java SE API page for java.util.Arrays
I was able to implement the solution but I am not sure if it the best-performing one because I am not using any java.util.Arrays methods as I am not sure which one to use necessarily to get me the best answer
public static void findCommonElements(int[] arr1, int[] arr2, int num){
for(int i = 0; i < arr1.length; i++){
for(int j = 0; j < arr2.length; j++){
if(arr1[i] == arr2[j] && arr1[i] == num){
System.out.println(num);
}
}
}
}
UPDATE:
I was able to update the code with following solution which completely removes for loop and implements binary for better performance
int[] arr1 = {7,8,5,1,2,3,6,7};
int[] arr2 = {9,8,6,4,1,2,4,5};
Arrays.sort(arr1);
Arrays.sort(arr2);
int index1 = Arrays.binarySearch(arr1, 5);
int index2 = Arrays.binarySearch(arr2, 5);
System.out.println(index1);
System.out.println(index2);
if(index1 < 0 || index2 < 0){
System.out.println("number not found in both arrays");
}
else{
System.out.println("number found in both arrays");
}

The problem description is a bit hard to follow, but by reference to the example code, I take this to be a fair rewording: "Write the best-performing method you can that takes two int arrays of the same length and a scalar int value i as parameters, and prints whether the value of i appears in both arrays."
Your first solution tests each pair of elements drawn one from the first array and the other from the second to determine whether they are equal to each other and to the target value. This is grossly inefficient for the problem as interpreted.
Your second solution sorts the arrays first, so as to be able to use a binary search to try to find the target element. This is better, but still inefficient. Although the binary searches are quite fast, the sorting required to prepare for them takes a lot more work than is saved by a single binary search.
Since it is sufficient to determine only whether the target value appears in both arrays, you can
scan each array for the target value, once, independently of the other.
skip the second scan if the first one does not find the target value
break early from each scan when the target value is found
The latter two are minor improvements, as they reduce only the minimum and average number of steps. The first, however, is a huge improvement, especially as array size increases, because for arrays of length n, then this requires a number of steps proportional to n in the worst case, whereas your first example code requires steps proportional to n2 in both the average and worst cases, and your second requires time proportional to n log n in the average and worst cases.
The implementation is left as the exercise it is intended to be. However, with respect to
I was able to implement the solution but I am not sure if it the
best-performing one because I am not using any java.util.Arrays
methods as I am not sure which one to use necessarily to get me the
best answer
, I don't think java.util.Arrays offers any method that particularly helps with this problem, especially given the orientation toward best possible performance.

You can use search the arrays using streams:
public static boolean findCommonElements(int[] arr1, int[] arr2, int num) {
return Arrays.stream(arr1).anyMatch(x -> x == num) &&
Arrays.stream(arr2).anyMatch(x -> x == num);
}
Similar method using linear search in arrays of Integer using Arrays.asList to convert arrays:
public static boolean findCommonElements(Integer[] arr1, Integer[] arr2, int num) {
return Arrays.asList(arr1).indexOf(num) > -1 &&
Arrays.asList(arr2).indexOf(num) > -1;
}

Related

Time complexity for all subsets using backtracking

I am trying to understand the time complexity while using backtracking. The problem is
Given a set of unique integers, return all possible subsets.
Eg. Input [1,2,3] would return [[],[1],[2],[1,2],[3],[1,3],[2,3],[1,2,3]]
I am solving it using backtracking as this:
private List<List<Integer>> result = new ArrayList<>();
public List<List<Integer>> getSubsets(int[] nums) {
for (int length = 1; length <= nums.length; length++) { //O(n)
backtrack(nums, 0, new ArrayList<>(), length);
}
result.add(new ArrayList<>());
return result;
}
private void backtrack(int[] nums, int index, List<Integer> listSoFar, int length) {
if (length == 0) {
result.add(listSoFar);
return;
}
for (int i = index; i < nums.length; i++) { // O(n)
List<Integer> temp = new ArrayList<>();
temp.addAll(listSoFar); // O(2^n)
temp.add(nums[i]);
backtrack(nums, i + 1, temp, length - 1);
}
}
The code works fine, but I am having trouble understanding the time/space complexity.
What I am thinking is here the recursive method is called n times. In each call, it generates the sublist that may contain max 2^n elements. So time and space, both will be O(n x 2^n), is that right?
Is that right? If not, can any one elaborate?
Note that I saw some answers here, like this but unable to understand. When recursion comes into the picture, I am finding it a bit hard to wrap my head around it.

You're exactly right about space complexity. The total space of the final output is O(n*2^n), and this dominates the total space used by the program. The analysis of the time complexity is slightly off though. Optimally, time complexity would, in this case, be the same as the space complexity, but there are a couple inefficiencies here (one of which is that you're not actually backtracking) such that the time complexity is actually O(n^2*2^n) at best.
It can definitely be useful to analyze a recursive algorithm's time complexity in terms of how many times the recursive method is called times how much work each call does. But be careful about saying backtrack is only called n times: it is called n times at the top level, but this is ignoring all the subsequent recursive calls. Also every call at the top level, backtrack(nums, 0, new ArrayList<>(), length); is responsible for generating all subsets sized length, of which there are n Choose length. That is, no single top-level call will ever produce 2^n subsets; it's instead that the sum of n Choose length for lengths from 0 to n is 2^n:
Knowing that across all recursive calls, you generate 2^n subsets, you might then want to ask how much work is done in generating each subset in order to determine the overall complexity. Optimally, this would be O(n), because each subset varies in length from 0 to n, with the average length being n/2, so the overall algorithm might be O(n/2*2^n) = O(n*2^n), but you can't just assume the subsets are generated optimally and that no significant extra work is done.
In your case, you're building subsets through the listSoFar variable until it reaches the appropriate length, at which point it is appended to the result. However, listSoFar gets copied to a temp list in O(n) time for each of its O(n) characters, so the complexity of generating each subset is O(n^2), which brings the overall complexity to O(n^2*2^n). Also, some listSoFar subsets are created which never figure into the final output (you never check to see that there are enough numbers remaining in nums to fill listSoFar out to the desired length before recursing), so you end up doing unnecessary work in building subsets and making recursive calls which will never reach the base case to get appended to result, which might also worsen the asymptotic complexity. You can address the first of these inefficiencies with back-tracking, and the second with a simple break statement. I wrote these changes into a JavaScript program, leaving most of the logic the same but re-naming/re-organizing a little bit:
function getSubsets(nums) {
let subsets = [];
for (let length = 0; length <= nums.length; length++) {
// refactored "backtrack" function:
genSubsetsByLength(length); // O(length*(n Choose length))
}
return subsets;
function genSubsetsByLength(length, i=0, partialSubset=[]) {
if (length === 0) {
subsets.push(partialSubset.slice()); // O(n): copy partial and push to result
return;
}
while (i < nums.length) {
if (nums.length - i < length) break; // don't build partial results that can't finish
partialSubset.push(nums[i]); // O(1)
genSubsetsByLength(length - 1, ++i, partialSubset);
partialSubset.pop(); // O(1): this is the back-tracking part
}
}
}
for (let subset of getSubsets([1, 2, 3])) console.log(`[`, ...subset, ']');
The key difference is using back-tracking to avoid making copies of the partial subset every time you add a new element to it, such that each is built in O(length) = O(n) time rather than O(n^2) time, because there is now only O(1) work done per element added. Popping off the last character added to the partial result after each recursive call allows you to re-use the same array across recursive calls, thus avoiding the O(n) overhead of making temp copies for each call. This, along with the fact that only subsets which appear in the final output are built, allows you to analyze the total time complexity in terms of the total number of elements across all subsets in the output: O(n*2^n).

Your code works not efficiently.
Like first solution in the link, you only think about the number will be included or not. (like getting combination)
It means, you don't have to iterate in getSubsets and backtrack function.
"backtrack" function could iterate "nums" array with parameter
private List<List<Integer>> result = new ArrayList<>();
public List<List<Integer>> getSubsets(int[] nums) {
backtrack(nums, 0, new ArrayList<>(), new ArrayList<>());
return result;
}
private void backtrack(int[] nums, int index, List<Integer> listSoFar)
// This function time complexity 2^N, because will search all cases when the number included or not
{
if (index == nums.length) {
result.add(listSoFar);
return;
}
// exclude num[index] in the subset
backtrack(nums, index+1, listSoFar)
// include num[index] in the subset
backtrack(nums, index+1, listSoFar.add(nums[index]))
}

Using Selection Sort to Sort a Nearly Sorted Array?

I was playing around with this website (https://www.toptal.com/developers/sorting-algorithms/) and clicked on "Play Nearly Sorted". I noticed that the Selection Sort algorithm was considerably slower than the others. Can you please explain why this is so, and possibly point me to a link with more information on this. Thanks!

You'll notice selection sort actually just takes the same amount of steps regardless of the initial ordering of the data.
Selection sort can be described like this (pseudocode, assumes data is a list of length n and indices run from 1 to n inclusive)
for pass = 1 to n {
//at each pass, find the minimal item in the list (that we haven't already sorted)
indexMin = pass
for item = pass + 1 to n {
if data[item] < data[indexMin] {
indexMin = item
}
}
swap data[pass] and data[indexMin]
}
Like it's name suggests, it repeatedly (n times in fact) selects the smallest element, and then puts it in its proper position. This creates a sorted sublist at the beginning running from position 1 to position pass at each pass.
As you can see, selection sort has no capability of terminating early. It just takes a blind shotgun approach at sorting. It will always runs exactly n passes, even on an array that's already completely sorted.
Other algorithms often "know" when they're done sorting, and can terminate early. As Tim's comment says, selection sort is a brute force approach that will never run any faster than n^2

To fully understand the runtime of common sorting algorithms, it requires you to read through their pseudo code. In a "nearly sorted case," selection sort is the slowest because no matter what, its running time is always O(n^2), which runs in a polynomial time. Polynomial time is considered as the slowest among all the time complexity presented in the website you attached. Here is the code for selection sort:
public static void selectionSort(int [] A) {
for(int i = 0; i < A.length - 1; i++) {
int min = i;
for(int j = i + 1; j < A.length; j++){
if(A[j] < A[min])
min = j;
}
}
swap(A, i, min);
}
It always runs with these two "for" loops regardless the how much sorted the array A is. Regarding other algorithms, they are relatively "smarter" (faster) with the initial array A if it is somehow or nearly sorted. You can ask yourself in another way around; why insertion sort is so fast in a "nearly sorted array?" Hope this helps!

Is there any way to shorten a for-each loop in java?

I want to iterate just the half of an array in java. Is there any elegant way to shorten this up, eg with a for-each loop?
int[] array = {0,1,2,3,4,5};
for (int i = 0; i<array.length/2; i++)
{
System.out.println(array[i]);
}

If you converted the array into a list using the asList method of the Arrays class in Java, then you can use the forEach method in the List class in Java to print out each element of the list in one single line,
Arrays.asList(array).forEach(System.out::println);
To print only half the array, I'd suggest copying half the array into a new array using the copyOfRange method,
Integer[] newArray = Arrays.copyOfRange(array, 0, array.length/2);
Arrays.asList(newArray).forEach(System.out::println);
EDIT: Like Marko Topolnik pointed out, we're actually starting out with an array of primitive types instead of object types, so in order to use the asList method we're going to have to convert the array into an array of objects (from int to Integer using Integer[] integerArray = ArrayUtils.toObject(array);). However this just seems tedious/inefficient and OP asked for a shorter way so my suggestion would be to use Marko's method,
Arrays.stream(array).limit(array.length/2).forEach(System.ou‌t::println);
EDIT 2: Like Amber Beriwal pointed out, it should be noted that although the one-line solution above looks pretty due to its conciseness, it is still very inefficient/slow compared to the OP's original method. Therefore, I would like to reiterate Amber's comments that the OP and others should just stick with the original for-loop.
for (int i = 0; i < array.length/2; i++)
{
System.out.println(array[i]);
}

How about:
IntStream.range(0, array.length / 2).map(i -> array[i]).forEach(System.out::println);
One line, and no array copies.
Broken down:
IntStream.range(0, array.length / 2) //get the range of numbers 0 - (array length)/2
.map(i -> array[i]) //map from index to value
.forEach(System.out::println); //print result

The answer you have posted is good. Although, I couldn't find a better way to make it compact keeping the performance same, but performance can be improved. Remember following practices while coding:
Algorithm's memory requirement should be optimum
Algorithm's time i.e. performance should be optimum
Algorithm's complexity should not be too much. For significant gains in 1 & 2, this can be skipped.
Considering 1 & 2, lines of code comes at least priority.
Solution 1: This solution will be 4-5 times slower than your approach, plus Stream will take extra space.
Arrays.stream(array).limit(array.length/2).forEach(System.ou‌t::println);
Solution 2: This solution is faster than the above code and your code (based on my testing), but Stream will take extra space. Also, it is not compact.
Arrays.stream(array).limit(array.length / 2).forEach(new IntConsumer() {
#Override
public void accept(int value) {
System.out.println(value);
}
});
Solution 3: As suggested by you.
int[] array = new int[] { 0, 1, 2, 3, 4, 5 };
int limit = array.length / 2;
for (int i = 0; i < limit; i++) {
System.out.println(array[i]);
}
Recommendation: Don't go over to reduce the LOC at the stake of losing performance and memory. It is better to keep up with the solution that gives you best performance..

Find all the ways you can go up an n step staircase if you can take k steps at a time such that k <= n

This is a problem I'm trying to solve on my own to be a bit better at recursion(not homework). I believe I found a solution, but I'm not sure about the time complexity (I'm aware that DP would give me better results).
Find all the ways you can go up an n step staircase if you can take k steps at a time such that k <= n
For example, if my step sizes are [1,2,3] and the size of the stair case is 10, I could take 10 steps of size 1 [1,1,1,1,1,1,1,1,1,1]=10 or I could take 3 steps of size 3 and 1 step of size 1 [3,3,3,1]=10
Here is my solution:
static List<List<Integer>> problem1Ans = new ArrayList<List<Integer>>();
public static void problem1(int numSteps){
int [] steps = {1,2,3};
problem1_rec(new ArrayList<Integer>(), numSteps, steps);
}
public static void problem1_rec(List<Integer> sequence, int numSteps, int [] steps){
if(problem1_sum_seq(sequence) > numSteps){
return;
}
if(problem1_sum_seq(sequence) == numSteps){
problem1Ans.add(new ArrayList<Integer>(sequence));
return;
}
for(int stepSize : steps){
sequence.add(stepSize);
problem1_rec(sequence, numSteps, steps);
sequence.remove(sequence.size()-1);
}
}
public static int problem1_sum_seq(List<Integer> sequence){
int sum = 0;
for(int i : sequence){
sum += i;
}
return sum;
}
public static void main(String [] args){
problem1(10);
System.out.println(problem1Ans.size());
}
My guess is that this runtime is k^n where k is the numbers of step sizes, and n is the number of steps (3 and 10 in this case).
I came to this answer because each step size has a loop that calls k number of step sizes. However, the depth of this is not the same for all step sizes. For instance, the sequence [1,1,1,1,1,1,1,1,1,1] has more recursive calls than [3,3,3,1] so this makes me doubt my answer.
What is the runtime? Is k^n correct?

TL;DR: Your algorithm is O(2n), which is a tighter bound than O(kn), but because of some easily corrected inefficiencies the implementation runs in O(k2 × 2n).
In effect, your solution enumerates all of the step-sequences with sum n by successively enumerating all of the viable prefixes of those step-sequences. So the number of operations is proportional to the number of step sequences whose sum is less than or equal to n. [See Notes 1 and 2].
Now, let's consider how many possible prefix sequences there are for a given value of n. The precise computation will depend on the steps allowed in the vector of step sizes, but we can easily come up with a maximum, because any step sequence is a subset of the set of integers from 1 to n, and we know that there are precisely 2n such subsets.
Of course, not all subsets qualify. For example, if the set of step-sizes is [1, 2], then you are enumerating Fibonacci sequences, and there are O(φn) such sequences. As k increases, you will get closer and closer to O(2n). [Note 3]
Because of the inefficiencies in your coded, as noted, your algorithm is actually O(k2 αn) where α is some number between φ and 2, approaching 2 as k approaches infinity. (φ is 1.618..., or (1+sqrt(5))/2)).
There are a number of improvements that could be made to your implementation, particularly if your intent was to count rather than enumerate the step sizes. But that was not your question, as I understand it.
Notes
That's not quite exact, because you actually enumerate a few extra sequences which you then reject; the cost of these rejections is a multiplier by the size of the vector of possible step sizes. However, you could easily eliminate the rejections by terminating the for loop as soon as a rejection is noticed.
The cost of an enumeration is O(k) rather than O(1) because you compute the sum of the sequence arguments for each enumeration (often twice). That produces an additional factor of k. You could easily eliminate this cost by passing the current sum into the recursive call (which would also eliminate the multiple evaluations). It is trickier to avoid the O(k) cost of copying the sequence into the output list, but that can be done using a better (structure-sharing) data-structure.
The question in your title (as opposed to the problem solved by the code in the body of your question) does actually require enumerating all possible subsets of {1…n}, in which case the number of possible sequences would be exactly 2n.

If you want to solve this recursively, you should use a different pattern that allows caching of previous values, like the one used when calculating Fibonacci numbers. The code for Fibonacci function is basically about the same as what do you seek, it adds previous and pred-previous numbers by index and returns the output as current number. You can use the same technique in your recursive function , but add not f(k-1) and f(k-2), but gather sum of f(k-steps[i]). Something like this (I don't have a Java syntax checker, so bear with syntax errors please):
static List<Integer> cache = new ArrayList<Integer>;
static List<Integer> storedSteps=null; // if used with same value of steps, don't clear cache
public static Integer problem1(Integer numSteps, List<Integer> steps) {
if (!ArrayList::equal(steps, storedSteps)) { // check equality data wise, not link wise
storedSteps=steps; // or copy with whatever method there is
cache.clear(); // remove all data - now invalid
// TODO make cache+storedSteps a single structure
}
return problem1_rec(numSteps,steps);
}
private static Integer problem1_rec(Integer numSteps, List<Integer> steps) {
if (0>numSteps) { return 0; }
if (0==numSteps) { return 1; }
if (cache.length()>=numSteps+1) { return cache[numSteps] } // cache hit
Integer acc=0;
for (Integer i : steps) { acc+=problem1_rec(numSteps-i,steps); }
cache[numSteps]=acc; // cache miss. Make sure ArrayList supports inserting by index, otherwise use correct type
return acc;
}

Reducing Time Complexity of these two methods?

/*
* Returns true if this and other are rankings of the same
* set of strings; otherwise, returns false. Throws a
* NullPointerException if other is null. Must run in O(n)
* time, where n is the number of elements in this (or other).
*/
public boolean sameNames(Ranking other)
{
ArrayList<String> str1 = new ArrayList<String>();
ArrayList<String> str2 = new ArrayList<String>();
for(int i = 0; i < this.getNumItems(); i++){
str1.add(this.getStringOfRank(i));
}
for(int i = 0; i < other.getNumItems(); i++){
str2.add(other.getStringOfRank(i));
}
Collections.sort(str1);
Collections.sort(str2);
if(str1.size() == str2.size())
return str1.containsAll(str2);
else
return false;
}
Ok so in the code above, using str1.containsAll(str2) destroys my O(n) time complexity, as I believe it is O(n^2) in this case. My question how can I compare the contents of two arrays/arrayLists without using O(n^2). All I can think of is nested for loop, which of course is O(n^2).
/*
* Returns the rank of name. Throws an IllegalArgumentException
* if name is not present in the ranking. Must run in O(log n)
* time, where n = this.getNumItems().
*/
public int getRankOfString(String name)
{
Cities[] nameTest = new Cities[city.length];
int min = 0;
int max = city.length;
System.arraycopy(city, 0, nameTest, 0, city.length);
Arrays.sort(nameTest, Cities.BY_NAME);
while(max >= min){
int mid = (min + max)/2;
if(nameTest[mid].getName().equals(name))
return nameTest[mid].getRank();
else if(nameTest[mid].getName().compareTo(name) < 0)
min = mid + 1;
else
max = mid-1;
}
throw new IllegalArgumentException();
}
And this one, this has to be O(log n). So I used a binary search, however it only works on sorted arrays, so I have to call Arrays.sort(), BUT I can't mess with the order of the actual array so I have to copy the array using System.arraycopy(). This is most likely O(n + (n log n) + log n), which is not log n. I don't know what other way I can search for something, it seems like log n is the best, but that is binary search and would force me to sort array first, which just adds time...
P.S. I am not allowed to use Maps or Sets... :(
Any help would be awesome.
Sorry, a ranking object contains an array of city names that can be called and an array of rankings (just ints) for each city that can be called. sameNames() is simply testing two ranking objects possess the same cities, getRankofString() has a set name entered, it then checks to see if that name is in the ranking object, if it is, it returns its corresponding rank. Hope that cleared it up
And yeah, cannot use hash anything. We are basically limited to messing around with arrays and arrayLists and stuff.

Let's count the occurrences of each string. It's a bit similar to the counting sort.
Create a hash table t with hashing function f(), where keys are strings and values are integers (initially 0).
Iterate through the first strings, for each string do t[f(string)]++.
Iterate through the second strings, for each string do t[f(string)]++.
Iterate through the non-zero values in t, if all are even - return true. Otherwise - false.
Linear time complexity.

The first method has complexity at least O(n^2), given by 2*O(n*f(n)) + 2*O(n log n) + O(n^2). The O(n log n) is given by the Collections.sort() calls which will also 'destroys your O(n)' complexity as you put it.
Since both array lists are already sorted and of equal length when you try containsAll, that call is equivalent with some sort of equals (first element in one list should be equal to the first element in the second one, etc). You can easily compare the two lists manually (can't think of any build-in function that does this).
Hence, the overall complexity of the first piece of code can be reduced to O(n log n), if you can keep the complexity of getStringOfRank() under O(log n) (but that function is not shown in your post).
The second function (which isn't related to the first piece of code) has the complexity O(n log n) as pointed out by your computations. If you already copy, then sort the city array, the binary search is pointless. Don't copy, don't sort, just compare each city in the array, putting the entire complexity of this function to O(n). Alternatively, just keep a sorted copy of the city array and use binary search on that.
Either way, creating a copy of an array, sorting that copy for each function call is highly ineffective - if you want to call this function inside a loop, like you used getStringOfRank() above, construct the sorted copy before the loop and use it as an argument:
private boolean getRankOfString(String name, Cities[] sortedCities) {
// only binary search code needed here
}
Off-topic:
Based on the second function, you have something like Cities[] city declared in somewhere your code. If it were to follow conventions, it should be more like City[] cities (class name singular, array name should be the one using the plural)

So the first one just needs to compare if the two have the exact same names and nothing else?
How about this
public static boolean compare(List<String> l1, List<String> l2) {
if (l1.size() != l2.size()) return false;
long hash1 = 0;
long hash2 = 0;
for (int i = 0 ; i < l1.size() ; i++) {
hash1 += l1.get(i).hashCode();
hash2 += l2.get(i).hashCode();
}
return hash1 == hash2;
}
In theory, you could get a hash collision I suppose.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Finding common element in two arrays with best performing method - java

Related

Time complexity for all subsets using backtracking

Using Selection Sort to Sort a Nearly Sorted Array?

Is there any way to shorten a for-each loop in java?

Find all the ways you can go up an n step staircase if you can take k steps at a time such that k <= n

Reducing Time Complexity of these two methods?

Categories

Resources