Referencing https://www.geeksforgeeks.org/find-subarray-with-given-sum-in-array-of-integers/ for some reason I'm feeling a little thick about this and not totally grasping the reason that once you find the highest index for (current_sum - target_sum) in the map, that you know if you start at the index immediately following that in the array and include the values up to the current index where you encounter this in the array, that you have your subarray solution.
I pretty much get it, that it's because if we've reached a point in our iterating of the array that we've seen the difference between our current sum and the target number, then if we remove that difference from the sum we have found the subarray for the solution, but I can't quite grasp why exactly that is. For example, what if the difference is "2" but the index we have stored in our map where we last saw the sum was "2" is not immediately before the subarray leading up to where we are now and provides the solution. Again, I kind of get it but would appreciate a clear and precise explanation so I have that "aha" moment and more solidly grasp it.
Also wondering the logic that might lead me to this solution after solving this in a different way for positive integers only, namely the efficient solution covered here https://www.geeksforgeeks.org/find-subarray-with-given-sum/.
Thanks.
public static void subArraySum(int[] arr, int n, int sum) {
//cur_sum to keep track of cummulative sum till that point
int cur_sum = 0;
int start = 0;
int end = -1;
HashMap<Integer, Integer> hashMap = new HashMap<>();
for (int i = 0; i < n; i++) {
cur_sum = cur_sum + arr[i];
//check whether cur_sum - sum = 0, if 0 it means
//the sub array is starting from index 0- so stop
if (cur_sum - sum == 0) {
start = 0;
end = i;
break;
}
//if hashMap already has the value, means we already
// have subarray with the sum - so stop
if (hashMap.containsKey(cur_sum - sum)) {
start = hashMap.get(cur_sum - sum) + 1;
end = i;
break;
}
//if value is not present then add to hashmap
hashMap.put(cur_sum, i);
}
// if end is -1 : means we have reached end without the sum
if (end == -1) {
System.out.println("No subarray with given sum exists");
} else {
System.out.println("Sum found between indexes "
+ start + " to " + end);
}
}
Related
In the process of learning algorithms, I have written code to compare 2 algorithms performance in terms of running time. The task of these algorithms is to find all the pairs of numbers in an array that add up to a specific number.
First approach - Brute force.
2 for loops to find the pairs of numbers that add up to the given number. Basically time complexity is O(n*n).
Second approach - Efficient
First sort the array, then have start and end as index to the beginning and end of array, and depending on the sum of these elements in the positions, move left or right to find pairs of numbers.
My question is -
I am printing the running time of each algorithm approach. But it seems like the running time of the Brute force approach is faster than the Efficient one. Why is this happening?
See the code here -
public class MainRunner {
final private static int numberRange = 100000;
public static void generateRandomNumbers(int[] array, int[] dupArray) {
System.out.println("Generated Array: ");
Random random = new Random();
for (int i = 0; i < array.length; i++) {
int generatedRandomInt = random.nextInt(array.length) + 1;
array[i] = dupArray[i] = generatedRandomInt;
}
}
public static void main(String[] args) {
int[] array = new int[numberRange];
int[] dupArray = new int[numberRange];
generateRandomNumbers(array, dupArray);
Random random = new Random();
int sumToFind = random.nextInt(numberRange) + 1;
System.out.println("\n\nSum to find: " + sumToFind);
// Starting Sort and Find Pairs
final long startTimeSortAndFindPairs = System.currentTimeMillis();
new SortAndFindPairs().sortAndFindPairsOfNumbers(sumToFind, array);
final long durationSortAndFind = System.currentTimeMillis() - startTimeSortAndFindPairs;
// Starting Find Pairs
final long startTimeFindPairs = System.currentTimeMillis();
new FindPairs().findPairs(sumToFind, dupArray);
final long durationFindPairs = System.currentTimeMillis() - startTimeFindPairs;
System.out.println("Sort and Find Pairs: " + durationSortAndFind);
System.out.println("Find Pairs: " + durationFindPairs);
}
}
SortAndFindPairs.java
public class SortAndFindPairs {
public void sortAndFindPairsOfNumbers(int argNumberToFind, int[] array) {
Arrays.sort(array);
System.out.println("\n\nResults of Sort and Find Pairs: \n");
int startIndex = 0;
int endIndex = array.length - 1;
while (startIndex < endIndex) {
int sum = array[startIndex] + array[endIndex];
if (argNumberToFind == sum) {
//System.out.println(array[startIndex] + ", " + array[endIndex]);
startIndex++;
endIndex--;
} else if (argNumberToFind > sum) {
startIndex++;
} else {
endIndex--;
}
}
}
And the FindPairs.java
public class FindPairs {
public void findPairs(int argNumberToFind, int[] array) {
System.out.println("\nResults of Find Pairs: \n");
int randomInt1 = 0;
int randomInt2 = 0;
for (int i = 0; i < array.length - 1; i++) {
for (int j = i + 1; j < array.length; j++) {
int sum = array[i] + array[j];
if (argNumberToFind == sum) {
//System.out.println(array[i] + ", " + array[j]);
//randomInt1++;
//randomInt2--;
}
}
}
}}
Only on adding the two variables randomInt1 and randomInt2 in the FindPairs.java, the running time difference is seen. Or else, the running time of FindPairs.java is much less than SortAndFindPairs.java. So why does adding just 2 variable operations increase time by so much? According to conventions, simple operations should consume negligible time. Am I missing out something here?
Results for numberRange = 1000000
Results of Find Pairs:
Sort and Find Pairs: 641
Find Pairs: 57
I think the problem is your compiler optimization playing tricks to you. I tried different permutations of your code, and noticed that the double for loop in FindPairs is doing almost nothing. So the compiler may be stripping some of the code.
I got this numbers with the exact copy of your code:
Sort and Find Pairs: 43
Find Pairs: 13
Consistently (I ran it several times to double check) Sort and find was slower, everytime.
But then I changed the inner loop for to do nothing:
for (int j = i + 1; j < array.length; j++) {
//int sum = array[i] + array[j];
//if (argNumberToFind == sum) {
//System.out.println(array[i] + ", " + array[j]);
//randomInt1++;
//randomInt2--;
//}
And guess what? I got:
Sort and Find Pairs: 20
Find Pairs: 11
Tried several times and the numbers were pretty similar. By removing both loops the runtime for find pairs went to 1. So My guess, maybe the optimization step of the compiler is assuming that the code inside the inner loop doesn't have any effect and thus removes it. The code in Sort and find is a little smarter and so it gets kept.
Now, I tried a different thing, I commented out the increment of randomInt1, but left the sum and if commented,
for (int j = i + 1; j < array.length; j++) {
//int sum = array[i] + array[j];
//if (argNumberToFind == sum) {
//System.out.println(array[i] + ", " + array[j]);
randomInt1++;
//randomInt2--;
//}
and then I got:
Sort and Find Pairs: 42
Find Pairs: 5
Wow, suddenly it got faster! (maybe the compiler replaced the for for the arithmetic calculation of randomInt1 by using the loop bounds?)
My last attempt. You can noticed that this is not a fair comparison, the sort and find have a lot of logic involved, while the find doesn't. It actually does nothing when it find a pair. So to make it apples to apples we want to be sure find pairs actually do something, and lets make sure sort and find do the same extra amount (like adding the same number on both sides of the equation). So I changed the methods to calculate the count of matching pairs instead. Like this:
System.out.println("\nResults of Find Pairs: \n");
long randomInt1 = 0;
int randomInt2 = 0;
int count = 0;
for (int i = 0; i < array.length - 1; i++) {
for (int j = i + 1; j < array.length; j++) {
int sum = array[i] + array[j];
if (argNumberToFind == sum) {
count++;
}
}
}
System.out.println("\nPairs found: " + count + "\n");
and
public void sortAndFindPairsOfNumbers(int argNumberToFind, int[] array) {
Arrays.sort(array);
System.out.println("\n\nResults of Sort and Find Pairs: \n");
int startIndex = 0;
int endIndex = array.length - 1;
int count = 0;
while (startIndex < endIndex) {
int sum = array[startIndex] + array[endIndex];
if (argNumberToFind == sum) {
//System.out.println(array[startIndex] + ", " + array[endIndex]);
startIndex++;
endIndex--;
count++;
} else if (argNumberToFind > sum) {
startIndex++;
} else {
endIndex--;
}
}
System.out.println("\nPairs found: " + count + "\n");
}
And then got:
Sort and Find Pairs: 38
Find Pairs: 4405
The time for find pairs blowed up! And the sort and find kept in line with what we were seeing before.
So the most likely answer to your problem is that the compiler is optimizing something, and an almost empty for loop is something that the compiler can definitely use to optimize. whilst for the sort and find, the complex logic may cause the optimizer to step back. Your algorithm lessons are find. Here java is playing you a trick.
One more thing you can try is use different languages. I'm pretty sure you will find interesting stuff by doing so!
As stated by LIuxed, sort operation takes some time. If you invest time in sorting, why do you then not take advantage of the fact that list items are sorted?
If list elements are sorted, you could use a binary search algorithm... start in the middle of the array, and check if you go 1/2 up, or 1/2 down. As a result, you can get faster performance with sorted array for seeking a value. Such an algorithm is already implemented in the Arrays.binarySearch methods.
See https://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#binarySearch(int[],%20int)
You will notice the difference when you sort just once, but seek many times.
Calling the Array.sort(MyArray) method, takes long time because it uses a selection algorithm; this means, the Sort method go through all the array x times ( x= array.lenght) searching for the smallest/biggest value, and set it on top of the array, and so on.
Thats why, using this method takes a long time, depending on the array size.
I removed everything from your sortAndFindPairsOfNumbers method, just kept
Arrays.sort(array);
But still time difference is much more.
This means most of the time taken is by sort method.
So your thinking that second approach is Efficient one is not correct. Its all about input size.
If you keep numberRange, lets say, 1000, then SortAndFindPairs will be faster.
Some Background
Last week I did a problem in my textbook where It told me to generate 20 random numbers and then put brackets around successive numbers that are equal
Consider the following which my program outputs
697342(33)(666)(44)69(66)1(88)
What I need to do
The next problem was to basically get the longest sequence of these words and put brackets around them. If you have
1122345(6666)
Basically you need to put brackets around four 6's , since they occur most often.
I've finished all other problems in the chapter I am studying ( Arrays and ArrayLists), however I can't seem to figure this one out.
Here is the solution that I have made for putting brackets around successive numbers:
class Seq
{
private ArrayList<Integer> nums;
private Random randNum;
public Seq()
{
nums = new ArrayList<Integer>();
randNum = new Random();
}
public void fillArrList()
{
for (int i = 0 ; i < 20 ; i++)
{
int thisRandNum = randNum.nextInt(9)+1;
nums.add(thisRandNum);
}
}
public String toString() {
StringBuilder result = new StringBuilder();
boolean inRun = false;
for (int i = 0; i < nums.size(); i++) {
if (i < nums.size() - 1 && nums.get(i).equals(nums.get(i + 1))) {
if (!inRun) {
result.append("(");
}
result.append(nums.get(i));
inRun = true;
} else {
result.append(nums.get(i));
if (inRun) {
result.append(")");
}
inRun = false;
}
}
return result.toString();
}
}
My Thoughts
Iterate through the whole list. Make a count variable, that keeps track of how many numbers are successive of each other. I.e 22 would have a count of 2. 444 a count of 3
Next make an oldCount, which compares the current count to the oldCount. We only want to keep going if our new count is greater than oldCount
After that we need a way to get the starting index of the largest count variable, as well as the end.
Is my way of thinking correct? Because I'm having trouble updating the oldCount and count variable while comparing them, since there values constantly change. I'm not looking for the code, but rather some valuable hints.
My count is resetting like this
int startIndex, endIndex = 0;
int count = 0;
int oldCount = 0;
for(int i = 0 ; i < nums.size(); i++)
{
if(nums.get(i) == nums.get(i+1) && count >= oldCount)
{
count++;
}
oldCount = count;
}
Only after walking all elements you will know the longest subsequence.
11222333333444555
11222(333333)444555
Hence only after the loop you can insert both brackets.
So you have to maintain a local optimum: start index plus length or last index of optimum.
And then for every sequence the start index of the current sequence.
As asked:
The optimal state (sequence) and the current state are two things. One cannot in advance say that any current state is the final optimal state.
public String toString() {
// Begin with as "best" solution the empty sequence.
int startBest = 0; // Starting index
int lengthBest = 0; // Length of sequence
// Determine sequences:
int startCurrent = 0; // Starting index of most current/last sequence
for (int i = 0; i < nums.size(); i++) {
// Can we add the current num to the current sequence?
if (i == startCurrent || nums.get(i).equals(nums.get(i - 1)))) {
// We can extend the current sequence with this i:
int lengthCurrent = i - startCurrent + 1;
if (lengthCurrent > lengthBest) { // Current length better?
// New optimum:
startBest = startCurrent;
lengthBest = lengthCurrent;
}
} else {
// A different num, start here.
// As we had already a real sequence (i != 0), no need for
// checking for a new optimum with length 1.
startCurrent = i;
}
}
// Now we found the best solution.
// Create the result:
StringBuilder result = new StringBuilder();
for (int i = 0; i < nums.size(); i++) {
result.append(nums.get(i));
}
// Insert the right ')' first as its index changes by 1 after inserting '('.
result.insert(startBest + lengthBest, ")");
result.insert(startBest, "(");
return result.toString();
}
The first problem is how to find the end of a sequence, and set the correct start of the sequence.
The problem with the original algorithm is that there is handled just one sequence (one subsequence start).
The way you have suggested could work. And then, if newcount is greater than oldcount, you'll want to store an additional number in another variable - the index of the where the longest sequence begins.
Then later, you can go and insert the ( at the position of that index.
i.e. if you have 11223456666.
The biggest sequence starts with the first number 6. That is at index 7, so store that 7 in a variable.
I think you need to iterate the entire list even though the current count is lower than the oldCount, what about e.g. 111224444?
Keep 4 variables while iterating the list: highestStartIndex, highestEndIndex, highestCount and currentCount. Iterate the entire list and use currentCount to count equal neighbouring numbers. Update the highest* variables when a completed currentCount is higher than highestCount. Lastly write the numbers out with paranthesis using the *Index variables.
Heyo,
I´m actually try to implement a function that takes an integer as input.
I´ve also have an array of ascendent integer numbers.
Now i´ve try to find the closest lower and closest higher number to my single integer.
I´ve like to return it as an array but I´ve only found a solution to find THE one closest number to a given input.
public int getClosestTimeValue(int time) {
int nearest = -1;
int bestDistanceFoundYet = Integer.getInteger(null);
int[] array = null;
// We iterate on the array...
for (int i = 0; i < array.length; i++) {
// if we found the desired number, we return it.
if (array[i] == time) {
return array[i];
} else {
int d = Math.abs(time - array[i]);
if (d < bestDistanceFoundYet) {
nearest = array[i];
}
}
}
return nearest;
}
Has anyone an idea how I can solve this problem in java?
Thank you, Lucas
If you are not required to use an array directly, then you can use a NavigableSet and the ceiling()/floor() methods to get the nearest greater/lesser elements in the set. Example:
NavigableSet<Integer> values = new TreeSet<Integer>();
for (int x : array) { values.add(x); }
int lower = values.floor(time);
int higher = values.ceiling(time);
If you are required to use an array (homework?) then find a good reference on binary search.
At the moment you are searching for one time only. To find both the closest lower and closest higher time, you should have two variables. Then you can check whether the iterated time is lower or higher than the input and store the values in corresponding variables. Also at the moment you are returning only one value, but in order to return multiple values, you should do it through an array.
I'm not sure whether it answers your question, but here's how I would solve the problem:
array = new int[]; // Array of times you have declared elsewhere.
// Method which returns the array of found times.
public int[] getClosestTime(int time) {
int closestLowerTime = 0;
int closestHigherTime = 100; // Value bigger than the largest value in the array.
times = new int[2]; // Array for keeping the two closest values.
// Iterating the array.
for (int i = 0; i < array.length; i++) {
// Finding the two closest values.
int difference = time - array[i];
if (difference > 0 && array[i] > closestLowerTime) {
closestLowerTime = array[i];
} else if (difference < 0 && array[i] < closestHigherTime) {
closestHigherTime = array[i];
}
}
times[0] = closestLowerTime;
times[1] = closestHigherTime;
return times;
}
This finds both the closest lower and higher value and returns them as an array. At the moment I solved it as the times were between 0 and 100, but in case you don't know the largest time value, you can find it through another loop which iterates through the array and stores the largest value in closestHigherTime. I didn't find a proper way to return the exact value through an array, but is it required?
As the array is sorted....
1) Check the middle two elements ..if both are less than the number check the left half (.i.e repeat step1)
else if both are greater than the number repeat step1 for right half...else the selected two numbers are your required answer
I'm looking over an assignment that I finished a few days ago and realized I'm not supposed to use constants. The assignment is the well-known "find the largest sum of a sub-array of integers both positive and negative recursively using a divide and conquer approach" problem. My algorithm works, but a part of it uses a constant in order to figure out the largest sum of sub-arrays that include the middle of the array.
Here's the relevant code:
lfSum = Integer.MIN_VALUE;
sum = 0;
// Sum from left to mid
for (int i = mid; i >= LF; i--) {
sum += array[i];
if (sum > lfSum) {
lfSum = sum;
if (lfSum > lfMax) {
lfMax = lfSum;
}
}
}
rtSum = Integer.MIN_VALUE;
sum = 0;
// Sum from mid to right
for (int j = mid+1; j <= RT; j++) {
sum += array[j];
if (sum > rtSum) {
rtSum = sum;
if (rtSum > rtMax) {
rtMax = rtSum;
}
}
}
// Largest sum spanning whole array
midMax = lfSum + rtSum; // midMax = leftMid + midRight;
What this does is it loops through each half of the entire array and checks to see if the sum is larger than the smallest integer possible in case the entire array is negative. If it is, it sets that side's max sum to sum's value. If that value is larger than what one of the recursive calls returned (lfMax or rtMax), set the respective side's recursive value to it.
Like I said earlier, this works perfectly well, but I'm not supposed to be using "Integer.MIN_VALUE". Is there another way around this? Obviously I could initialize lfSum/rtSum to the numerical value of Integer.MIN_VALUE, but I'd like to know if there are any other options.
I've tried removing rtSum/lfSum and just comparing sum to the recursive values, and initializing lfSum/rtSum to 0, but both did not work correctly. Thanks for taking the time to read this!
You can initialize lfSum as null:
Integer lfSum = null;
And modify the if condition like this:
if (lfSum == null || (lfSum != null && sum > lfSum.intValue())) {
lfSum = sum;
if (lfSum > lfMax) {
lfMax = lfSum;
}
}
Similar strategy applies to rtSum.
I want to find the majority in array (number that appears most of the time).
I have a sorted array and use these cycles:
for(int k = 1;k < length;k++)
{
if(arr[k-1] == arr[k])
{
count++;
if(count > max)
{
max = count;
maxnum = arr[k-1];
}
} else {
count = 0;
}
}
or
for(int h=0;h<length;h++)
{
for(int l=1;l<length;l++)
{
if(arr[h] == arr[l])
{
count++;
if(count > max)
{
max = count;
maxnum = arr[h];
}
} else count = 0;
}
}
they are similiar. When i try them on small arrays everything seems to be ok. But on a long run array with N elements 0<=N<=500000, each element K 0<=K<=10^9 they give wrong answers.
Here is solution with mistake http://ideone.com/y2gvnX. I know there are better algos to find majority but i just need to know where is my mistake.
I really can't find it :( Will really appreciate help!
First of all, you should use the first algorithm, as your array is sorted. 2nd algorithm runs through the array twice unnecessarily.
Now your first algorithm is almost correct, but it has two problems: -
The first problem is you are setting count = 0, in else part,
rather it should be set to 1. Because every element comes at least
once.
Secondly, you don't need to set max every time in your if. Just
increment count, till the if-condition is satisfied, and as soon
as condition fails, check for the current count with current
max, and reset the current max accordingly.
This way, your max will not be checked on every iteration, but only when a mismatch is found.
So, you can try out this code: -
// initialize `count = 1`, and `maxnum = Integer.MIN_VALUE`.
int count = 1;
int max = 0;
int maxnum = Integer.MIN_VALUE;
for(int k = 1;k < length;k++)
{
if(arr[k-1] == arr[k]) {
count++; // Keep on increasing count till elements are equal
} else {
// if Condition fails, check for the current count v/s current max
if (max < count) { // Move this from `if` to `else`
max = count;
maxnum = arr[k - 1];
}
count = 1; // Reset count to 1. As every value comes at least once.
}
}
Note : -
The problem with this approach is, if two numbers say - 1 and 3, comes equal number of times - which is max, then the max count will be counted for 3 (assuming that 3 comes after 1, and maxnum will contain 3 and ignore 1. But they both should be considered.
So, basically, you cannot use a for loop and maintain a count to take care of this problem.
A better way is to create a Map<Integer, Integer>, and store the count of each value in there. And then later on sort that Map on value.
Your first algorithm looks correct to me. The second one (which is what your linked code uses) needs some initialization each time through the loop. Also, the inner loop does not need to start at 1 each time; it can start at h + 1:
for(int h=0; h<length; h++)
{
count = 1; // for the element at arr[h]
for(int l=h + 1; l<length; l++)
{
if(arr[h] == arr[l])
{
count++;
}
}
if(count > max)
{
max = count;
maxnum = arr[h];
}
}
The first algorithm is much better for sorted arrays. Even for unsorted arrays, it would be cheaper to sort the array (or a copy of it) and then use the first algorithm rather than use the second.
Note that if there are ties (such as for the array [1, 1, 2, 2, 3] as per #Rohit's comment), this will find the first value (in the sort order) that has the maximum count.
The error I can readily see is that if all elements are distinct, then the max at end is 0.
However it has to be 1.
So when you update count in "else" case, update it to 1 instead of 0, as a new element has been discovered, and its count is 1.
Your first algorithm only makes sense if the array is sorted.
Your second algorithm just sets count to zero in the wrong place. You want to set count to zero before you enter the inner for loop.
for(int h=0;h<length;h++)
{
count = 0;
for(int l=0;l<length;l++)
{
if(arr[h] == arr[l])
{
count++;
if(count > max)
{
max = count;
maxnum = arr[h];
}
}
}
}
Also, you don't need to check count each time in the inner loop.
max = 0;
for(int h=0;h<length;h++)
{
count = 0;
for(int l=0;l<length;l++)
{
if(arr[h] == arr[l])
count++;
}
if(count > max)
{
max = count;
maxnum = arr[h];
}
}