Determining Highest and Lowest Numbers in an Array - java

I'm trying to solve a problem where I need to write java code to find the two highest and the smallest number in an array given the below conditions:
-Every element is a real number
-Every element is random
Any ideas on the best approach?

You have to examine every number, so your best algorithm is linear in the length of the array.
The standard approach is to just scan the array, keeping track of the two smallest and largest numbers that you've seen so far.
So, given that firstMin, secondMin, firstMax, secondMax are respectively the smallest, second smallest, largest and second largest values that you've seen so far, on the next iteration of the loop:
if (value > firstMax) {
secondMax = firstMax;
firstMax = value;
}
else if (value > secondMax) {
secondMax = value;
}
if (value < firstMin) {
secondMin = firstMin;
firstMin = value;
}
else if (value < secondMin) {
secondMin = value;
}
At the end of this block, we maintain the invariant that firstMin, secondMin, firstMax, secondMax are respectively the smallest, second smallest, largest and second largest values that you've seen so far. This proves correctness.
This algorithm is linear in the length of the array and examines each value exactly once and makes the minimum number of comparisons. It is also O(1) in space, and is optimal in that it uses only four extra memory locations for the top and bottom two values.

Have a variable to keep track of the min, second_min, max, and second_max values that you have seen so far.
You can go through the elements of the array one by one, and update your min/max variables accordingly. Here are some cases to consider:
If current element is smaller than your min, save your min to second_min and update your min.
If current element is larger than your max, save your max to second_max and update your max
If current element is smaller then second_min, but larger than min, update second_min only
If current element is larger then second_max, but smaller than max, update second_max only

Is super-optimized performance necessary? If not,
Arrays.sort( array );
Look at the first and last two elements.

I think the best way is use the array as a heap or a special data structure if you can. Keeping the structure balance and sorted will be most efficient for you. This implies keeping some elements empty at some indixes in your array since the index of the array will be information about your ADT. http://en.wikipedia.org/wiki/Heap_(data_structure)

Related

Get the Sum of all positive numbers in a Circularly ordered array in O(Log N)

I've been given an exercise in class that requires the following:
An array v formed by N integers is circularly ordered if, either the array is ordered, or else v[N‐1] ≤ v[0] and ∃k with 0<k<N such as ∀i≠k v[i] ≤ v[i+1].
Example:
Given a circularly ordered array with as much as 10 positive items, calculate the sum of the positive values. For this last example the answer would be 27.
I've been required to implement it using a Divide-and-Conquer scheme in java, given that the complexity is in the worst case O(Log N), being N the array size.
So far I tried to pivot a value until I find a positive value, then knowing the other positive values are adjacent, it's possible to sum the maximum of 10 positive values with a O(1) complexity.
I thought of doing a binary search to achieve O(Log N) complexity, but this would not follow the divide and conquer pattern.
I'm easily able to implement it through a O(N) complexity like this:
public static int addPositives(int[] vector){
return addPositives(vector,0,vector.length-1
}
public static int addPositives(int[] vector, int i0, int iN){
int k = (i0+iN)/2;
if (iN-i0 > 1){
return addPositives(vector,i0,k) + addPositives(vector,k+1,iN);
}else{
int temp = 0;
for (int i = i0; i <= iN; i++) {
if (vector[i]>0) temp+=vector[i];
}
return temp;
}
}
However trying to land the O(Log N) gets me nowhere, how could I achieve it?
You can improve your divide and conquer implementation to meet the required running time if you prune irrelevant branches of the recursion.
After you divide the current array into two sub-arrays, compare the first and last elements of each sub-array. If both are negative and the first is smaller than the last, you know for sure that all the elements in this sub-array are negative and you don't have to make the recursive call on it (since you know it will contribute 0 to the total sum).
You can also stop the recursion if all the elements in a sub-array are positive (which can also be verified by comparing the first and last elements of the sub-array) - in that case you have to sum all the elements of that sub-array, so there's no point to continue the recursion.
My advice for the O(Log N) would be a direct comparison to meet the second of the two criteria: the last item being less than the first.
return vector[0] >= vector[iN-1]
If you want something with greater complexity, I forget the algorithm name, but you could get the array at the halfway point, and do two ordered searches from there: from the mid to the start and then the mid to the end

Finding number of subarrays whose sum equals `k`

We would be given an array of integers and a value k. We need to find the total number of sub-arrays whose sum equals k.
I found some interesting code online (on Leetcode) which is as follows:
public class Solution {
public int subarraySum(int[] nums, int k) {
int sum = 0, result = 0;
Map<Integer, Integer> preSum = new HashMap<>();
preSum.put(0, 1);
for (int i = 0; i < nums.length; i++) {
sum += nums[i];
if (preSum.containsKey(sum - k)) {
result += preSum.get(sum - k);
}
preSum.put(sum, preSum.getOrDefault(sum, 0) + 1);
}
return result;
}
}
To understand it, I walked through some specific examples like [1,1,1,1,1] with k=3 and [1,2,3,0,3,2,6] with k=6. While the code works perfectly in both the cases, I fail to follow how it actually computes the output.
I have two specific points of confusion:
1) Why does the code continuously add the values in the array, without ever zeroing it out? For example, in case of [1,1,1,1,1] with k=3, once sum=3, don't we need to reset sum to zero? Doesn't not resetting sum interfere with finding later subarrays?
2) Shouldn't we simply do result++ when we find a subarray of sum k? Why do we add preSum.get(sum-k) instead?
Let's handle your first point of confusion first:
The reason the code keeps summing the array and doesn't reset sum is because we are saving the sum in preSum (previous sums) as we go. Then, any time we get to a point where sum-k is a previous sum (say at index i), we know that the sum between index i and our current index is exactly k.
For example, in the image below with i=2, and our current index equal to 4, we can see that since 9, the sum at our current index, minus 3, the sum at index i, is 6, the sum between indexes 2 and 4 (inclusive) is 6.
Another way to think about this is to see that discarding [1,2] from the array (at our current index of 4) gives us a subarray of sum 6, for similar reasons as above (see image for details).
Using this method of thinking, we can say we want to discard from the front of the array until we are left with a subarray of sum k. We could do this by saying, for each index, "discard just 1, then discard 1+2, then discard 1+2+3, etc" (these numbers are from our example) until we found a subarray of sum k (k=6 in our example).
That gives a perfectly valid solution, but notice we would be doing this at every index of our array, and thus summing the same numbers over and over. A way to save computation would be to save these sums for later use. Even better, we already sum these same numbers to get our current sum, so we can just save that total as we go.
To find a subarray, we can just look through our saved sums, subtracting them and testing if what we are left with is k. It is a bit annoying to have to subtract every saved sum, so we can use the commutativity of subtraction to see that if sum-x=k is true, sum-k=x is also true. This way we can just see if x is a saved sum, and, if it is, know we have found a subarray of size k. A hash map makes this lookup efficient.
Now for your second point of confusion:
Most of the time you are right, upon finding an appropriate subarray we could just do result++. Almost always, the values in preSum will be 1, so result+=preSum.get(sum-k) will be equivalent to result+=1, or result++.
The only time it isn't is when preSum.put is called on a sum that has been reached before. How can we get back to a sum we already had? The only way is with either negative numbers, which cancel out previous numbers, or with zero, which doesn't affect the sum at all.
Basically, we get back to a previous sum when a subarray's sum is equal to 0. Two examples of such subarrays are [2,-2] or the trivial [0]. With such a subarray, when we find a later, adjoining subarray with sum k, we need to add more than 1 to result as we have found more than one new subarray, one with the zero-sum subarray (sum=k+0) and one without it (sum=k).
This is the reason for that +1 in the preSum.put as well. Every time we reach the same sum again, we have found another zero-sum subarray. With two zero-sum subarrays, finding a new adjoining subarray with sum=k actually gives 3 subarrays: the new subarray (sum=k), the new subarray plus the first zero-sum (sum=k+0), and the original with both zero-sums (sum=k+0+0). This logic holds for higher numbers of zero-sum subarrays as well.

Finding mean and median in constant time

This is a common interview question.
You have a stream of numbers coming in (let's say more than a million). The numbers are between [0-999]).
Implement a class which supports three methods in O(1)
* insert(int i);
* getMean();
* getMedian();
This is my code.
public class FindAverage {
private int[] store;
private long size;
private long total;
private int highestIndex;
private int lowestIndex;
public FindAverage() {
store = new int[1000];
size = 0;
total = 0;
highestIndex = Integer.MIN_VALUE;
lowestIndex = Integer.MAX_VALUE;
}
public void insert(int item) throws OutOfRangeException {
if(item < 0 || item > 999){
throw new OutOfRangeException();
}
store[item] ++;
size ++;
total += item;
highestIndex = Integer.max(highestIndex, item);
lowestIndex = Integer.min(lowestIndex, item);
}
public float getMean(){
return (float)total/size;
}
public float getMedian(){
}
}
I can't seem to think of a way to get the median in O(1) time.
Any help appreciated.
You have already done all the heavy lifting, by building the store counters. Together with the size value, it's easy enough.
You simply start iterating the store, summing up the counts until you reach half of size. That is your median value, if size is odd. For even size, you'll grab the two surrounding values and get their average.
Performance is O(1000/2) on average, which means O(1), since it doesn't depend on n, i.e. performance is unchanged even if n reaches into the billions.
Remember, O(1) doesn't mean instant, or even fast. As Wikipedia says it:
An algorithm is said to be constant time (also written as O(1) time) if the value of T(n) is bounded by a value that does not depend on the size of the input.
In your case, that bound is 1000.
The possible values that you can read are quite limited - just 1000. So you can think of implementing something like a counting sort - each time a number is input you increase the counter for that value.
To implement the median in constant time, you will need two numbers - the median index(i.e. the value of the median) and the number of values you've read and that are on the left(or right) of the median. I will just stop here hoping you will be able to figure out how to continue on your own.
EDIT(as pointed out in the comments): you already have the array with the sorted elements(stored) and you know the number of elements to the left of the median(size/2). You only need to glue the logic together. I would like to point out that if you use linear additional memory you won't need to iterate over the whole array on each insert.
For the general case, where range of elements is unlimited, such data structure does not exist based on any comparisons based algorithm, as it will allow O(n) sorting.
Proof: Assume such DS exist, let it be D.
Let A be input array for sorting. (Assume A.size() is even for simplicity, that can be relaxed pretty easily by adding a garbage element and discarding it later).
sort(A):
ds = new D()
for each x in A:
ds.add(x)
m1 = min(A) - 1
m2 = max(A) + 1
for (i=0; i < A.size(); i++):
ds.add(m1)
# at this point, ds.median() is smallest element in A
for (i = 0; i < A.size(); i++):
yield ds.median()
# Each two insertions advances median by 1
ds.add(m2)
ds.add(m2)
Claim 1: This algorithm runs in O(n).
Proof: Since we have constant operations of add() and median(), each of them is O(1) per iteration, and the number of iterations is linear - the complexity is linear.
Claim 2: The output is sorted(A).
Proof (guidelines): After inserting n times m1, the median is the smallest element in A. Each two insertions after it advances the median by one item, and since the advance is sorted, the total output is sorted.
Since the above algorithm sorts in O(n), and not possible under comparisons model, such DS does not exist.
QED.

Java, Finding Kth largest value from the array [duplicate]

This question already has answers here:
How to find the kth largest element in an unsorted array of length n in O(n)?
(32 answers)
Closed 7 years ago.
I had an interview with Facebook and they asked me this question.
Suppose you have an unordered array with N distinct values
$input = [3,6,2,8,9,4,5]
Implement a function that finds the Kth largest value.
EG: If K = 0, return 9. If K = 1, return 8.
What I did was this method.
private static int getMax(Integer[] input, int k)
{
List<Integer> list = Arrays.asList(input);
Set<Integer> set = new TreeSet<Integer>(list);
list = new ArrayList<Integer>(set);
int value = (list.size() - 1) - k;
return list.get(value);
}
I just tested and the method works fine based on the question. However, interviewee said, in order to make your life complex! lets assume that your array contains millions of numbers then your listing becomes too slow. What you do in this case?
As hint, he suggested to use min heap. Based on my knowledge each child value of heap should not be more than root value. So, in this case if we assume that 3 is root then 6 is its child and its value is grater than root's value. I'm probably wrong but what you think and what is its implementation based on min heap?
He has actually given you the whole answer. Not just a hint.
And your understanding is based on max heap. Not min heap. And it's workings are self-explanatory.
In a min heap, the root has the minimum (less than it's children) value.
So, what you need is, iterate over the array and populate K elements in min heap.
Once, it's done, the heap automatically contains the lowest at the root.
Now, for each (next) element you read from the array,
-> check if the value is greater than root of min heap.
-> If yes, remove root from min heap, and add the value to it.
After you traverse your whole array, the root of min heap will automtically contain the kth largest element.
And all other elements (k-1 elements to be precise) in the heap will be larger than k.
Here is the implementation of the Min Heap using PriorityQueue in java. Complexity: n * log k.
import java.util.PriorityQueue;
public class LargestK {
private static Integer largestK(Integer array[], int k) {
PriorityQueue<Integer> queue = new PriorityQueue<Integer>(k+1);
int i = 0;
while (i<=k) {
queue.add(array[i]);
i++;
}
for (; i<array.length; i++) {
Integer value = queue.peek();
if (array[i] > value) {
queue.poll();
queue.add(array[i]);
}
}
return queue.peek();
}
public static void main(String[] args) {
Integer array[] = new Integer[] {3,6,2,8,9,4,5};
System.out.println(largestK(array, 3));
}
}
Output: 5
The code loop over the array which is O(n). Size of the PriorityQueue (Min Heap) is k, so any operation would be log k. In the worst case scenario, in which all the number are sorted ASC, complexity is n*log k, because for each element you need to remove top of the heap and insert new element.
Edit: Check this answer for O(n) solution.
You can probably make use of PriorityQueue as well to solve this problem:
public int findKthLargest(int[] nums, int k) {
int p = 0;
int numElements = nums.length;
// create priority queue where all the elements of nums will be stored
PriorityQueue<Integer> pq = new PriorityQueue<Integer>();
// place all the elements of the array to this priority queue
for (int n : nums){
pq.add(n);
}
// extract the kth largest element
while (numElements-k+1 > 0){
p = pq.poll();
k++;
}
return p;
}
From the Java doc:
Implementation note: this implementation provides O(log(n)) time for
the enqueing and dequeing methods (offer, poll, remove() and
add); linear time for the remove(Object) and contains(Object)
methods; and constant time for the retrieval methods (peek,
element, and size).
The for loop runs n times and the complexity of the above algorithm is O(nlogn).
Heap based solution is perfect if the number of elements in array/stream is unknown. But, what if they are finite but still you want an optimized solution in linear time.
We can use Quick Select, discussed here.
Array = [3,6,2,8,9,4,5]
Let's chose the pivot as first element:
pivot = 3 (at 0th index),
Now partition the array in such a way that all elements less than or equal are on left side and numbers greater than 3 on right side. Like it's done in Quick Sort (discussed on my blog).
So after first pass - [2,3,6,8,9,4,5]
pivot index is 1 (i.e it's the second lowest element). Now apply the same process again.
chose, 6 now, the value at index after previous pivot - [2,3,4,5,6,8,9]
So now 6 is at the proper place.
Keep checking if you have found the appropriate number (kth largest or kth lowest in each iteration). If it's found you are done else continue.
One approach for constant values of k is to use a partial insertion sort.
(This assumes distinct values, but can easily be altered to work with duplicates as well)
last_min = -inf
output = []
for i in (0..k)
min = +inf
for value in input_array
if value < min and value > last_min
min = value
output[i] = min
print output[k-1]
(That's pseudo code, but should be easy enough to implement in Java).
The overall complexity is O(n*k), which means it works pretty well if and only if k is constant or known to be less that log(n).
On the plus side, it is a really simple solution. On the minus side, it is not as efficient as the heap solution

Find n highest numbers

There are millions of integers are given. How to find out n largest numbers from this? Note that since the input is huge i cant store anything in the memory.
Any suggestions?
Thanks
shag
You can iterate through all numbers (reading them from a media one by one for example) and only keep a list with the 10 maximum numbers.
In pseudo code:
max_numbers = new int[n]
until not end of file:
read number
if number > min(max_numbers):
'copy number to minimum value of max_numbers'
Just have an array of n elements and if you find one number that is bigger than the smallest in the array, you can change it.
You could keep an extra variable where you keep the smallest number in the array so you only iterate on it when you know you have to change something.
Get an array of 10 length, while you run through numbers, swap the smallest with a new bigger.
public void largest() {
int _current, _highest, _lowest;
if(_current >= _highest) {
_highest = _current;
} else if(_current <= _lowest) {
_lowest = _current;
}
}
What I would do.
Maintain a Max-Heap of size n.
EDITED
I recommend forming a priority-queue (heap based), taking Michael's suggestion to it's logical conclusion. Don't store 10, store n.
PQ a[n];
a.insert(input);
O(log n) FTW

Categories