Operation and time complexity in arbitrary array - java

Suppose you're given an arbitrary array of length n. Which of the following operations can you perform on the array in worst-case O(1) time?
A. Remove the ith element, decreasing the size by 1
B. Insert an element at ith position , increasing the size by 1
C. find the maximum element
D. Swap the elements at location i and j
In this question, I'm not sure about the definition of arbitrary array. It seems that D is correct, but I'm not sure. Could anybody explain it? Many thanks!

I think by arbitrary, it might just mean that the values of the array don't matter.
A) Removing the ith element and decreasing the array's size by one has an O(n) complexity: when removing an ith element, you have to move all the other elements down one index. If you removed the 0th element, you would need to move n - 1 elements down one index.
B) Inserting an element at the ith position and increasing the array's size by one has an O(n) complexity, for a similar reason. If I add an element to the beginning of the array, I need to move all the other elements up one. Not to mention that because arrays have fixed size, I would also need to create a new array and copy the former elements.
C) Finding the maximum element will at least take O(n) times. You need to look through each element to find the max value, right? Well, if the max value is at position n at the end of the array, you need to go through all the values up to that point.
D) Swapping elements is just O(1). It's a constant operation that doesn't require for or while loops or what not.
Something like that. Hope it helps!

Related

Best way to retrieve K largest elements from large unsorted arrays?

I recently had a coding test during an interview. I was told:
There is a large unsorted array of one million ints. User wants to retrieve K largest elements. What algorithm would you implement?
During this, I was strongly hinted that I needed to sort the array.
So, I suggested to use built-in sort() or maybe a custom implementation if performance really mattered. I was then told that using a Collection or array to store the k largest and for-loop it is possible to achieve approximately O(N), in hindsight, I think it's O(N*k) because each iteration needs to compare to the K sized array to find the smallest element to replace, while the need to sort the array would cause the code to be at least O(N log N).
I then reviewed this link on SO that suggests priority queue of K numbers, removing the smallest number every time a larger element is found, which would also give O(N log N). Write a program to find 100 largest numbers out of an array of 1 billion numbers
Is the for-loop method bad? How should I justify pros/cons of using the for-loop or the priorityqueue/sorting methods? I'm thinking that if the array is already sorted, it could help by not needing to iterate through the whole array again, i.e. if some other method of retrieval is called on the sorted array, it should be constant time. Is there some performance factor when running the actual code that I didn't consider when theorizing pseudocode?
Another way of solving this is using Quickselect. This should give you a total average time complexity of O(n). Consider this:
Find the kth largest number x using Quickselect (O(n))
Iterate through the array again (or just through the right-side partition) (O(n)) and save all elements ≥ x
Return your saved elements
(If there are repeated elements, you can avoid them by keeping count of how many duplicates of x you need to add to the result.)
The difference between your problem and the one in the SO question you linked to is that you have only one million elements, so they can definitely be kept in memory to allow normal use of Quickselect.
There is a large unsorted array of one million ints. The user wants to retrieve the K largest elements.
During this, I was strongly hinted that I needed to sort the array.
So, I suggested using a built-in sort() or maybe a custom
implementation
That wasn't really a hint I guess, but rather a sort of trick to deceive you (to test how strong your knowledge is).
If you choose to approach the problem by sorting the whole source array using the built-in Dual-Pivot Quicksort, you can't obtain time complexity better than O(n log n).
Instead, we can maintain a PriorytyQueue which would store the result. And while iterating over the source array for each element we need to check whether the queue has reached the size K, if not the element should be added to the queue, otherwise (is size equals to K) we need to compare the next element against the lowest element in the queue - if the next element is smaller or equal we should ignore it if it is greater the lowest element has to be removed and the new element needs to be added.
The time complexity of this approach would be O(n log k) because adding a new element into the PriorytyQueue of size k costs O(k) and in the worst-case scenario this operation can be performed n times (because we're iterating over the array of size n).
Note that the best case time complexity would be Ω(n), i.e. linear.
So the difference between sorting and using a PriorytyQueue in terms of Big O boils down to the difference between O(n log n) and O(n log k). When k is much smaller than n this approach would give a significant performance gain.
Here's an implementation:
public static int[] getHighestK(int[] arr, int k) {
Queue<Integer> queue = new PriorityQueue<>();
for (int next: arr) {
if (queue.size() == k && queue.peek() < next) queue.remove();
if (queue.size() < k) queue.add(next);
}
return toIntArray(queue);
}
public static int[] toIntArray(Collection<Integer> source) {
return source.stream().mapToInt(Integer::intValue).toArray();
}
main()
public static void main(String[] args) {
System.out.println(Arrays.toString(getHighestK(new int[]{3, -1, 3, 12, 7, 8, -5, 9, 27}, 3)));
}
Output:
[9, 12, 27]
Sorting in O(n)
We can achieve worst case time complexity of O(n) when there are some constraints regarding the contents of the given array. Let's say it contains only numbers in the range [-1000,1000] (sure, you haven't been told that, but it's always good to clarify the problem requirements during the interview).
In this case, we can use Counting sort which has linear time complexity. Or better, just build a histogram (first step of Counting Sort) and look at the highest-valued buckets until you've seen K counts. (i.e. don't actually expand back to a fully sorted array, just expand counts back into the top K sorted elements.) Creating a histogram is only efficient if the array of counts (possible input values) is smaller than the size of the input array.
Another possibility is when the given array is partially sorted, consisting of several sorted chunks. In this case, we can use Timsort which is good at finding sorted runs. It will deal with them in a linear time.
And Timsort is already implemented in Java, it's used to sort objects (not primitives). So we can take advantage of the well-optimized and thoroughly tested implementation instead of writing our own, which is great. But since we are given an array of primitives, using built-in Timsort would have an additional cost - we need to copy the contents of the array into a list (or array) of wrapper type.
This is a classic problem that can be solved with so-called heapselect, a simple variation on heapsort. It also can be solved with quickselect, but like quicksort has poor quadratic worst-case time complexity.
Simply keep a priority queue, implemented as binary heap, of size k of the k smallest values. Walk through the array, and insert values into the heap (worst case O(log k)). When the priority queue is too large, delete the minimum value at the root (worst case O(log k)). After going through the n array elements, you have removed the n-k smallest elements, so the k largest elements remain. It's easy to see the worst-case time complexity is O(n log k), which is faster than O(n log n) at the cost of only O(k) space for the heap.
Here is one idea. I will think for creating array (int) with max size (2147483647) as it is max value of int (2147483647). Then for every number in for-each that I get from the original array just put the same index (as the number) +1 inside the empty array that I created.
So in the end of this for each I will have something like [1,0,2,0,3] (array that I created) which represent numbers [0, 2, 2, 4, 4, 4] (initial array).
So to find the K biggest elements you can make backward for over the created array and count back from K to 0 every time when you have different element then 0. If you have for example 2 you have to count this number 2 times.
The limitation of this approach is that it works only with integers because of the nature of the array...
Also the representation of int in java is -2147483648 to 2147483647 which mean that in the array that need to be created only the positive numbers can be placed.
NOTE: if you know that there is max number of the int then you can lower the created array size with that max number. For example if the max int is 1000 then your array which you need to create is with size 1000 and then this algorithm should perform very fast.
I think you misunderstood what you needed to sort.
You need to keep the K-sized list sorted, you don't need to sort the original N-sized input array. That way the time complexity would be O(N * log(K)) in the worst case (assuming you need to update the K-sized list almost every time).
The requirements said that N was very large, but K is much smaller, so O(N * log(K)) is also smaller than O(N * log(N)).
You only need to update the K-sized list for each record that is larger than the K-th largest element before it. For a randomly distributed list with N much larger than K, that will be negligible, so the time complexity will be closer to O(N).
For the K-sized list, you can take a look at the implementation of Is there a PriorityQueue implementation with fixed capacity and custom comparator? , which uses a PriorityQueue with some additional logic around it.
There is an algorithm to do this in worst-case time complexity O(n*log(k)) with very benign time constants (since there is just one pass through the original array, and the inner part that contributes to the log(k) is only accessed relatively seldomly if the input data is well-behaved).
Initialize a priority queue implemented with a binary heap A of maximum size k (internally using an array for storage). In the worst case, this has O(log(k)) for inserting, deleting and searching/manipulating the minimum element (in fact, retrieving the minimum is O(1)).
Iterate through the original unsorted array, and for each value v:
If A is not yet full then
insert v into A,
else, if v>min(A) then (*)
insert v into A,
remove the lowest value from A.
(*) Note that A can return repeated values if some of the highest k values occur repeatedly in the source set. You can avoid that by a search operation to make sure that v is not yet in A. You'd also want to find a suitable data structure for that (as the priority queue has linear complexity), i.e. a secondary hash table or balanced binary search tree or something like that, both of which are available in java.util.
The java.util.PriorityQueue helpfully guarantees the time complexity of its operations:
this implementation provides O(log(n)) time for the enqueing and dequeing methods (offer, poll, remove() and add); linear time for the remove(Object) and contains(Object) methods; and constant time for the retrieval methods (peek, element, and size).
Note that as laid out above, we only ever remove the lowest (first) element from A, so we enjoy the O(log(k)) for that. If you want to avoid duplicates as mentioned above, then you also need to search for any new value added to it (with O(k)), which opens you up to a worst-case overall scenario of O(n*k) instead of O(n*log(k)) in case of a pre-sorted input array, where every single element v causes the inner loop to fire.

Insertion Time complexity on Java arrays

I have array of 10 elements. Integer[] arr = new Integer10;
What is the time complexity when I add 5th element like arr[5]=999;. I strongly believe that time complexity in arrays for insertion is O(1) , but on some literatures its said that for insertion on arrays (not arraylist) T.C is O(N) because the shifting occurs. How there can be shifting occured? It is not dynamic array.
Here the sources which I believe they are wrong:
Big O Notation Arrays vs. Linked List insertions
O(1) accurately describes inserting at the end of the array. However,
if you're inserting into the middle of an array, you have to shift all
the elements after that element, so the complexity for insertion in
that case is O(n) for arrays. End appending also discounts the case
where you'd have to resize an array if it's full.
https://iq.opengenus.org/time-complexity-of-array/
Inserting and deleting elements take linear time depending on the
implementation. If we want to insert an element at a specific index,
we need to skip all elements from that index to the right by one
position. This takes linear time O(N).
3.https://www.log2base2.com/data-structures/array/insert-element-particular-index-array.html
If we want to insert an element to index 0, then we need to shift all
the elements to right.
For example, if we have 5 elements in the array and need to insert an
element in arr[0], we need to shift all those 5 elements one position
to the right.
There's a misunderstanding:
arr[5]=999; is a direct assignment, not an 'insertion' where shifting would occur, so yes: it's O(1)
Insertion on the other hand consists of the following steps:
if array is full, create a new array
and copy data below index (<) to 0
copy data equal and above index (>=) to index + 1
assign value arr[index]=newValue;
So insertion, because it has to copy/move n-index elements on each insert, is considered O(n), because big-oh notation is designed to showcase worst-case.
Now, with MXX registers and parallel piplines on GPUs and stuff like that, we can speed this up by another big (constant) factor, but the problem itself remains in the O(n) category.
Not to argue with other answers, but I think the OP ask about direct assignment on array arr[5]=999; This operation will assign the value 999 to 5th index in array (if array has length 6 or more). It will not do any shifting or allocation. and for this operation the notion is definitely is O(1)
There are two use cases of inserting data in the array.
Inplace replacement
Shift the data
In first case (Inplace replacement), we generally have code like arr[5] = 99. For this use case, we are not going to shift the data of index 5 to index 6 and so on. Hence, the time complexity of such operation is O(1).
Whereas, in second case (Shift data), for inserting data at index 5, we will first shift the data of index 5 to index 6, and index 6 data to index 7 and so on. In this case, the time complexity of such operation would be O(N)

Find the only unique element in an array of a million elements

I was asked this question in a recent interview.
You are given an array that has a million elements. All the elements are duplicates except one. My task is to find the unique element.
var arr = [3, 4, 3, 2, 2, 6, 7, 2, 3........]
My approach was to go through the entire array in a for loop, and then create a map with index as the number in the array and the value as the frequency of the number occurring in the array. Then loop through our map again and return the index that has value of 1.
I said my approach would take O(n) time complexity. The interviewer told me to optimize it in less than O(n) complexity. I said that we cannot, as we have to go through the entire array with a million elements.
Finally, he didn't seem satisfied and moved onto the next question.
I understand going through million elements in the array is expensive, but how could we find a unique element without doing a linear scan of the entire array?
PS: the array is not sorted.
I'm certain that you can't solve this problem without going through the whole array, at least if you don't have any additional information (like the elements being sorted and restricted to certain values), so the problem has a minimum time complexity of O(n). You can, however, reduce the memory complexity to O(1) with a XOR-based solution, if every element is in the array an even number of times, which seems to be the most common variant of the problem, if that's of any interest to you:
int unique(int[] array)
{
int unpaired = array[0];
for(int i = 1; i < array.length; i++)
unpaired = unpaired ^ array[i];
return unpaired;
}
Basically, every XORed element cancels out with the other one, so your result is the only element that didn't cancel out.
Assuming the array is un-ordered, you can't. Every value is mutually exclusive to the next so nothing can be deduced about a value from any of the other values?
If it's an ordered array of values, then that's another matter and depends entirely on the ordering used.
I agree the easiest way is to have another container and store the frequency of the values.
In fact, since the number of elements in the array was fix, you could do much better than what you have proposed.
By "creating a map with index as the number in the array and the value as the frequency of the number occurring in the array", you create a map with 2^32 positions (assuming the array had 32-bit integers), and then you have to pass though that map to find the first position whose value is one. It means that you are using a large auxiliary space and in the worst case you are doing about 10^6+2^32 operations (one million to create the map and 2^32 to find the element).
Instead of doing so, you could sort the array with some n*log(n) algorithm and then search for the element in the sorted array, because in your case, n = 10^6.
For instance, using the merge sort, you would use a much smaller auxiliary space (just an array of 10^6 integers) and would do about (10^6)*log(10^6)+10^6 operations to sort and then find the element, which is approximately 21*10^6 (many many times smaller than 10^6+2^32).
PS: sorting the array decreases the search from a quadratic to a linear cost, because with a sorted array we just have to access the adjacent positions to check if a current position is unique or not.
Your approach seems fine. It could be that he was looking for an edge-case where the array is of even size, meaning there is either no unmatched elements or there are two or more. He just went about asking it the wrong way.

Why does Insertion Sort not need the swap operation?

So I am currently learning Java and I was asking myself, why the Insertion-Sort method doesn´t have the need to use the swap operation? As Far as I understood, elements get swapped so wouldn´t it be usefull to use the swap operation in this sorting algorithm?
As I said, I am new to this but I try to understand the background of these algorithms , why they are the way they actually are
Would be happy for some insights :)
B.
Wikipedia's article for Insertion sort states
Each iteration, insertion sort removes one element from the input
data, finds the location it belongs within the sorted list, and
inserts it there. It repeats until no input elements remain. [...] If
smaller, it finds the correct position within the sorted list, shifts
all the larger values up to make a space, and inserts into that
correct position.
You can consider this shift as an extreme swap. What actually happens is the value is stored in a placeholder and checked versus the other values. If those values are smaller, they are simply shifted, ie. replace the previous (or next) position in the list/array. The placeholder's value is then put in the position from which the element was shifted.
Insertion Sort does not perform swapping. It performs insertions by shifting elements in a sequential list to make room for the element that is being inserted.
That is why it is an O(N^2) algorithm: for each element out of N, there can be O(N) shifts.
So, you Could do insertion sort by swapping.
But, is that the best way to do it? you should think of what a swap is...
temp = a
a=b
b=temp
there are 3 assignments that take place for a single swap.
eg. [2,3,1]
If the above list is to be sorted, you could 1. swap 3 and 1 then, 2. swap 1 and 2
total 6 assignments
Now,
Instead of swapping, if you just shift 2 and 3 one place to the right ( 1 assignment each) and then put 1 in array[0], you would end up with just 3 assignments instead of the 6 you would do with swapping.

Heapsorting algorithm

so I have a program that performs a heap sort and I have a remove element function. I do this by taking the last element all the way on the right side and then replace the n-th element with it. To maintain the sort, I then bubble the element down to its correct place in the heap. My friend doesn't seem to think that this will work. will it?
Heapsort is using max-heap, each time, you need to switch the root node, which is the max one, with the nth element, then put the max element into result list from right to left.
Then, bubble the element down to its proper position, and decrease the heap's size from n to n-1.
Then, the new root node of new heap is the max element among the remaining n-1 elements. Just recursively switch root element with the n-1th element, and again put the max element to the result list, decrease heap's size by 1 until the heap's size is 0

Categories