if you have a map or list with a bunch of doubles between -1 and 1 you can order them from -1 to 1.
how ever it is perfect e.g ( 0.4 is after 0.3 but before 0.5 )
is it possible to simulate putting the numbers generally in the correct place but not perfect?
its hard to explain what I mean but I have drawn a diagram to help.
The x's are points on the number line
I don't want it to be perfectly sorted, neither randomly sorted but in-between; roughly sorted.
is this possible if so how?
You can use a TreeMap (constructor with Comparator-arg) or Collections.sort(List, Comparator) with a Comparator based on Double.compare(double, double).
This is correct for perfect sorting. About "roughly sorted", you have to write your own Comparator-logic. And please define your term how you really want to sort.
You haven't defined 'roughly sorted', but one option is "binning". Split the interval into n bins of width intervalsize/n. Store the bins in an array or list, and each bin is a list of values. Iterate once over the input set and distribute the values to their appropriate bins, which is O(n). You can improve the sorting by increasing the number of bins.
You can sort them first, and then run a set random swap of direct neighbours. For example, with a list size of n, you swap n times at random positions.
Related
I am currently implementing the Mash algorithm for genome comparison. For this I need to create a sketch S(A) for each genome, which is a set of certain size s containing the lowest hash values corresponding to k-mers in the genome. To compare two genomes, I am computing the Jaccard index, for which I need an additional sketch of the union of the two genomes, i.e. S(A u B) for two genomes A and B. This should contain the s lowest hash values found in the union of A and B.
I am computing the sketches as a TreeSet because in the original algorithm to compute a sketch I need to remove the biggest value from the set whenever I add a new value that is lower and the sketch has already reached the maximum size s. This is very easily accomplished using TreeSet because the largest value will be in the last position of the set.
I have computed a union of two sketches and now want to remove the larger elements to reduce the sketch size to size s. I first implemented this using a while loop and removing the last element until reaching the desired size.
The following is my code using an example TreeSet and size s = 10:
SortedSet<Integer> example = new TreeSet<>();
for (int i = 0; i < 15; i++) {
example.add(i);
}
while (example.size() > 10) example.remove(example.last());
However, in the real application sketch sizes will be much larger and the size of the union can be up to two times the size of a single sketch. Trying to find a more efficient way to reduce the sketch size I found that you could convert the TreeSet to an array. Thus, my second approach would be the following:
Object[] temp = example.toArray();
int value = (int) temp[10];
example = example.headSet(value);
So, here I am getting the value at index s from the array, which I can then use to create a headSet from the TreeSet.
Now, I am wondering if there is a more efficient way to reduce the size of the TreeSet, where I don't need to iterate over the size of the TreeSet over and over again or generate an extra array.
I have a list where I am trying to find the sum of combination of the lists entries, except the entries where both values to add are equal to each other (ie 2+2 would not be added) and add them to another list.
As an example:
[1,2,3] would yield the list of sums [3,4,5] because 1+2=5,1+3=4, and 2+3=5
However, my issues arises with not knowing how many sums will be produced. I am working in java and am limited to native arrays, therefore the size of the array has to be set before I can add the sum values to it.
I know I would not be able to find the exact size of the sum list due to the possibility that a sum would not get added if the two elements are the same, but I am trying to ballpark it so I don't have massive arrays.
The closest 'formula' I have gotten is setting the following, but it is never precisely what the max value would be for any list
(list length of original numbers * list length of original numbers) / 2
I am trying to keep time complexity in mind, so keeping a running count of how many sums there are, setting an array to that size, and looping through the original list again would not be efficient.
Any suggestions?
Can you add same sums to array, I mean, your array is {1,2,3,4,5}. Would you print the both result of 1+5 and 2+4 =6.
If your answer is yes. You can get the length of array and multiply it with 1 less and divide them to 2. For instance; our array → {1,2,3,4,5} the lenght is 5 the length of our result array will be 5*4/2=10.
Or you can use lists in java if you cant define a length for array. Keep in mind.
trying to figure out following problem:
Given a set S of N positive integers the task is to divide them into K subsets such that the sum of the elements values in every of the K subsets is equal.
I want to do this with a set of values not more than 10 integers, with values not bigger than 10 , and less than 5 subsets.
All integers need to be distributed, and only perfect solutions (meaning all subsets are equal, no approximations) are accepted.
I want to solve it recursively using backtracking. Most ressources I found online were using other approaches I did not understand, using bitmasks or something, or only being for two subsets rather than K subsets.
My first idea was to
Sort the set by ascending order, check all base cases (e.g. an even distribution is not possible), calculate the average value all subsets have to have so that all subsets are equal.
Going through each subset, filling each (starting with the biggest values first) until that average value (meaning theyre full) is achieved.
If the average value for a subset can't be met (undistributed values are too big etc.), go back and try another combination for the previous subset.
Keep going back if dead ends are encountered.
stop if all dead ends have been encountered or a perfect solution was found.
Unfortunately I am really struggling with this, especially with implementing the backtrack and retrying new combinations.
Any help is appreciated!
the given set: S with N elements has 2^N subsets. (well explained here: https://www.mathsisfun.com/activity/subsets.html ) A partition is is a grouping of the set's elements into non-empty subsets, in such a way that every element is included in one and only one of the subsets. The total number of partitions of an n-element set is the Bell number Bn.
A solution for this problem can be implemented as follows:
1) create all possible partitions of the set S, called P(S).
2) loop over P(S) and filter out if the sum of the elements values in every subsets do not match.
So I am currently learning Java and I was asking myself, why the Insertion-Sort method doesn´t have the need to use the swap operation? As Far as I understood, elements get swapped so wouldn´t it be usefull to use the swap operation in this sorting algorithm?
As I said, I am new to this but I try to understand the background of these algorithms , why they are the way they actually are
Would be happy for some insights :)
B.
Wikipedia's article for Insertion sort states
Each iteration, insertion sort removes one element from the input
data, finds the location it belongs within the sorted list, and
inserts it there. It repeats until no input elements remain. [...] If
smaller, it finds the correct position within the sorted list, shifts
all the larger values up to make a space, and inserts into that
correct position.
You can consider this shift as an extreme swap. What actually happens is the value is stored in a placeholder and checked versus the other values. If those values are smaller, they are simply shifted, ie. replace the previous (or next) position in the list/array. The placeholder's value is then put in the position from which the element was shifted.
Insertion Sort does not perform swapping. It performs insertions by shifting elements in a sequential list to make room for the element that is being inserted.
That is why it is an O(N^2) algorithm: for each element out of N, there can be O(N) shifts.
So, you Could do insertion sort by swapping.
But, is that the best way to do it? you should think of what a swap is...
temp = a
a=b
b=temp
there are 3 assignments that take place for a single swap.
eg. [2,3,1]
If the above list is to be sorted, you could 1. swap 3 and 1 then, 2. swap 1 and 2
total 6 assignments
Now,
Instead of swapping, if you just shift 2 and 3 one place to the right ( 1 assignment each) and then put 1 in array[0], you would end up with just 3 assignments instead of the 6 you would do with swapping.
how do you return number of distinct/unique values in an array for example
int[] a = {1,2,2,4,5,5};
Set<Integer> s = new HashSet<Integer>();
for (int i : a) s.add(i);
int distinctCount = s.size();
A set stores each unique (as defined by .equals()) element in it only once, and you can use this to simplify the problem. Create a Set (I'd use a HashSet), iterate your array, adding each integer to the Set, then return .size() of the Set.
An efficient method: Sort the array with Arrays.sort. Write a simple loop to count up adjacent equal values.
Really depends on the numbers of elements in the array. If you're not dealing with a large amount of integers, a HashSet or a binary tree would probably be the best approach. On the other hand, if you have a large array of diverse integers (say, more than a billion) it might make sense to allocate a 2^32 / 2^8 = 512 MByte byte array in which each bit represents the existence or non-existence of an integer and then count the number of set bits in the end.
A binary tree approach would take n * log n time, while an array approach would take n time. Also, a binary tree requires two pointers per node, so your memory usage would be a lot higher as well. Similar consideration apply to hash tables as well.
Of course, if your set is small, then just use the inbuilt HashSet.