Representing binary relation in java - java

One famous programmer said "why anybody need DB, just give me hash table!". I have list of grammar symbols together with their frequencies. One way it's a map: symbol#->frequency. The other way its a [binary] relation. Problem: get top 5 symbols by frequency.
More general question. I'm aware of [binary] relation algebra slowly making inroad into CS theory. Is there java library supporting relations?

List<Entry<String, Integer>> myList = new ArrayList<...>();
for (Entry<String, Integer> e : myMap.entrySet())
myList.add(e);
Collections.sort(myList, new Comparator<Entry<String, Integer>>(){
int compare(Entry a, Entry b){
// compare b to a to get reverse order
return new Integer(b.getValue()).compareTo(new Integer(a.getValue());
}
});
List<Entry<String, Integer>> top5 = myList.sublist(0, 5);
More efficient:
TreeSet<Entry<String, Integer>> myTree = new TreeSet<...>(
new Comparator<Entry<String, Integer>>(){
int compare(Entry a, Entry b){
// compare b to a to get reverse order
return new Integer(b.getValue()).compareTo(new Integer(a.getValue());
}
});
for (Entry<String, Integer> e : myMap.entrySet())
myList.add(e);
List<Entry<String, Integer>> top5 = new ArrayList<>();
int i=0;
for (Entry<String, Integer> e : myTree) {
top5.add(e);
if (i++ == 4) break;
}

With TreeSet it should be easy:
int i = 0;
for(Symbol s: symbolTree.descendingSet()) {
i++;
if(i > 5) break; // or probably return
whatever(s);
}

Here is a general algorithm, assuming you already have a completed symbol HashTable
Make 2 arrays:
freq[5] // Use this to save the frequency counts for the 5 most frequent seen so far
word[5] // Use this to save the words that correspond to the above array, seen so far
Use an iterator to traverse your HashTable or Map:
Compare the current symbol's frequency against the ones in freq[5] in sequential order.
If the current symbol has a higher frequency than any entry in the array pairing above, shift that entry and all entries below it one position (i.e. the 5th position gets kicked out)
Add the current symbol / frequency pair to the newly vacated position
Otherwise, ignore.
Analysis:
You make at most 5 comparisons (constant time) against the arrays with each symbol seen in the HashTable, so this is O(n)
Each time you have to shift the entries in the array down, it is also constant time. Assuming you do a shift every time, this is still O(n)
Space: O(1) to store the arrays
Runtime: O(n) to iterate through all the symbols

Related

Effective way. of comparing list elements in Java

Is there any **effective way **of comparing elements in Java and print out the position of the element which occurs once.
For example: if I have a list: ["Hi", "Hi", "No"], I want to print out 2 because "No" is in position 2. I have solved this using the following algorithm and it works, BUT the problem is that if I have a large list it takes too much time to compare the entire list to print out the first position of the unique word.
ArrayList<String> strings = new ArrayList<>();
for (int i = 0; i < strings.size(); i++) {
int oc = Collections.frequency(strings, strings.get(i));
if (oc == 1)
System.out.print(i);
break;
}
I can think of counting each element's occurrence no and filter out the first element though not sure how large your list is.
Using Stream:
List<String> list = Arrays.asList("Hi", "Hi", "No");
//iterating thorugh the list and storing each element and their no of occurance in Map
Map<String, Long> counts = list.stream().collect(Collectors.groupingBy(Function.identity(), LinkedHashMap::new, Collectors.counting()));
String value = counts.entrySet().stream()
.filter(e -> e.getValue() == 1) //filtering out all the elements which have more than 1 occurance
.map(Map.Entry::getKey) // creating a stream of element from map as all of these have only single occurance
.findFirst() //finding the first element from the element stream
.get();
System.out.println(list.indexOf(value));
EDIT:
A simplified version can be
Map<String, Long> counts2 = new LinkedHashMap<String, Long>();
for(String val : list){
long count = counts2.getOrDefault(val, 0L);
counts2.put(val, ++count);
}
for(String key: counts2.keySet()){
if(counts2.get(key)==1){
System.out.println(list.indexOf(key));
break;
}
}
The basic idea is to count each element's occurrence and store them in a Map.Once you have count of all elements occurrences. then you can simply check for the first element which one has 1 as count.
You can use HashMap.For example you can put word as key and index as value.Once you find the same word you can delete the key and last the map contain the result.
If there's only one word that's present only once, you can probably use a HashMap or HashSet + Deque (set for values, Deque for indices) to do this in linear time. A sort can give you the same in n log(n), so slower than linear but a lot faster than your solution. By sorting, it's easy to find in linear time (after the sort) which element is present only once because all duplicates will be next to each other in the array.
For example for a linear solution in pseudo-code (pseudo-Kotlin!):
counters = HashMap()
for (i, word in words.withIndex()) {
counters.merge(word, Counter(i, 1), (oldVal, newVal) -> Counter(oldVald.firstIndex, oldVald.count + newVal.count));
}
for (counter in counters.entrySet()) {
if (counter.count == 1) return counter.firstIndex;
}
class Counter(firstIndex, count)
Map<String,Boolean> + loops
Instead of using Map<String,Integer> as suggested in other answers.
You can maintain a HashMap (if you need to maintain the order, use LinkedHashMap instead) of type Map<String,Boolean> where a value would denote whether an element is unique or not.
The simplest way to generate the map is method put() in conjunction with containsKey() check.
But there are also more concise options like replace() + putIfAbsent(). putIfAbsent() would create a new entry only if key is not present in the map, therefore we can associate such string with a value of true (considered to be unique). On the other hand replace() would update only existing entry (otherwise map would not be effected), and if entry exist, the key is proved to be a duplicate, and it has to be associated with a value of false (non-unique).
And since Java 8 we also have method merge(), which expects tree arguments: a key, a value, and a function which is used when the given key already exists to resolve the old value and the new one.
The last step is to generate list of unique strings by iterating over the entry set of the newly created map. We need every key having a value of true (is unique) associated with it.
List<String> strings = // initializing the list
Map<String, Boolean> isUnique = new HashMap<>(); // or LinkedHashMap if you need preserve initial order of strings
for (String next: strings) {
isUnique.replace(next, false);
isUnique.putIfAbsent(next, true);
// isUnique.merge(next, true, (oldV, newV) -> false); // does the same as the commented out lines above
}
List<String> unique = new ArrayList<>();
for (Map.Entry<String, Boolean> entry: isUnique.entrySet()) {
if (entry.getValue()) unique.add(entry.getKey());
}
Stream-based solution
With streams, it can be done using collector toMap(). The overall logic remains the same.
List<String> unique = strings.stream()
.collect(Collectors.toMap( // creating intermediate map Map<String, Boolean>
Function.identity(), // key
key -> true, // value
(oldV, newV) -> false, // resolving duplicates
LinkedHashMap::new // Map implementation, if order is not important - discard this argument
))
.entrySet().stream()
.filter(Map.Entry::getValue)
.map(Map.Entry::getKey)
.toList(); // for Java 16+ or collect(Collectors.toList()) for earlier versions

Print the Key for the N-th highest Value in a HashMap

I have a HashMap and have to print the N-th highest value in the HashMap.
I have managed to get the highest value.
I have sorted the HashMap first so that if there are two keys with the same value, then I get the key that comes first alphabetically.
But I still don't know how to get the key for nth highest value?
public void(HashMap map, int n) {
Map<String, Integer> sortedmap = new TreeMap<>(map);
Map.Entry<String, Integer> maxEntry = null;
for (Map.Entry<String, Integer> entry : sortedmap.entrySet()) {
if (maxEntry == null || entry.getValue().compareTo(maxEntry.getValue()) > 0) {
maxEntry = entry;
}
}
System.out.println(maxEntry.getKey());
}
Here is one way. It is presumed by Nth highest that duplicates must be ignored. Otherwise you would be asking about position in the map and not the intrinsic value as compared to others. For example, if the values are 8,8,8,7,7,5,5,3,2,1 then the 3rd highest value is 5 where the value 8 would be simply be value in the 3rd location of a descending sorted list.
initialize found to false and max to Integer.MAX_VALUE.
sort the list in reverse order based on value. Since the TreeMap is already sorted by keys and is a stable sort (see Sorting algorithms) the keys will remain in sorted order for duplicate values.
loop thru the list and continue checking if the current value is less than max. The key here is less than, That is what ignores the duplicates when iterating thru the list.
if the current value is less than max, assign to max and decrement n. Also assign the key
if n == 0, set found to true and break out of the loop.
if the loop finishes on its own, found will be false and no nth largest exists.
Map<String, Integer> map = new TreeMap<>(Map.of(
"peter" , 40, "mike" , 90, "sam",60, "john",90, "jimmy" , 32, "Alex",60,"joan", 20, "alice", 40));
List<Entry<String,Integer>> save = new ArrayList<>(map.entrySet());
save.sort(Entry.comparingByValue(Comparator.reverseOrder()));
int max = Integer.MAX_VALUE;
boolean found = false;
String key = null;
for (Entry<String,Integer> e : save) {
if (e.getValue() < max) {
max = e.getValue();
key = e.getKey();
if (--n == 0) {
found = true;
break;
}
}
}
if (found) {
System.out.println("Value = " + max);
System.out.println("Key = " + key);
} else {
System.out.println("Not found");
}
prints
Value = 60
Key = Alex
This problem doesn't require sorting all the given data. It will cause a huge overhead if n is close to 1, in which case the possible solution will run in a linear time O(n). Sorting increases time complexity to O(n*log n) (if you are not familiar with Big O notation, you might be interested in reading answers to this question). And for any n less than map size, partial sorting will be a better option.
If I understood you correctly, duplicated values need to be taken into account. For instance, for n=3 values 12,12,10,8,5 the third-largest value will be 10 (if you don't duplicate then the following solution can be simplified).
I suggest approaching this problem in the following steps:
Reverse the given map. So that values of the source map will become the keys, and vice versa. In the case of duplicated values, the key (value in the reversed map) that comes first alphabetically will be preserved.
Create a map of frequencies. So that the values of the source map will become the keys of the reversed map. Values will represent the number of occurrences for each value.
Flatten the values of reversed map into a list.
Perform a partial sorting by utilizing PriorityQueue as container for n highest values. PriorityQueue is based on the so called min heap data structure. While instantiating PriorityQueue you either need to provide a Comparator or elements of the queue has to have a natural sorting order, i.e. implement interface Comparable (which is the case for Integer). Methods element() and peek() will retrieve the smallest element from the priority queue. And the queue will contain n largest values from the given map, its smallest element will be the n-th highest value of the map.
The implementation might look like this:
public static void printKeyForNthValue(Map<String, Integer> map, int n) {
if (n <= 0) {
System.out.println("required element can't be found");
}
Map<Integer, String> reversedMap = getReversedMap(map);
Map<Integer, Integer> valueToCount = getValueFrequencies(map);
List<Integer> flattenedValues = flattenFrequencyMap(valueToCount);
Queue<Integer> queue = new PriorityQueue<>();
for (int next: flattenedValues) {
if (queue.size() >= n) {
queue.remove();
}
queue.add(next);
}
if (queue.size() < n) {
System.out.println("required element wasn't found");
} else {
System.out.println("value:\t" + queue.element());
System.out.println("key:\t" + reversedMap.get(queue.element()));
}
}
private static Map<Integer, String> getReversedMap(Map<String, Integer> map) {
Map<Integer, String> reversedMap = new HashMap<>();
for (Map.Entry<String, Integer> entry: map.entrySet()) { // in case of duplicates the key the comes first alphabetically will be preserved
reversedMap.merge(entry.getValue(), entry.getKey(),
(s1, s2) -> s1.compareTo(s2) < 0 ? s1 : s2);
}
return reversedMap;
}
private static Map<Integer, Integer> getValueFrequencies(Map<String, Integer> map) {
Map<Integer, Integer> result = new HashMap<>();
for (Integer next: map.values()) {
result.merge(next, 1, Integer::sum); // the same as result.put(next, result.getOrDefault(next, 0) + 1);
}
return result;
}
private static List<Integer> flattenFrequencyMap(Map<Integer, Integer> valueToCount) {
List<Integer> result = new ArrayList<>();
for (Map.Entry<Integer, Integer> entry: valueToCount.entrySet()) {
for (int i = 0; i < entry.getValue(); i++) {
result.add(entry.getKey());
}
}
return result;
}
Note, if you are not familiar with Java 8 method merge(), inside getReversedMap() you can replace it this with:
if (!reversedMap.containsKey(entry.getValue()) ||
entry.getKey().compareTo(reversedMap.get(entry.getValue())) < 0) {
reversedMap.put(entry.getValue(), entry.getKey());
}
main() - demo
public static void main(String[] args) {
Map<String, Integer> source =
Map.of("w", 10, "b", 12, "a", 10, "r", 12,
"k", 3, "l", 5, "y", 3, "t", 9);
printKeyForNthValue(source, 3);
}
Output (the third-greatest value from the set 12, 12, 10, 10, 9, 5, 3, 3)
value: 10
key: a
When finding the kth highest value, you should consider using a priority queue (aka a heap) or using quick select.
A heap can be constructed in O(n) time however if you initialize it and insert n elements, it will take O(nlogn) time. After which you can pop k elements in order to get the kth highest element
Quick select is an algorithm designed for finding the nth highest element in O(n) time

Find map value with highest number of occurrences

I have a Map<Integer,Integer>
1 10
2 10
3 20
5 20
6 11
7 22
How do I find the maximum repeated value of the map? In this case - that is 10 & 20. Repeated count is 2 on both case.
Don't reinvent the wheel and use the frequency method of the Collections class:
public static int frequency(Collection<?> c, Object o)
If you need to count the occurrences for all values, use a Map and loop cleverly :)
Or put your values in a Set and loop on each element of the set with the frequency method above. HTH
If you fancy a more functional, Java 8 one-liner solution with lambdas, try:
Map<Integer, Long> occurrences =
map.values().stream().collect(Collectors.groupingBy(w -> w, Collectors.counting()));
loop over the hashmap, and count the number of repetitions.
for(Integer value:myMap.values() ){
Integer count = 1;
if(countMap.contains(value)){
count = countMap.get(value);
count++;
}
countMap.put(value, count);
}
then loop over the result map, and find the max(s):
Integer maxValue=0;
for (Map.Entry<Integer, Integer> entry : countMap.entrySet()){
if(entry.getValue => maxValue){
maxValue = entry.getValue;
maxResultList.add(entry.Key);
}
}
Simple solution is you need to write your own put method for getting repeated values
for repeated values
put(String x, int i){
List<Integer> list = map.get(x);
if(list == null){
list = new ArrayList<Integer>();
map.put(x, list);
}
list.add(i);
}
So, in this case, map to a list of [10,10,20,20]
for getting repeated values occurrence
You need be to compare the size of your values list with your values set.
List<T> listOfValues= map.values();
Set<T> listOfSetValues= new HashSet<T>(map.values);
now you need to check size of both collections; if unequal, you have duplicates, to get the max repeated occurrence subtract list from map size.
We can use a number of simple methods to do this.
First, we can define a method that counts elements, and returns a map from the value to its occurrence count:
Map<T, Integer> countAll(Collection<T> c){
return c.stream().collect(groupingByConcurrent(k->k, Collectors.counting()));
}
Then, to filter out all entries having fewer instances than the one with the most, we can do this:
C maxima(Collection<T> c, Comparator<? super T> comp,
Producer<C extends Collection<? super T> p)){
T max = c.stream().max(comp);
return c.stream().filter(t-> (comp.compare(t,max) >= 0)).collect(p);
}
Now we can use them together to get the results we want:
maxima(countAll(yourMap.valueSet()).entrySet(),
Comparator.comparing(e->e.getValue()), HashSet::new);
Note that this would produce a HashSet<Entry<Integer,Integer>> in your case.
Try this simple method:
public String getMapKeyWithHighestValue(HashMap<String, Integer> map) {
String keyWithHighestVal = "";
// getting the maximum value in the Hashmap
int maxValueInMap = (Collections.max(map.values()));
//iterate through the map to get the key that corresponds to the maximum value in the Hashmap
for (Map.Entry<String, Integer> entry : map.entrySet()) { // Iterate through hashmap
if (entry.getValue() == maxValueInMap) {
keyWithHighestVal = entry.getKey(); // this is the key which has the max value
}
}
return keyWithHighestVal;
}

Java How to return top 10 items based on value in a HashMap

So I am very new to Java and as such I'm fighting my way through an exercise, converting one of my Python programs to Java.
I have run into an issue where I am trying to replicate the behavior, from python the following will return only the keys sorted (by values), not the values:
popular_numbers = sorted(number_dict, key = number_dict.get, reverse = True)
In Java, I have done a bit of research and have not yet found an easy enough sample for a n00b such as myself or a comparable method. I have found examples using Guava for sorting, but the sort appears to return a HashMap sorted by key.
In addition to the above, one of the other nice things about Python, that I have not found in Java is the ability to, easily, return a subset of the sorted values. In Python I can simply do the following:
print "Top 10 Numbers: %s" % popular_numbers[:10]
In this example, number_dict is a dictionary of key,value pairs where key represents numbers 1..100 and the value is the number of times the number (key) occurs:
for n in numbers:
if not n == '':
number_dict[n] += 1
The end result would be something like:
Top 10 Numbers: ['27', '11', '5', '8', '16', '25', '1', '24', '32',
'20']
To clarify, in Java I have successfully created a HashMap, I have successfully examined numbers and increased the values of the key,value pair. I am now stuck at the sort and return the top 10 numbers (keys) based on value.
Put the map's entrySet() into a List.
Sort this list using Collections.sort and a Comparator which sorts Entrys based on their values.
Use the subList(int, int) method of List to retrieve a new list containing the top 10 elements.
Yes, it will be much more verbose than Python :)
With Java 8+, to get the first 10 elements of a list of intergers:
list.stream().sorted().limit(10).collect(Collectors.toList());
To get the first 10 elements of a map's keys, that are integers:
map.keySet().stream().sorted().limit(10).collect(Collectors.toMap(Function.identity(), map::get));
HashMaps aren't ordered in Java, and so there isn't really a good way to order them short of a brute-force search through all the keys. Try using TreeMap: http://docs.oracle.com/javase/6/docs/api/java/util/TreeMap.html
Assuming your map is defined something like this and that you want to sort based on values:
HashMap<Integer, Integer> map= new HashMap<Integer, Integer>();
//add values
Collection<Integer> values= map.values();
ArrayList<Integer> list= new ArrayList<Integer>(values);
Collections.sort(list);
Now, print the first top 10 elements of the list.
for (int i=0; i<10; i++) {
System.out.println(list.get(i));
}
The values in the map are not actually sorted, because the HashMap is not sorted at all (it stores the values in the buckets based on the hashCode of the key). This code is just displaying 10 smallest elements in the map.
EDIT sort without loosing the key-value pairs:
//sorted tree map
TreeMap<Integer, Integer> tree= new TreeMap<>();
//iterate over a map
Iteartor<Integer> it= map.keySet().iterator();
while (it.hasNext()) {
Integer key= it.next();
tree.put(map.get(key), key);
}
Now you have the TreeMap tree that is sorted and has reversed key-value pairs from the original map, so you don't lose the information.
Try the next:
public static void main(String[] args) {
// Map for store the numbers
Map<Integer, Integer> map = new HashMap<Integer, Integer>();
// Populate the map ...
// Sort by the more popular number
Set<Entry<Integer, Integer>> set = map.entrySet();
List<Entry<Integer, Integer>> list = new ArrayList<>(set);
Collections.sort(list, new Comparator<Entry<Integer, Integer>>() {
#Override
public int compare(Entry<Integer, Integer> a,
Entry<Integer, Integer> b) {
return b.getValue() - a.getValue();
}
});
// Output the top 10 numbers
for (int i = 0; i < 10 && i < list.size(); i++) {
System.out.println(list.get(i));
}
}
Guava Multiset is a great fit for your use case, and would nicely replace your HashMap. It is a collection which counts the number of occurences of each element.
Multisets has a method copyHighestCountFirst, which returns an immutable Multiset ordered by count.
Now some code:
Multiset<Integer> counter = HashMultiset.create();
//add Integers
ImmutableMultiset<Integer> sortedCount = Multisets.copyHighestCountFirst(counter);
//iterate through sortedCount as needed
Use a SortedMap, call values(). The docs indicate the following:
The collection's iterator returns the values in ascending order of the corresponding keys
So as long as your comparator is written correctly you can just iterate over the first n keys
Build a list from the keyset.
Sort the HashMap by values using the keys to access the value in the Collection.sort() method.
Return a sub list of the sorted key set.
if you care about the values, you can use the keys in step 3 and build value set.
HashMap<String, Integer> hashMap = new HashMap<String, Integer>();
List list = new ArrayList(hashMap.keySet());
Collections.sort(list, (w1, w2) -> hashMap.get(w2) - hashMap.get(w1)); //sorted descending order by value;
return list.subList(0, 10);
To preserve the ranking order and efficiently return top count, much smaller than the size of the map size:
map.entrySet().stream()
.sorted(Map.Entry.comparingByValue(Comparator.reverseOrder()))
.limit(count)
.collect(toMap(Map.Entry::getKey, Map.Entry::getValue,
(e1, e2) -> e1,
LinkedHashMap::new))

max freq of repetition in an array

what is the fastest way to find the max freq of repetition in an array in java in smallest time complexity
A=[1,2,3,4,1,1]
ans = 1
how can this be done
a (mostly) linear time solution would be to use a HashMap<Integer, Integer> and build a histogram of all values appearing in A.
HashMap<Integer, Integer> m = new HashMap<Integer, Integer>();
for(int x : A)
{
Integer v = m.get(x);
if (null == v) {v = Integer.valueOf(0);}
m.put(x, ++v);
}
The going over the entire map and return the entry with the maximum value.
with the entrySet() method this is done in linear time as well.

Categories