Iterating Java Arrays with Streams - java

So here's a simple algorithmic problem,
Given a list of integers, check if there are two numbers in this list that when added together give eight (8).
Here's my solution,
import java.util.List;
public class Main {
static List<Integer> arrayOne = List.of(1,3,6,9);
static List<Integer> arrayTwo = List.of(1,6,2,10);
static boolean validateArray(int result, List<Integer> array){
for (int i = 0; i<array.size() - 1; i++){
for (int j = i + 1; j < array.size(); j ++){
int value1 = array.get(i);
int value2 = array.get(j);
if(value1 + value2 == result){
return true;
}
}
}
return false;
}
public static void main(String[] args) {
System.out.println(validateArray(8, arrayTwo));
}
}
This works fine. What I'm trying to learn is how to rewrite this code in Java 8. As in what the different options with the loops in Java 8.

If you use a bit different solution using Set<Integer> to find an addition complement into the result, then you can easily convert the solution into Stream API.
Iterative approach
Set<Integer> set = Set.copyOf(array);
for (Integer integer : array) {
if (set.contains(result - integer) && (result != 2 * integer)) {
System.out.printf("Found %s + %s = %s", integer, result - integer, result);
return true;
}
}
System.out.printf("Found found no two numbers their addition would result in in %s%n", result);
return false;
Stream API approach (with logging)
Set<Integer> set = Set.copyOf(array);
return array.stream()
.filter(integer -> set.contains(result - integer) && (result != 2 * integer))
.findFirst()
.map(integer -> {
System.out.printf("Found %s + %s = %s%n", integer, result - integer, result);
return true;
})
.orElseGet(() -> {
System.out.printf("Found found no two numbers their addition would result in in %s%n", result);
return false;
});
And if you don't need to log the results and you care only about the result, the whole stream can be simplified and shortened.
Stream API approach (result only)
Set<Integer> set = Set.copyOf(array);
return array.stream()
.anyMatch(integer -> set.contains(result - integer) && (result != 2 * integer));
Remark:
The algorithm used in all snippets above is simple. You iterate each number in the array and check whether its difference from the result would be a number found in the Set<Integer> of the array (constant look-up: O(1)). To eliminate the currently iterated number (in case the requested result would be 2 * integer), such a check is added. This solution assumes there are no duplicated numbers in the input array. In such a case, the Set<Integer> shall be used instead and there is no need of a conversion.

Regardless of the implementation (streams or loops) performing brute-force iterations over the whole list for each element of the list isn't the best way to solve this problem.
We can index the elements by generating a Map of type Map<Integer,Boolean> (credits to #Holger for this idea), where Keys would represent unique values in the given list and the corresponding boolean Values would denote whether the occurs more than once. Then we can iterate over the Keys of the Map, checking for each key if the corresponding key, which is equal to result - key is present in the Map.
There's one edge case, though, that we need to address:
if result is even and there's a single element in the list, which is equal to result / 2 checking if result - key is present in the map is not sufficient and in this case the Value would be handy to check if associated key has a pair (to construct the target sum).
If you want to use Stream API firstly to generate a Map, you can use of Collector toMap().
Then create a stream over the Keys of the Map and apply anyMatch() operation to obtain the boolean result:
static boolean validateArray(int result, List<Integer> array) {
Map<Integer, Boolean> hasPair = array.stream()
.collect(Collectors.toMap(
Function.identity(), // Key
i -> false, // Value - the element has been encountered for the first time, therefore Value is false
(left, right) -> true // mergeFunction - resolves value of a duplicated Key a true (it has a pair)
));
return hasPair.keySet().stream()
.anyMatch(key -> key * 2 == result ?
hasPair.get(key) : hasPair.containsKey(result - key)
);
}

Related

Apply map() to only a subset of elements in a Java stream?

Is it possible, with the Java stream API to apply a map() not on the whole stream (a.k.a. not on every element which is streamed), but only on a set of elements which pass a filter? The filter however should not filter out elements. The result should be a stream of the original elements, but on some of them, a map() has been applied.
Pseudocode:
List<Integer>.stream.filterAndMap(x -> if (x>10) {x+2}).toList();
if … else …
Just return the value unchanged if it does not meet your requirement for modification.
List < Integer > integers = List.of( 1 , 7 , 42 );
List < Integer > modified =
integers
.stream()
.map( integer -> {
if ( integer > 10 ) { return integer + 2; }
else { return integer; }
} )
.toList();
modified.toString() = [1, 7, 44]
Ternary operator
Or shorten that code by using a ternary operator as commented by MC Emperor.
List < Integer > integers = List.of( 1 , 7 , 42 );
List < Integer > modified =
integers
.stream()
.map( integer -> ( integer > 10 ) ? ( integer + 2 ) : integer )
.toList();
See how starting with intention-named pseudo-code pays always off.
Naming
The pseudo-instruction filterAndMap(x -> if (x>10) {x+2}) is violating SRP at least by the name containing "And" to clue two responsibilities.
To resolve with Java Streams-API would suggest:
list.stream()
.filter(predicate) // filter means discarding elements from result
.map(mappingFunction) // map applies to filtered elements only
.toList();
But then the resulting list is filtered by the predicate or condition, hence not having original size.
Rephrase intention: Conditionally Map
Search for [java] conditionally map. Some answers show if statements or ternary operator to implement the conditional.
Still the primary step is map. Inside implement the conditional. Whether using a lambda or a function-reference, this decides how to map:
if (predicate.apply(x)) {
return modify(x);
}
return x; // default: unmodified identity
This is Basil's approach .map( integer -> ( integer > 10 ) ? ( integer + 2 ) : integer ).
Why streams?
Assume some of the list elements stay the same, where only some are modified on condition. Like your requirement states:
stream of the original elements, but on some of them, a map() has been applied.
Wouldn't it be clearer then, to use a for-each loop with conditional modification to have the list elements modified in-place:
List<Integer> intList = List.of(1, 7, 42);
// can also be a intList.forEach with Consumer
for (i : intList) {
if (i > 10) {
i = i + 2;
}
});

Print the Key for the N-th highest Value in a HashMap

I have a HashMap and have to print the N-th highest value in the HashMap.
I have managed to get the highest value.
I have sorted the HashMap first so that if there are two keys with the same value, then I get the key that comes first alphabetically.
But I still don't know how to get the key for nth highest value?
public void(HashMap map, int n) {
Map<String, Integer> sortedmap = new TreeMap<>(map);
Map.Entry<String, Integer> maxEntry = null;
for (Map.Entry<String, Integer> entry : sortedmap.entrySet()) {
if (maxEntry == null || entry.getValue().compareTo(maxEntry.getValue()) > 0) {
maxEntry = entry;
}
}
System.out.println(maxEntry.getKey());
}
Here is one way. It is presumed by Nth highest that duplicates must be ignored. Otherwise you would be asking about position in the map and not the intrinsic value as compared to others. For example, if the values are 8,8,8,7,7,5,5,3,2,1 then the 3rd highest value is 5 where the value 8 would be simply be value in the 3rd location of a descending sorted list.
initialize found to false and max to Integer.MAX_VALUE.
sort the list in reverse order based on value. Since the TreeMap is already sorted by keys and is a stable sort (see Sorting algorithms) the keys will remain in sorted order for duplicate values.
loop thru the list and continue checking if the current value is less than max. The key here is less than, That is what ignores the duplicates when iterating thru the list.
if the current value is less than max, assign to max and decrement n. Also assign the key
if n == 0, set found to true and break out of the loop.
if the loop finishes on its own, found will be false and no nth largest exists.
Map<String, Integer> map = new TreeMap<>(Map.of(
"peter" , 40, "mike" , 90, "sam",60, "john",90, "jimmy" , 32, "Alex",60,"joan", 20, "alice", 40));
List<Entry<String,Integer>> save = new ArrayList<>(map.entrySet());
save.sort(Entry.comparingByValue(Comparator.reverseOrder()));
int max = Integer.MAX_VALUE;
boolean found = false;
String key = null;
for (Entry<String,Integer> e : save) {
if (e.getValue() < max) {
max = e.getValue();
key = e.getKey();
if (--n == 0) {
found = true;
break;
}
}
}
if (found) {
System.out.println("Value = " + max);
System.out.println("Key = " + key);
} else {
System.out.println("Not found");
}
prints
Value = 60
Key = Alex
This problem doesn't require sorting all the given data. It will cause a huge overhead if n is close to 1, in which case the possible solution will run in a linear time O(n). Sorting increases time complexity to O(n*log n) (if you are not familiar with Big O notation, you might be interested in reading answers to this question). And for any n less than map size, partial sorting will be a better option.
If I understood you correctly, duplicated values need to be taken into account. For instance, for n=3 values 12,12,10,8,5 the third-largest value will be 10 (if you don't duplicate then the following solution can be simplified).
I suggest approaching this problem in the following steps:
Reverse the given map. So that values of the source map will become the keys, and vice versa. In the case of duplicated values, the key (value in the reversed map) that comes first alphabetically will be preserved.
Create a map of frequencies. So that the values of the source map will become the keys of the reversed map. Values will represent the number of occurrences for each value.
Flatten the values of reversed map into a list.
Perform a partial sorting by utilizing PriorityQueue as container for n highest values. PriorityQueue is based on the so called min heap data structure. While instantiating PriorityQueue you either need to provide a Comparator or elements of the queue has to have a natural sorting order, i.e. implement interface Comparable (which is the case for Integer). Methods element() and peek() will retrieve the smallest element from the priority queue. And the queue will contain n largest values from the given map, its smallest element will be the n-th highest value of the map.
The implementation might look like this:
public static void printKeyForNthValue(Map<String, Integer> map, int n) {
if (n <= 0) {
System.out.println("required element can't be found");
}
Map<Integer, String> reversedMap = getReversedMap(map);
Map<Integer, Integer> valueToCount = getValueFrequencies(map);
List<Integer> flattenedValues = flattenFrequencyMap(valueToCount);
Queue<Integer> queue = new PriorityQueue<>();
for (int next: flattenedValues) {
if (queue.size() >= n) {
queue.remove();
}
queue.add(next);
}
if (queue.size() < n) {
System.out.println("required element wasn't found");
} else {
System.out.println("value:\t" + queue.element());
System.out.println("key:\t" + reversedMap.get(queue.element()));
}
}
private static Map<Integer, String> getReversedMap(Map<String, Integer> map) {
Map<Integer, String> reversedMap = new HashMap<>();
for (Map.Entry<String, Integer> entry: map.entrySet()) { // in case of duplicates the key the comes first alphabetically will be preserved
reversedMap.merge(entry.getValue(), entry.getKey(),
(s1, s2) -> s1.compareTo(s2) < 0 ? s1 : s2);
}
return reversedMap;
}
private static Map<Integer, Integer> getValueFrequencies(Map<String, Integer> map) {
Map<Integer, Integer> result = new HashMap<>();
for (Integer next: map.values()) {
result.merge(next, 1, Integer::sum); // the same as result.put(next, result.getOrDefault(next, 0) + 1);
}
return result;
}
private static List<Integer> flattenFrequencyMap(Map<Integer, Integer> valueToCount) {
List<Integer> result = new ArrayList<>();
for (Map.Entry<Integer, Integer> entry: valueToCount.entrySet()) {
for (int i = 0; i < entry.getValue(); i++) {
result.add(entry.getKey());
}
}
return result;
}
Note, if you are not familiar with Java 8 method merge(), inside getReversedMap() you can replace it this with:
if (!reversedMap.containsKey(entry.getValue()) ||
entry.getKey().compareTo(reversedMap.get(entry.getValue())) < 0) {
reversedMap.put(entry.getValue(), entry.getKey());
}
main() - demo
public static void main(String[] args) {
Map<String, Integer> source =
Map.of("w", 10, "b", 12, "a", 10, "r", 12,
"k", 3, "l", 5, "y", 3, "t", 9);
printKeyForNthValue(source, 3);
}
Output (the third-greatest value from the set 12, 12, 10, 10, 9, 5, 3, 3)
value: 10
key: a
When finding the kth highest value, you should consider using a priority queue (aka a heap) or using quick select.
A heap can be constructed in O(n) time however if you initialize it and insert n elements, it will take O(nlogn) time. After which you can pop k elements in order to get the kth highest element
Quick select is an algorithm designed for finding the nth highest element in O(n) time

Check if list of integers contains two groups of different repeated numbers

How to using java stream, check if list of integers contains two groups of different repeated numbers. Number must be repeated not more then two time.
Example: list of 23243.
Answer: true, because 2233
Example 2: list of 23245.
Answer: none
Example 3: list of 23232.
Answer: none, because 222 repeated three times
One more question, how can i return not anyMatch, but the biggest of repeated number?
listOfNumbers.stream().anyMatch(e -> Collections.frequency(listOfNumbers, e) == 2)
This will tell you if the list meets your requirements.
stream the list of digits.
do a frequency count.
stream the resultant counts
filter out those not equal to a count of 2.
and count how many of those there are.
Returns true if final count == 2, false otherwise.
List<Integer> list = List.of(2,2,3,3,3,4,4);
boolean result = list.stream()
.collect(Collectors.groupingBy(a -> a,
Collectors.counting()))
.values().stream().filter(count -> count == 2).limit(2)
.count() >= 2; // fixed per OP's comment
The above prints true since there are two groups of just two digits, namely 2's and 4's
EDIT
First, I made Holger's suggestion to short circuit the count check.
To address your question about returning multiple values, I broke up the process into parts. The first is the normal frequency count that I did before. The next is gathering the information requested. I used a record to return the information. A class would also work. The max count for some particular number is housed in an AbstractMap.SimpleEntry
List<Integer> list = List.of(2, 3, 3, 3, 4, 4, 3, 2, 3);
Results results = groupCheck(list);
System.out.println(results.check);
System.out.println(results.maxEntry);
Prints (getKey() and getValue() may be used to get the individual values. First is the number, second is the occurrences of that number.)
true
3=5
The method and record declaration
record Results(boolean check,
AbstractMap.SimpleEntry<Integer, Long> maxEntry) {
}
Once the frequency count is computed, simply iterate over the entries and
count the pairs and compute the maxEntry by comparing the existing maximum count to the iterated one and update as required.
public static Results groupCheck(List<Integer> list) {
Map<Integer, Long> map = list.stream().collect(
Collectors.groupingBy(a -> a, Collectors.counting()));
AbstractMap.SimpleEntry<Integer, Long> maxEntry =
new AbstractMap.SimpleEntry<>(0, 0L);
int count = 0;
for (Entry<Integer, Long> e : map.entrySet()) {
if (e.getValue() == 2) {
count++;
}
maxEntry = e.getValue() > maxEntry.getValue() ?
new AbstractMap.SimpleEntry<>(e) : maxEntry;
}
return new Results(count >= 2, maxEntry);
}
One could write a method which builds a TreeMap of the frequencies.
What happens here, is that a frequency map is built first (by groupingBy(Function.identity(), Collectors.counting()))), and then we must 'swap' the keys and values, because we want to use the frequencies as keys.
public static TreeMap<Long, List<Integer>> frequencies(List<Integer> list) {
return list.stream()
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet().stream()
.collect(Collectors.toMap(e -> e.getValue(), e -> List.of(e.getKey()), (a, b) -> someMergeListsFunction(a, b), TreeMap::new));
}
And then we can just use our method like this:
// We assume the input list is not empty
TreeMap<Long, List<Integer>> frequencies = frequencies(list);
var higher = frequencies.higherEntry(2L);
if (higher != null) {
System.out.printf("There is a number which occurs more than twice: %s (occurs %s times)\n", higher.getValue().get(0), higher.getKey());
}
else {
List<Integer> occurTwice = frequencies.lastEntry().getValue();
if (occurTwice.size() < 2) {
System.out.println("Only " + occurTwice.get(0) " occurs twice...");
}
else {
System.out.println(occurTwice);
}
}
A TreeMap is a Map with keys sorted by some comparator, or the natural order if none is given. The TreeMap class contains methods to search for certain keys. For example, the higherEntry method returns the first entry which is higher than the given key. With this method, you can easily check if a key higher than 2 exists, for one of the requirements is that none of the numbers may occur more than twice.
The above code checks whether there is a number occurring more than twice, that is when higherEntry(2L) returns a nonnull value. Otherwise, lastEntry() is the highest number occurring. With getValue(), you can retrieve the list of these numbers.

Counting each distinct array occurrence in a list of arrays with duplicates

PROBLEM
I have a list of arrays and I want to count the occurrences of duplicates.
For example, if I have this :
{{1,2,3},
{1,0,3},
{1,2,3},
{5,2,6},
{5,2,6},
{5,2,6}}
I want a map (or any relevant collection) like this :
{ {1,2,3} -> 2,
{1,0,3} -> 1,
{5,2,6} -> 3 }
I can even lose the arrays values, I'm only interested in cardinals (e.g. 2, 1 and 3 here).
MY SOLUTION
I use the following algorithm :
First hash the arrays, and check if each hash is in an HashMap<Integer, ArrayList<int[]>>, let's name it distinctHash, where the key is the hash and the value is an ArrayList, let's name it rowList, containing the different arrays for this hash (to avoid collisions).
If the hash is not in distinctHash, put it with the value 1 in another HashMap<int[], Long> that counts each occurrence, let's call it distinctElements.
Then if the hash is in distinctHash, check if the corresponding array is contained in rowList. If it is, increment the value in distinctElements associated to the identical array found in rowList. (If you use the new array as a key you will create another key since their reference are different).
Here is the code, the boolean returned tells if a new distinct array was found, I apply this function sequentially on all of my arrays :
HashMap<int[], Long> distinctElements;
HashMap<Integer, ArrayList<int[]>> distinctHash;
private boolean addRow(int[] row) {
if (distinctHash.containsKey(hash)) {
int[] indexRow = distinctHash.get(hash).get(0);
for (int[] previousRow: distinctHash.get(hash)) {
if (Arrays.equals(previousRow, row)) {
distinctElements.put(
indexRow,
distinctElements.get(indexRow) + 1
);
return false;
}
}
distinctElements.put(row, 1L);
ArrayList<int[]> rowList = distinctHash.get(hash);
rowList.add(row);
distinctHash.put(hash, rowList);
return true;
} else {
distinctElements.put(row, 1L);
ArrayList<int[]> newValue = new ArrayList<>();
newValue.add(row);
distinctHash.put(hash, newValue);
return true;
}
}
QUESTION
The problem is that my algorithm is too slow for my needs (40s for 5,000,000 arrays, and 2h-3h for 20,000,000 arrays). Profiling with NetBeans told me that the hashing takes 70% of runtime (using Google Guava murmur3_128 hash function).
Is there another algorithm that could be faster? As I said I'm not interested in arrays values, only in the number of their occurrences. I am ready to sacrifice precision for speed so a probabilistic algorithm is fine.
Wrap the int[] in a class that implements equals and hashCode, then build Map of the wrapper class to instance count.
class IntArray {
private int[] array;
public IntArray(int[] array) {
this.array = array;
}
#Override
public int hashCode() {
return Arrays.hashCode(this.array);
}
#Override
public boolean equals(Object obj) {
return (obj instanceof IntArray && Arrays.equals(this.array, ((IntArray) obj).array));
}
#Override
public String toString() {
return Arrays.toString(this.array);
}
}
Test
int[][] input = {{1,2,3},
{1,0,3},
{1,2,3},
{5,2,6},
{5,2,6},
{5,2,6}};
Map<IntArray, Long> map = Arrays.stream(input).map(IntArray::new)
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
map.entrySet().forEach(System.out::println);
Output
[1, 2, 3]=2
[1, 0, 3]=1
[5, 2, 6]=3
Note: The above solution is faster and uses less memory than solution by Ravindra Ranwala, but it does require the creation of an extra class, so it is debatable which is better.
For smaller arrays, use the simpler solution below by Ravindra Ranwala.
For larger arrays, the above solution is likely better.
Map<List<Integer>, Long> map = Stream.of(input)
.map(a -> Arrays.stream(a).boxed().collect(Collectors.toList()))
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
You may do it like so,
Map<List<Integer>, Long> result = Stream.of(source)
.map(a -> Arrays.stream(a).boxed().collect(Collectors.toList()))
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
And here's the output,
{[1, 2, 3]=2, [1, 0, 3]=1, [5, 2, 6]=3}
If the sequence of elements for all duplication of that array is like each other and the length of each array is not much, you can map each array to an int number and using from last part of your method. Although this method decrease the time of hashing, there are some assumptions here which might not be true for your case.

Find map value with highest number of occurrences

I have a Map<Integer,Integer>
1 10
2 10
3 20
5 20
6 11
7 22
How do I find the maximum repeated value of the map? In this case - that is 10 & 20. Repeated count is 2 on both case.
Don't reinvent the wheel and use the frequency method of the Collections class:
public static int frequency(Collection<?> c, Object o)
If you need to count the occurrences for all values, use a Map and loop cleverly :)
Or put your values in a Set and loop on each element of the set with the frequency method above. HTH
If you fancy a more functional, Java 8 one-liner solution with lambdas, try:
Map<Integer, Long> occurrences =
map.values().stream().collect(Collectors.groupingBy(w -> w, Collectors.counting()));
loop over the hashmap, and count the number of repetitions.
for(Integer value:myMap.values() ){
Integer count = 1;
if(countMap.contains(value)){
count = countMap.get(value);
count++;
}
countMap.put(value, count);
}
then loop over the result map, and find the max(s):
Integer maxValue=0;
for (Map.Entry<Integer, Integer> entry : countMap.entrySet()){
if(entry.getValue => maxValue){
maxValue = entry.getValue;
maxResultList.add(entry.Key);
}
}
Simple solution is you need to write your own put method for getting repeated values
for repeated values
put(String x, int i){
List<Integer> list = map.get(x);
if(list == null){
list = new ArrayList<Integer>();
map.put(x, list);
}
list.add(i);
}
So, in this case, map to a list of [10,10,20,20]
for getting repeated values occurrence
You need be to compare the size of your values list with your values set.
List<T> listOfValues= map.values();
Set<T> listOfSetValues= new HashSet<T>(map.values);
now you need to check size of both collections; if unequal, you have duplicates, to get the max repeated occurrence subtract list from map size.
We can use a number of simple methods to do this.
First, we can define a method that counts elements, and returns a map from the value to its occurrence count:
Map<T, Integer> countAll(Collection<T> c){
return c.stream().collect(groupingByConcurrent(k->k, Collectors.counting()));
}
Then, to filter out all entries having fewer instances than the one with the most, we can do this:
C maxima(Collection<T> c, Comparator<? super T> comp,
Producer<C extends Collection<? super T> p)){
T max = c.stream().max(comp);
return c.stream().filter(t-> (comp.compare(t,max) >= 0)).collect(p);
}
Now we can use them together to get the results we want:
maxima(countAll(yourMap.valueSet()).entrySet(),
Comparator.comparing(e->e.getValue()), HashSet::new);
Note that this would produce a HashSet<Entry<Integer,Integer>> in your case.
Try this simple method:
public String getMapKeyWithHighestValue(HashMap<String, Integer> map) {
String keyWithHighestVal = "";
// getting the maximum value in the Hashmap
int maxValueInMap = (Collections.max(map.values()));
//iterate through the map to get the key that corresponds to the maximum value in the Hashmap
for (Map.Entry<String, Integer> entry : map.entrySet()) { // Iterate through hashmap
if (entry.getValue() == maxValueInMap) {
keyWithHighestVal = entry.getKey(); // this is the key which has the max value
}
}
return keyWithHighestVal;
}

Categories