Best data structure to "group by" and aggregate values in Java? - java

I created an ArrayList of Array type like below,
ArrayList<Object[]> csvArray = new ArrayList<Object[]>();
As you can see, each element of the ArrayList is an array like {Country, City, Name, Age}.
Now I'm wanting to do a "group by" on Country and City (combined), followed by taking the average Age of the people for each Country+City.
May I know what is the easiest way to achieve this? Or you guys have suggestions to use data structures better than ArrayList for this "group by" and aggregation requirements?
Your answers are much appreciated.

You will get lot of options in Java 8.
Example
Stream<Person> people = Stream.of(new Person("Paul", 24), new Person("Mark",30), new Person("Will", 28));
Map<Integer, List<String>> peopleByAge = people
.collect(groupingBy(p -> p.age, mapping((Person p) -> p.name, toList())));
System.out.println(peopleByAge);
If you can use Java 8 and no specific reason for using a data structure, you can go through below tutorial
http://java.dzone.com/articles/java-8-group-collections

You could use Java 8 streams for this and Collectors.groupingBy. For example:
final List<Object[]> data = new ArrayList<>();
data.add(new Object[]{"NL", "Rotterdam", "Kees", 38});
data.add(new Object[]{"NL", "Rotterdam", "Peter", 54});
data.add(new Object[]{"NL", "Amsterdam", "Suzanne", 51});
data.add(new Object[]{"NL", "Rotterdam", "Tom", 17});
final Map<String, List<Object[]>> map = data.stream().collect(
Collectors.groupingBy(row -> row[0].toString() + ":" + row[1].toString()));
for (final Map.Entry<String, List<Object[]>> entry : map.entrySet()) {
final double average = entry.getValue().stream()
.mapToInt(row -> (int) row[3]).average().getAsDouble();
System.out.println("Average age for " + entry.getKey() + " is " + average);
}

You can check the collections recommended by #duffy356. I can give you an standard solution related with java.utils
I'd use a common Map<Key,Value> and being specific a HashMap.
For the keys, as I can see, you'll need and extra plain object which relates country and city. The point is create a working equals(Object) : boolean method. I'd use the Eclipse-auto generator; for me it gives me the following:
class CountryCityKey {
// package visibility
String country;
String city;
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((country == null) ? 0 : country.hashCode());
result = prime * result + ((region == null) ? 0 : region.hashCode());
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
CountryCityKey other = (CountryCityKey) obj;
if (country == null) {
if (other.country != null)
return false;
} else if (!country.equals(other.country))
return false;
if (region == null) {
if (other.region != null)
return false;
} else if (!region.equals(other.region))
return false;
return true;
}
}
Now we can group or objects in a HashMap<CountryCityKey, MySuperObject>
The code for that could be:
Map<CountryCityKey, List<MySuperObject>> group(List<MySu0perObject> list) {
Map<CountryCityKey, MySuperObject> response = new HashMap<>(list.size());
for (MySuperObject o : list) {
CountryCityKey key = o.getKey(); // I consider this done, so simply
List<MySuperObject> l;
if (response.containsKey(key)) {
l = response.get(key);
} else {
l = new ArrayList<MySuperObject>();
}
l.add(o);
response.put(key, l);
}
return response;
}
And you have it :)

you could use the brownies-collections library of magicwerk.org (http://www.magicwerk.org/page-collections-overview.html)
they offer keylists, which fit your requirements.(http://www.magicwerk.org/page-collections-examples.html)

I would recommend an additional step. You gather your data from CSV in Object[]. If you wrap your data into a class containing these data java8 collections will easily help you. (also without but it is more readable and understandable)
Here is an example - it introduces a class Information which contains your given data (country, city,name, age). The class has a constructor initializing these fields by a given Object[] array which might help you to do so - BUT: the fields have to be fixed (which is usual for CSV):
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
public class CSVExample {
public static void main(String[] args) {
ArrayList<Information> csvArray = new ArrayList<>();
csvArray.add(new Information(new Object[] {"France", "Paris", "Pierre", 34}));
csvArray.add(new Information(new Object[] {"France", "Paris", "Madeleine", 26}));
csvArray.add(new Information(new Object[] {"France", "Toulouse", "Sam", 34}));
csvArray.add(new Information(new Object[] {"Italy", "Rom", "Paul", 44}));
// combining country and city with whitespace delimiter to use it as the map key
Map<String, List<Information>> collect = csvArray.stream().collect(Collectors.groupingBy(s -> (s.getCountry() + " " + s.getCity())));
//for each key (country and city) print the key and the average age
collect.forEach((k, v) -> System.out.println(k + " " + v.stream().collect(Collectors.averagingInt(Information::getAge))));
}
}
class Information {
private String country;
private String city;
private String name;
private int age;
public Information(Object[] information) {
this.country = (String) information[0];
this.city = (String) information[1];
this.name = (String) information[2];
this.age = (Integer) information[3];
}
public Information(String country, String city, String name, int age) {
super();
this.country = country;
this.city = city;
this.name = name;
this.age = age;
}
public String getCountry() {
return country;
}
public String getCity() {
return city;
}
public String getName() {
return name;
}
public int getAge() {
return age;
}
#Override
public String toString() {
return "Information [country=" + country + ", city=" + city + ", name=" + name + ", age=" + age + "]";
}
}
The main shows a simple output for your question.

In java 8 the idea of grouping objects in a collection based on the values of one or more of their properties is simplified by using a Collector.
First, I suggest you add a new class as follow
class Info {
private String country;
private String city;
private String name;
private int age;
public Info(String country,String city,String name,int age){
this.country=country;
this.city=city;
this.name=name;
this.age=age;
}
public String toString() {
return "("+country+","+city+","+name+","+age+")";
}
// getters and setters
}
Setting up infos
ArrayList<Info> infos =new ArrayList();
infos.add(new Info("USA", "Florida", "John", 26));
infos.add(new Info("USA", "Florida", "James", 18));
infos.add(new Info("USA", "California", "Alan", 30));
Group by Country+City:
Map<String, Map<String, List<Info>>>
groupByCountryAndCity = infos.
stream().
collect(
Collectors.
groupingBy(
Info::getCountry,
Collectors.
groupingBy(
Info::getCity
)
)
);
System.out.println(groupByCountryAndCity.get("USA").get("California"));
Output
[(USA,California,James,18), (USA,California,Alan,30)]
The average Age of the people for each Country+City:
Map<String, Map<String, Double>>
averageAgeByCountryAndCity = infos.
stream().
collect(
Collectors.
groupingBy(
Info::getCountry,
Collectors.
groupingBy(
Info::getCity,
Collectors.averagingDouble(Info::getAge)
)
)
);
System.out.println(averageAgeByCountryAndCity.get("USA").get("Florida"));
Output:
22.0

/* category , list of cars*/
Please use the below code : I have pasted it from my sample app !Happy Coding .
Map<String, List<JmCarDistance>> map = new HashMap<String, List<JmCarDistance>>();
for (JmCarDistance jmCarDistance : carDistanceArrayList) {
String key = jmCarDistance.cartype;
if(map.containsKey(key)){
List<JmCarDistance> list = map.get(key);
list.add(jmCarDistance);
}else{
List<JmCarDistance> list = new ArrayList<JmCarDistance>();
list.add(jmCarDistance);
map.put(key, list);
}
}

Best data structure is a Map<Tuple, List>.
Tuple is the key, i.e. your group by columns.
List is used to store the row data.
Once you have your data in this structure, you can iterate through each key, and perform the aggregation on the subset of data.

Related

Unable to get desired output in "Predicate" "and" of Java

I have the following class Person -
Person.java -
public class Person {
private int id;
private String department;
private double salary;
public Person(int id, String department, double salary) {
this.id = id;
this.department = department;
this.salary = salary;
}
public String getDepartment() {
return department;
}
public double getSalary() {
return salary;
}
#Override
public String toString() {
return "Person{" +
"id=" + id +
", department='" + department + '\'' +
", salary=" + salary +
'}';
}
}
It has the fields -
id, department, salary
Now I have first predicate -
Predicate<List<Person>> hasSalaryOf40k = list -> {
boolean myReturn = false;
Iterator<Person> iterator = list.iterator();
while (iterator.hasNext()) {
Person person = iterator.next();
double salary = person.getSalary();
if (salary == 40000) {
myReturn = true;
break;
}
}
return myReturn;
};
Here, I want to filter out those lists having persons with salary as 40K.
Second predicate -
Predicate<List<Person>> isDeveloper = list -> {
boolean myReturn = false;
Iterator<Person> iterator = list.iterator();
while (iterator.hasNext()) {
Person person = iterator.next();
String department = person.getDepartment();
if (department.equals("Developer")) {
myReturn = true;
break;
}
}
return myReturn;
};
Here, I want to filter out those lists having persons with department as 'developer'
Third predicate -
Predicate<List<Person>> hasSalaryOf40kAndIsDeveloper = list ->
hasSalaryOf40k.and(isDeveloper).test(list);
Here, I want to filter out those lists having persons with both salary as 40K and department as "developer"
Now I have the following two lists -
List<Person> list1 = new ArrayList<>(List.of(
new Person(1, "Developer", 35000),
new Person(2, "Accountant", 40000),
new Person(3, "Clerk", 20000),
new Person(4, "Manager", 50000)
));
List<Person> list2 = new ArrayList<>(List.of(
new Person(1, "Developer", 40000),
new Person(2, "Accountant", 35000),
new Person(3, "Clerk", 22000),
new Person(4, "Manager", 55000)
));
The list1 does not match the desired criteria while list2 matches the desired criteria.
Now I call the predicate method test -
System.out.println(hasSalaryOf40kAndIsDeveloper.test(list1));
System.out.println(hasSalaryOf40kAndIsDeveloper.test(list2));
Output -
true
true
Desired output -
false
true
Where am I going wrong and how to correct my code?
You're applying the predicate to the whole list and not each element of the list, so it's true that the list contains a developer and its true that the list contains a salary over 40k. You need to apply the predicate to the Person object rather than the List<Person> object

How can I duplicate an HashMap with duplicate values

I need to create new HashMaps that contain only the duplicate values of my first HashMap :
Original map: {Player1=Hello, Player2=Hi, Player3=Hi, Player4=Hello, Player5=Hello}
For the outputs, I want to get :
Hello map: {Player1=Hello,Player4=Hello, Player5=Hello}
Hi map: {Player2=Hi, Player3=Hi}
What is best way to do?
If you are using Java8+ you can use stream, with groupingBy and toMap like so :
Map<String, Map<String, String>> collect = map.entrySet().stream()
.collect(Collectors.groupingBy(Map.Entry::getValue,
Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue)));
For e simple map, your outputs can be :
Hi - {Player2=Hi, Player3=Hi}
Hello - {Player5=Hello, Player1=Hello, Player4=Hello}
Ideone demo
class Player {
private String name;
private int id;
public Player(int id) {
this.id = id;
this.name = "Player " + id;
}
#Override
public int hashCode() {
return this.id;
}
#Override
public String toString() {
return this.name;
}
}
Player[] players = new Player[5];
IntStream.range(0, players.length).forEach(i -> players[i] = new Player(i + 1));
// -------------------------------------------------------------------------
HashMap<Player, String> original = new HashMap<>();
original.put(players[0], "Hello");
original.put(players[1], "Hi");
original.put(players[2], "Hi");
original.put(players[3], "Hello");
original.put(players[4], "Hello");
HashMap<String, HashMap<Player, String>> duplicates = new HashMap<>();
original.keySet().stream().forEach(key -> {
String value = original.get(key);
HashMap<Player, String> duplicate = duplicates.get(value);
if (duplicate == null) {
duplicate = new HashMap<>();
duplicates.put(value, duplicate);
}
duplicate.put(key, value);
});
System.out.println("Original: " + original);
duplicates.forEach((key, value) -> {
System.out.println(key + ": " + value);
});
//
Original: {Player 1=Hello, Player 2=Hi, Player 3=Hi, Player 4=Hello, Player 5=Hello}
Hi: {Player 2=Hi, Player 3=Hi}
Hello: {Player 1=Hello, Player 4=Hello, Player 5=Hello}

Reduce list of object into Immutable map with lambda

I have a list of Person object. All person has a unique id, but the person's name can be the same.
Person {
String id,
String name,
}
I want to convert this array of persons into ImmutableMap<String, ImmutableSet<String>>. The key of the map should be the user's name, the immutable set contains the ids of specific user's name.
I know how to do it using HashMap and HashSet:
for (person : personList) {
String id = person.id;
String name = person.name;
if (!hashMap.containsKey(name)) {
hashMap.put(name, new HashSet<String>());
}
hashMap.get(name).add(id);
}
I want to know how to do it using ImmutableMap, ImmutableSet with lambda.
Here is one possible solution.
First, generate some data. Three names and 18 ids. Put them in a list.
Random r = new Random();
int[] ids = r.ints(1000, 1, 1000).distinct().limit(18).toArray();
int id = 0;
List<Person> people = new ArrayList<>();
for (int i = 0; i < 6; i++) {
for (String name : List.of("Bob", "Joe", "Mary")) {
people.add(new Person(name, ids[id++]));
}
}
Now creating the map.
Use groupingBy to create a key pointing to a collection. The key is the
name and the collection is the map.
The collection(a set) holds the ids
Map<String, Set<Integer>> nameToID =
Collections.unmodifiableMap(people.stream().collect(
Collectors.groupingBy(Person::getName, Collectors.mapping(
Person::getID, Collectors.toUnmodifiableSet()))));
Print them.
nameToID.entrySet().forEach(
e -> System.out.println(e.getKey() + " -> " + e.getValue()));
}
}
Here is the Person class with some additional methods and a constructor.
class Person {
String name;
int id;
public Person(String name, int id) {
this.name = name;
this.id = id;
}
public String getName() {
return name;
}
public int getID() {
return id;
}
public String toString() {
return "(" + name + "," + id + ")";
}
}

Get all first value of List<Object[]>

how can I retrieve only the pet types using lambda expression? I want a new list containing only "dog", "cat", etc,
I have list:
List<Object[]> list;
in this list I have structure:
list.get(0) returns ("dog", 11)
list.get(1) returns ("cat", 22)
etc.
how can I retrieve only the pet types using lambda expression? I want a new list containing only "dog", "cat", etc,
An easy way is the use of the stream api:
List firstElements = list.stream().map(o -> o[0]).collect(Collectors.toList());
It is as simple as using map and collect.
private void test(String[] args) {
List<Animal> list = new ArrayList<>();
list.add(new Animal("dog",11));
list.add(new Animal("cat",22));
List<String> names = list.stream()
// Animal -> animal.type.
.map(a -> a.getType())
// Collect into a list.
.collect(Collectors.toList());
System.out.println(names);
}
I used Animal as:
class Animal {
final String type;
final int age;
public Animal(String type, int age) {
this.type = type;
this.age = age;
}
public String getType() {
return type;
}
public int getAge() {
return age;
}
#Override
public String toString() {
return "Animal{" +
"type='" + type + '\'' +
", age=" + age +
'}';
}
}

using iterator to go over list of hashmaps

I have a String exp, which I have to compare with the mapsList.
String exp = "nodeId=1&&name=Router||level=1";
List mapList = new ArrayList();
Map map = new HashMap();
map.put("nodeId","1");
map.put("name","Router");
map.put("level", "1");
Map map1 = new HashMap();
map1.put("nodeId","2");
map1.put("name","Router");
map1.put("level","2");
Map map2 = new HashMap();
map2.put("nodeId","3");
map2.put("name","Router");
map2.put("level","3");
mapList.add(map);
mapList.add(map1);
mapList.add(map2);
I take the exp and split into an array.
String delims = "[\\&&\\||\\=]+";
String[] token = exp.split(delims);
Then I divide the array into two smaller sub arrays. One for Keys and the other for values. After which I compare ...
if(map.keySet().contains(a1[0]) && map.keySet().contains(a1[1]) || map.keySet().contains(a1[2])){
if(map.values().contains(a2[0]) && map.values().contains(a2[1]) || map.values().contains(a2[2])){
System.out.println("Match\tMapKeys: "+map.keySet()+" Values: "+map.values());
}else{
System.out.println("No Match\t");
}
}
So my problem is I can do this for each map, but can't figure out how to implement it with iterator.
Can some1 push me in the right direction?
Thanks.
You really, really want to define an object to hold your data, instead of using HashMaps.
class Node {
private int id;
private String name;
private int level;
public Node(int id, String name, int level) {
this.id = id;
this.name = name;
this.level = level;
}
public int getId() {
return id;
}
public String getName() {
return name;
}
public int getLevel() {
return level;
}
}
now you populate the list like this
List<Node> nodeList = new ArrayList<Node>();
nodeList.add(new Node(1, "Router", 1));
nodeList.add(new Node(2, "Router", 2));
nodeList.add(new Node(3, "Router", 3));
and you could look for your match like this
String exp = "nodeId=1&&name=Router||level=1";
String delims = "[\\&&\\||\\=]+";
String[] token = exp.split(delims);
int id = Integer.parseInt(token[1]);
String name = token[3];
int level = Integer.parseInt(token[5]);
boolean match = false;
for (Node node : nodeList) {
if (node.getId() == id && node.getName().equals(name)
&& node.getLevel() == level) {
System.out.println("Match found: " + node);
match = true;
}
}
if (!match) {
System.out.println("No match");
}
which gives me the following output
Match found: Node#1391f61c
and the next step is to implement toString.
You should check out http://docs.oracle.com/javase/tutorial/java/concepts/ as it introduces objects and why they are useful.

Categories