I have an interesting design problem that I will attempt to simplify in a toy problem below:
I wish to design a system for which the output will be student objects based on certain inputs and intermediary processing. The flow will be as follows: I have a list of classrooms as one type of input. To generate the output, the processing steps are:
Filter each classroom by students under the age of X (lets say 10)
Sort the filtered results by any permutation of this hierarchical order: height, weight, arm length
Return the top 8 students.
Another input can simply be a list of students I already have and want included as part of the result. For example: Input 1: List of 3 students, Input 2: List of 2 classrooms for which the processing steps above will run.
What would be the best way to design such a system with inputs being:
input type {student list|classroom list},
filter type {age|height|etc},
sort order{any ordering of height,weight,arm length},
returnNum{how many students to return}
The system should be flexible enough to accommodate more input types and more sort order entries {ie. sort students by shoe size}. What data structure can I use to model each part of this section (ie. what is the best way to represent the sort order criteria?) Is there any design pattern that would fit these needs? Any help with the architecture design would be greatly appreciated!
Well, what you're suggesting can be done easily with Java 8 streams, so I guess one pattern you could follow is that of a pipeline. You could also implement this using internal iterators:
List<Student> found = Stream.of(student1, student2, student3, ..., studentn)
.filter(s -> s.getAge() > 100)
.sorted(Comparator.comparing(Student::getHeight).thenComparing(Student::getWeight))
.limit(10)
.collect(Collectors.toList());
From the requirements, although both Student and Classroom are StudentSources, the filtering and sorting always act on Student (they're never filtered or sorted by classroom, in your sample inputs). Filtering is pretty easy:
// interface with a single method, reduces to a lambda too
interface Filter<T> {
boolean accept(T candidate);
}
Sorting is canonically:
package java.util;
// interface with a single method, reduces to a lambda too
interface Comparable<T> {
int compareTo(T a, T b);
}
Both of the above are applications of the Visitor design pattern. As with the succinct answer by #Edwin, you line up your visitors in a pipeline (configuration phase) and then visit them (execution phase). Notice that the Visitor pattern has 'reasons' which um, students should read up on in their 'Gang of 4' book.
You don't say much about:
how the inputs are represented to the program (eg. as text which needs to be parsed)
whether the same student might conceivably appear in more than 1 classroom, or the list-of-students ... this has bearing on which Java collection you might choose to pass the Comparator to;
So the task at hand boils down to:
read the definitions of the filters, sorters, limiters, create visitors for these
read the data (classrooms/students), and for each student discovered, do if passesAllFilters(student) okstudents.add(student); where okstudents is a java.util.TreeSet primed with a Comparator, stop when limit is reached.
You could possibly quibble that the step which takes 'input that defines filter(s)' is a 'factory method', but that's really of no help .. List<Filter<Student>> getFilters(String filterSpec) doesn't really get you anywhere where a factory method is useful. Parsing the filters and coming up with code which references particular properties of Students, applies the expressions, etc, may not be a trivial task. Depending on the types of expressions your filters are required to permit, you might want to look into a compiler-generator like ANTLR 4. You'll likely need to use reflection.
take a look at https://docs.oracle.com/javase/7/docs/api/java/util/SortedMap.html
you can use multiple SortedMap (one for every key), in which you put every element of your list whit the corresponding key for every map (es: sortedMapAge.put(carl, 18) ..sortedMapHeight(carl,"1.75") ...).
so with iterator you can access your list members through the varius keys using the appropriate SortedMap.
And if you want to store alla that maps in a further abstraction layer you can store them in a hashmap and use as key the key identifier (sortedMapAge...key Age, sortedMapHeight height ....)
it's quite tricky but maps offer you a good ways to organize objects with keyset.
Personally, I'd do it like this:
public <E> List<E> query(List<E> dataset, Filter<E> filter, Comparator<E> sortOrder, int maxResults);
That is, I'd use generics to abstract over the input type, the command pattern for the filters and orders, and a plain int for the number of results to return.
Related
This question already has answers here:
Class Object vs Hashmap
(3 answers)
Closed 3 years ago.
I have some piece of code that returns a min and max values from some input that it takes. I need to know what are the benefits of using a custom class that has a minimum and maximum field over using a map that has these two values?
//this is the class that holds the min and max values
public class MaxAndMinValues {
private double minimum;
private double maximum;
//rest of the class code omitted
}
//this is the map that holds the min and max values
Map<String, Double> minAndMaxValuesMap
The most apparent answer would be Object Oriented Programming aspects like the possibility to data with functionality, and the possibility to derive that class.
But let's for the moment assume, that is not a major factor, and your example is so simplistic, that I wouldn't use a Map either. What I would use is the Pair class from Apache Commons: https://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/tuple/Pair.html
(ImmutablePair):
https://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/tuple/ImmutablePair.html
The Pair class is generic, and has two generic types, one for each field. You can basically define a Pair of something, and get type safety, IDE support, autocompletion, and the big benefit of knowing what is inside. Also a Pair features stuff that a Map can not. For example, a Pair is potentially Comparable. See also ImmutablePair, if you want to use it as key in another Map.
public Pair<Double, Double> foo(...) {
// ...
Pair<Double, Double> range = Pair.of(minimum, maximum);
return range;
}
The big advantage of this class is, that the type you return exposes the contained types. So if you need to, you could return different types from a single method execution (without using a map or complicated inner class).
e.g. Pair<String, Double> or Pair<String, List<Double>>...
In simple situation, you just need to store min and max value from user input, your custom class will be ok than using Map, the reason is: in Java, a Map object can be a HashMap, LinkedHashMap or and TreeMap. it get you a short time to bring your data into its structure and also when you get value from the object. So in simple case, as you just described, just need to use your custom class, morever, you can write some method in your class to process user input, what the Map could not process for you.
I would say to look from perspective of the usage of a programming language. Let it be any language, there will be multiple ways to achieve the result (easy/bad/complicated/performing ...). Considering an Object oriented language like java, this question points more on to the design side of your solution.
Think of accessibility.
The values in a Map is kind of public that , you can modify the contents as you like from any part of the code. If you had a condition that the min and max should be in the range [-100 ,100] & if some part of your code inserts a 200 into map - you have a bug. Ok we can cover it up with a validation , but how many instances of validations would you write? But an Object ? there is always the encapsulation possibilities.
Think of re-use
. If you had the same requirement in another place of code, you have to rewrite the map logic again(probably with all validations?) Doesn't look good right?
Think of extensibility
. If you wanted one more data like median or average -either you have to dirty the map with bad keys or create a new map. But a object is always easy to extend.
So it all relates to the design. If you think its a one time usage probably a map will do ( not a standard design any way. A map must contain one kind of data technically and functionally)
Last but not least, think of the code readability and cognitive complexity. it will be always better with objects with relevant responsibilities than unclear generic storage.
Hope I made some sense!
The benefit is simple : make your code clearer and more robust.
The MaxAndMinValues name and its class definition (two fields) conveys a min and a max value but overall it makes sure that will accept only these two things and its class API is self explanatory to know how to store/get values from it.
While Map<String, Double> minAndMaxValuesMap conveys also the idea that a min and a max value are stored in but it has also multiple drawbacks in terms of design :
we don't know how to retrieve values without looking how these were added.
About it, how to name the keys we we add entries in the map ? String type for key is too broad. For example "MIN", "min", "Minimum" will be accepted. An enum would solve this issue but not all.
we cannot ensure that the two values (min and max) were added in (while an arg constructor can do that)
we can add any other value in the map since that is a Map and not a fixed structure in terms of data.
Beyond the idea of a clearer code in general, I would add that if MaxAndMinValues was used only as a implementation detail inside a specific method or in a lambda, using a Map or even an array {15F, 20F} would be acceptable. But if these data are manipulated through methods, you have to do their meaning the clearest possible.
We used custom class over Hashmap to sort Map based on values part
There is a Team object , that contains list of players List<Players>. All teams need to be stored in a Teams collection.
Conditions:
If a new player need to be added to a particular team , that particular team is retrieved from Teams and Player need to be added to Players list of that team
Each Team object in the collection Teams need to be unique based on the team name
Team objects in the collection need to be sorted based on team name.
Considerations:
In this scenario when I use List<Team> , I can achieve 1, 3 . But uniqueness cannot be satisfied.
If I use TreeSet<Team> 2,3 can be achieved. But as there is no get method on TreeSet , a particular team cannot be selected
So I ended up using TreeMap<teamName,Team>. This makes all 1,2,3 possible. But I think it's not the good way to do it
Which Data Structure is ideal for this use case? Preferably form Java collections.
You can utilize your TreeSet if you wish. However, if you're going to utilize the Set interface you can use remove(object o) instead of get. You'll remove the object, make your modifications, then add it back into the set.
I think extending (i.e. creating a subclass from) ArrayList or LinkedList and overriding the set(), add(), addAll(), remove(), and removeRange() methods in such way that they ensure the uniqueness and sortedness conditions (invariants) would be a very clean design. You can also implement a binary search method in your class to quickly find a team with a given name.
ArrayList is a better choice to base your class on, if you aren't going to add or remove teams too frequently. ArrayList would give you O(n) insertion and removal, but O(log n) cost for element access and ensuring uniqueness if you use binary search (where n is the number of elements in the array).
See the generics tutorial for subclassing generics.
How about using a Guava's MultiMap? More precisely, a SetMultimap. Specifically, a SortedSetMultimap. Even more specifically, its TreeMultimap implementation (1).
Explanations:
In a MultiMap, a Key points not to a single value, but rather to a Collection of values.
This means you can bind to a single Team key a collection of several Player values, so that's Req1 solved.
In a SetMultiMap, the Keys are unique.
This gets your Req2 solved.
In a SortedSetMultimap, the Valuess are also sorted.
While you don't specifically care for this, it's nice to have.
In a TreeMultimap, The Keyset and each of their Values collections are Sorted.
This gets your Req3 sorted (See what I did there?)
Usage:
TreeMultimap<Team, Player> ownership = new TreeMultimap<Team, Player>();
ownership.put(team1, playerA);
ownership.put(team1, playerB);
ownership.put(team2, playerC);
Collection<Player> playersOfTeamA = ownership.get(team1); // contains playerA, playerB
SortedSet<Team> allTeams = ownership.keySet(); // contains team1, team2
Gothas:
Remember to set equals and hashCode correctly on your Team object to use its name.
Alternatively, you could use the static create(Comparator<? super K> keyComparator, Comparator<? super V> valueComparator) which provides a purpose-built comparison if you do not wish to change the natural ordering of Team. (use Ordering.natural() for the Player comparator to keep its natural ordering - another nice Guava thing). In any case, make sure it is compatible with equals!
MultiMaps are not Maps because puting a new value to a key does not remove the previously held value (that's the whole point), so make sure you understand it. (for instance it still hold that you cannot put a key-value pair twice...)
(1): I am unsure wether SortedSetMultimap is sufficient. In its Javadoc, it states the Values are sorted, but nothing is said of the keys. Does anyone know any better?
(2) I assure you, I'm not affiliated to Guava in any way. I just find it awesome!
Time and again, I find myself in the situation where I want to use a value, and add it to a collection at the same time, e.g.:
List<String> names = new ArrayList<>();
person1.setName(addTo(names, "Peter"));
person2.setName(addTo(names, "Karen"));
(Note: using java.util.Collection.add(E) doesn't work of course, because it returns a boolean.)
Sure, it's easy to write a utility method myself like:
public static <E> E addTo(Collection<? super E> coll, E elem) {
coll.add(elem);
return elem;
}
But is there really not something like this already in JavaSE, Commons Collections, Guava, or maybe some other "standard" library?
The following will work if you use Eclipse Collections:
MutableList<String> names = Lists.mutable.empty();
person1.setName(names.with("Peter").getLast());
person2.setName(names.with("Karen").getLast());
The with method returns the collection being added to so you can easily chain adds if you want to. By using getLast after calling with on a MutableList (which extends java.util.List) you get the element you just added.
Note: I am a committer for Eclipse Collections.
This looks like a very strange pattern to me. A line like person1.setName(addTo(names, "Peter")) seems inverted and is very difficult to properly parse:
An existing person object is assigned a name, that name will first be added to a list of names, and the name is "Peter".
Contrast that with (for example) person1.setName("Peter"); names.add(person1.getName());:
Make "Peter" the name of an existing person object, then add that name to a list of names.
I appreciate that it's two statements instead of one, but that's a very low cost relative to the unusual semantics you're proposing. The latter formatting is easier to understand, easier to refactor, and more idiomatic.
I would be willing to wager that many scenarios that might benefit from your addTo() method have other problems and would be better-served by a different refactoring earlier on.
At its core the issue seems to be that you're trying to represent a complex data type (Person) while simultaneously constructing an unrelated list consisting of a particular facet of those objects. A potentially more straightforward (and still fluent) option would be to construct a list of Person objects and then transform that list to extract the values you need. Consider:
List<Person> people = ImmutableList.of(new Person("Peter"), new Person("Karen"));
List<String> names = people.stream().map(Person::getName).collect(toList());
Notice that we no longer need the isolated person1 and person2 variables, and there's now a more direct relationship between people and names. Depending on what you need names for you might be able to avoid constructing the second list at all, e.g. with List.forEach().
If you're not on Java 8 yet you can still use a functional syntax with Guava's functional utilities. The caveat on that page is a worthwhile read too, even in Java-8-land.
I have a List that contains objects like an adress - e.g. city, street, name.
And always need receive 3 lists: first ordered by city, second ordered by street, third ordered by name. It is clear that possible order consequentially the same list. But it consumes many time.
Is it possible to create something like Iterator that can depend from parameter return all members of collection in corresponding order? Or exists another solution?
Thanks.
Not with the standard java collection framework. What you are asking to work, you have to build different search trees that operate on same collection.
It is like having indices on different columns in DB
You can build 3 Comparators- 1 based on each criteria item, and then call Collections.sort(List, Comparator) to get each sorting.
If you're doing this numerous times on the same list, make 3 copies.
Note that the iterator idea wouldn't really be terribly efficient as it would need some way to trace through the list that would be at least as ugly as doing the sort. That is, the iterator would need to know which item comes next, in some way other than the natural order of the list. It's just as easy to re-sort the list as try to navigate it out of natural order.
Often, I have a list of objects. Each object has properties. I want to extract a subset of the list where a specific property has a predefined value.
Example:
I have a list of User objects. A User has a homeTown. I want to extract all users from my list with "Springfield" as their homeTown.
I normally see this accomplished as follows:
List users = getTheUsers();
List returnList = new ArrayList();
for (User user: users) {
if ("springfield".equalsIgnoreCase(user.getHomeTown())
returnList.add(user);
}
I am not particularly satisfied with this solution. Yes, it works, but it seems so slow. There must be a non-linear solution.
Suggestions?
Well, this operation is linear in nature unless you do something extreme like index the collection based on properties you expect to examine in this way. Short of that, you're just going to have to look at each object in the collection.
But there may be some things you can do to improve readability. For example, Groovy provides an each() method for collections. It would allow you to do something like this...
def returnList = new ArrayList();
users.each() {
if ("springfield".equalsIgnoreCase(it.getHomeTown())
returnList.add(user);
};
You will need a custom solution for this. Create a custom collection such that it implements List interface and add all elements from original list into this list.
Internally in this custom List class you need to maintain some collections of Map of all attributes which can help you lookup values as you need. To populate this Map you will have to use introspection to find list of all fields and their values.
This custom object will have to implement some methods as List findAllBy(String propertyName, String propertyValue); that will use above hash map to look up those values.
This is not an easy straightforward solution. Further more you will need to consider nested attributes like "user.address.city". Making this custom List immutable will help a lot.
However even if you are iterating list of 1000's of objects in List, still it will be faster so you are better off iterating List for what you need.
As I have found out, if you are using a list, you have to iterate. Whether its a for-each, lambda, or a FindAll - it is still being iterated. No matter how you dress up a duck, it's still a duck. As far as I know there are HashTables, Dictionaries, and DataTables that do not require iteration to find a value. I am not sure what the Java equivalent implementations are, but maybe this will give you some other ideas.
If you are really interested in performance here, I would also suggest a custom solution. My suggestion would be to create a Tree of Lists in which you can sort the elements.
If you are not interested about the ordering of the elements inside your list (and most people are usually not), you could also use a TreeMap (or HashMap) and use the homeTown as key and a List of all entries as value. If you add new elements, just look up the belonging list in the Map and append it (if it is the first element of course you need to create the list first). If you want to delete an element simply do the same.
In the case you want a list of all users with a given homeTown you just need to look up that list in the Map and return it (no copying of elements needed), I am not 100% sure about the Map implementations in Java, but the complete method should be in constant time (worst case logarithmic, depending on the Map implementation).
I ended up using Predicates. Its readability looks similar to Drew's suggestion.
As far as performance is concerned, I found negligible speed improvements for small (< 100 items) lists. For larger lists (5k-10k), I found 20-30% improvements. Medium lists had benefits but not quite as large as bigger lists. I did not test super large lists, but my testing made it seem the large the list the better the results in comparison to the foreach process.