I am currently struggling with the concept of having a Hashtable as value in a key-value pair of another Hashtable.
Hashtable<Key,Hashtable<Key,Value> table;
In my current project I require a way to group data 2 times, kinda like a node-structure (treeview). Here's a simple example of the kind of data I want to store:
group1
element1
element2
group2
element3
element4
element5
The only thinks which came to my mind are using the above mentioned Hashtable-construct or creating a "node-collection" which would group my data as explained. (does such a "node-collection" exist in the Java API?)
Is it favorable to use the Hashtable-idea over the node-collection-idea?
Why shouldn't it be proper? It's a pretty normal way to do things, though as things get more complex you might want to start resorting to designing your own classes to use as data structures.
The advantage of generic structures is that they serve for a number of needs out-of-the-box, but its disadvantage is that they have low readability and they lack of semantic. Compare these two declarations:
Hashtable<Key1,Hashtable<Key2,Value2>>
Hashtable<Key,BoughtItemsMap>
That's enough for a two-level grouping structure. So you'd better not even imagine how would it be for a three-level:
Hashtable<Key1,Hashtable<Key2,Hashtable<Key3,Value3>>>
Is it favorable to use the Hashtable-idea over the node-collection-idea?
It depends on your needs: A Hashtable (or better, a Map) is used to map keys to values. A collection, instead, does not map; Just contains values.
So, if the 2nd level of your structure does not need mapping, a Collection should be enough. Something like this:
class MyCollectionOfElements extends ArrayList<Element>{...}
Map<Key, MyCollectionOfElements> map=new HashMap<Key, MyCollectionOfElements>();
Related
This question already has answers here:
Class Object vs Hashmap
(3 answers)
Closed 3 years ago.
I have some piece of code that returns a min and max values from some input that it takes. I need to know what are the benefits of using a custom class that has a minimum and maximum field over using a map that has these two values?
//this is the class that holds the min and max values
public class MaxAndMinValues {
private double minimum;
private double maximum;
//rest of the class code omitted
}
//this is the map that holds the min and max values
Map<String, Double> minAndMaxValuesMap
The most apparent answer would be Object Oriented Programming aspects like the possibility to data with functionality, and the possibility to derive that class.
But let's for the moment assume, that is not a major factor, and your example is so simplistic, that I wouldn't use a Map either. What I would use is the Pair class from Apache Commons: https://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/tuple/Pair.html
(ImmutablePair):
https://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/tuple/ImmutablePair.html
The Pair class is generic, and has two generic types, one for each field. You can basically define a Pair of something, and get type safety, IDE support, autocompletion, and the big benefit of knowing what is inside. Also a Pair features stuff that a Map can not. For example, a Pair is potentially Comparable. See also ImmutablePair, if you want to use it as key in another Map.
public Pair<Double, Double> foo(...) {
// ...
Pair<Double, Double> range = Pair.of(minimum, maximum);
return range;
}
The big advantage of this class is, that the type you return exposes the contained types. So if you need to, you could return different types from a single method execution (without using a map or complicated inner class).
e.g. Pair<String, Double> or Pair<String, List<Double>>...
In simple situation, you just need to store min and max value from user input, your custom class will be ok than using Map, the reason is: in Java, a Map object can be a HashMap, LinkedHashMap or and TreeMap. it get you a short time to bring your data into its structure and also when you get value from the object. So in simple case, as you just described, just need to use your custom class, morever, you can write some method in your class to process user input, what the Map could not process for you.
I would say to look from perspective of the usage of a programming language. Let it be any language, there will be multiple ways to achieve the result (easy/bad/complicated/performing ...). Considering an Object oriented language like java, this question points more on to the design side of your solution.
Think of accessibility.
The values in a Map is kind of public that , you can modify the contents as you like from any part of the code. If you had a condition that the min and max should be in the range [-100 ,100] & if some part of your code inserts a 200 into map - you have a bug. Ok we can cover it up with a validation , but how many instances of validations would you write? But an Object ? there is always the encapsulation possibilities.
Think of re-use
. If you had the same requirement in another place of code, you have to rewrite the map logic again(probably with all validations?) Doesn't look good right?
Think of extensibility
. If you wanted one more data like median or average -either you have to dirty the map with bad keys or create a new map. But a object is always easy to extend.
So it all relates to the design. If you think its a one time usage probably a map will do ( not a standard design any way. A map must contain one kind of data technically and functionally)
Last but not least, think of the code readability and cognitive complexity. it will be always better with objects with relevant responsibilities than unclear generic storage.
Hope I made some sense!
The benefit is simple : make your code clearer and more robust.
The MaxAndMinValues name and its class definition (two fields) conveys a min and a max value but overall it makes sure that will accept only these two things and its class API is self explanatory to know how to store/get values from it.
While Map<String, Double> minAndMaxValuesMap conveys also the idea that a min and a max value are stored in but it has also multiple drawbacks in terms of design :
we don't know how to retrieve values without looking how these were added.
About it, how to name the keys we we add entries in the map ? String type for key is too broad. For example "MIN", "min", "Minimum" will be accepted. An enum would solve this issue but not all.
we cannot ensure that the two values (min and max) were added in (while an arg constructor can do that)
we can add any other value in the map since that is a Map and not a fixed structure in terms of data.
Beyond the idea of a clearer code in general, I would add that if MaxAndMinValues was used only as a implementation detail inside a specific method or in a lambda, using a Map or even an array {15F, 20F} would be acceptable. But if these data are manipulated through methods, you have to do their meaning the clearest possible.
We used custom class over Hashmap to sort Map based on values part
A HashSet is backed by a HashMap. From it's JavaDoc:
This class implements the Set interface, backed by a hash table
(actually a HashMap instance)
When taking a look at the source we can also see how they relate to each other:
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
Therefore a HashSet<E> is backed by a HashMap<E,Object>. For all HashSets in our application we have one reference object PRESENT that we use in the HashMap for the value. While the memory needed to store PRESENT is neglectable, we still store a reference to it for each value in the map.
Would it not be more efficient to use null instead of PRESENT? A further consideration then is should we forgo the HashSet altogether and directly use a HashMap, given the circumstance permits the use of a Map instead of a Set.
My basic problem that triggered these thoughts is the following situation: I have a collection of objects on with the following properties:
big collection of objects > 30'000
Insertion order is not relevant
Efficient check if an item is contained
Adding new items to the collection is not relevant
The chosen solution should perform optimal in the context to the above criteria as well as minimize memory consumption. On this basis the datastructures HashSet and HashMap spring to mind. When thinking about alternative approaches, the key question is:
How to check containement efficiently?
The only answer that comes to my mind is using the items hash to calculate the storage location. I might be missing something here. Are there any other approaches?
I had a look at various issues, that did shed some light on the issue, but not quietly answered my question:
Java : HashSet vs. HashMap
clarifying facts behind Java's implementation of HashSet/HashMap
Java HashSet vs HashMap
I am not looking for suggestions of any alternative libraries or framework to address this, but I want to understand if there is an other way to think about efficient containement checking of an element in a Collection.
In short, yes you should use HashSet. It might not be the most possibly efficient Set implementation, but that hardly ever matters, unless you are working with huge amounts of data.
In that case, I would suggest using specialized libraries. EnumMaps if you can use enums, primitive maps like Trove if your data is mostly primitives, a bunch of other data-structures that are optimized for certain data-types, or even an in-memory-database.
Don't get me wrong, I'm someone who likes performance-tuning, too, but replacing the built-in data-structures should only be done when its really necessary. For most cases, they work perfectly fine.
What you could do, in case you really want to save the last bit of memory and do not care about inserting, is using a fixed-sized array, sorting that and doing a binary search every time. But I doubt that it's more efficient than a HashSet.
Hashtables and HashSets should be used entirely different, so maybe the two shouldn't be compared as "which is more efficient". The hashset would be more suitable for the mathematical "set" (ex. {1,2,3,4}). They contain no duplicates and allow for only one null value. While a hashmap is more of a key-> pair value system. They allow multiple null values as well as duplicates, just not duplicate key vales. I know this is probably answering "difference between a hashtable and hashset" but I think my point is they really can't be compared.
I used hashmap to store data.
The problem is that I just noticed hashmap can't have more than one same key.
What else should I use to store data which the data looks like this:
Name1 100.0
Name2 99.8
Name3 121.5
...
Other thing I'm trying to do is to show data of one certain person, when I call that key.
So, is there way to store more than one value related to one key? or should I use other type of storage?
A hashmap can have duplicate keys if you store the values within another data structure such as a linked list or a tree at each key index. Then you just have to decide how to handle the collisions.
Edit:
HashMap
["firstKey"] => LinkedList of (3,4,5)
["secondKey"] => null
["thirdKey"] => LinkedList of (3)
To extend on Matthew Coxes answer, you could extend the Hashtable Class so that it automatically manages your lists for you and would give you the appearance of having multiple keys.
The Google guava library contain some collection type that allow for more that one element per key. The Multimap is the first one that come to mind.
http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/collect/Multimap.html
Guava in general contain a lot of very convenient utilities, I think its worth checking out.
If you can't use an external library, you can simply (Like Matthew Cox said) mix a map and a List with Map<K, List<V>>. But that is a bit more inconvenient to work with since you have to initialise a list for every key.
I'd rather go with my own datamodel and store that in a list, or map if you want fast access, e.g.
public class Player {
private String name;
private List<Float> scores;
}
The advantages:
you can easily see, what the structure wants to express
you can easily extend it (e.g. add aliases for the player, or calculate the avarage scor of player 1)
I am working on an application with a number of custom data classes. I am taking input for my application from 2 different places and want to cross-reference between the two to help ensure the data is accurate.
I have a Map.Entry<String,HashMap<String, Integer>> object called chromosome, where each value is called a marker.
I also have a custom object called IndividualList individuals which extends HashMap<Integer,Individual> where each Individual has a method Genotype getGenotype() which returns the non-static variable genotype. Genotype extends HashMap<String,String[]>
I want to look at each the key for all my marker objects and check whether each of them are present as a key in any Individual's genotype. Every Individual has the same keys in its genotype so I only need to test for one Individual.
The problem I am facing is which Individual to test, as because it is a HashMap I cannot simply just arbitrarily choose the first element, so what I am doing at the moment is taking the values of individuals as a Collection then converting these to an ArrayList<Individual> then taking the first of these elements (which is just an arbitrary one as HashMap is unordered) to get an Individual then taking this Individual's genotype and comparing marker.getKey() with the keys in the genotype. Like so :
for(Map.Entry<String, MarkerPosition> marker : chromosome.getValue().entrySet())
if(!(new ArrayList<Individual>(individuals.values()).get(0)
.getGenotype().containsKey(marker.getKey())))
errors.add("Marker " + marker.getKey() + " is not present in genotype");
But as you can see, this is horrid and ugly and far too complicated, so I was wondering if there is a much simpler way of achieving what I want that I am missing.
Thanks!
Why can you not arbitrarily choose the first element of a HashMap?
individuals.entrySet().iterator().next()
individuals.values().iterator().next()
This will probably be the same entry each time. You should make sure the map is not empty to avoid an exception.
...This question is really confusingly phrased and difficult to understand, but I'm not clear on why you don't just use
individuals.values().iterator().next()
instead of new ArrayList<Individual>(individuals.values()).get(0).
(If you can use third-party libraries, your code would probably be significantly clearer overall if you used a Guava Table, which is a general-purpose, significantly "cleaner" replacement for a Map<K1, Map<K2, V>>. Disclosure: I contribute to Guava.)
Often, I have a list of objects. Each object has properties. I want to extract a subset of the list where a specific property has a predefined value.
Example:
I have a list of User objects. A User has a homeTown. I want to extract all users from my list with "Springfield" as their homeTown.
I normally see this accomplished as follows:
List users = getTheUsers();
List returnList = new ArrayList();
for (User user: users) {
if ("springfield".equalsIgnoreCase(user.getHomeTown())
returnList.add(user);
}
I am not particularly satisfied with this solution. Yes, it works, but it seems so slow. There must be a non-linear solution.
Suggestions?
Well, this operation is linear in nature unless you do something extreme like index the collection based on properties you expect to examine in this way. Short of that, you're just going to have to look at each object in the collection.
But there may be some things you can do to improve readability. For example, Groovy provides an each() method for collections. It would allow you to do something like this...
def returnList = new ArrayList();
users.each() {
if ("springfield".equalsIgnoreCase(it.getHomeTown())
returnList.add(user);
};
You will need a custom solution for this. Create a custom collection such that it implements List interface and add all elements from original list into this list.
Internally in this custom List class you need to maintain some collections of Map of all attributes which can help you lookup values as you need. To populate this Map you will have to use introspection to find list of all fields and their values.
This custom object will have to implement some methods as List findAllBy(String propertyName, String propertyValue); that will use above hash map to look up those values.
This is not an easy straightforward solution. Further more you will need to consider nested attributes like "user.address.city". Making this custom List immutable will help a lot.
However even if you are iterating list of 1000's of objects in List, still it will be faster so you are better off iterating List for what you need.
As I have found out, if you are using a list, you have to iterate. Whether its a for-each, lambda, or a FindAll - it is still being iterated. No matter how you dress up a duck, it's still a duck. As far as I know there are HashTables, Dictionaries, and DataTables that do not require iteration to find a value. I am not sure what the Java equivalent implementations are, but maybe this will give you some other ideas.
If you are really interested in performance here, I would also suggest a custom solution. My suggestion would be to create a Tree of Lists in which you can sort the elements.
If you are not interested about the ordering of the elements inside your list (and most people are usually not), you could also use a TreeMap (or HashMap) and use the homeTown as key and a List of all entries as value. If you add new elements, just look up the belonging list in the Map and append it (if it is the first element of course you need to create the list first). If you want to delete an element simply do the same.
In the case you want a list of all users with a given homeTown you just need to look up that list in the Map and return it (no copying of elements needed), I am not 100% sure about the Map implementations in Java, but the complete method should be in constant time (worst case logarithmic, depending on the Map implementation).
I ended up using Predicates. Its readability looks similar to Drew's suggestion.
As far as performance is concerned, I found negligible speed improvements for small (< 100 items) lists. For larger lists (5k-10k), I found 20-30% improvements. Medium lists had benefits but not quite as large as bigger lists. I did not test super large lists, but my testing made it seem the large the list the better the results in comparison to the foreach process.