I am looking for some kind of map that would have fixed size, for example 20 entries, but not only, I want to keep only the lowest values, lets say I'm evaluating some kind of function and inserting results in my map ( I need map because I have to keep Key-Value ) but I want to have only 20 lowest results. I was thinking about sorting and then removing last element but I need to do it for milions of records, so sorting everytime I add value is not efficient, maybe there is some better way?
Thanks for help.
There is no built in data structure for this in java. You can try looking for one in the guava library. Otherwise think about using a LinkedHashMap or a TreeMap for this. You can wrap it in your own class to take care of the limiting.
If you care about efficiency be advised that TreeMap is in fact a red-black tree internally so put() has the time complexity of log(n).
Related
I am trying to implement binary search into my application.
I am creating a method to go through the user's contact list, add the numbers to an array, sort it and then use a binary search to locate numbers etc.
But I was thinking what kind of array should I just use ArrayList, then sort it and then implement a binary search.
Or is there a way to store the data? like sets, or maps etc?
Scenario - I'll be getting the users contacts from their phone. Every number, of course, needs to be stored in an array or list (whichever is better).
Then sort that array.
Now I want to search for a number using a Binary search. Since a user can have a large contact set, I thought this would be a good method
There are three basic options:
Sorted list or array + binary search.
Tree-based structure like TreeMap.
Hash-based structure like HashMap.
The question is why you need binary search. If you simply want to look up contact info by number, then a HashMap would probably be a better choice from time complexity perspective.
Binary search would make sense if you have some order in keys and are interested in something like range queries. But even in this case a tree-based structure like TreeMap would be a better choice. Not so much for the time complexity (that will be pretty much the same) but more from the interface point of view.
I would suggest using a HashMap, since it has O(1) look-up vs O(log n) look-up in a sorted array.
So if your main concern is look-up (search), go for Hash.
I have two lists of phone numbers. 1st list is a subset of 2nd list. I ran two different algorithms below to determine which phone numbers are contained in both of two lists.
Way 1:
Sortting 1st list: Arrays.sort(FirstList);
Looping 2nd list to find matched element: If Arrays.binarySearch(FistList, 'each of 2nd list') then OK
Way 2:
Convert 1st list into HashMap with key/valus is ('each of 1st list', Boolean.TRUE)
Looping 2nd list to find matched element: If FirstList.containsKey('each of 2nd list') then OK
It results in Way 2 ran within 5 seconds is faster considerably than Way 1 with 39 seconds. I can't understand the reason why.
I appreciate your any comments.
Because hashing is O(1) and binary searching is O(log N).
HashMap relies on a very efficient algorithm called 'hashing' which has been in use for many years and is reliable and effective. Essentially the way it works is to split the items in the collection into much smaller groups which can be accessed extremely quickly. Once the group is located a less efficient search mechanism can be used to locate the specific item.
Identifying the group for an item occurs via an algorithm called a 'hashing function'. In Java the hashing method is Object.hashCode() which returns an int representing the group. As long as hashCode is well defined for your class you should expect HashMap to be very efficient which is exactly what you've found.
There's a very good discussion on the various types of Map and which to use at Difference between HashMap, LinkedHashMap and TreeMap
My shorthand rule-of-thumb is to always use HashMap unless you can't define an appropriate hashCode for your keys or the items need to be ordered (either natural or insertion).
Look at the source code for HashMap: it creates and stores a hash for each added (key, value) pair, then the containsKey() method calculates a hash for the given key, and uses a very fast operation to check if it is already in the map. So most retrieval operations are very fast.
Way 1:
Sorting: around O(nlogn)
Search: around O(logn)
Way 2:
Creating HashTable: O(n) for small density (no collisions)
Contains: O(1)
I have the following key-value system (HashMap) , where String would be a key like this "2014/12/06".
LinkedHashMap<String, Value>
So, I can retrieve an item knowing the key, but what I'm looking for is a method to retrieve a list of the value which key matches partialy, I mean, how could I retrieve all the values of 2014?.
I would like to avoid solutions like, test every item in the list, brute-force, or similar.
thanks.
Apart from doing the brute-force solution of iterating over all the keys, I can think of two options :
Use a TreeMap, in which the keys are sorted, so you can find the first key that is >= "2014/01/01" (using map.getCeilingEntry("2014/01/01")) and go over all the keys from there.
Use a hierarchy of Maps - i.e. Map<String,Map<String,Value>>. The key in the outer Map would be the year. The key in the inner map would be the full date.
Not possible with LinkedHashMap only. If you can copy the keys to an ordered list you can perform a binary search on that and then do a LinkedHashMap.get(...) with the full key(s).
If you're only ever going to want to retrieve items using the first part of the key, then you want a TreeMap rather than a LinkedHashMap. A LinkedHashMap is sorted according to insertion order, which is no use for this, but a TreeMap is sorted according to natural ordering, or to a Comparator that you supply. This means that you can find the first entry that starts with 2014 efficiently (in log time), and then iterate through until you get to the first one that doesn't match.
If you want to be able to match on any part of the key, then you need a totally different solution, way beyond a simple Map. You'd need to look into full text searching and indexing. You could try something like Lucene.
You could refine a hash function for your values so that values with similar year would hash around similar prefixed hashes. That wouldn't be efficient (probably poor distribution of hashes) nor to the spirit of HashMaps. Use other map implementations such as TreeMaps that keep an order of your choice.
I have simple collection of string objects might be around 10 elements ,
but i use this collection in production environment such that the we search for a given string in that collection millions of tiimes ,
what is the best collection or data structure we can use to get the best results so that seach operation can be performed in 0(1) time
we can use HashMap here but the order of search there is in constant time not 0(1) i want to make sure that search is 0(1).
Our data structure must return true if present , else false if not present
Use a HashSet<String> structure. The contains() operation has a complexity of O(1).
Constant time is O(1). HashMap is fine. (Or HashSet, depending on whether you need a Set or a Map.)
If your set is immutable, Guava's ImmutableSet will reduce memory footprint by a factor of ~3 (and probably give you a small constant factor of improved speed).
If you can't use HashSet/HashMap as previously suggested, you could write a Radix Tree implementation.
Say I have a Hashtable<String, Object> with such keys and values:
apple => 1
orange => 2
mossberg => 3
I can use the standard get method to get 1 by "apple", but what I want is getting the same value (or a list of values) by a part of the key, for example "ppl". Of course it may yield several results, in this case I want to be able to process each key-value pair. So basically similar to the LIKE '%ppl%' SQL statement, but I don't want to use a (in-memory) database just because I don't want to add unnecessary complexity. What would you recommend?
Update:
Storing data in a Hashtable isn't a requirement. I'm seeking for a kind of a general approach to solve this.
The obvious brute-force approach would be to iterate through the keys in the map and match them against the char sequence. That could be fine for a small map, but of course it does not scale.
This could be improved by using a second map to cache search results. Whenever you collect a list of keys matching a given char sequence, you can store these in the second map so that next time the lookup is fast. Of course, if the original map is changed often, it may get complicated to update the cache. As always with caches, it works best if the map is read much more often than changed.
Alternatively, if you know the possible char sequences in advance, you could pre-generate the lists of matching strings and pre-fill your cache map.
Update: Hashtable is not recommended anyway - it is synchronized, thus much slower than it should be. You are better off using HashMap if no concurrency is involved, or ConcurrentHashMap otherwise. Latter outperforms a Hashtable by far.
Apart from that, out of the top of my head I can't think of a better collection to this task than maps. Of course, you may experiment with different map implementations, to find the one which suits best your specific circumstances and usage patterns. In general, it would thus be
Map<String, Object> fruits;
Map<String, List<String>> matchingKeys;
Not without iterating through explicitly. Hashtable is designed to go (exact) key->value in O(1), nothing more, nothing less. If you will be doing query operations with large amounts of data, I recommend you do consider a database. You can use an embedded system like SQLite (see SQLiteJDBC) so no separate process or installation is required. You then have the option of database indexes.
I know of no standard Java collection that can do this type of operation efficiently.
Sounds like you need a trie with references to your data. A trie stores strings and lets you search for strings by prefix. I don't know the Java standard library too well and I have no idea whether it provides an implementation, but one is available here:
http://www.cs.duke.edu/~ola/courses/cps108/fall96/joggle/trie/Trie.java
Unfortunately, a trie only lets you search by prefixes. You can work around this by storing every possible suffix of each of your keys:
For 'apple', you'd store the strings
'apple'
'pple'
'ple'
'le'
'e'
Which would allow you to search for every prefix of every suffix of your keys.
Admittedly, this is the kind of "solution" that would prompt me to continue looking for other options.
first of all, use hashmap, not hashtable.
Then, you can filter the map using a predicate by using utilities in google guava
public Collection<Object> getValues(){
Map<String,Object> filtered = Maps.filterKeys(map,new Predicate<String>(){
//predicate methods
});
return filtered.values();
}
Can't be done in a single operation
You may want to try to iterate the keys and use the ones that contain your desired string.
The only solution I can see (I'm not Java expert) is to iterate over the keys and check for matching against a regular expression. If it matches, you put the matched key-value pair in the hashtable that will be returned.
If you can somehow reduce the problem to searching by prefix, you might find a NavigableMap helpful.
it will be interesting to you to look throw these question: Fuzzy string search library in Java
Also take a look on Lucene (answer number two)