I'm looking for a good sorted data structure in java. After doing some research got few hints about using TreeSet/TreeMap. But these components is lack of one thing: random access to an element in the set. For example, I want to access nth element in the sorted set, but with TreeSet, I must iterate over other n-1 elements before I can get there. It would be a waste since I would have upto several thousands elements in my Set.
The use case is like below
9:20 AM what is this object? edited by user1
9:30 AM what is this book ? edited by user2
9:40 PM what is this red book? edited by user1
I always want to show the latest edited title by that user. I know that the latest is going to be with greatest timestamp. For this i found that ConcurrentSkipListSet/Maps are good. But, I would like to know if there are any better ways to implement this functionality.
Assuming you should keep your data sorted, your best bet is TreeMap. There's no silver bullet which is a sorted collection and also perform O(1) random access. In an ordered collection, you're capable of accessing an element with index directly but you cannot benefit from index based access if your collection is required sorted.
If you need concurrency, ConcurrentSkipListMap is good. It's suitable for large-scale concurrent access to data. However, in terms of performance, it's no match for our Red-Black tree based pal, TreeMap. Thus if you don't need concurrency, forget ConcurrentSkipListMap and stick with TreeMap.
TreeMap is elegant and satisfies your need. Nonetheless, in practice, using HashMap and sorting data whenever you need might be better. Try both and find out which one is better in your case.
Related
I want to write my own Map in Java. I know how map works, but i don't really know where you can keep keys and values. Can i keep them for example in List? So the keys would be store in the list and values would be store in another list?
Best would be if you checked out some of the concepts behind HashMap, TreeMap, HeapMap etc.
Once you understand those concepts, you're far better prepared for writing your own map when it comes to speed.
In other words: unless you know the concepts of all available implementations, it is very unlikely your wheel-re-invention will be a better solution.
Also be sure to test your implementations very thoroughly, as Collection are the backbone and heart of any good application.
Two very very simple (but slow) solutions are these:
1) As suggested above, you can use an ArrayList<Pair> and add your custom getItemByKey() (in Java commonly named 'get') method.
2) You can use two arrays, both keeping the same size, and keeping keys and values matched by their respective indices.
For choosing the data structure there's not better than Array (not all time but almost) of Entries (key/value) because the main goal of map is to map objects for objects, so mapping keys to values.
Using arrays for fast and constant access O(1), but you have a little problem, when your map is full, you have to create new Array and copy old entries.
Note: HashMap works in the same way.
Looking at this question it made me curious as to which to use, Hashset vs ArrayList. The Hashset seems to have a better lookup and ArrayList has a better insert (for many many objects). So I my question is, since I can't insert using an ArrayList, then search through it using a HashSet, I'm going to have to pick one or the other. Would inserting with an ArrayList, converting to a HashSet to do the lookups, be SLOWER overall than to just insert into a HashSet then lookup? Or just stick with the ArrayList, although the lookup is worse, the inserting makes up for it?
It very much depends on the size of the collection and the way you use it. I.e. you can reuse the same HashSet for copying and this would save you time. Or you can keep them up-to-date.
Creating a HashSet copy for each element lookup will always be slower.
You can also utilize LinkedHashSet which has quick insertion and a HashSet's look up speed at cost of a little worse memory consumption and O(N) index(int) operation.
You must decide for your specific application which tradeoff pays off better. Do you first insert everything, then spend the rest of the time looking up, maybe occasionally adding a few more? Use HashSet. Do you have a lot of duplicates, which you must suppress? Another strong point for HashSet. Do you insert a lot all the time and only do an occasional lookup? Then use ArrayList. And so on, there are many more combinations and in some cases you'll have to benchmark it to see.
It's totally depends on your use case. If you implement the hashCode method correctly, the insert operation of HashSet is also an O(1) operation. If you don't need randomly access the elements(using index), and you don't want duplicates, HashSet would be a better choice.
Input : Let's say I have an object as Person. It has 2 properties namely
ssnNo - Social Security Number
name.
In one hand I have a List of Person objects (with unique ssnNo) and in the other hand I have a Map containing Person's ssnNo as the key and Person's name as the value.
Output : I need Person names using its ssnNo.
Questions :
Which approach to follow out of the 2 I have mentioned above i.e. using list or map? (I think the obvious answer would be the map).
If it is the map, is it always recommended to use map whether the data-set is large or small? I mean are there any performance issues that come with the map.
Map is the way to go. Maps perform very well, and their advantages over lists for lookups get bigger the bigger your data set gets.
Of course, there are some important performance considerations:
Make sure you have a good hashcode (and corresponding equals) implementation, so that you data will be evenly spread across the buckets of the Map.
Make sure you pre-size your Map when you allocate it (if at all possible). The map will automatically resize, but the resize operation essentially requires re-inserting each prior element into the new, bigger Map.
You're right, you should use a map in this case. There are no performance issues using map compared to lists, the performance is significantly better than that of a list when data is large. Map uses key's hashcodes to retrieve entries, in similar way as arrays use indexes to retrieve values, which gives good performance
This looks like a situation appropriate for a Map<Long, Person> that maps a social security number to the relevant Person. You might want to consider removing the ssnNo field from Person so as to avoid any redundancies (since you would be storing those values as keys in your map).
In general, Maps and Lists are very different structures, each suited for different circumstances. You would use the former whenever you want to maintain a set of key-value pairs that allows you to easily and quickly (i.e. in constant time) look up values based on the keys (this is what you want to do). You would use the latter when you simply want to store an ordered, linear collection of elements.
I think it makes sense to have a Person object, but it also makes sense to use a Map over a List, since the look up time will be faster. I would probably use a Map with SSNs as keys and Person objects as values:
Map<SSN,Person> ssnToPersonMap;
It's all pointers. It actually makes no sense to have a Map<ssn,PersonName> instead of a Map<ssn,Person>. The latter is the best choice most of the time.
Using map especially one that implement using a hash table will be faster than the list since this will allow you to get the name in constant time O(1). However using the list you need to do a linear search or may be a binary search which is slower.
I'm making a java application that is going to be storing a bunch of random words (which can be added to or deleted from the application at any time). I want fast lookups to see whether a given word is in the dictionary or not. What would be the best java data structure to use for this? As of now, I was thinking about using a hashMap, and using the same word as both a value and the key for that value. Is this common practice? Using the same string for both the key and value in a (key,value) pair seems weird to me so I wanted to make sure that there wasn't some better idea that I was overlooking.
I was also thinking about alternatively using a treeMap to keep the words sorted, giving me an O(lgn) lookup time, but the hashMap should give an expected O(1) lookup time as I understand it, so I figured that would be better.
So basically I just want to make sure the hashMap idea with the strings doubling as both key and value in each (key,value) pair would be a good decision. Thanks.
I want fast lookups to see whether a given word is in the dictionary or not. What would be the best java data structure to use for this?
This is the textbook usecase of a Set. You can use a HashSet. The naive implementation for Set<T> uses a corresponding Map<T, Object> to simply mark whether the entry exists or not.
If you're storing it as a collection of words in a dictionary, I'd suggest taking a look at Tries. They require less memory than a Set and have quick lookup times of worst case O(string length).
Any class that is a Set should help your purpose. However, Do note that Set will not allow for duplicates. For that matter, even a Map won't allow duplicate keys. I would suggest on using a an ArrayList(assuming synchronization is not needed) if you need to add duplicate entries and treat them as separate.
My only concern would be memory, if you use the HashSet and if you have a very large collection of words... Then you will have to load the entire collection in the memory... If that's not a problem.... (And your collection must be very large for this to be a problem)... Then the HashSet should be fine... If you indeed have a very large collection of words, then you can try to use a tree, and only load in memory the parts that you are interested in.
Also keep in mind that insertion is fast, but not as fast as in a tree, remember that for this to work, Java is going to insert every element sorted. Again, nothing major, but if you add a lot of words at a time, you may consider using a tree...
This question relates to using most efficient data structure for a part of a uni-project.
I have to store several instruction objects in a data structure. Each instruction has a unique int ID called Stage. Is HashMap the best choice to find the instruction i need fast ?I havent used it before but from the description it seems that using the int ID as key would make this run efficiently. If you can, please suggest a more efficient way to do it. Thanks
If you only want to lookup entries and not add/delete move, sort or do anything else,
than an array is the fastest data structure for this.
Yes. Some kind of Map seems to be the data structure of choice in your scenario.
Note that a HashMap does not maintain the order of its elements. If order is important to you, I suggest you use LinkedHashMap (or perhaps even some List structure) instead.
I think that's the best way because that way you can access the table in O(1).
Depends also on what's the type of your ids, maybe an array is enough (and even more efficient), but generally a hash-table is more flexible for these purposes.
If you know the domain of the keys, an Arraylist or a plain array may be even more efficient. But there are reasons not to use plain arrays too much.
If the ID's are simply Integers and they are like 0,1,2..n then an array would be the best choice.