Indexed addition without IndexOutOfBounds Exception - java

I have n objects each of them with an identifying number. I get them unsorted but the range of indexes (0, n-1) is used to identify them. I want to access them as fastest as possible. I suppose that an ArrayList would be the best option, I'd add the object with identifier n at the position of the ArrayList with index n by:
list.add(identifier, object);
The problem is that when I am adding the objects I get an IndexOutOfBounds Exception because I'm adding them unsorted and the size() is smaller although I know that previous positions will also be filled.
Another option is to use a HashMap but I suppose that this will decrease performance.
Do you know a collection that has the behavior described above?

Do you know a collection that has the behavior described above?
It sounds like you need a plain old Java array. And if you need it as a collection, then use "Arrays.asList(...)" to create a List wrapper for it.
Now this won't work if you needed to add or remove elements from the array / collection, but it sounds like you don't need to from your problem description.
If you do need to add / remove elements (as distinct from using set to update the element at a given position, then Peter Lawrey's approach is best.
By contrast, a HashMap<Integer, Object> would be an expensive alternative. At a rough estimate, I'd say it that "indexing" operations would be at least 10 times slower, and the data structure would take 10 times the space compared to an equivalent array or ArrayList type. A hash table based solution is only really a viable alternative (from a performance perspective) if the array is large and sparse.

Sometimes you get the indexes out of order. This requires you to add dummy entries which may be filled later.
int indexToAdd = ...
E elementToAdd = ...
while(list.size() <= indexToAdd) list.add(null);
list.set(indexToAdd, elementToAdd);
This will allow you to add entries beyond the current end of the list.
The Javadoc for List.add(int, E) and List.set(int, E) both state
IndexOutOfBoundsException - if the index is out of range (index < 0 || index > size())
If you attempt to add entries beyond the end
List list = new ArrayList();
list.add(1, 1);
you get
Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 1, Size: 0
at java.util.ArrayList.rangeCheckForAdd(ArrayList.java:612)
at java.util.ArrayList.add(ArrayList.java:426)
at Main.main(Main.java:28)

I'm not sure how much more expensive a HashMap<Integer, T> would be. Integer.hashCode() is quite efficient, though the only really expensive operation might be copying the data into a new larger array as the number of items increases. However, if you know your n, you could use a normal array.
As an alternative, you could implement your own Map<Integer, T> that does not use the hash code but the Integer itself. Before you do this, make sure that neither an array is sufficient nor HashMap is efficient enough!

I think you have at least two good options.
The first is to use a straight Java array initialized to the appropriate length, then add objects with the syntax:
theArray[i] = object;
The second would be to use a HashMap, and add objects with the syntax:
theMap.put(i, object);
I'm not sure what performance issues you're worried about, but adding elements within the range, clearing (out of an array) or removing (out of a HashMap), and finding elements from a given index (or key, for a HashMap) are all O(1) for both structures. I would also suggest taking a look at Wikipedia's list of data structures if neither of these seem good.

Related

Structure like Java's EnumSet that can hold repeated elements

I need some structure where to store N Enums, some of them repeated. And be able to easily extract them. So far I've try to use the EnumSet like this.
cards = EnumSet.of(
BEST_OF_THREE,
BEST_OF_THREE,
SIMPLE_QUESTION,
SIMPLE_QUESTION,
STAR);
But now I see it can only have one of each. Conceptually, which one would be the best structure to use for this problem.
Regards
jose
You can use a Map of type Enumeration -> Integer, where the integer indicates how many of each there are. The google guava "MultiSet" does this for you, and handles the edge cases of adding an enum to the set when there is not already an entry, and removing an enum when it leaves none left.
Another strategy is to use the Enumeration ordinal index. Because this index is unique, you can use this to index into an int array that is sized to the Enumeration size, where the count in each array slot would indicate how many of each enumeration you have. Like this:
// initialize array for counting each enumeration type
// TODO: someone should double check every initial value will be zero
int[] cardCount = new int[CardEnum.values().length];
...
// incrementing the count for an enumeration (when we add)
cardCount[BEST_OF_THREE.ordinal()]++;
...
// decrementing the count for an enumeration (when we remove)
cardCount[BEST_OF_THREE.ordinal()]--;
// DEBUG: assert cardCount[BEST_OF_THREE.ordinal()] >= 0
...
// getting the count for an enumeration
int count = cardCount[BEST_OF_THREE.ordinal()];
... Some time later
Having read the clarifying comments underneath the original post that explained what the OP was asking, it is clear that you're best off with a linear structure with an entry per element. I didn't realize that you didn't need detailed information on how many of each you needed. Storing them in a MultiSet or an equivalent counting structure makes it hard to randomly pick, as you need to attribute an index picked at random from [0, size) to a particular container, which takes log time.
Sets don't allow duplicates, so if you want repeats you'll need either a List or a Map.
If you just need the number of duplicates, an EnumMap with Integer values is probably your best bet.
If the order is important, and you need quick access to the number of each type, you'll probably need to roll your own data structure.
If the order is important (but the count of each is not), then a List is the way to go, which implementation depends on how you will use it.
LinkedList - Best when there will be many inserts/removals from the beginning of the List. Indexing into a LinkedList is very expensive, and should be avoided whenever possible. If a List is built by shifting data onto the front of the list, but any later additions are at the end, conversion to an ArrayList once the initial List is built is a good idea - especially if indexing into the List is anticipated at any point.
ArrayList - When in doubt, this is a good place to start. Inserting or removing items requires shifting, so if this is a common operation look elsewhere.
TreeList - This is a good all-around option, and insertions and removals anywhere in the List are inexpensive. This does require the Apache commons library, and uses a bit more memory than the others.
Benchmarks, and the code used go generate them can be found in this gist.

Fastest Variable Set in Java

I need Structure (Arraylist, LinkedList, etc) that is very fast for this case:
While the structure is not empty I search the structure for elements that satisfy a condition , lets say k, remove the elements that satisfy k and start over for another condition lets say k+1.
e.g.:
for (int i = 1 ; i <= 1000000; i++) {
structure.add(i);
}
d = 2;
while (!structure.isEmpty()) {
for(int boom : structure.clone) {
if (boom % d == 2) {
structure.remove(boom);
}
d++;
}
}
If the elements are primitives, then the fastest structure will most probably be a specialized primitive collection (e.g., trove). Following references for boxed primitives is a nearly sure cache miss and this probably dominates the costs.
I wouldn't suggest a LinkedList for the same reason: It's dead slow due to cache misses.
If the order is unimportant, than an ArrayList is perfect. Instead of removing an element, replace it by the last one and remove the last array element. This is an O(1) operation and doesn't suffer from the bad spatial locality.
If the order is important, you can build your own ArrayList-like structure. Instead of removing an element, you mark it for removal e.g. in a BitSet or in a boolean[]. Finally you perform the removal in one sweep by moving all elements to their right position and adjusting the length. The optimized loop will most probably look similar to CharMatcher.removeFrom loop.
A simpler solution would be to use an ArrayList and copy all surviving elements to another one. I'd bet it'd beat the LinkedList hands down. As a minor GC-friedly optimization you can work with two lists.
LinkedList should be fastest for this case. Use the iterator explicitly (structure.iterator()) and call the remove method of the iterator instead of calling structure.remove(element)!
I don't know your exact use case, but here's one note.
If you have your predicates P1 .. PN pre-compiled, available, and if you are not modifying the contents of the collection and if your predicates are not dependent on each other, you might want to create a composite predicate, like bundle up N predicates in some logical order, and then in only one iteration over your collection perform the filtering method.
As for data structure, I'd think if it like this:
If my filtering predicates will be totally arbitrary, then a list should be OK to use.
In some more specific cases with very limited and strict value sets, you might consider a tree-like or a graph-like structure, where you could have some master nodes which would denote that property "property1" has value "value1". In case you wanted to drop all items where "property1" value is "value1" you could tell that master node to remove all his children (and that they should detach themselves from any other parent master nodes they might have).
Sorted List data structure
If you construct the lists yourself you can consider using a sorted data-structure. It will give you best search performance( log n complexity so it is very fast).
Linked List data structure
LinkedList gives you constant time element removal but random access doesn't have constant complexity (is slow).
You will have to benchmark if a LinkedList or a sorted list would be faster for your scenario.
If your elements are ints, I suppose bit set would be the fastest data structure for this task. Iteration would be slightly slower than through array list (even not standard java.util.ArrayList, only primitive specialization), but remove ops cost nearly nothing, while removes from any array list are quite expensive.
Note, you can gain much by working directly with long[] as bit set and performing bitwise operations by hand, because java.util.BitSet is not very performance-focused. But, of cause, start with BitSet.

Adding elements into ArrayList at position larger than the current size

Currently I'm using an ArrayList to store a list of elements, whereby I will need to insert new elements at specific positions. There is a need for me to enter elements at a position larger than the current size. For e.g:
ArrayList<String> arr = new ArrayList<String>();
arr.add(3,"hi");
Now I already know there will be an OutOfBoundsException. Is there another way or another object where I can do this while still keeping the order? This is because I have methods that finds elements based on their index. For e.g.:
ArrayList<String> arr = new ArrayList<String>();
arr.add("hi");
arr.add(0,"hello");
I would expect to find "hi" at index 1 instead of index 0 now.
So in summary, short of manually inserting null into the elements in-between, is there any way to satisfy these two requirements:
Insert elements into position larger than current size
Push existing elements to the right when I insert elements in the middle of the list
I've looked at Java ArrayList add item outside current size, as well as HashMap, but HashMap doesn't satisfy my second criteria. Any help would be greatly appreciated.
P.S. Performance is not really an issue right now.
UPDATE: There have been some questions on why I have these particular requirements, it is because I'm working on operational transformation, where I'm inserting a set of operations into, say, my list (a math formula). Each operation contains a string. As I insert/delete strings into my list, I will dynamically update the unapplied operations (if necessary) through the tracking of each operation that has already been applied. My current solution now is to use a subclass of ArrayList and override some of the methods. I would certainly like to know if there is a more elegant way of doing so though.
Your requirements are contradictory:
... I will need to insert new elements at specific positions.
There is a need for me to enter elements at a position larger than the current size.
These imply that positions are stable; i.e. that an element at a given position remains at that position.
I would expect to find "hi" at index 1 instead of index 0 now.
This states that positions are not stable under some circumstances.
You really need to make up your mind which alternative you need.
If you must have stable positions, use a TreeMap or HashMap. (A TreeMap allows you to iterate the keys in order, but at the cost of more expensive insertion and lookup ... for a large collection.) If necessary, use a "position" key type that allows you to "always" generate a new key that goes between any existing pair of keys.
If you don't have to have stable positions, use an ArrayList, and deal with the case where you have to insert beyond the end position using append.
I fail to see how it is sensible for positions to be stable if you insert beyond the end, and allow instability if you insert in the middle. (Besides, the latter is going to make the former unstable eventually ...)
even you can use TreeMap for maintaining order of keys.
First and foremost, I would say use Map instead of List. I guess your problem can be solved in better way if you use Map. But in any case if you really want to do this with Arraylist
ArrayList<String> a = new ArrayList<String>(); //Create empty list
a.addAll(Arrays.asList( new String[100])); // add n number of strings, actually null . here n is 100, but you will have to decide the ideal value of this, depending upon your requirement.
a.add(7,"hello");
a.add(2,"hi");
a.add(1,"hi2");
Use Vector class to solve this issue.
Vector vector = new Vector();
vector.setSize(100);
vector.set(98, "a");
When "setSize" is set to 100 then all 100 elements gets initialized with null values.
For those who are still dealing with this, you may do it like this.
Object[] array= new Object[10];
array[0]="1";
array[3]= "3";
array[2]="2";
array[7]="7";
List<Object> list= Arrays.asList(array);
But the thing is you need to identify the total size first, this should be just a comment but I do not have much reputation to do that.

Question regarding Java's LinkedList class

I have a question regarding the LinkedList class in Java.
I have a scenario wherein i need to add or set an index based on whether the index exists in the linkedlist or not. A pseudo-code of what i want to achieve is --
if index a exists within the linkedlist ll
ll.set(a,"arbit")
else
ll.add(a,"arbit")
I did go through the Javadocs for the LinkedList class but did not come across anything relevant.
Any ideas ?
Thanks
p1ng
What about using a Map for this:
Map<Integer, String> map = new HashMap<Integer, String>();
// ...
int a = 5;
map.put(a, "arbit");
Even if a already exists, put will just replace the old String.
Searching in linked list is not very efficient (O(n)). Have you considering using different data structure - e.g. HashMap which would give you O(1) access time?
If you need sequential access as well as keyed access you might want to try a LinkedHashMap, available as from 1.4.2
http://download.oracle.com/javase/1.4.2/docs/api/java/util/LinkedHashMap.html
Map<Integer, String> is definitely a good (the best?) way to go here.
Here's an option for keeping with LinkedList if that's for some bizarre reason a requirement. It has horrible runtime performance and disallows null, since null now becomes an indicator that an index isn't occupied.
String toInsert = "arbit";
int a = 5;
//grow the list to allow index a
while ( a >= ll.size() ) {
ll.add(null);
}
//set index a to the new value
ll.set(a, toInsert);
If you're going to take this gross road, you might be better off with an ArrayList.
Why is it so bad? Say you had only one element at index 100,000. This implementation would require 100,000 entries in the list pointing to null. This results in horrible runtime performance and memory usage.
LinkedList cannot have holes inside, so you can't have list [1,2,3,4] and then ll.add(10,10), so I think there's something wrong with your example. Use either Map or search for some other sparse array
It looks like you're trying to use a as a key, and don't state whether you have items at index i < a. If you run your code when ll.size() <= a then you'll end up with a NullPointerException.
And if you add an item at index a the previous item at a will now be at a+1.
In this case it would be best to remove item at a first (if it exists) then add item "arbit" into a. Of course, the condition above re: ll.size() <=a still applies here.
If the order of the results is important, a different approach could use a HashMap<Integer,String> to create your dataset, then extract the keys using HashMap<?,?>.getKeySet() then sort them in their natural order (they're numeric after all) then extract the values from the map while iterating over the keySet. Nasty, but does what you want... Or create your own OrderedMap class, that does the same...
Could you expand on why you need to use a LinkedList? Is ordering of the results important?

How to get the last 25 elements of a SortedSet?

In Java I have a SortedSet that may have 100,000 elements. I would like to efficiently and elegantly get the last 25 elements. I'm a bit puzzled.
To get the first 25 I'd iterate and stop after 25 elements. But I don't know how to iterate in reverse order. Any ideas?
SortedSet<Integer> summaries = getSortedSet();
// what goes here :-(
You need a NavigableSet. Else you’ll have to do it inefficiently, iterating through the whole SortedSet and collecting elements into a Queue that you keep trimmed at 25 elements.
SortedSet<T> was designed assuming a very simple iteration model, forward only, thus finding the top n entries is easy but finding the last would require an expensive read through the iterator maintaining a window of the last n entries.
NavigableSet<T> adding in 1.6 solves this (and the only SortedSet implementation from 1.4 TreeSet implements it so it is likely to be a drop in replacement for you).
NavigableSet<T> set = new TreeSet<T>();
// add elements
set.descendingIterator() // iterate over the last n entires as needed
Reverse your sort and take the first 25 items. You can then reverse those which will be efficient as its only 25 items.
Bruce
A different data structure would be more appropriate for this operation.
This is not an elegant way or very efficient, but assuming the SortedSet is in ascending order you could get the Last() item and remove it, storing it in another list, and repeat 25 times. You would then have to put these elements back again!
Throw the Set into a List and use subList(). I'm not sure how performant it is to create the List, so you'd have to run some tests. It'd certainly make the coding easy though.
List f = new ArrayList( summaries);
List lastTwentyFive = f.subList( summaries.size() - 25, summaries.size() );
https://github.com/geniot/indexed-tree-map
You may want to take a look at IndexedTreeMap in indexed-tree-map
Use exact(size-25) to get to the element at index without iteration.
I'm assuming that this is unlikely to be of any real-life use in your project, but it's worth noting that you might simply be able to have the list sorted in the opposite direction instead :)

Categories