Related
I'm trying to implement pagination within a java/spring boot app on a Collection that is returned from a function. I want to get pages that are ordered by each elements "startTime". So if the user asks for page 2 and each page has 10 items in it, then I would want to give the user the items with the top 10-20 most recent start times. I've since tried two approaches:
a) Converting the returned collection into an array and then using IntStream on it to get elements from one index to another.
final exampleClass[] example = exampleCollection.toArray(new exampleClass[0]);
Collection<exampleClass> examplePage = IntStream.range(start, end)
...
b) Converting the returned collection into an ArrayList and then using Pageable/PageRequest to create a new page from that ArrayList.
The problem is that these seem very inefficient since I first have to change the Collection to an ArrayList or array and then operate on it. I would like to know if there are more efficient ways to turn collections into structures that I can iterate on using indices so that I can implement pagination. Or, if there are some Spring functions for creating pages that don't require non-Collection parameters. However, I can't find any Spring functions for creating pages that do.
Also, is there any difference in runtime between
List<exampleClass> x = new ArrayList<>(exampleCollection);
and
List<exampleClass> x = (List<exampleClass>)exampleCollection;
I would like to know if there are more efficient ways to turn collections into structures that I can iterate on using indices
The only efficient way is to check by instanceof if your collection is indeed a List. If it is, then you can cast it, and simply use e.g. sublist(start, stop) to produce your paginated result.
Please note that accessing an element by its index might not be efficient either. In a LinkedList, accessing an element is a O(N) operation, so accesing M elements by index produces a O(M*N) operation, whereas using sublist() is a O(M+N) operation.
There is a specialization of the List interface that is used to mark lists that are fast at being accessed by index, and that is : RandomAccess, you may or may not want to check on that to decide on the best strategy.
Also, is there any difference in runtime between
List x = new ArrayList<>(exampleCollection);
and
List<exampleClass> x = (List<exampleClass>)exampleCollection;
There absolutely is.
The second is a cast, and has virtually no cost. Just beware that x and exampleCollection are one and the same object (modifying one is the same as modifying the other). Obviously, a cast may fail with a ClassCastException if exampleCollection is not actually a list.
The first performs a copy, which has a cost in both CPU (traversal of exampleCollection) and memory (allocating an array of the collection's size). Both are pretty low for small collections, but your mileage may vary.
In this copy case, modifying one collection does nothing to the other, you get a copy.
A collection does not have to have consistent iteration order: if you call iterator() twice you may get two different sequences. Converting the collection into an array or a list is the best solution.
As for the second question: This line of code:
List<exampleClass> x = new ArrayList<>(exampleCollection);
creates a new ArrayList, which is a shallow copy of the original collection. That is, it contains pointers to the same objects as the original collection, but the list itself is new and you could for example sort the list, or add or remove items, without affecting the original collection. Compared to:
List<exampleClass> x = (List<exampleClass>)exampleCollection;
Assuming exampleCollection is actually a List, this gives you a pointer to that list with a new data type. If you make changes like sort or add or remove items, you will see these modifications in exampleCollection. On the other hand if exampleCollection is not a List, you will get a run-time error (ClassCastException).
I am learning Java now and I am learning about different kinds of collections, so far I learned about LinkedList, ArrayList and Array[].
Now I've been introduced to Hash types of collections, HashSet and HashMap, and I didn't quite understand why there are useful, because the list of commands that they support is quietly limited, also, they are sorted in a random order and I need to Override the equal and HashKey methods in order to make it work right with class.
Now, what I don't understand is the benefits over the hassle of using these types instead of ArrayList of a costume class.
I mean, what Map is doing is connecting 2 objects as 1, but wouldn't it just be better to create a class that contains this 2 objects as parameters, and have getters to modify and use them?
If the benefit is that this Hash objects can only contain 1 object of the same name, wouldn't it just be easier to make the ArrayList check that the type is not already there before adding it?
So far I learned to choose when to use LinkedList, ArrayList or Array[] by the rule of "if it's really simple, use Array[], if it's a bit more complex use ArrayList (for example to hold collection of certain class), and if the list is dynamic with a lot of objects inside that need to change order according to removing or adding a new one in the middle or go back and forth within the list then use LinkedList.
But I couldn't understand when to prefer HashMap or HashSet, and I would be really glad if you could explain it to me.
Let me help you out here...
Hashed collections are the most efficient to add, search and remove data, since they hash the key (in HashMap) or the element (in HashSet) to find the place where they belong in a single step.
The concept of hashing is really simple. It is the process of representing an object as a number that can work as it´s id.
For example, if you have a string in Java like String name = "Jeremy";, and you print its hashcode: System.out.println(name.hashCode());, you will see a big number there (-2079637766), that was created using that string object values (in this string object, it's characters), that way, that number can be used as an Id for that object.
So the Hashed collections like the ones mentioned above, use this number to use it as an array index to find the elements in no-time. But obviously is too big to use it as an array index for a possible small array. So they need to reduce that number so it fits in the range of the array size. (HashMap and HashSet use arrays to store their elements).
The operation that they use to reduce that number is called hashing, and is something like this: Math.abs(-2079637766 % arrayLength);.
It's not like that exactly, it's a bit more complex, but this is to simplify.
Let's say that arrayLength = 16;
The % operator will reduce that big number to a number smaller than 16, so that it can be fit in the array.
That is why a Hashed collection will not allow duplicate, because if you try to add the same object or an equivalent one (like 2 strings with the same characters), it will produce the same hashcode and will override whatever value is in the result index.
In your question, you mentioned that if you are worried about duplicates items in an ArrayList, we can just check if the item is there before inserting it, so this way we don't need to use a HashSet. But that is not a good idea, because if you call the method list.contains(elem); in an ArrayList, it needs to go one by one comparing the elements to see if it's there. If you have 1 million elements in the ArrayList, and you check if an element is there, but it is not there, the ArrayList iterated over 1 million elements, that is not good. But with a HashSet, it would only hashed the object and go directly where it is supposed to be in the array and check, doing it in just 1 step, instead of 1 million. So you see how efficient a HashSet is compared to an ArrayList.
The same happens with a HashMap of size 1 million, that it will only take 1 single step to check if a key is there, and not 1 million.
The same thing happens when you need to add, find and remove an element, with the hashed collections it will do all that in a single step (constant time, doesn't depend on the size of the map), but that varies for other structures.
That's why it is really efficient and widely used.
Main Difference between an ArrayList and a LinkedList:
If you want to find the element at place 500 in an ArrayList of size 1000, you do: list.get(500); and it will do that in a single step, because an ArrayList is implemented with an array, so with that 500, it goes directly where the element is in the array.
But a LinkedList is not implemented with an array, but with objects pointing to each other. This way, they need to go linearly and counting from 0, one by one until they get to the 500, which is not really efficient compared to the 1 single step of the ArrayList.
But when you need to add and remove elements in an ArrayList, sometimes the Array will need to be recreated so more elements fit in it, increasing the overhead.
But that doesn't happen with the LinkedList, since no array has to be recreated, only the objects (nodes) have to be re-referenced, which is done in a single step.
So an ArrayList is good when you won't be deleting or adding a lot of elements on the structure, but you are going to read a lot from it.
If you are going to add and remove a lot of elements, then is better a linked list since it has less work to do with those operations.
Why you need to implement the equals(), hashCode() methods for user-defined classes when you want to use those objects in HashMaps, and implement Comparable interface when you want to use those objects with TreeMaps?
Based on what I mentioned earlier for HashMaps, is possible that 2 different objects produce the same hash, if that happens, Java will not override the previous one or remove it, but it will keep them both in the same index. That is why you need to implement hashCode(), so you make sure that your objects will not have a really simple hashCode that can be easily duplicated.
And the reason why is recommended to override the equals() method is that if there is a collision (2 or more objects sharing the same hash in a HashMap), then how do you tell them apart? Well, asking the equals() method of those 2 objects if they are the same. So if you ask the map if it contains a certain key, and in that index, it finds 3 elements, it asks the equals() methods of those elements if its equals() to the key that was passed, if so, it returns that one. If you don't override the equals() method properly and specify what things you want to check for equality (like the properties name, age, etc.), then some unwanted overrides inside the HashMap will happen and you will not like it.
If you create your own classes, say, Person, and has properties like name, age, lastName and email, you can use those properties in the equals() method and if 2 different objects are passed but have the same values in your selected properties for equality, then you return true to indicate that they are the same, or false otherwise. Like the class String, that if you do s1.equals(s2); if s1 = new String("John"); and s2 = new String("John");, even though they are different objects in Java Heap Memory, the implementation of String.equals method uses the characters to determine if the objects are equals, and it returns true for this example.
To use a TreeMap with user-defined classes, you need to implement the Comparable interface, since the TreeMap will compare and sort the objects based on some properties, you need to specify by which properties your objects will be sorted. Will your objects be sorted by age? By name? By id? Or by any other property that you would like. Then, when you implement the Comparable interface and override the compareTo(UserDefinedClass o) method, you do your logic and return a positive number if the current object is greater than the o object passed, 0 if they are the same and a negative number if the current object is smaller. That way, the TreeMap will know how to sort them, based on the number returned.
First HashSet. In HashSet, you can easily get whether it contains given element. Let's have a set of people in your class and you want to ask whether a guy is in your class. You can make an array list of strings. And if you want to ask if a guy is in your class, you have to iterate through whole the list until you find him, which might be too slow for longer lists. If you use HashSet instead, the operation is much faster. You calculate the hash of the searched string and then you go directly to the hash, so you don't need to pass so many elements to answer your question. Well, you can also make a workaround to make the ArrayList faster to access for this purpose but this is already prepared.
And now HashMap. Now imagine that you also want to store a score for each person. So now you can use HashMap. You enter the name and you get his score in a short time, without the need of iterating through whole the data structure.
Does it make sense?
Concerning your question:
"But I couldn't understand when to prefer HashMap or HashSet, and I
would be really glad if you could explain it to me"
The HashMap implement the Map interface, to be used for mapping a Key (K) to a value (V) in constant time, and where order doesn't matter, so you can put and retrieve those data efficiently if you now the key.
And HashSet implement the Set interface, but is internanly using and HashMap, its role is to be used as a Set, meaning you're not supposed to retrieve an element, you just check that is in the set or not (mostly).
In HashMap, you can have identical value, while you can't in a Set (because its a property of a Set).
Concerning this question :
If the benefit is that this Hash objects can only contain 1 object of the same name, >wouldn't it just be easier to make the ArrayList check that the type is not already >there before adding it?
When dealing with collection, you have may base you choice of a particular one on the data representation but also on the way you want to access and store those data, how do you access it ? Do you need to sort them ? Because each implemenation may have different complexity (https://en.wikipedia.org/wiki/Time_complexity), it become important.
Using the doc,
For ArrayList:
The add operation runs in amortized constant time, that is, adding n elements requires O(n) time. All of the other operations run in linear time (roughly speaking).
For HashMap:
This implementation provides constant-time performance for the basic operations (get and put), assuming the hash function disperses the elements properly among the buckets. Iteration over collection views requires time proportional to the "capacity" of the HashMap instance (the number of buckets) plus its size (the number of key-value mappings). Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.
So it's about the time complexity.
You may choose even more untypical collection for certain problems :).
This has little to do with Java specifically, and the choice depends mostly on performance requirements, but there's a fundamental difference that must be highlighted. Conceptually, Lists are types of collections that keep the order of insertion and may have duplicates, Sets are more like bags of items that have no specific order and no duplicates. Of course, different implementations may find a way around it (like a TreeSet).
First, let's check the difference between ArrayList and LinkedList. A linked list is a set of nodes, where each node contains a value and a link to the next and previous nodes. This makes inserting an element to a linked list a matter of appending a node to the end of the list, which is a quick operation since the memory does not have to be contiguous, as long as a node keeps a reference to the next node. On the other side, accessing a specific element requires transversing the entire list until finding it.
An array list, as the name implies, wraps an array. Accessing elements in an array by using its index is direct access, but inserting an element implies resizing the array to include the new element, so the memory it occupies is contiguous, making writes a bit heavier in this case.
A HashMap works like a dictionary, where for each key there's a value. The behavior of the insertion will mostly depend on how the hashCode and equals functions of the object used as a key are implemented. If the hashCode of two keys is the same, there's a hash collision, so equals will be used to understand if it's the same key or not. If equals is the same, then it's the same key, so the value is replaced. If not, the new value is added to the collection. Accessing and Writing values depends mostly on calculating the hash of the key followed by direct access to the value, making both operations really quick, O(1).
A set is pretty much like a hash map, without the "values" part, thus, it follows the same rules regarding the implementation of hashCode and equals operations for the added value.
It might be handy to study a bit about the Big-O notation and complexity of algorithms. If you are starting with Java, I'd strongly recommend the book Effective Java, by Joshua Bloch.
Hope it helps you dig further.
Let's say we have an array list of objects ObjArray.
What is the most efficient way for that object to locate itself within the list, and remove itself from the list?
The way I tend to use is this:
Every object in the list has an ID that corresponds to its place in the list
When object.remove() is called, the object simply simply calls ObjArray.remove(ID).
ObjArray is parsed from index ID upwards calling ObjArray.get(i).ID--. This sets all objects above the removed object to the right ID.
The other method is of course simply parsing ObjArray until a object match is found.
So, is there a better way of doing this? ArrayList is not necessary, if a HashMap or LinkedList can be used to do things better, that's just as good.
More information as requested.
Objects contain information as to where they need to be drawn on screen, and what image is to be drawn. The paint function of the main JPanel is called by a timer. The paint function loops through the list ObjArray and calls the the object's draw function (Obj.draw(Graphics g)).
Objects may be added or removed by clicking.
When an object is removed, it need to remove itself from the ObjArray list. I have stated the two methods that I can think of in the first part.
I would like to know if anyone knows of a more efficient way of doing this.
In short:
What's the most efficient way for an item to find/know its position in a list
Efficient in terms of code:
list.remove(this);
The object must be given a reference to the list of course.
Efficient in terms of performance would require a small redesign, probably involving a Map, but is beyond the scope of this question.
Usethe List's indexOf to get the ID. Drop your idea of an ID for each object._
I have a document with 15,000 items. Each item contains 6 variables (strings and integers). I have to copy all of these into some sort of two dimensional array, what the best way to do it?
Here are my ideas so far:
Make a GIANT 2D array or array list the same way you make any other array.
Pros: Simple Cons: Messy(would create a class just for this), huge amount of code, if I make a mistake it will be imposable to find where it is, all variables would have to be string even the ints which will make my job harder down the road
Make a new class with a super that takes in all the variables I need.
Create each item as a new instance of this class.
Add all of the instances to a 2D array or array list.
Pros: Simple, less messy, easier to find a mistake, not all the variables need to be strings which makes it much easier later when I don't have to convert string to int, a little less typing for me Cons: Slower? Will instances make my array compile slower? And will they make the over all array slow when I'm searching to items in it?
These ideas don't seem all to great :( and before I start the three week, five hour a day process of adding these items I would like to find the best way so I won't have to do it again... Suggestions on my current ideas or any new ideas?
Data example:
0: 100, west, sports, 10.89, MA, united
*not actual data
Your second options seems to be good. You can create a class containing all the items and create an array of that class.
You may use the following:
1. Read the document using buffered reader, so that memory issues will not occur.
2. Create a class containing your items.
3. Create a List of type you need and store the elements into it.
Let me know in case you face further problems.
If you already have the document with the 15000 * 6 items, in my experience you would be better served writing a program to use regex and parse it and have the output be the contents of the java array in the format you want. With such a parsing program in place, it will then also be very easy for you to change the format of the 15000 lines if you want to generate it differently.
As to the final format, I would have an ArrayList of your bean. By you text thus far, you don't necessarily need a super that takes in the variables, unless you need to have subtypes that are differentiated.
You'll probably run out of static space in a single class. So what I do is break up a big class like that into a file with a bunch of inner nested classes that each have a 64K (or less) part of the data as static final arrays, and then I merge them together in the main class in that file.
I have this in a class of many names to fix:
class FixName{
static String[][] testStrings;
static int add(String[][] aTestStrings, int lastIndex){
for(int i=0; i<aTestStrings.length; ++i) {
testStrings[++lastIndex]=aTestStrings[i];
}
return lastIndex;
}
static {
testStrings = new String[
FixName1.testStrings.length
+FixName2.testStrings.length
+FixName3.testStrings.length
+FixName4.testStrings.length
/**/ ][];
int lastIndex=-1;
lastIndex=add(FixName1.testStrings,lastIndex);
lastIndex=add(FixName2.testStrings,lastIndex);
lastIndex=add(FixName3.testStrings,lastIndex);
lastIndex=add(FixName4.testStrings,lastIndex);
/**/ }
}
class FixName1 {
static String[][] testStrings = {
{"key1","name1","other1"},
{"key2","name2","other2"},
//...
{"keyN","nameN","otherN"}
};
}
Create a wrapper (Item) if you have not already(as your question does not state it clearly).
If the size of the elements is fixed ie 1500 use array other wise use LinkedList(write your own linked list or use Collection).
If there are others operations that you need to support on this collection of items, may be further inserts, search( in particular) use balanced binary search tree.
With the understanding of the question i would say linked list is better option.
If the items have a unique property (name or id or row number or any other unique identifier) I recommend using a HashMap with a wrapper around the item. If you are going to do any kind of lookup on your data (find item with id x and do operation y) this is the fastest option and is also very clean, it just requires a wrapper and you can use a datastructure that is already implemented.
If you are not doing any lookups and need to process the items en masse in no specific order I would recommend an ArrayList, it is very optimized as it is the most commonly used collection in java. You would still need the wrapper to keep things clean and a list is far cleaner than an array at almost no extra cost.
Little point in making your own collection as your needs are not extremely specific, just use one that is already implemented and never worry about your code breaking, if it does it is oracles fault ;)
I've got a Problem with ArrayList. I need it to store a result. Because I want to start with element n I tried to give the ArrayList a capacity with ensureCapacity(n+1) to use set(n,x) but I get an IndexOutOfBoundsException.
I tried to store n add(x) before the use of set and this works.
So I'd like to know why it doesn't work on my way and how to solve this because put n times a add(x) isn't a good style ;-)
When you change the capacity of an ArrayList it doesn't create any elements, it just reserves memory where there could be elements. You can check the size before and after adjusting the capacity and you will see that it does not change.
The purpose of changing the capacity is if you know in advance how many elements you will have, then you can avoid unnecessary repeated resizing as you add new elements, and you can avoid memory wastage from excess unused capacity.
If you don't like using your own loop and the list add method directly then there is another way. Create your ArrayList with the number of elements you want it directly like this:
final int MAX_ELEMENTS = 1000;
List<Integer> myList = new ArrayList<Integer>(
Collections.<Integer>nCopies(MAX_ELEMENTS, null));
Or, if you already have a list that you want to expand the size by n elements:
myList.addAll(Collections.<Integer>nCopies(n, null));
(Note, I assumed here that the list would be holding Integer objects, but you can change this to your custom type. If you are working with raw/pre-Java 5 types then just drop the generic declarations.)
As for your actual question: capacity != contents. An ArrayList internally has both a physical array and a count of what is actually in it. Increasing the capacity, changes the internal array so it can hold that many elements, however, the count does not change. You need to add elements to increase that count.
On the other hand, if you are just trying to set specific elements and know the maximum that you want to use, why not use an array directly? If you then need to pass this array to an API that takes Lists, then use Arrays.asList. The other classes could still change contents of your backing array but it would not be able to increase the size or capacity of it.
As others have answered, ensureCapacity() is just related to performance, is not frequently used by the common user.
From Bruce Eckel's Thinking in Java book:
In a private message, Joshua Bloch
wrote: "... I believe that we erred by
allowing implementation details (such
as hash table size and load factor)
into our APIs. The client should
perhaps tell us the maximum expected
size of a collection, and we should
take it from there. Clients can easily
do more harm than good by choosing
values for these parameters. As an
extreme example, consider Vector's
capacityIncrement. No one should ever
set this, and we shouldn't have
provided it. If you set it to any
non-zero value, the asymptotic cost of
a sequence of appends goes from linear
to quadratic. In other words, it
destroys your performance. Over time,
we're beginning to wise up about this
sort of thing. If you look at
IdentityHashMap, you'll see that it
has no low-level tuning parameters"
You are getting this exception because ensureCapacity() only makes sure that there is enough memory allocated for adding objects to an ArrayList, I believe this is in case you want to add multiple objects at once, without having to relocate memory.
To do what you want you would have to initiate the ArrayList with null elements first...
int n = 10; //capacity required
ArrayList foo = new ArrayList();
for( int i=0; i<=n; i++ ) {
foo.add(null);
}
Then you have objects in the List that you can reference via index and you wont receive the exception.
Perhaps you should rethink the choice of using List<Double>. It might be that a Map<Integer,Double> would be more appropriate if elements are to be added in an odd order.
Whether this is appropriate depends on knowledge about your usage that I don't have at the moment though.
Is the data structure eventually going to be completely filled, or is the data sparse?
what other people said about ensureCapacity() ...
you should write a class like DynamicArrayList extends ArrayList. then just overrride add(n,x) to do with for loop add(null) logic specified about.
ensureCapacity() has another purpose. It should be used in cases when you get to know the required size of the List after it has been constructed. If you know the size before it is constructor, just pass it as a an argument to the constructor.
In the former case use ensureCapacity() to save multiple copying of the backing array on each addition. However, using that method leaves the structure in a seemingly inconsistent state
the size of the backing array is increased
the size field on the ArrayList isn't.
This, however, is normal, since the capacity != size
Use the add(..) method, which is the only one that is increasing the size field:
ArrayList list = new ArrayList();
list.ensureCapacity(5); // this can be done with constructing new ArrayList(5)
for (int i = 0; i < list.size - 1; i ++) {
list.add(null);
}
list.add(yourObject);