subList view based on value of elements

subList view based on value of elements - java

I know that the Collection framework allows for the creation of "views", that is lightweight "wrappers" for a Collection object.
What I am especially interested in is, given a List, to return a view for only a subset of elements matching some conditions.
Basically, what I want to emulate is the functionality of the subList() method, only not based on start and end indexes, but on some parameters of the elements.
The first approach I thought about was simply to create another List, go through the first List and check each element...
While this wouldn't be actually copy any MyObject but only their references, I would anyways create a new List object, with its overhead. Isn't that right?
Is there any lightweight method of doing what I need?
N.B. My original List is a really big collection...
Thank you all

You can do this easily in Java using the Guava collections (Collections2 has a filter method http://docs.guava-libraries.googlecode.com/git-history/v11.0.1/javadoc/index.html).
You can also do this in groovy using the findAll method, for example
myList.findAll { it.contains("aValue") }
Any of these methods will create a new collection under the hood. So they are just doing the work for you of iterating over the elements and checking them. The overhead of creating a new list is minimal (it's just instantiating one new object).

I would anyways create a new List object, with its overhead
I don't understand what your concern here. Looking at source of ArrayList class even subList(int fromIndex, int toIndex) method in List class creates a new inner class (which extends from List). That is essentially what you will be doing in your method i.e. create a new List instance and copy your matching element's reference into it. That custom method will be more or less will have same performance as subList method.

Related

What is the difference in time & space complexity between declaring an ArrayList object and a List object?

I know that an instance of ArrayList can be declared in the two following ways:
ArrayList<String> list = new ArrayList<String>();
and
List<String> list = new ArrayList<String();
I know that using the latter declaration provides the flexibility of changing the implementation from one List subclass to another (eg, from ArrayList to LinkedList).
But, what is the difference in the time and space complexity in the two? Someone told me the former declaration will ultimately make the heap memory run out. Why does this happen?
Edit: While performing basic operations like add, remove and contains does the performance differ in the two implementations?

The space complexity of your data structure and the time complexity of different operations on it are all implementation specific. For both of the options you listed above, you're using the same data structure implementation i.e. ArrayList<String>. What type you declare them as on the left side of the equal sign doesn't affect complexity. As you said, being more general with type on the left of the equal sign just allows for swapping out of implementations.

What determines the behaviour of an object is its actual class/implementation. In your example, the list is still an ArrayList, so the behaviour won't change.
Using a List<> declaration instead of an ArrayList<> means that you will only use the methods made visible by the List interface and that if later you need another type of list, it will be easy to change it (you just change the call to new). This is why we often prefer it.
Example: you first use an ArrayList but then find out that you often need to delete elements in the middle of the list. You would thus consider switching to a LinkedList. If you used the List interface everywhere (in getter/setter, etc.), then the only change in your code will be:
List<String> list = new LinkedList<>();
but if you used ArrayList, then you will need to refactor your getter/setter signatures, the potential public methods, etc.

Why don't the unmodifiable methods from Collections class, create collections with new elements?

Suppose there is this code:
List<String> modifiableList = new ArrayList<String>(Arrays.asList("1","2","3"));
List<String> unmodifiableList = Collections.unmodifiableList(modifiableList);
System.out.println(unmodifiableList);
modifiableList.remove("1");
modifiableList.remove("3");
System.out.println(unmodifiableList);
it prints
[1, 2, 3]
[2]
If changing the second line to
List<String> unmodifiableList = Collections.unmodifiableList(
new ArrayList<String>(modifiableList));
it works as expected.
The question is why doesn't the UnmodifiableList inner class from Collections (and all the other unmodifiable classes) from there create a fresh copy of the original list, as does the constructor of ArrayList for example in order to really make it unmodifiable?
Edit: I understand what the code does; my question is why was it implemented this way? Why does the constructor from the UnmodifiableList (inner class from Collections) behave like the constructor of ArrayList in creating a fresh copy of the underlying array? Why a modifiable collection (ArrayList) copies the whole content while an unmodifiable collection doesn't?

Well the purpose of the methods is to create an unmodifiable view on an existing collection. That's the documented behaviour, and in many cases it's exactly what you want - it's much more efficient than copying all the data, for example... or you want to hand callers collections which will reflect any changes you want to make, but without allowing them to make changes.
If you want an immutable copy of the data (or at least the references...) then just create a copy and then create an immutable view over the top of it - just as you are.
So basically, you can easily create either a view or a copy-and-view, based on Collections.unmodifiable* themselves performing the view operation. So we have two orthogonal operations:
Create a copy (e.g. via the constructor)
Create a view (via Collections.unmodifiable*)
Those operations can be composed very easily. If Collections.unmodifiable* actually performed a "copy only" then we'd require other operations in order to just make a view. If you accept that both options are useful in different situations, making them composable gives lots of flexibility.

The reason is simple efficiency. Copying all of the elements of a collection could be very time-consuming, particularly if the collection being wrapped has some sort of magic going on like JPA lazy-loading, and requires extra memory. Wrapping the underlying collection as-is is a trivial operation that imposes no additional overhead. In the case where the developer really does want a separate copy (unmodifiable or not), it's very easy to create it explicitly. (I tend to use Guava Immutable* for this.)

Please note, that unmodifiableList returns a "unmodifiable view" of provided list. So the list itself stays at it is (it can be still modified), only its "unmodifiable view" is unmodifiable. You can think of it as of SQL tables and views --- you can run DML scripts on tables and it will be reflected on related views. As to ArrayList --- it's backed by... an array, so it's implementation feature, that it copies elements from provided source list (which doesn't have to be backed by an array actually). Does it answer your question?

Enforcing order in Java Iterator

I am providing a library for a different team. One of the methods I provide receives as argument an Iterator. I would like to somehow require a certain order of iteration. Is there any way to do this in code by extending Iterator?

Not directly. The iterator is made just to give you an item at time, avoiding to storing in memory and pass a whole list of values, which could be unfeasible at times.
Unless you have more knowledge on how the values are generated and which bounds have to be applied to the sorting of data, the only way is to get all elements from the iterator, store them in some list/vector/database, sort them and return another iterator using the sorted list.

You're being passed, as an argument, an instance of some concrete class which implements Iterator. You can't extend its class because its class is decided upon instantiation, which is done by the code that calls your method.
If you want to fail fast when the required order is not respected, try Guava's Ordering.isOrdered() method.
NB This will only work if your argument is an Iterable, rather than Iterator. You need it to be Iterable (an interface which allows you to retrieve the Iterator) so that you can iterate twice: once to check order, once to process.

java-how to manage multiple lists of data- in a single variable- with easy access to each list

I have a scenario where I have to work with multiple lists of data in a java app...
Now each list can have any number of elements in it... Also, the number of such lists is also not known initially...
Which approach will suit my scenario best? I can think of arraylist of list, or list of list or list of arraylist etc(ie combinations of arraylist + list/ arraylist+arraylist/list+list)... what I would like to know is--
(1) Which of the above (or your own solution) will be easiest to manage- viz to store/fetch data
(2) Which of the above will use the least amount of memory?

I would declare my variable as:
List<List<DataType>> lists = new ArrayList<List<DataType>>();
There is a slight time penalty in accessing list methods through a variable of an interface type, but this, I think, is more than balanced by the flexibility you have of changing the type as you see fit. (For instance, if you decided to make lists immutable, you could do that through one of the methods in java.util.Collections, but not if you had declared it to be an ArrayList<List<DataType>>.)
Note that lists will have to hold instances of some concrete class that implements List<DataType>, since (as others have noted) List is an interface, not a class.

List is an interface. ArrayList is one implementation of List.
When you construct a List you must choose a specific concrete type (e.g. ArrayList). When you use the list it is better to program against the interface if possible. This prevents tight coupling between your code and the specific List implementation, allowing you to more easily change to another List implementation later if you wish.

If you know a way to identify which list you will be dealing with, use a map of lists.
Map<String,List<?>> = new HashMap<String,List<?>>();
This way you would avoid having to loop through the outer elements to reach the actual list. Hash map performs better than an iterator.

How to initialize a dynamic array in java?

If I have a class that needs to return an array of strings of variable dimension (and that dimension could only be determined upon running some method of the class), how do I declare the dynamic array in my class' constructor?
If the question wasn't clear enough,
in php we could simply declare an array of strings as $my_string_array = array();
and add elements to it by $my_string_array[] = "New value";
What is the above code equivalent then in java?

You will want to look into the java.util package, specifically the ArrayList class. It has methods such as .add() .remove() .indexof() .contains() .toArray(), and more.

Plain java arrays (ie String[] strings) cannot be resized dynamically; when you're out of room but you still want to add elements to your array, you need to create a bigger one and copy the existing array into its first n positions.
Fortunately, there are java.util.List implementations that do this work for you. Both java.util.ArrayList and java.util.Vector are implemented using arrays.
But then, do you really care if the strings happen to be stored internally in an array, or do you just need a collection that will let you keep adding items without worrying about running out of room? If the latter, then you can pick any of the several general purpose List implementations out there. Most of the time the choices are:
ArrayList - basic array based implementation, not synchronized
Vector - synchronized, array based implementation
LinkedList - Doubly linked list implementation, faster for inserting items in the middle of a list
Do you expect your list to have duplicate items? If duplicate items should never exist for your use case, then you should prefer a java.util.Set. Sets are guaranteed to not contain duplicate items. A good general-purpose set implementation is java.util.HashSet.
Answer to follow-up question
To access strings using an index similar to $my_string_array["property"], you need to put them in a Map<String, String>, also in the java.util package. A good general-purpose map implementation is HashMap.
Once you've created your map,
Use map.put("key", "string") to add strings
Use map.get("key") to access a string by its key.
Note that java.util.Map cannot contain duplicate keys. If you call put consecutively with the same key, only the value set in the latest call will remain, the earlier ones will be lost. But I'd guess this is also the behavior for PHP associative arrays, so it shouldn't be a surprise.

Create a List instead.
List<String> l = new LinkedList<String>();
l.add("foo");
l.add("bar");

No dynamic array in java, length of array is fixed.
Similar structure is ArrayList, a real array is implemented underlying it.
See the name ArrayList :)

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.