Java Modcount (ArrayList)

Java Modcount (ArrayList) - java

In Eclipse, I see that ArrayList objects have a modCount field. What is its purpose? (number of modifications?)

It allows the internals of the list to know if there has been a structural modification made that might cause the current operation to give incorrect results.
If you've ever gotten ConcurrentModificationException due to modifying a list (say, removing an item) while iterating it, its internal modCount was what tipped off the iterator.
The AbstractList docs give a good detailed description.

Yes. If you ever intend to extend AbstractList, you have to write your code so that it adheres the modCount's javadoc as cited below:
/**
* The number of times this list has been <i>structurally modified</i>.
* Structural modifications are those that change the size of the
* list, or otherwise perturb it in such a fashion that iterations in
* progress may yield incorrect results.
*
* <p>This field is used by the iterator and list iterator implementation
* returned by the {#code iterator} and {#code listIterator} methods.
* If the value of this field changes unexpectedly, the iterator (or list
* iterator) will throw a {#code ConcurrentModificationException} in
* response to the {#code next}, {#code remove}, {#code previous},
* {#code set} or {#code add} operations. This provides
* <i>fail-fast</i> behavior, rather than non-deterministic behavior in
* the face of concurrent modification during iteration.
*
* <p><b>Use of this field by subclasses is optional.</b> If a subclass
* wishes to provide fail-fast iterators (and list iterators), then it
* merely has to increment this field in its {#code add(int, E)} and
* {#code remove(int)} methods (and any other methods that it overrides
* that result in structural modifications to the list). A single call to
* {#code add(int, E)} or {#code remove(int)} must add no more than
* one to this field, or the iterators (and list iterators) will throw
* bogus {#code ConcurrentModificationExceptions}. If an implementation
* does not wish to provide fail-fast iterators, this field may be
* ignored.
*/
Taking a look into the actual JDK source code and reading the javadocs (either online or in code) help a lot in understanding what's going on. Good luck.
I would add, you can add JDK source code to Eclipse so that every F3 or CTRL+click on any Java SE class/method points to the actual source code. If you download the JDK, you should have the src.zip in the JDK installation folder. Now, in Eclipse's the top menu, go to Window » Preferences » Java » Installed JREs. Select the current JRE and click Edit. Select the rt.jar file, click at Source Attachment, click at External File, navigate to JDK folder, select the src.zip file and add it. Now the source code of the Java SE API is available in Eclipse. The JDK source code gives a lot of insights. Happy coding :)

protected transient int modCount = 0; is the property declared at public abstract class AbstractList, to identify total number of structural modification made in this collection. Means if there is a add/remove there will be an increment in this counter for both operation. Hence this counter always get incremented for any modification. So not useful for size computation.
This will be useful to throw ConcurrentModificationException.
ConcurrentModificationException will be thrown while iterating the collection by one thread and there is a modification in the collection by another thread.
This is achieved like whenever iterator object is created modCount will be set into expectedCount, and each iterator navigation expectedCount will be compared with modCount to throw ConcurrentModificationException when there is a change.
private class Itr implements Iterator<E> {
...
...
/**
* The modCount value that the iterator believes that the backing
* List should have. If this expectation is violated, the iterator
* has detected concurrent modification.
*/
int expectedModCount = modCount;
public E next() {
checkForComodification();
...
...
}
final void checkForComodification() {
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
}
...
...
}
size() api won't suits here; since if there is two operation (add and remove) happened before next() called still size will show the same value; hence not able to detect the modification happened on this collection using size() api while iteration. Hence we need modification_increment_counter which is modCount.

It's the number of times the structure (size) of the collection changes

From the Java API for the mod count field:
The number of times this list has been structurally modified. Structural modifications are those that change the size of the list, or otherwise perturb it in such a fashion that iterations in progress may yield incorrect results.

From the 1.4 javadoc on AbstractList:
protected transient int modCount
The number of times this list has been
structurally modified. Structural
modifications are those that change
the size of the list, or otherwise
perturb it in such a fashion that
iterations in progress may yield
incorrect results.
This field is used by the iterator and
list iterator implementation returned
by the iterator and listIterator
methods. If the value of this field
changes unexpectedly, the iterator (or
list iterator) will throw a
ConcurrentModificationException in
response to the next, remove,
previous, set or add operations. This
provides fail-fast behavior, rather
than non-deterministic behavior in the
face of concurrent modification during
iteration.
Use of this field by subclasses is optional.

Related

What's the difference between peekOption and headOption in Vavr Collection

Here the doc for Vavr List peekOption
https://www.javadoc.io/doc/io.vavr/vavr/0.10.1/io/vavr/collection/List.html#peekOption--
Here the doc of Vavr Traversable headOption
https://www.javadoc.io/doc/io.vavr/vavr/0.10.1/io/vavr/collection/Traversable.html#headOption--
Implémentation seems exactly the same so with that kind of usage i can use both but which is the best..?
MyObject myObject = myJavaCollection.stream()
.filter(SomePredicate::isTrue)
.collect(io.vavr.collection.List.collector()) //Collect to vavr list to have vavr methods availables
.peek(unused -> LOGGER.info("some log"))
.map(MyObject::new)
.peekOption() //or .headOption()
.getOrNull();
So i was wondering what is the différence between those methods.

From the sourcecode of Vavr's List (see https://github.com/vavr-io/vavr/blob/master/src/main/java/io/vavr/collection/List.java) we have:
/**
* Returns the head element without modifying the List.
*
* #return {#code None} if this List is empty, otherwise a {#code Some} containing the head element
* #deprecated use headOption() instead
*/
#Deprecated
public final Option<T> peekOption() {
return headOption();
}
So they do exactly the same, like you say, and since peekOption() is deprecated, headOption() seems to be the one to use.
As for the reason to use one over the other:
It looks like the Vavr List interface defines some stack related methods (like push, pop, peek, etc) to make it easier to use lists as stacks in a convenient way, if you should want that. (For example, you would use peekOption() if you consider the list to be a stack and headOption() otherwise)
These stack-methods are however all deprecated - probably because there are always non-stack methods that can be used instead of them. So they probably backed away from the idea that "a list is also a stack" - maybe because they thought it mixes concepts a bit and makes the interface too big (just a guess). So that must be the reason headOption() is preferred - all the stack-methods are deprecated.
(Normal Java Lists also have stack methods, but that is through an interface, so all lists are also stacks but you can have a stack which is not a list.))

According to their documentation (List and Traversable)
List's peekOption
default Option peekOption()
Returns the head element without modifying the List.
Returns:
None if this List is empty, otherwise a Some containing the head element
Traversable's headOption
default Option headOption()
Returns the first element of a non-empty Traversable as Option.
Returns:
Some(element) or None if this is empty.
They act exactly the same way. They either return the head element or Option.none(). Only their variants head and peek throw an exception if no elements are present. List simply happens to have two methods that behave the same way only because it extends the Traversable interface.

Why does Stream#toList's default implementation seem overcomplicated / suboptimal?

Looking at the implementation for Stream#toList, I just noticed how overcomplicated and suboptimal it seemed.
Like mentioned in the javadoc just above, this default implementation is not used by most Stream implementation, however, it could have been otherwise in my opinion.
The sources
/**
* Accumulates the elements of this stream into a {#code List}. The elements in
* the list will be in this stream's encounter order, if one exists. The returned List
* is unmodifiable; calls to any mutator method will always cause
* {#code UnsupportedOperationException} to be thrown. There are no
* guarantees on the implementation type or serializability of the returned List.
*
* <p>The returned instance may be value-based.
* Callers should make no assumptions about the identity of the returned instances.
* Identity-sensitive operations on these instances (reference equality ({#code ==}),
* identity hash code, and synchronization) are unreliable and should be avoided.
*
* <p>This is a terminal operation.
*
* #apiNote If more control over the returned object is required, use
* {#link Collectors#toCollection(Supplier)}.
*
* #implSpec The implementation in this interface returns a List produced as if by the following:
* <pre>{#code
* Collections.unmodifiableList(new ArrayList<>(Arrays.asList(this.toArray())))
* }</pre>
*
* #implNote Most instances of Stream will override this method and provide an implementation
* that is highly optimized compared to the implementation in this interface.
*
* #return a List containing the stream elements
*
* #since 16
*/
#SuppressWarnings("unchecked")
default List<T> toList() {
return (List<T>) Collections.unmodifiableList(new ArrayList<>(Arrays.asList(this.toArray())));
}
My idea of what would be better
return (List<T>) Collections.unmodifiableList(Arrays.asList(this.toArray()));
Or even
return Arrays.asList(this.toArray()));
IntelliJ's proposal
return (List<T>) List.of(this.toArray());
Is there any good reason for the implementation in the JDK sources?

The toArray method might be implemented to return an array that is then mutated afterwards, which would effectively make the returned list not immutable. That's why an explicit copy by creating a new ArrayList is done.
It's essentially a defensive copy.
This was also discussed during the review of this API, where Stuart Marks writes:
As written it's true that the default implementation does perform apparently redundant copies, but we can't be assured that toArray() actually returns a freshly created array. Thus, we wrap it using Arrays.asList and then copy it using the ArrayList constructor. This is unfortunate but necessary to avoid situations where someone could hold a reference to the internal array of a List, allowing modification of a List that's supposed to be unmodifiable.

why do I need Iterator interface and why should I use it?

I am new to Java so maybe to some of you my question will seem silly.
As I understand from some tutorial if I need to make on my custom object foreach the object must implement Iterable interface.
My question is why do I need Iterator interface and why should I use it?

As you mentioned, Iterable is used in foreach loops.
Not everything can be used in a foreach loop, right? What do you think this will do?
for (int a : 10)
The designers of Java wanted to make the compiler able to spot this nonsense and report it to you as a compiler error. So they thought, "what kind of stuff can be used in a foreach loop?" "Well", they thought, "objects must be able to return an iterator". And this interface is born:
public interface Iterable<T> {
/**
* Returns an iterator over elements of type {#code T}.
*
* #return an Iterator.
*/
Iterator<T> iterator();
}
The compiler can just check whether the object in the foreach loop implements Iterable or not. If it does not, spit out an error. You can think of this as a kind of "marker" to the compiler that says "Yes I can be iterated over!"
"What is an iterator then?", they thought again, "Well, an iterator should be able to return the next element and to return whether it has a next element. Some iterators should also be able to remove elements". So this interface is born:
public interface Iterator<E> {
/**
* Returns {#code true} if the iteration has more elements.
* (In other words, returns {#code true} if {#link #next} would
* return an element rather than throwing an exception.)
*
* #return {#code true} if the iteration has more elements
*/
boolean hasNext();
/**
* Returns the next element in the iteration.
*
* #return the next element in the iteration
* #throws NoSuchElementException if the iteration has no more elements
*/
E next();
/**
* Removes from the underlying collection the last element returned
* by this iterator (optional operation). This method can be called
* only once per call to {#link #next}. The behavior of an iterator
* is unspecified if the underlying collection is modified while the
* iteration is in progress in any way other than by calling this
* method.
*
* #implSpec
* The default implementation throws an instance of
* {#link UnsupportedOperationException} and performs no other action.
*
* #throws UnsupportedOperationException if the {#code remove}
* operation is not supported by this iterator
*
* #throws IllegalStateException if the {#code next} method has not
* yet been called, or the {#code remove} method has already
* been called after the last call to the {#code next}
* method
*/
default void remove() {
throw new UnsupportedOperationException("remove");
}
}

The Iterator is design pattern, it allows to go through collection of same object in certain way, this also allow to hide implementation of store element and iteration mechanism from user. As you can see in javadoc many classes implements Itarable interface, not only collections. In example it allows you to iterate through two List implementations in same performance, when ArrayList give indexes in same time but LinkedList for give certain index need to go all elements previously to this number and this much slower. But when you get Iterator from this implementation you get same performance in both cases because iterate algorithm optimised in both list in different way. ResultSet is also iterator but it does not implement interface from java.util it allow to iterate in all result of query in db in same way and hide back structures responsibles for elements store and db participation. In example when you need some optimization you may make new ResultSet implementation query db per next result invoke or what ever you want, because it also decouple client code from elements storage realization and iteration algorithms.

ModCount in map and list

While Debugging the collections in eclispse I just Inspect that there is thing called modCount for example if we debug list we will see while inspecting in debugging what this modCount represents..!!please advise

See the javadoc
The number of times this list has been structurally modified. Structural modifications are those that change the size of the list, or otherwise perturb it in such a fashion that iterations in progress may yield incorrect results.
This field is used by the iterator and list iterator implementation returned by the iterator and listIterator methods. If the value of this field changes unexpectedly, the iterator (or list iterator) will throw a ConcurrentModificationException in response to the next, remove, previous, set or add operations. This provides fail-fast behavior, rather than non-deterministic behavior in the face of concurrent modification during iteration.
Use of this field by subclasses is optional. If a subclass wishes to provide fail-fast iterators (and list iterators), then it merely has to increment this field in its add(int, E) and remove(int) methods (and any other methods that it overrides that result in structural modifications to the list). A single call to add(int, E) or remove(int) must add no more than one to this field, or the iterators (and list iterators) will throw bogus ConcurrentModificationExceptions. If an implementation does not wish to provide fail-fast iterators, this field may be ignored.

It's a counter used to detect modifications to the collection when iterating the collection: iterators are fail fast, and throw an exception if the collection has been modified during the iteration. modCount is used to track the modifications.
FYI, the sources of the standard classes are part of the JDK, and you may read them to understand how the standard classes work.

what's the usage of the code in the implementation of AbstractCollection's toArray Method

public Object[] toArray() {
// Estimate size of array; be prepared to see more or fewer elements
Object[] r = new Object[size()];
Iterator<E> it = iterator();
for (int i = 0; i < r.length; i++) {
if (! it.hasNext()) // fewer elements than expected
return Arrays.copyOf(r, i);
r[i] = it.next();
}
return it.hasNext() ? finishToArray(r, it) : r;
}
here's the code of implementation of AbstractCollection.toArray method.
if (! it.hasNext()) // fewer elements than expected
return Arrays.copyOf(r, i);
I don't understand the usage of the code above. I suspect the code is used to avoid the size changing while the the method is invoked. So I have two questions:
What I suspect is right or wrong? if it's wrong, what's the usage of this code?
If it's true, what situation can make the size changing while the method has been invoked?

Well, the method's javadoc sais it all:
/**
* {#inheritDoc}
*
* <p>This implementation returns an array containing all the elements
* returned by this collection's iterator, in the same order, stored in
* consecutive elements of the array, starting with index {#code 0}.
* The length of the returned array is equal to the number of elements
* returned by the iterator, even if the size of this collection changes
* during iteration, as might happen if the collection permits
* concurrent modification during iteration. The {#code size} method is
* called only as an optimization hint; the correct result is returned
* even if the iterator returns a different number of elements.
*
* <p>This method is equivalent to:
*
* <pre> {#code
* List<E> list = new ArrayList<E>(size());
* for (E e : this)
* list.add(e);
* return list.toArray();
* }</pre>
*/
I find two interesting things to mention here:
Yes, you're right, as the javadoc sais, this method is prepared to return correctlly even if the Collection has been modified in the mean time. That's why the initial size is just a hint. The usage of the iterator also ensures avoidance from the "concurrent modification" exception.
It's very easy to imagine a multi-threaded situation where one thread adds/removes elements from a Collection while a different thread calls the "toArray" method on it. In such a situation, if the Collection is not thread safe (like obtained via Collections.synchronizedCollection(...) method, or by manually creating synchronized access code towards it) you'll get into a situation where it's modified and toArray-ed at the same time.

I just want to mention that according to the javadoc, the method size() can return maximum Integer.MAX_VALUE. But if your collection has more elements you can't get a proper size.

you are right, the array is initialized with size() so if any element is removed while the array is being populated, you would benefit from this check.
Collections are by default not thread safe, so another thread could call remove() while the iteration is in progress :-)

While it's generally guaranteed (e.g. for all java.util.* collection classes) that a collection won't change while it is iterated (otherwise throws a ConcurrentModificationException) this is not guaranteed for all collections. It's therefore possible for another thread to add or remove elements while one thread is calling toArray(), thus changing the size of collection and therefore the resulting array. Alternatively, some implementation might only return an approximate size.
Therefore, to answer the question:
These two lines check if the end of the collection was reached before the expected size (result of size() call which defines r.length) was reached. If this is the case, a copy of the array r with the appropriate size will be made. Remember that it's not possible to resize an array.
As said, different possibilities since the contract for Collection is pretty loose. Multi-threading, approximate results by size() and others.

Andrei is the main answer. corsair raises an excellent point about Integer.MAX_VALUE.
For completeness, I will add the toArray method is supposed to work on any Collection including:
arrays with a buggy size method;
dynamic arrays - the contents of the Collection could change depending on other threads (concurrency), the time, or random numbers. An example in pseudocode
Collection < Food > thingsACatholicCanEat ; // if it is Friday should not include meat

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java Modcount (ArrayList) - java

In Eclipse, I see that ArrayList objects have a modCount field. What is its purpose? (number of modifications?)

It's the number of times the structure (size) of the collection changes

From the Java API for the mod count field: The number of times this list has been structurally modified. Structural modifications are those that change the size of the list, or otherwise perturb it in such a fashion that iterations in progress may yield incorrect results.

Related

What's the difference between peekOption and headOption in Vavr Collection

Why does Stream#toList's default implementation seem overcomplicated / suboptimal?

why do I need Iterator interface and why should I use it?

ModCount in map and list

what's the usage of the code in the implementation of AbstractCollection's toArray Method

Categories

Resources