Looking at this question: How to dynamically do filtering in Java 8?
The issue is to truncate a stream after a filter has been executed. I cant use limit because I dont know how long the list is after the filter. So, could we count the slements after the filter?
So, I thought I could create a class that counts and pass the stream through a map.The code is in this answer.
I created a class that counts but leave the elements unaltered, I use a Function here, to avoid to use the lambdas I used in the other answer:
class DoNothingButCount<T > implements Function<T, T> {
AtomicInteger i;
public DoNothingButCount() {
i = new AtomicInteger(0);
}
public T apply(T p) {
i.incrementAndGet();
return p;
}
}
So my Stream was finally:
persons.stream()
.filter(u -> u.size > 12)
.filter(u -> u.weitght > 12)
.map(counter)
.sorted((p1, p2) -> p1.age - p2.age)
.collect(Collectors.toList())
.stream()
.limit((int) (counter.i.intValue() * 0.5))
.sorted((p1, p2) -> p2.length - p1.length)
.limit((int) (counter.i.intValue() * 0.5 * 0.2)).forEach((p) -> System.out.println(p));
But my question is about another part of the my example.
collect(Collectors.toList()).stream().
If I remove that line the consequences are that the counter is ZERO when I try to execute limit. I am somehow cheating the "efectively final" requirement by using a mutable object.
I may be wrong, but I iunderstand that the stream is build first, so if we used mutable objects to pass parameters to any of the steps in the stream these will be taken when the stream is created.
My question is, if my assumption is right, why is this needed? The stream (if non parallel) could be pass sequentially through all the steps (filter, map..) so this limitation is not needed.
Short answer
My question is, if my assumption is right, why is this needed? The
stream (if non parallel) could be pass sequentially through all the
steps (filter, map..) so this limitation is not needed.
As you already know, for parallel streams, this sounds pretty obvious: this limitation is needed because otherwise the result would be non deterministic.
Regarding non-parallel streams, it is not possible because of their current design: each item is only visited once. If streams did work as you suggest, they would do each step on the whole collection before going to the next step, which would probably have an impact on performance, I think. I suspect that's why the language designers made that decision.
Why it technically does not work without collect
You already know that, but here is the explanation for other readers.
From the docs:
Streams are lazy; computation on the source data is only performed
when the terminal operation is initiated, and source elements are
consumed only as needed.
Every intermediate operation of Stream, such as filter() or limit() is actually just some kind of setter that initializes the stream's options.
When you call a terminal operation, such as forEach(), collect() or count(), that's when the computation happens, processing items following the pipeline previously built.
This is why limit()'s argument is evaluated before a single item has gone through the first step of the stream. That's why you need to end the stream with a terminal operation, and start a new one with the limit() you'll then know.
More detailed answer about why not allow it for parallel streams
Let your stream pipeline be step X > step Y > step Z.
We want parallel treatment of our items. Therefore, if we allow step Y's behavior to depend on the items that already went through X, then Y is non deterministic. This is because at the moment an item arrives at step Y, the set of items that have already gone through X won't be the same across multiple executions (because of the threading).
More detailed answer about why not allow it for non-parallel streams
A stream, by definition, is used to process the items in a flow. You could think of a non-parallel stream as follows: one single item goes through all the steps, then the next one goes through all the steps, etc. In fact, the doc says it all:
The elements of a stream are only visited once during the life of a
stream. Like an Iterator, a new stream must be generated to revisit
the same elements of the source.
If streams didn't work like this, it wouldn't be any better than just do each step on the whole collection before going to the next step. That would actually allow mutable parameters in non-parallel streams, but it would probably have a performance impact (because we would iterate multiple times over the collection). Anyway, their current behavior does not allow what you want.
Related
Does filter chaining change the outcome if i use parallelStream() instead of stream() ?
I tried with a few thousand records, and the output appeared consistent over a few iterations. But since this involves threads,(and I could not find enough relevant material that talks about this combination) I want to make doubly sure that parallel stream does not impact the output of filter chaining in any way. Example code:
List<Element> list = myList.parallelStream()
.filter(element -> element.getId() > 10)
.filter(element -> element.getName().contains("something"))
.collect(Collectors.toList());
Short answer: No.
The filter operation as documented expects a non-interferening and stateless predicate to apply to each element to determine if it should be included as part of the new stream.
Few aspects that you shall consider for that are -
With an exception to concurrent collections(what do you choose as myList in the existing code to be) -
For most data sources, preventing interference means ensuring that the
data source is not modified at all during the execution of the stream
pipeline.
The state of the data sources (myList and its elements within your filter operations are not mutated)
Note also that attempting to access mutable state from behavioral
parameters presents you with a bad choice with respect to safety and
performance;
Moreover, think around it, what is it in your filter operation that would be impacted by multiple threads. Given the current code, nothing functionally, as long as both the operations are executed, you would get a consistent result regardless of the thread(s) executing them.
Say I have this list of fruits:-
List<String> f = Arrays.asList("Banana", "Apple", "Grape", "Orange", "Kiwi");
I need to prepend a serial number to each fruit and print it. The order of fruit or serial number does not matter. So this is a valid output:-
4. Kiwi
3. Orange
1. Grape
2. Apple
5. Banana
Solution #1
AtomicInteger number = new AtomicInteger(0);
String result = f.parallelStream()
.map(i -> String.format("%d. %s", number.incrementAndGet(), i))
.collect(Collectors.joining("\n"));
Solution #2
String result = IntStream.rangeClosed(1, f.size())
.parallel()
.mapToObj(i -> String.format("%d. %s", i, f.get(i - 1)))
.collect(Collectors.joining("\n"));
Question
Why is solution #1 a bad practice? I have seen at a lot of places that AtomicInteger based solutions are bad (like in this answer), specially in parallel stream processing (that's the reason I used parallel streams above, to try run into issues).
I looked at these questions/answers:-
In which cases Stream operations should be stateful?
Is use of AtomicInteger for indexing in Stream a legit way?
Java 8: Preferred way to count iterations of a lambda?
They just mention (unless I missed something) "unexpected results can occur". Like what? Can it happen in this example? If not, can you provide me an example where it can happen?
As for "no guarantees are made as to the order in which the mapper function is applied", well, that's the nature of parallel processing, so I accept it, and also, the order doesn't matter in this particular example.
AtomicInteger is thread safe, so it shouldn't be a problem in parallel processing.
Can someone provide examples in which cases there will be issues while using such a state-based solution?
Well look at what the answer from Stuart Marks here - he is using a stateful predicate.
The are a couple of potential problems, but if you don't care about them or really understand them - you should be fine.
First is order, exhibited under the current implementation for parallel processing, but if you don't care about order, like in your example, you are ok.
Second one is potential speed AtomicInteger will be times slower to increment that a simple int, as said, if you care about this.
Third one is more subtle. Sometimes there is no guarantee that map will be executed, at all, for example since java-9:
someStream.map(i -> /* do something with i and numbers */)
.count();
The point here is that since you are counting, there is no need to do the mapping, so its skipped. In general, the elements that hit some intermediate operation are not guaranteed to get to the terminal one. Imagine a map.filter.map situation, the first map might "see" more elements compared to the second one, because some elements might be filtered. So it's not recommended to rely on this, unless you can reason exactly what is going on.
In your example, IMO, you are more than safe to do what you do; but if you slightly change your code, this requires additional reasoning to prove it's correctness. I would go with solution 2, just because it's a lot easier to understand for me and it does not have the potential problems listed above.
Note also that attempting to access mutable state from behavioral parameters presents you with a bad choice with respect to safety and performance; if you do not synchronize access to that state, you have a data race and therefore your code is broken, but if you do synchronize access to that state, you risk having contention undermine the parallelism you are seeking to benefit from. The best approach is to avoid stateful behavioral parameters to stream operations entirely; there is usually a way to restructure the stream pipeline to avoid statefulness.
Package java.util.stream, Stateless behaviors
From the perspective of thread-safety and correctness, there is nothing wrong with solution 1. Performance (as an advantage of parallel processing) might suffer, though.
Why is solution #1 a bad practice?
I wouldn't say it's a bad practice or something unacceptable. It's simply not recommended for the sake of performance.
They just mention (unless I missed something) "unexpected results can occur". Like what?
"Unexpected results" is a very broad term, and usually refers to improper synchronisation, "What's the hell just happened?"-like behaviour.
Can it happen in this example?
It's not the case. You are likely not going to run into issues.
If not, can you provide me an example where it can happen?
Change the AtomicInteger to an int*, replace number.incrementAndGet() with ++number, and you will have one.
*a boxed int (e.g. wrapper-based, array-based) so you can work with it within a lambda
Case 2 - In API notes of IntStream class returns a sequential ordered IntStream from startInclusive (inclusive) to endInclusive (inclusive) by an incremental step of 1 kind of for loop thus parallel stream are processing it one by one and providing the correct order.
* #param startInclusive the (inclusive) initial value
* #param endInclusive the inclusive upper bound
* #return a sequential {#code IntStream} for the range of {#code int}
* elements
*/
public static IntStream rangeClosed(int startInclusive, int endInclusive) {
Case 1 - It is obvious that the list will be processed in parallel thus the order will not be correct. Since mapping operation is performed in parallel, the results for the same input could vary from run to run, due to thread scheduling differences thus no guarantees that different operations on the "same" element within the same stream pipeline are executed in the same thread also there is no guarantee how a mapper function is also applied to the particular elements within the stream.
Source Java Doc
I'm reading about java streams API and I encountered the following here:
The operation forEachOrdered processes elements in the order specified by the stream, regardless of whether the stream is executed in serial or parallel. However, when a stream is executed in parallel, the map operation processes elements of the stream specified by the Java runtime and compiler. Consequently, the order in which the lambda expression e -> { parallelStorage.add(e); return e; } adds elements to the List parallelStorage can vary every time the code is run. For deterministic and predictable results, ensure that lambda expression parameters in stream operations are not stateful.
I tested the following code and in fact, it works as mentioned:
public class MapOrdering {
public static void main(String[] args) {
// TODO Auto-generated method stub
List < String > serialStorage = new ArrayList < > ();
System.out.println("Serial stream:");
int j = 0;
List < String > listOfIntegers = new ArrayList();
for (int i = 0; i < 10; i++) listOfIntegers.add(String.valueOf(i));
listOfIntegers.stream().parallel().map(e - > {
serialStorage.add(e.concat(String.valueOf(j)));
return e;
}).forEachOrdered(k - > System.out.println(k));;
/*
// Don't do this! It uses a stateful lambda expression.
.map(e -> { serialStorage.add(e); return e; })*/
for (String s: serialStorage) System.out.println(s);
}
}
output
Serial stream:
0
1
2
3
4
5
6
7
8
9
null
null
80
90
50
40
30
00
questions:
The output changes every time I run this. How do I make sure that the stateful map operation is executed in order.
map is an intermediate operation and it only starts processing elements
until terminal operation commences. Since a terminal operation is
ordered, why is a map operation unordered, and tends to change results
every time when working with stateful operation?
You got lucky to see that serialStorage has all the elements that you think it will, after all you are adding from multiple threads multiple elements to a non-thread-safe collection ArrayList. You could have easily seen nulls or a List that does not have all the elements. But even when you add a List that is thread-safe - there is absolutely no order that you can rely on in that List.
This is explicitly mentioned in the documentation under side-effects, and intermediate operations should be side effect-free.
Basically there are two orderings: processing order (intermediate operations) and encounter order. The last one is preserved (if it is has one to begin with and stream intermediate operations don't break it - for example unordered, sorted).
Processing order is not specified, meaning all intermediate operations will process elements in whatever order they feel like. Encounter order (the one you see from a terminal operation) will preserver the initial order.
But even terminal operations don't have to preserve the initial order, for example forEach vs forEachOrdered or when you collect to a Set; of course read the documentation, it usually states clearly this aspect.
I would like to answer your 2 questions, while adding to this other answer...
output changes everytime i run this. how to write code to process statefull map operation in an ordered way?
Stateful map operations are discouraged and you shouldn't use them, even for sequential streams. If you want that behaviour, you'd better use an imperative approach.
map is intermediate operation and it only starts processing elements until terminal operation commences.since terminal operation is ordered ,why map operation is unordered and tend to change results every time when working with statefull operation?
Only forEachOrdered respects encounter order of elements; intermediate operations (such as map) are not compelled to do so. For a parallel stream, this means that intermediate operations are allowed to be executed in any order by the pipeline, thus taking advantage of parallelism.
However, bear in mind that providing a stateful argument to an intermediate operation, (i.e. a stateful mapper function to the map operation) when the stream is parallel, would require you to manually synchronize the state kept by the stateful argument (i.e. you would need to use a synchronized view of the list, or implement some locking mechanism, etc), but this would in turn affect performance negatively, since (as stated in the docs) you'd risk having contention undermine the parallelism you are seeking to benefit from.
Edit: for a terminal operation like forEachOrdered, parallelism would usually bring little benefit, since many times it needs to do some internal processing to comply with the requirement of respecting encounter order, i.e. buffer the elements.
Getting a Spliterator from a Stream pipeline may return an instance of a StreamSpliterators.WrappingSpliterator. For example, getting the following Spliterator:
Spliterator<String> source = new Random()
.ints(11, 0, 7) // size, origin, bound
.filter(nr -> nr % 2 != 0)
.mapToObj(Integer::toString)
.spliterator();
Given the above Spliterator<String> source, when we traverse the elements individually through the tryAdvance (Consumer<? super P_OUT> consumer) method of Spliterator, which in this case is an instance of StreamSpliterators.WrappingSpliterator, it will first accumulate items into an internal buffer, before consuming those items, as we can see in StreamSpliterators.java#298. From a simple point of view, the doAdvance() inserts items first into buffer and then it gets the next item and pass it to consumer.accept (…).
public boolean tryAdvance(Consumer<? super P_OUT> consumer) {
boolean hasNext = doAdvance();
if (hasNext)
consumer.accept(buffer.get(nextToConsume));
return hasNext;
}
However, I am not figuring out the need of this buffer.
In this case, why the consumer parameter of the tryAdvance is not simply used as a terminal Sink of the pipeline?
Keep in mind that this is the Spliterator returned by the public method Stream.spliterator(), so no assumptions about the caller can be made (as long as it is within the contract).
The tryAdvance method may get called once for each of the stream’s elements and once more to detect the end of the stream, well, actually, it might get called an arbitrary number of times even after hitting the end. And there is no guaranty that the caller will always pass the same consumer.
To pass a consumer directly to the source spliterator without buffering, you will have to compose a consumer that will perform all pipeline stages, i.e. call a mapping function and use its result or test a predicate and not call the downstream consumer if negative and so on. The consumer passed to the source spliterator would also be responsible to notify the WrappingSpliterator somehow about a value being rejected by the filter as the source spliterator’s tryAdvance method still returns true in that case and the operation would have to be repeated then.
As Eugene correctly mentioned, this is the one-fits-all implementation that doesn’t consider how many or what kind of pipeline stages are there. The costs of composing such a consumer could be heavy and might have to be reapplied for each tryAdvance call, read for every stream element, e.g. when different consumers are passed to tryAdvance or when equality checks do not work. Keep in mind that consumers are often implemented as lambda expressions and the identity or equality of the instances produced by lambda expressions is unspecified.
So the tryAdvance implementation avoids these costs by composing only one consumer instance on the first invocation that will always store the element into the same buffer, also allocated on the first invocation, if not rejected by a filter. Note that under normal circumstances, the buffer will only hold one element. Afaik, flatMap is the only operation that may push more elements to the buffer. But note that the existence of this non-lazy behavior of flatMap is also the reason why this buffering strategy is required, at least when flatMap is involved, to ensure that the Spliterator implementation handed out by a public method will fulfill the contract of passing at most one element to the consumer during one invocation of tryAdvance.
In contrast, when you call forEachRemaining, these problems do not exist. There is only one Consumer instance during the entire operation and the non-laziness of flatMap doesn’t matter either, as all elements will get consumed anyway. Therefore, a non-buffering transfer will be attempted, as long as no previous tryAdvance call was made that could have caused buffering of some elements:
public void forEachRemaining(Consumer<? super P_OUT> consumer) {
if (buffer == null && !finished) {
Objects.requireNonNull(consumer);
init();
ph.wrapAndCopyInto((Sink<P_OUT>) consumer::accept, spliterator);
finished = true;
}
else {
do { } while (tryAdvance(consumer));
}
}
As you can see, as long as the buffer has not been initialized, i.e. no previous tryAdvance call was made, consumer::accept is bound as Sink and a complete direct transfer made.
I mostly agree with great #Holger answer, but I would put accents differently. I think it is hard for you to understand the need for a buffer because you have very simplistic mental model of what Stream API allows. If one thinks about Stream as a sequence of map and filter, there is no need for additional buffer because those operations have 2 important "good" properties:
Work on one element at a time
Produce 0 or 1 element as a result
However those are not true in general case. As #Holger (and I in my original answer) mentioned there is already flatMap in Java 8 that breaks rule #2 and in Java 9 they've finally added takeWhile that actually transforms on whole Stream -> Stream rather than on a per-element basis (and that is AFAIK the first intermediate shirt-circuiting operation).
Another point I don't quite agree with #Holger is that I think that the most fundamental reason is a bit different than the one he puts in the second paragraph (i.e. a) that you may call tryAdvance post the end of the Stream many times and b) that "there is no guaranty that the caller will always pass the same consumer"). I think that the most important reason is that Spliterator being functionally identical to Stream has to support short-circuiting and laziness (i.e. ability to not process the whole Stream or else it can't support unbound streams). In other words, even if Spliterator API (quite strangely) required that you must use the same Consumer object for all calls of all methods for a given Spliterator, you would still need tryAdvance and that tryAdvance implementation would still have to use some buffer. You just can't stop processing data if all you've got is forEachRemaining(Consumer<? super T> ) so you can't implement anything similar to findFirst or takeWhile using it. Actually this is one of the reasons why inside JDK implementation uses Sink interface rather than Consumer (and what "wrap" in wrapAndCopyInto stands for): Sink has additional boolean cancellationRequested() method.
So to sum up: a buffer is required because we want Spliterator:
To use simple Consumer that provides no means to report back end of processing/cancellation
To provide means to stop processing of the data by a request of the (logical) consumer.
Note that those two are actually slightly contradictory requirements.
Example and some code
Here I'd like to provide some example of code that I believe is impossible to implement without additional buffer given current API contract (interfaces). This example is based on your example.
There is simple Collatz sequence of integers that is conjectured to always eventually hit 1. AFAIK this conjecture is not proved yet but is verified for many integers (at least for whole 32-bit int range).
So assume that the problem we are trying to solve is following: from a stream of Collatz sequences for random start numbers in range from 1 to 1,000,000 find the first that contains "123" in its decimal representation.
Here is a solution that uses just Stream (not a Spliterator):
static String findGoodNumber() {
return new Random()
.ints(1, 1_000_000) // unbound!
.flatMap(nr -> collatzSequence(nr))
.mapToObj(Integer::toString)
.filter(s -> s.contains("123"))
.findFirst().get();
}
where collatzSequence is a function that returns Stream containing the Collatz sequence until the first 1 (and for nitpickers let it also stop when current value is bigger than Integer.MAX_VALUE /3 so we don't hit overflow).
Every such Stream returned by collatzSequence is bound. Also standard Random will eventually generate every number in the provided range. It means that we are guaranteed that there eventually will be some "good" number in the stream (for example just 123) and findFirst is short-circuiting so the whole operation will actually terminate. However no reasonable Stream API implementation can predict this.
Now let's assume that for some strange reason you want to perform the same thing using intermediate Spliterator. Even though you have only one piece of logic and no need for different Consumers, you can't use forEachRemaining. So you'll have to do something like this:
static Spliterator<String> createCollatzRandomSpliterator() {
return new Random()
.ints(1, 1_000_000) // unbound!
.flatMap(nr -> collatzSequence(nr))
.mapToObj(Integer::toString)
.spliterator();
}
static String findGoodNumberWithSpliterator() {
Spliterator<String> source = createCollatzRandomSpliterator();
String[] res = new String[1]; // work around for "final" closure restriction
while (source.tryAdvance(s -> {
if (s.contains("123")) {
res[0] = s;
}
})) {
if (res[0] != null)
return res[0];
}
throw new IllegalStateException("Impossible");
}
It is also important that for some starting numbers the Collatz sequence will contain several matching numbers. For example, both 41123 and 123370 (= 41123*3+1) contain "123". It means that we really don't want our Consumer to be called post the first matching hit. But since Consumer doesn't expose any means to report end of processing, WrappingSpliterator can't just pass our Consumer to the inner Spliterator. The only solution is to accumulate all results of inner flatMap (with all the post-processing) into some buffer and then iterate over that buffer one element at a time.
Spliterators are designed to handle sequential processing of each item in encounter order, and parallel processing of items in some order. Each method of the Spliterator must be able to support both early binding and late binding. The buffering is intended to gather data into suitable, processable chunks, that follow the requirements for ordering, parallelization and mutability.
In other words, tryAdvance() is not the only method in the class, and other methods have to work with each other to deliver the external contract. To do that, in the face of sub-classes that may override some or all of the methods, requires that each method obey its internal contract.
This is something that I've read from Holger quite in a few posts and I'll just sum it up here; if there's a certain exact duplicate (I'll try to find one) - I will close and delete my answer in respect to that one.
First, is why WrappingSpliterator are needed in the first place - for stateful operations like sorted, distinct, etc - but I think you already understood that. I assume for flatMap also - since it is eager.
Now, when you call spliterator, IFF there are no stateful operations there is no real reason to wrap that into a WrappingSpliterator obviously, but at the moment this is not done. This could be changed in a future release - where they can detect if there are stateful operations before you call spliterator; but they don't do that now and simply treat every operation as stateful, thus wrapping it into WrappingSpliterator
I wonder whether there is a nicer (or just an other) approach to get the count of all items that enter the terminal operation of a stream instead of the following:
Stream<T> stream = ... // given as parameter
AtomicLong count = new AtomicLong();
stream.filter(...).map(...)
.peek(t -> count.incrementAndGet())
where count.get() gives me the actual count of the processed items at that stage.
I deliberately skipped the terminal operation as that might change between .forEach, .reduce or .collect.
I do know .count already, but it seems to work well only if I exchange a .forEach with a .map and use the .count as terminal operation instead. But it seems to me as if .map is then misused.
What I don't really like with the above solution: if a filter is added after it, it just counts the elements at that specific stage, but not the ones that are going into the terminal operation.
The other approach that comes to my mind is to collect the filtered and mapped values into a list and operate on that and just call list.size() to get the count. However this will not work, if the collection of the stream would lead to an error, whereas with the above solution I could have a count for all processed items so far, if an appropriate try/catch is in place. That however isn't a hard requirement.
It seems you already have the cleanest solution via peek before the terminal operation IMO. The only reason I could think that this is needed is for debug purposes - and if that is the case, than peek was designed for that. Wrapping the Stream for that and providing separate implementations is way too much - besides the huge amount of time and later support for everything that get's added to Streams.
For the part of what if there is another filter added? Well, provide a code comment(lots of us do that) and a few test cases that would otherwise fail for example.
Just my 0.02$
The best idea that is possible is using a mapping on itself and while doing so counting the invocation of the mapping routine.
steam.map(object -> {counter.incrementAndGet(); return object;});
Since this lambda can be reused and you can replace any lambda with an object you can create a counter object like this:
class StreamCounter<T> implements Function<? super T,? extends T> {
int counter = 0;
public T apply(T object) { counter++; return object;}
public int get() { return counter;}
}
So using:
StreamCounter<String> myCounter = new ...;
stream.map(myCounter)...
int count = myCounter.get();
Since again the map invocation is just another point of reuse the map method can be provided by extending Stream and wrap the ordinary stream.
This way you can create something like:
AtomicLong myValue = new AtomicLong();
...
convert(stream).measure(myValue).map(...).measure(mySecondValue).filter(...).measure(myThirdValue).toList(...);
This way you can simply have your own Stream wrapper that wraps transparently every stream in its own version (which is no performance or memory overhead) and measure the cardinality of any such point of measure.
This is often done when analyzing complexity of algorithms when creating map/reduce solutions. Extending your stream implementation by not taking a atomic long instance for counting but only the name of the measure point your stream implementation can hold unlimited number of measure points while providing a flexible way to print a report.
Such an implementation can remember the concrete sequence of stream methods along with the position of each measure point and brings outputs like:
list -> (32k)map -> (32k)filter -> (5k)map -> avg().
Such a stream implementation is written once, can be used for testing but also for reporting.
Build in into an every day implementation gives the possibility to gather statistics for certain processing and allow for a dynamic optimization by using a different permutation of operations. This would be for example a query optimizer.
So in your case the best would be reusing a StreamCounter first and depending on the frequency of use, the number of counters and the affinity for the DRY-principle eventually implement a more sophisticated solution later on.
PS: StreamCounter uses an int value and is not thread-safe so in a parallel stream setup one would replace the int with an AtomicInteger instance.