Multiple threads in for loop - java

I have a method I need to call for each element in a list, then return this list to the caller in another class. I want to create a Thread for each element but am struggling to get my head around how to do this.
public List<MyList> threaded(List<Another> another) {
List<MyList> myList= new ArrayList<>();
Visibility visi = new Visibility();
Thread[] threads = new Thread[another.size()];
for (int i = 0; i < another.size(); i++) {
visi = test(another.get(i));
myList.add(visi);
}
return myList;
}
So i've defined an array of threads that matches the number of elements in another list. To use each of those threads in the loop and then return the myList after all threads have been executed is where i'm lost.

This looks like a perfect use case for a Stream.parallelStream()
public List<MyList> threaded(List<Another> another) {
return another.parallelStream()
.map(a -> test(a));
.collect(Collectors.toList());
}
This will call test on each Another and collect the results as a List using as many cpus as you have available (up to the number of objects you have)
Yes, you could create a Thread for each one, except this is like to be less efficient and much more complicated.

Related

How to iterate two lists simultaneously using Java 8

I below two lists
List<Map<String, Strings>> mapList
List<MyObject> myObjectList
Both lists have same size.
Currently I am iterating them using for loop as below.
List <CustomObject> customObjectList1 = new ArrayList();
List <CustomObject> customObjectList2 = new ArrayList();
int i=0;
for(MyObject myObject:myObjectList){
if(“NEW”.equalIgnoreCase(myObject.getType)){
customObjectList1.add(constructCustomObject(myObject, mapList.get(i));
}
if(“DELETE”.equalIgnoreCase(myObject.getType)){
customObjectList2.add(constructCustomObject(myObject, mapList.get(i));
}
i++;
}
if(!customObjectList1.isEmpty()){
jpaRepo.saveAll(customObjectList1);
}
if(!customObjectList2.isEmpty()){
jpaRepo.deleteAll(customObjectList2);
}
Any better/efficient way to iterate two lists simultaneously using Java 8?
Seems like your issue centers in the ability of knowing the index of the object you are iterating on.
If you want to do it the stream way, maybe you can try something like this
Disclaimer: I do not think it will have a very big impact on performance since there are no objects that can be released by the GC, or decrease in number of iterations.
IntStream.range(0, myObjectList.size())
.forEach(idx -> {
MyObject myObject = myObjectList.get(idx);
if(“NEW”.equalIgnoreCase(myObject.getType)){
customObjectList1.add(constructCustomObject(myObject, mapList.get(idx));
}
if(“DELETE”.equalIgnoreCase(myObject.getType)){
customObjectList2.add(constructCustomObject(myObject, mapList.get(idx));
}
});
;

Remove an object from an ArrayList without (implicitly) looping through it

I am looping through a list A to find X. Then, if X has been found, it is stored into list B. After this, I want to delete X from list A. As speed is an important issue for my application, I want to delete X from A without looping through A. This should be possible as I already know the location of X in A (I found its position in the first line). How can I do this?
for(int i = 0; i<n; i++) {
Object X = methodToGetObjectXFromA();
B.add(X);
A.remove(X); // But this part is time consuming, as I unnecessarily loop through A
}
Thanks!
Instead of returning the object from yhe method, you can return its index and then remove by index:
int idx = methodToGetObjectIndexFromA();
Object X = A.remove(idx); // But this part is time consuming, as I unnecessarily loop through A
B.add(X);
However, note that the remove method may be still slow due to potential move of the array elements.
You can use an iterator, and if performance is an issue is better you use a LinkedList for the list you want to remove from:
public static void main(String[] args) {
List<Integer> aList = new LinkedList<>();
List<Integer> bList = new ArrayList<>();
aList.add(1);
aList.add(2);
aList.add(3);
int value;
Iterator<Integer> iter = aList.iterator();
while (iter.hasNext()) {
value = iter.next().intValue();
if (value == 3) {
bList.add(value);
iter.remove();
}
}
System.out.println(aList.toString()); //[1, 2]
System.out.println(bList.toString()); //[3]
}
If you stored all the objects to remove in a second collection, you may use ArrayList#removeAll(Collection)
Removes from this list all of its elements that are contained in the
specified collection.
Parameters:
c collection containing elements to be removed from this list
In this case, just do
A.removeAll(B);
When exiting your loop.
Addition
It calls ArrayList#batchRemove which will use a loop to remove the objects but you do not have to do it yourself.

java intstream parallel loop omitting data

I have this piece of code:
ArrayList<ArrayList<Double> results = new ArrayList<ArrayList<Double>();
IntStream.range(0, 100).parallel().forEach(x ->{
for (int y = 0; y <100;y++){
for (int z = 0; z <100;z++){
for (int q = 0; q <100;q++){
results.add(someMethodThatReturnsArrayListDouble);
}
}
}
});
System.out.println(results.size());
After running this code, i get always different results.size(), always a few short. Any idea why is that and how to fix it?
ArrayList is not threadsafe. If you try and add items to it in different threads (which is what a parallellised stream does), it is likely to break.
From the docs:
Note that this implementation is not synchronized. If multiple threads access an ArrayList instance concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more elements, or explicitly resizes the backing array; merely setting the value of an element is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the list. If no such object exists, the list should be "wrapped" using the Collections.synchronizedList method.
The easiest fix, in this case, would be to remove the call to parallel().
You result is not synchronized. There are multiple ways to solve your problem, the best would be letting the java stream api handle the combining of the lists.
List<List<Double>> results = IntStream.range(0, 100).parallel().flatmap(x ->{
List<Double>> results = new ArrayList<Double>();
for (int y = 0; y <100;y++){
for (int z = 0; z <100;z++){
for (int q = 0; q <100;q++){
results.add(someMethodThatReturnsArrayListDouble);
}
}
}
return results.stream();
}).collect(Collectors.toList());
This collects the lists in the method, and returns them as a stream to be combined at the end of the method using collectors.toList(), what is thread safe.
use
Vector
it's a thread-safe implementation of List.

Multithreaded: Identifying duplicate objects

I'm trying to implement a duplicate objects finding method over a List object. Traversing through the List and finding the duplicate objects using multiple threads is the target. So far I used ExecutorService as follows.
ExecutorService executor = Executors.newFixedThreadPool(5);
for (int i = 0; i < jobs; i++) {
Runnable worker = new TaskToDo(jobs);
executor.execute(worker);
}
executor.shutdown();
while (!executor.isTerminated()) {
}
System.out.println("Finished all threads");
At TaskToDo class I iterate through the loop. When a duplicate is detected the one out of them will be removed from the List. Following are the problems I faced,
When using multiple threads at the executor it does not result as intended. Some duplicate values are still exist in the list. But a single thread at the executor works perfectly. I tried
List<String> list = Collections.synchronizedList(new LinkedList<String>()) also but same problem exists.
What is the best data structure that i can use for this purpose of removing duplicates for better performance ?
Google gave some results to use Concurrent structures. But difficult to figure out a correct approach to achieve this.
Appreciate your help. Thanks in advance... :)
Following is the code for iterating through the specified list object. Here actual content of the files will be compared.
for(int i = currentTemp; i < list.size() - 1; i++){
if(isEqual(list.get(currentTemp), list.get(i+1))){
synchronized (list) {
list.remove(i + 1);
i--;
}}}
With your current logic, you would have to synchronize at coarser granularity, otherwise you risk removing the wrong element.
for (int i = currentTemp; i < list.size() - 1; i++) {
synchronized (list) {
if (i + 1 > list.size() && isEqual(list.get(currentTemp), list.get(i+1))) {
list.remove(i + 1);
i--;
}
}
}
You see, the isEqual() check must be inside the synchronized block to ensure atomicity of the equivalence check with the element removal. Assuming most of your concurrent processing benefit would come from asynchronous comparison of list elements using isEqual(), this change nullifies any benefit you sought.
Also, checking list.size() outside the synchronized block isn't good enough, because list elements can be removed by other threads. And unless you have a way of adjusting your list index down when elements are removed by other threads, your code will unknowingly skip checking some elements in the list. The other threads are shifting elements out from under the current thread's for loop.
This task would be much better implemented using an additional list to keep track of indexes that should be removed:
private volatile Set<Integer> indexesToRemove =
Collections.synchronizedSet(new TreeSet<Integer>(
new Comparator<Integer>() {
#Override public int compare(Integer i1, Integer i2) {
return i2.compareTo(i1); // sort descending for later element removal
}
}
));
The above should be declared at the same shared level as your list. Then the code for iterating through the list should look like this, with no synchronization required:
int size = list.size();
for (int i = currentTemp; i < size - 1; i++) {
if (!indexesToRemove.contains(i + 1)) {
if (isEqual(list.get(currentTemp), list.get(i+1))) {
indexesToRemove.add(i + 1);
}
}
}
And finally, after you have join()ed the worker threads back to a single thread, do this to de-duplicate your list:
for (Integer i: indexesToRemove) {
list.remove(i.intValue());
}
Because we used a descending-sorted TreeSet for indexesToRemove, we can simply iterate its indexes and remove each from the list.
If your algorithm acts on sufficient data that you might really benefit from multiple threads, you encounter another issue that will tend to mitigate any performance benefits. Each thread has to scan the entire list to see if the element it is working on is a duplicate, which will cause the CPU cache to keep missing as various threads compete to access different parts of the list.
This is known as False Sharing.
Even if False Sharing does not get you, you are de-duping the list in O(N^2) because for each element of the list, you re-iterate the entire list.
Instead, consider using a Set to initially collect the data. If you cannot do that, test the performance of adding the list elements to a Set. That should be a very efficient way to approach this problem.
If you're trying to dedup a large number of files, you really ought to be using a hash-based structure. Concurrently modifying lists is dangerous, not least because indexes into the list will constantly be changing out from under you, and that's bad.
If you can use Java 8, my approach would look something like this. Let's assume you have a List<String> fileList.
Collection<String> deduplicatedFiles = fileList.parallelStream()
.map(FileSystems.getDefault()::getPath) // convert strings to Paths
.collect(Collectors.toConcurrentMap(
path -> {
try {
return ByteBuffer.wrap(Files.readAllBytes(path)),
// read out the file contents and wrap in a ByteBuffer
// which is a suitable key for a hash map
} catch (IOException e) {
throw new RuntimeException(e);
}
},
path -> path.toString(), // in the values, convert back to string
(first, second) -> first) // resolve duplicates by choosing arbitrarily
.values();
That's the entire thing: it concurrently reads all the files, hashes them (though with an unspecified hash algorithm that may not be great), deduplicates them, and spits out a list of files with distinct contents.
If you're using Java 7, then what I'd do would be something like this.
CompletionService<Void> service = new ExecutorCompletionService<>(
Executors.newFixedThreadPool(4));
final ConcurrentMap<ByteBuffer, String> unique = new ConcurrentHashMap<>();
for (final String file : fileList) {
service.submit(new Runnable() {
#Override public void run() {
try {
ByteBuffer buffer = ByteBuffer.wrap(Files.readAllBytes(
FileSystem.getDefault().getPath(file)));
unique.putIfAbsent(buffer, file);
} catch (IOException e) {
throw new RuntimeException(e);
}
}, null);
}
for (int i = 0; i < fileList.size(); i++) {
service.take();
}
Collection<String> result = unique.values();

How to compare two Arraylist values in java?

I have Two Arraylist RunningProcessList AllProcessList its contains following values are
RunningProcessList:
Receiver.jar
AllProcessList:
Receiver.jar
Sender.jar
Timeout.jar
TimeourServer.jar
AllProcessList arraylist contains the all java processes , RunningProcessList arraylist contains currently running process. I want to compare these two arraylist and I want to display If the process is not running. For Example compare two list and want to display following process is not running.
Result:
Sender.jar
Timeout.jar
TimeourServer.jar
I used the following code but its not working.
Object Result = null;
for (int i = 0; i <AllProcessList.size(); i++) {
for (int j = 0; j < RunningProcessList.size(); j++) {
if( AllProcessList.get(i) != ( RunningProcessList.get(j))) {
System.out.println( RunningProcessList.get(j)));
Result =RunningProcessList.get(j);
}
if(AllProcessList.get(i) != ( RunningProcessList.get(j))) {
list3.add(Result);
}
}
}
Take a look at the documentation for List, ecpecially the removeAll() method.
List result = new ArrayList(AllProcessList);
result.removeAll(RunningProcessList);
You could then iterate over that list and call System.out.println if you wanted, as you've done above... but is that what you want to do?
Assuming your lists are not too long, you can just collect all elements of AllProcessList that are not in the RunningProceesList
for (Object process : AllProcessList) {
if (!RunningProcessList.contains(process)) {
list3.add(process);
}
}
it's important that the RunningProcessList contains the same instances as the AllProcessList (or the objects must implement a functional equals method).
it would be better if your list contains instances of Process (or some other dedicated class).
List<Process> AllProcessList = new ArrayList<Process>();
List<Process> RunningProcessList = new ArrayList<Process>();
List<Process> list3 = new ArrayList<Process>();
...
for (Process process : AllProcessList) {
if (!RunningProcessList.contains(process)) {
list3.add(process);
}
}
English is not my first (neither second) language, any correction is welcome
Hi lakshmi,
I upvoted noelmarkham's answer as I think it's the best code wise and suits Your needs. So I'm not going to add another code snippet to this already long list, I just wanted to point You towards two things:
If Your processes are unique (their name/id whatever), You might consider to use (Hash)Sets in order to store them for better performance of Your desired operations. This should only be a concern when Your lists are large.
What about using ActiveProcesses and InactiveProccesses instead of Your current two lists? If a process changes its state You just have to remove it from one list and insert it into the other. This would lead to an overall cleaner design and You could access the not-running processes immediately.
Greetings
Depending on the type on AllProcessList and RunningProcessList (whocu should be allProcessList and runningProcessList to follow the Java naming conventions) the following will not work:
if ( AllProcessList.get(i) != ( RunningProcessList.get(j))) {
you should replace it with
if (!(AllProcessList.get(i).equals(RunningProcessList.get(j)))) {
!= compares physical equality, are the two things the exact same "new"ed object?
.equals(Object) compared locaical equality, ate the two things the "same"?
To do that you will need to override the equals and hashCode methods. Here is an article on that.
If the class is a built in Java library one then odds are equals and hashCode are done.
For sorted lists, the following is O(n). If a sort is needed, this method becomes O(nlogn).
public void compareLists(final List<T> allProcesses, final List<T> runningProcesses) {
// Assume lists are sorted, if not call Collection.sort() on each list (making this O(nlogn))
final Iterator<T> allIter = allProcesses.iterator();
final Iterator<T> runningIter = runningProcesses.iterator();
T allEntry;
T runningEntry;
while (allIter.hasNext() && runningIter.hasNext()) {
allEntry = allIter.next();
runningEntry = runningIter.next();
while (!allEntry.equals(runningEntry) && allIter.hasNext()) {
System.out.println(allEntry);
allEntry = allIter.next();
}
// Now we know allEntry == runningEntry, so we can go through to the next iteration
}
// No more running processes, so just print the remaining entries in the all processes list
while (allIter.hasNext()) {
System.out.println(allIter.next());
}
}

Categories