Improve Performance with Multiple Threads - java

I'm writing a Java program to solve this problem:
I have a balanced tree (namely, a TreeSet in Java) containing values. I have "Task" objects that will do either of the two things: try to find a value in the tree, or add a value to the tree. I will have a list of these "Task" objects (I used a LinkedList in Java) and I create threads to read and remove the tasks from this list one by one and perform their required action (i.e., find or add a value in the tree). I have created a synchronized "remove" method for my task list (which simply calls the underlying LinkedList's "remove" method). I have also defined the "add" method of the tree to be synchronized... (I don't know if it's necessary for it to be synchronized or not, but I assume it is).
How can I improve the performance of this program when using multiple threads? Right now, if I use a single thread, the time is better than when I use multiple threads.
This is the run method of my TaskRunner class, my threads are objects of this class and it implements Runnable, tasks is the list containing tasks and tree is my TreeSet passed to this object in the constructor:
Task task;
int action; // '0' for search, '1' for add
int value; // Value to be used for searching or adding
while (!tasks.isEmpty()) {
try { task = tasks.remove(); }
catch (NoSuchElementException ex) { break; }
action = task.getAction();
value = task.getValue();
if (action == 0)
boolean found = tree.contains(value);
else
tree.add(value);
}
Also, my tree inherits from TreeSet<Integer> in Java and I have defined its add method as synchronized:
public synchronized boolean add(Integer e) {
return super.add(e);
}
And my task list inherits from LinkedList<Task> and its remove method:
public synchronized Task remove() {
return super.remove();
}

If your task class implements Runnable interface, you can use ThreadPool to process the tasks.
Here is an example:
public class TreeSetTaskExample {
public static class Task implements Runnable {
String value;
boolean add;
Set<String> synchronizedTreeSet;
public Task(String value, boolean add, Set<String> synchronizedTreeSet) {
this.value = value;
this.add = add;
this.synchronizedTreeSet = synchronizedTreeSet;
}
#Override
public void run() {
String threadName = Thread.currentThread().toString();
if (add) {
System.out.println(threadName + "# add: " + value);
synchronizedTreeSet.add(value);
} else {
boolean contains = synchronizedTreeSet.contains(value);
System.out.println(threadName + "# treeSet.contains: " + value + " = " + contains + " removed...");
if (contains) {
synchronizedTreeSet.remove(value);
}
}
}
}
public static void main(String[] args) throws InterruptedException {
//
// synchronizedSet
//
Set<String> treeSet = Collections.synchronizedSet(new TreeSet<String>());
//
// ThreadPool with ? Threads
//
int processors = Runtime.getRuntime().availableProcessors();
ExecutorService threadPool = Executors.newFixedThreadPool(processors);
for (int i = 0; i < 100; i++) {
String someValue = "" + (i % 5);
boolean addOrCheck = Math.random() > 0.5;
threadPool.execute(new Task(someValue, addOrCheck, treeSet));
}
//
// don't forget to kill the threadpool
//
threadPool.shutdown();
}
}

Related

Concurrent Modification Exception in Callable class

I'm trying to split a list of objects within smaller sublist and to process them separately on different threads. So I have following code:
List<Instance> instances = xmlInstance.readInstancesFromXml();
List<Future<List<Instance>>> futureList = new ArrayList<>();
int nThreads = 4;
ExecutorService executor = Executors.newFixedThreadPool(nThreads);
final List<List<Instance>> instancesPerThread = split(instances, nThreads);
for (List<Instance> instancesThread : instancesPerThread) {
if (instancesThread.isEmpty()) {
break;
}
Callable<List<Instance>> callable = new MyCallable(instancesThread);
Future<List<Instance>> submit = executor.submit(callable);
futureList.add(submit);
}
instances.clear();
for (Future<List<Instance>> future : futureList) {
try {
final List<Instance> instancesFromFuture = future.get();
instances.addAll(instancesFromFuture);
} catch (InterruptedException | ExecutionException e) {
e.printStackTrace();
}
}
executor.shutdown();
try {
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
} catch (InterruptedException ie) {
ie.printStackTrace();
}
And the MyCallable class :
public class MyCallable implements Callable<List<Instance>> {
private List<Instance> instances;
public MyCallable (List<Instance> instances) {
this.instances = Collections.synchronizedList(instances);
}
#Override
public List<Instance> call() throws Exception {
for (Instance instance : instances) {
//process each object and changing some fields;
}
return instances;
}
}
Split method(It's split an given list in given number of list, also trying to have almost same size on each sublist) :
public static List<List<Instance>> split(List<Instance> list, int nrOfThreads) {
List<List<Instance>> parts = new ArrayList<>();
final int nrOfItems = list.size();
int minItemsPerThread = nrOfItems / nrOfThreads;
int maxItemsPerThread = minItemsPerThread + 1;
int threadsWithMaxItems = nrOfItems - nrOfThreads * minItemsPerThread;
int start = 0;
for (int i = 0; i < nrOfThreads; i++) {
int itemsCount = (i < threadsWithMaxItems ? maxItemsPerThread : minItemsPerThread);
int end = start + itemsCount;
parts.add(list.subList(start, end));
start = end;
}
return parts;
}
So, when I'm trying to execute it, I'm getting java.util.ConcurrentModificationException on this line for (Instance instance : instances) {, can somebody give any ideas why it's happening?
public MyCallable (List<Instance> instances) {
this.instances = Collections.synchronizedList(instances);
}
Using synchronizedList like this doesn't help you in the way you think it might.
It's only useful to wrap a list in a synchronizedList at the time you create it (e.g. Collections.synchronizedList(new ArrayList<>()). Otherwise, the underlying list is directly accessible, and thus accessible in an unsynchronized way.
Additionally, synchronizedList only synchronizes for the duration of individual method calls, not for the whole time while you are iterating over it.
The easiest fix here is to take a copy of the list in the constructor:
this.instances = new ArrayList<>(instances);
Then, nobody else has access to that list, so they can't change it while you are iterating it.
This is different to taking a copy of the list in the call method, because the copy is done in a single-threaded part of the code: no other thread can be modifying it while you are taking that copy, so you won't get the ConcurrentModificationException (you can get a CME in single-threaded code, but not using this copy constructor). Doing the copy in the call method means the list is iterated, in exactly the same way as with the for loop you already have.

Looping over a huge list, check string equal to true

I have a requirement as below:
List<User> userList = listOfUsers(); // Morethan 50,000 users
I need to find a user status from the list of users. if any one of the users is active then break the loop.
what is the efficient way to handle this in java ?
Java 8 solution with method reference:
userList.stream().filter(User::isActive).findFirst()
It'll return Optional so you could map over it.
One way to accelerate the search (Without Using Java 8) is by searching both directions in the ArrayList (i.e from the beginning to the middle, and from the end to the middle) at the same time via using multi-threading, I created this example and tested it against 1 million object/user to check if any of them is active (Note that I made only one user active and put him in the middle to see the longest time the search may take).
import java.util.ArrayList;
public class User {
// some fields to test
String name;
boolean active;
//volatile means all writes up to the volatile variable
//from other any thread are now visible to all other threads.
//so they can share working on that variable
static volatile boolean finishFirst = false; // to announce first thread finish
static volatile boolean finishSecond = false; // to announce second thread finish
static volatile boolean found = false; // // to announce if an active user found
/**
* Simple Constructor
* #param name
* #param active
*/
public User(String name, boolean active){
this.name = name;
this.active = active;
}
public static void main(String[] args) {
// create an ArrayList of type User
ArrayList<User> list = new ArrayList<User>();
// populate it with 1 MILLION user!!
int i=0;
for(;i<1000000; i++){
// make only the one in the very middle active to prolong the search to max
if(i==500000){
list.add(new User(String.valueOf(i),true));
}
else{
list.add(new User(String.valueOf(i),false));
}
}
System.out.println("End of Adding " + i + " User" );
// to measure how long it will take
long startTime, endTime;
startTime = System.currentTimeMillis();
System.out.println("Found Any Active: "+ isAnyActive(list)); // invoke the method
endTime = System.currentTimeMillis();
System.out.println(endTime-startTime + " MilliScond");
}
public static boolean isAnyActive(ArrayList<User> list){
found = false;
// create two threads, each search the half of the array
// so that shall save time to half
Thread t1 = new Thread(new Runnable(){
#Override
public void run() {
// read one more index in case the size is not an even number
// so it will exceed the middle in one -> no problem at all
for(int i=0; i<=(list.size()/2)+1; i++){
if(list.get(i).active) {
found = true;
finishFirst = true;
break;
}
}
finishFirst = true; // in case did not find any
}
});
// second thread the same, but read from the end to the middle
Thread t2 = new Thread(new Runnable(){
public void run() {
for(int i=list.size()-1; i>=list.size()/2; i--){
if(list.get(i).active) {
found = true;
finishSecond = true;
break;
}
}
finishSecond = true;
}
});
// start both thread
t2.start();
t1.start();
// while one of them has not finished yet
while(!finishFirst || !finishSecond){
// but in case not finished looping but found an active user
// break the loop
if(found){break;}
}
return found; // return the result
}
}
Test
End of Adding 1000000 User
Found Any Active: true
31 MilliScond
The efficient way is to do that filter with SQL if you are using that. Select just the active users....
When you have all that list to work with java it will be slow as hell and there is no magic here, you will need to iterate.
public User getActiveUserFromList(userList) {
for (User user : userList) {
if (user.isActive()) {
return user;
}
return null;
}
}
If you have that list anyway ordered you can try to hack it, let's assume it is ordered by active status
public Boolean isAnyActive(userList) {
if (userList.first().isActive()) { // try first
return true;
}
if (userList.last().isActive()) { // if its ordered and there is an active user, the last surely will be active, since first wasn't
return true;
}
return false;
}
I would certainly think about using Java 8 Lambda. I have written an example class:
package com.chocksaway;
import java.util.ArrayList;
import java.util.List;
/**
* Author milesd on 05/06/2017.
*/
class Name {
private String name;
private Boolean status;
public Name(String name, Boolean status) {
this.name = name;
this.status = status;
}
public String getName() {
return name;
}
public Boolean getStatus() {
return status;
}
}
public class FindFirstInStream {
public static void main(String[] args) {
List<Name> userList = new ArrayList<>();
userList.add(new Name("James", false));
userList.add(new Name("Eric", true));
userList.add(new Name("David", false));
Name firstActiveName = userList.stream()
.filter(e -> e.getStatus().equals(true))
.findFirst()
.get();
System.out.println(firstActiveName.getName());
}
}
I've created a Name class, with name, and status.
I populate a userList with James, Eric, and David.
I use Java 8 stream to filter, and return the first "active" name (Eric).
This is stored in "firstActiveName".
You may use Collections ArrayDeque. ArrayDeques will use half of the iteration to find the active user. In your case
ArrayDeque sample = new ArrayDeque(userList);
for(int i=0;i<sample.size();i++){
if(sample.pollFirst().status.equalsIgnoreCase("A")) {
break;
}
if(sample.pollLast().status.equalsIgnoreCase("A")) {
break;
}
if(sample.size()==0) break;
}
Because I see many Java 8 streaming solutions that do not use parallel streams, I add this answer. You have a large collection on which you do the matching, so you can use the power of parallelStreams when you would opt to use Java 8.
Optional<User> result = userList.parallelStream().filter(User::isActive).findAny();
Using a parallelStream will split the stream into multiple sub-streams which is more performant for very large collections. It uses the ForkJoinPool internally to process these sub-streams. The only difference here is that I use findAny() instead of findFirst() in this solution.
This is what Javadoc has to say about findAny():
The behavior of this operation is explicitly nondeterministic; it is
free to select any element in the stream. This is to allow for maximal
performance in parallel operations; the cost is that multiple
invocations on the same source may not return the same result. (If a
stable result is desired, use findFirst() instead.)
Here is a nice tutorial on Parallelism from Oracle.

Java: Fail in synchronizing threads

I have the following code:
for (int iThreadCounter = 1; iThreadCounter <= CONNECTIONS_NUM; iThreadCounter++){
WorkThread wt = new WorkThread(iThreadCounter);
new Thread(wt).start();
m_arrWorkThreadsToCreate.add(wt);
}
Those threads calls the following code:
int res = m_spLegJoin.call(m_workTread, m_workTread.getConfId());
And this is the call method inside LegJoinSp class:
public class LegJoinSp extends ConnEventSp {
private static final int _LEG_JOIN_ACTION_CODE = 22;
private static int m_nLegId = Integer.valueOf(IniUtils.getIniValue("General", "LEG_ID_START"));
private final Lock m_lock = new ReentrantLock();
public int call(WorkThread a_workThread, String a_sConfId) {
synchronized (this) {
//m_lock.lock();
m_nLegId++;
boolean bPass = false;
Log4jWrapper.writeLog(LogLevelEnum.DEBUG, "LegJoinSp - call", "a_workThread = " + a_workThread.getThreadId() + " a_sConfId = " + a_sConfId);
if (super.call(a_workThread, a_sConfId, _LEG_JOIN_ACTION_CODE, "" + m_nLegId) == 0) {
bPass = true;
} else {
bPass = false;
}
//m_lock.unlock();
if (bPass) {
Log4jWrapper.writeLog(LogLevelEnum.DEBUG, "LegJoinSp - call", "a_workThread = " + a_workThread.getThreadId() + " a_sConfId = " + a_sConfId + " returned leg id " + m_nLegId);
return m_nLegId;
} else {
return -1;
}
}
}
public Lock getLock() {
return m_lock;
}
}
I've got 2 threads calling this call() method.
m_nLegId is initiated with 100.
As you can see I have tried to lock the method with both
synchronized(this)
and
m_lock.lock() and m_lock.unlock()
The problem is that when I first get to if (bPass) inner code, it write 102 to my log as the m_nLegId value. However I expect it to be 101 because of the m_nLegId++; statement.
It seems that the second thread manage to get inside the code before the synchronize block ends for the first thread execution.
How can I fix that?
Thank you
For me your bug is related to the fact that m_nLegId is a static field and you try to synchronize access on the current instance instead of the class such that you don't properly prevent concurrent modifications of your field.
I mean
synchronized (this) {
Should rather be
synchronized (LegJoinSp.class) {
NB: In case you only need a counter, consider using an AtomicInteger for your field instead of an int.
The thing is you are creating a new object with every thread, but the way you applied the lock is applicable only to same object (as you applied the lock on the this).
So if you want to apply the lock on the class level, then you can create a static object and apply the lock on that object which can serve the purpose you wanted to achieve (if I understood your problem correctly based on the comments)

Performing a long calculation that returns after a timeout

I want to perform a search using iterative deepening, meaning every time I do it, I go deeper and it takes longer. There is a time limit (2 seconds) to get the best result possible. From what I've researched, the best way to do this is using an ExecutorService, a Future and interrupting it when the time runs out. This is what I have at the moment:
In my main function:
ExecutorService service = Executors.newSingleThreadExecutor();
ab = new AB();
Future<Integer> f = service.submit(ab);
Integer x = 0;
try {
x = f.get(1990, TimeUnit.MILLISECONDS);
}
catch(TimeoutException e) {
System.out.println("cancelling future");
f.cancel(true);
}
catch(Exception e) {
throw new RuntimeException(e);
}
finally {
service.shutdown();
}
System.out.println(x);
And the Callable:
public class AB implements Callable<Integer> {
public AB() {}
public Integer call() throws Exception {
Integer x = 0;
int i = 0;
while (!Thread.interrupted()) {
x = doLongComputation(i);
i++;
}
return x;
}
}
I have two problems:
doLongComputation() isn't being interrupted, the program only checks if Thread.interrupted() is true after it completes the work. Do I need to put checks in doLongComputation() to see if the thread has been interrupted?
Even if I get rid of the doLongComputation(), the main method isn't receiving the value of x. How can I ensure that my program waits for the Callable to "clean up" and return the best x so far?
To answer part 1: Yes, you need to have your long task check the interrupted flag. Interruption requires the cooperation of the task being interrupted.
Also you should use Thread.currentThread().isInterrupted() unless you specifically want to clear the interrupt flag. Code that throws (or rethrows) InterruptedException uses Thread#interrupted as a convenient way to both check the flag and clear it, when you're writing a Runnable or Callable this is usually not what you want.
Now to answer part 2: Cancellation isn't what you want here.
Using cancellation to stop the computation and return an intermediate result doesn't work, once you cancel the future you can't retrieve the return value from the get method. What you could do is make each refinement of the computation its own task, so that you submit one task, get the result, then submit the next using the result as a starting point, saving the latest result as you go.
Here's an example I came up with to demonstrate this, calculating successive approximations of a square root using Newton's method. Each iteration is a separate task which gets submitted (using the previous task's approximation) when the previous task completes:
import java.util.concurrent.*;
import java.math.*;
public class IterativeCalculation {
static class SqrtResult {
public final BigDecimal value;
public final Future<SqrtResult> next;
public SqrtResult(BigDecimal value, Future<SqrtResult> next) {
this.value = value;
this.next = next;
}
}
static class SqrtIteration implements Callable<SqrtResult> {
private final BigDecimal x;
private final BigDecimal guess;
private final ExecutorService xs;
public SqrtIteration(BigDecimal x, BigDecimal guess, ExecutorService xs) {
this.x = x;
this.guess = guess;
this.xs = xs;
}
public SqrtResult call() {
BigDecimal nextGuess = guess.subtract(guess.pow(2).subtract(x).divide(new BigDecimal(2).multiply(guess), RoundingMode.HALF_EVEN));
return new SqrtResult(nextGuess, xs.submit(new SqrtIteration(x, nextGuess, xs)));
}
}
public static void main(String[] args) throws Exception {
long timeLimit = 10000L;
ExecutorService xs = Executors.newSingleThreadExecutor();
try {
long startTime = System.currentTimeMillis();
Future<SqrtResult> f = xs.submit(new SqrtIteration(new BigDecimal("612.00"), new BigDecimal("10.00"), xs));
for (int i = 0; System.currentTimeMillis() - startTime < timeLimit; i++) {
f = f.get().next;
System.out.println("iteration=" + i + ", value=" + f.get().value);
}
f.cancel(true);
} finally {
xs.shutdown();
}
}
}

Why does a concurrent hash map work properly when accessed by two thread, one using the clear() and other using the putifAbsent() methods?

I am implementing an application using concurrent hash maps. It is required that one thread adds data into the CHM, while there is another thread that copies the values currently in the CHM and erases it using the clear() method. When I run it, after the clear() method is executed, the CHM always remains empty, though the other thread continues adding data to CHM.
Could someone tell me why it is so and help me find the solution.
This is the method that adds data to the CHM. This method is called from within a thread.
import java.util.concurrent.ConcurrentMap;
import java.util.concurrent.ConcurrentHashMap;
public static ConcurrentMap<String, String> updateJobList = new ConcurrentHashMap<String, String>(8, 0.9f, 6);
public void setUpdateQuery(String ticker, String query)
throws RemoteException {
dataBaseText = "streamming";
int n = 0;
try {
updateJobList.putIfAbsent(ticker, query);
}
catch(Exception e)
{e.printStackTrace();}
........................
}
Another thread calls the track_allocation method every minute.
public void track_allocation()
{
class Track_Thread implements Runnable {
String[] track;
Track_Thread(String[] s)
{
track = s;
}
public void run()
{
}
public void run(String[] s)
{
MonitoringForm.txtInforamtion.append(Thread.currentThread()+"has started runnning");
String query = "";
track = getMaxBenefit(track);
track = quickSort(track, 0, track.length-1);
for(int x=0;x<track.length;x++)
{
query = track[x].split(",")[0];
try
{
DatabaseConnection.insertQuery(query);
}
catch(Exception e)
{
e.printStackTrace();
}
}
}
}
joblist = updateJobList.values();
MonitoringForm.txtInforamtion.append("\nSize of the joblist is:"+joblist.size());
int n = joblist.size()/6;
String[][] jobs = new String[6][n+6];
MonitoringForm.txtInforamtion.append("number of threads:"+n);
int i = 0;
if(n>0)
{
MonitoringForm.txtInforamtion.append("\nSize of the joblist is:"+joblist.size());
synchronized(this)
{
updateJobList.clear();
}
Thread[] threads = new Thread[6];
Iterator it = joblist.iterator();
int k = 0;
for(int j=0;j<6; j++)
{
for(k = 0; k<n; k++)
{
jobs[j][k] = it.next().toString();
MonitoringForm.txtInforamtion.append("\n\ninserted into queue:\n"+jobs[j][k]+"\n");
}
if(it.hasNext() && j == 5)
{
while(it.hasNext())
{
jobs[j][++k] = it.next().toString();
}
}
threads[j] = new Thread(new Track_Thread(jobs[j]));
threads[j].start();
}
}
}
I can see a glaring mistake. This is the implementation of your Track_Thread classes run method.
public void run()
{
}
So, when you do this:
threads[j] = new Thread(new Track_Thread(jobs[j]));
threads[j].start();
..... the thread starts, and then immediately ends, having done absolutely nothing. Your run(String[]) method is never called!
In addition, your approach of iterating the map and then clearing it while other threads are simultaneously adding is likely to lead to entries occasionally being removed from the map without the iteration actually seeing them.
While I have your attention, you have a lot of style errors in your code:
The indentation is a mess.
You have named your class incorrectly: it is NOT a thread, and that identifier ignores the Java identifier rule.
Your use of white-space in statements is inconsistent.
These things make your code hard to read ... and to be frank, they put me off trying to really understand it.

Categories