I am working on a multithreaded application that reads data from a number of sources, does some calculations and writes results to several outputs. I do have several reader threads, several calculation threads and several writers. Number of each types of threads are given in configuration.
I would like to have these threads named accordingly: "reader-1", "reader-2", "writer-1", etc.
So, I wanted to use org.apache.commons.lang3.concurrent.BasicThreadFactory for this purpose.
I did write the following code:
BasicThreadFactory threadFactory = new BasicThreadFactory.Builder()
.namingPattern("%s-%d")
.daemon(false)
.priority(Thread.MAX_PRIORITY)
.build();
ExecutorService executors = Executors.newFixedThreadPool(config.getPoolSize(), threadFactory);
However, I cannot find anywhere how can I specify name and number of the working thread upon submission.
I searched hundreds of links and did not see a single example of how do I the do it.
I am creating my Callable
Callable reader = new BatchFileReader(config);
for(int i = 1; i <= maxReaders; i++) {
executors.submit(reader);
}
Submit method does not have any other parameters except Runnable/Callable instance. I cannot figure out where I can specify string "reader" and it sequential number to the thread factory.
If anyone can give me a hint, I will appreciate it greatly.
If you really want to set the name of the Thread which executes a submitted Callable/Runnable just give the name to the Callable/Runnable and let it set the name of the Thread it is running on, by just calling Thread.currentThread().setName(...), ie:
public class BatchFileReader implements Runnable {
private final String name;
public BatchFileReader(final String name /*, other arguments go here...*/) {
this.name = Objects.requireNonNull(name);
}
#Override
public void run() {
Thread.currentThread().setName(name);
//Do your work here...
}
}
I am expecting this to work independently of any Executor implementation, assuming that each Thread can only run one given task at a time, which as far as I know is the case indeed. If a single Thread may run more than one Callable/Runnable at a time in (pseudo-)parallel then this approach won't work.
I also looked at subclassing each provided Executor implementation, but that would be a bit more pain. At least on ThreadPoolExecutor you can achieve this by subclassing it and overriding ThreadPoolExecutor#beforeExecute and both AbstractExecutorService#newTaskFor methods in case you are using any submit method to submit Callables/Runnables. If you are just using execute for a Runnable then you will only need to override ThreadPoolExecutor#beforeExecute. But then again you would have to keep a reference to the Runnable's name inside it (so that you can pass it over to the Thread in ThreadPoolExecutor#beforeExecute).
Another thing I noticed lies on the question if you really have to rely on Threads for their name at all... I mean if you are just interacting with the Callables/Runnables inside their call/run method that you provide, then why not just simply provide them with their name and that's all? I mean I don't know why you need at all to provide a name to internally handled Threads anyway, by the time you are only interacting with their Callables/Runnables (assuming you are doing so). As far as I understand, you could just provide an accessor method for the name (ie a getName) in your BatchFileReader implementations.
Related
I'm trying to learn threads in Java. I've followed two different tutorials, but I feel like I'm not really getting the concept. As I understand it, when you create threads, you use the Thread class, and then embedding your own object within the thread. I can do that, but I can't figure out how to access the instance variables within the "embedded" object.
Suppose as a learning exercise, I wanted to create three threads which would go off and do work individually of one another. I could define this object to "drive" the threads:
public class StoogeObject implements Runnable {
String name;
int data;
public StoogeObject(String name, int data){
this.name=name;
this.data=data;
}
#Override
public void run() {
System.out.println("Thread "+this.name+" has this data: "+this.data);
// Do useful work here
System.out.println("Thread "+this.name+" is exiting...");
}
public String getName(){
return this.name;
}
}
Then, in a driver program, I would launch my threads:
public class driver {
public static void main(String[] args){
Thread stooge1 = new Thread(new StoogeObject("Larry", 123));
Thread stooge2 = new Thread(new StoogeObject("Curly", 456));
Thread stooge3 = new Thread(new StoogeObject("Moe", 789));
stooge1.start();
stooge2.start();
stooge3.start();
if(stooge1.isAlive())
System.out.println("From main(): "+stooge1.getName());
}
}
Output is:
From main(): Thread-0
Thread Larry has this data: 123
Thread Curly has this data: 456
Thread Moe has this data: 789
Thread Larry is exiting...
Thread Curly is exiting...
Thread Moe is exiting...
I was surprised when the stooge1.getName() line in main() produced "Thread-0", not "Larry". I was expecting the getName() method I wrote in stoogeObject.java to override and return the instance variable String name in this instance. Instead, I'm getting the name of the Thread, not the StoogeObject.
So... The stooge1 thread has a StoogeObject within it, but I don't know how to access its instance variables. More significantly, this example makes me wonder if I'm missing the point of threads. If I want my Larry, Curly, & Moe objects to go off and do productive work AND keep their own instance variables, is using threads the wrong way to go here? Should I start over, making these objects into processes?
I can't figure out how to access the instance variables within the "embedded" object.
You access them in exactly the same way that you would access the instance variables of any other object.
The "embedded" object, FYI, is called the thread's target or the thread's delegate.
There is nothing special about the target of a Thread. It's just an object.
I was surprised when the stooge1.getName() line in main() produced "Thread-0", not "Larry". I was expecting ... Instead, I'm getting the name of the Thread, not the StoogeObject.
That's because the Thread object and the StoogeObject are different objects.
this example makes me wonder if I'm missing the point of threads.
There are two different ways that threads can be used in a program. The first (which is what people think of more often than not) is that threads are how a Java program can make use of more than one CPU if your platform has more than one CPU. (Virtually all modern servers and workstations have more than one these days, and it's getting to where a lot of cell phones and tablets have more than one as well.) If your platform has, say eight CPUs, then up to eight of your threads may be able to run simultaneously if that many of them are "ready to run."
The second way to use threads in a program is to wait for things. For example, if your program is a server that has to waits for input from each of N clients, and respond to it; you can structure it as N threads that each just listen to and respond to one client. That often makes the code easier to understand. (Just like, it's easier to juggle one ball than it is to juggle N balls).
is using threads the wrong way to go here? Should I start over, making these objects into processes?
Threads can be much more tightly coupled than processes because the threads of a single program all share the same virtual address space (i.e., in a Java program, they all share the same heap). Communication between threads usually is one or two orders of magnitude faster than communication between different processes on the same machine.
If you need fine-grained communication between them, then they definitely should be threads. A good rule of thumb is that, an application should never spawn a new process unless there is a really good reason why it should not be just another thread.
If you want to access the runnable object that you pass to the thread, you need to keep a reference to it.
Here is an example:
stoogeObject obj = new stoogeObject("Larry", 123);
Thread stooge1 = new Thread(obj);
stooge1.start();
System.out.println(obj.getName());
This will print Larry.
Keep in mind that if the name variable from the stoogeObject instance is changed during the thread's runtime, you'll have to wait for that thread to finish (or finish changing the variable) in order to get the correct value.
You can do that by using join().
stoogeObject obj = new stoogeObject("Larry", 123);
Thread stooge1 = new Thread(obj);
stooge1.start();
stooge1.join();
System.out.println(obj.getName());
Here the System.out.println(obj.getName()) statement is executed only after the thread is done.
I was surprised when the stooge1.getName() line in main() produced "Thread-0", not "Larry". I was expecting the getName() method I wrote in stoogeObject.java to override and return the instance variable String name in this instance. Instead, I'm getting the name of the Thread, not the StoogeObject.
How is this surprising? You never set the thread's name, then you call stooge1.getName(), and stooge1 is the Thread, and you're getting precisely what you asked for: "the name of the Thread".
The only thing the Thread knows about the Runnable that you pass it is that it has a run() method, it doesn't know or care about any other things you've added to your Runnable implementation.
If you want to set the thread's name, either use the Thread constructor that takes a name:
Thread stooge1 = new Thread(new StoogeObject(...), "Thread's Name");
Or set its name later:
stooge1.setName("Thread's Name");
So... The stooge1 thread has a StoogeObject within it, but I don't know how to access its instance variables.
It's up to you to store and manage your StoogeObjects, and Titus' answer covers this nicely. I just wanted to add a bit on top of that to answer your thread name related question.
As a side note: Once you wrap your head around the fundamentals, check out the official high-level concurrency tutorial, particularly the section on "executors". The Java API provides a few really convenient high-level constructs for concurrency that you might find useful in certain situations.
I have gotten my code into a state where I am creating a couple of threads and then inside those threads I use a library framework which spawns some additional threads over the life span of my application.
I have no control over how many threads are spawned inside the library framework, but I know they exist because I can see them in the eclipse debugger, I have kept the threads I use outside the library framework to a minimum, because I really don't want a multithreaded application, but sometimes you have too.
Now I am at the point where I need to do things with sockets and I/O, both of which are inherently hard to deal with in a multithreaded environment and while I am going to make my program thread safe i'd rather not get into the situation in the first place, or at least minimize the occurrences, the classes I am attempting to reduce multithreading in aren't time sensitive and i'd like them to complete "when they get the time". As it happens the lazy work is all in the same class definition but due to reasons, the class is instantiated a hell of a lot.
I was wondering if it was possible to make single type classes use only one thread when instantiated from multiple threads, and how?
I imagine the only way to achieve this would be to create a separate thread specifically for handling and processing of a instances of single class type.
Or do I just have to think of a new way to structure my code?
EDIT: included an example of my applications architecture;
public class Example {
public ArrayList<ThreadTypeA> threads = new ArrayList<ThreadTypeA>();
public static void main(String[] args) {
threads.add(new ThreadTypeA());
// left out how dataObj gets to ThreadTypeB for brevity
dataObj data = new dataObj(events);
}
}
public ThreadTypeA {
public ArrayList<ThreadTypeB> newThreads = new ArrayList<ThreadTypeB>();
public Thread thread = new Thread(this, "");
}
public ThreadTypeB {
// left out how dataObj gets to ThreadTypeB for brevity
public libObj libObj = new Library(dataObj);
}
public Library {
public Thread thread = new Thread(this, "");
#Override
public void editMe(dataObj) {
dataObj.callBack();
}
}
public dataObj(events) {
public void callMe() {
for (Event event: events) {
event.callMe();
}
}
}
there are a number of different events that can be called, ranging from writing to files making sql queries, sending emails and using proprietary ethernet-serial comms. I wish all events to run on the same thread, sequentially.
Rather than having Threads, consider having Callable or Runnables. These are objects which represent the work that is to be done. Your code can pass these to a thread pool for execution - you'll get a Future. If you care about the answer, you'll call get on the future and your code will wait for the execution to complete. If it's a fire-and-forget then you can be assured it's queued and will get done in good time.
Generally it makes more sense to divorce your execution code from the threads that run it to allow patterns like this.
To restrict thread resources use a limited thread pool:
ExecutorService executor = Executors.newFixedThreadPool(4);
for (int i = 0; i < 100; ++i) {
executor.execute(new Runnable() { ... });
}
executor.shutdown();
Also the reuse of threads of such a pool is said to be faster.
It might be a far hope that the library does a similar thing, and maybe even has the thread pool size configurable.
I'm learning threads yet, but don't know much things.
I see that I need implement the Runnable interface and create various instances of the same class to each thread execute each one. It's correct?
If is correct, I need to create another class to contains the variables that will be accessed/shared by all threads?
EDIT: I need maintain some variables to coordinate the thread work, otherwise they will execute the same work. This will be one variable shared by all threads.
EDIT 2: this questions is related to this: How I make result of SQL querys with LIMIT different in each query? . I will need maintain the quantity of threads that have done a query to database to set the OFFSET parameter.
Each thread needs an instance of a Runnable to do its work, yes. In some cases the threads could share the same instance, but only if there is no state held within the instance that needs to differ between threads. Generally you will want different instances in each thread.
Threads should share as little state as possible to avoid problems, but if you do want to share state, in general you are right that you will need an instance or instances somewhere to hold that state.
Note that this shared state could also be held in class variables rather than instance variables.
There are many ways to solve this...this is really a question about Design Patterns.
Each thread could be provided via it's constructor an object or objects that describe its unique work.
Or you could provide the thread with a reference to a work queue from which they could query the next available task.
Or you could put a method in the class that implements Runnable that could be called by a master thread...
Many ways to skin this cat...I'm sure there are existing libraries for thread work distribution, configuration, etc.
Let's put all things on their places.
Statement new Thread(r) creates thread. But this thread still does not run. If you say"
Thread t = new Thread(r);
t.start();
you make thread to run, i.e. execute run() method of your runnable.
Other (equal) way to create and run thread is to inherit from class Thread and override default implementation of its run() method.
Now. If you have specific logic and you wish to run the same logic simultaneously in different threads you have to create different threads and execute their start() method.
If you prefer to implement Runnable interface and your logic does not require any parameters you even can create only one instance of your runnable implementation and run it into different threads.
public class MyLogic implements Runnable {
public void run() {
// do something.
}
}
//// ................
Runnable r = new MyLogic();
Thread t1 = new Thread(r);
Thread t2 = new Thread(r);
t1.start();
t2.start();
Now this logic is running simultaniusly in 2 separate threads while we created only one instance of MyLogic.
If howerver your logic requires parameters you should create separate instances.
public class MyLogic implements Runnable {
private int p;
public MyLogic(int p) {
this.p = p;
}
public void run() {
// this logic uses value of p.
}
}
//// ................
Thread t1 = new Thread(new MyLogic(111));
Thread t2 = new Thread(new MyLogic(222));
t1.start();
t2.start();
These 2 threads run the same logic with different arguments (111 and 222).
BTW this example shows how to pass values to thread. To get information from it you should use similar method. Define member variable result. The variable will be initiated by method run(). Provide appropriate getter. Now you can pass result from thread to anyone that is interesting to do this.
Obviously described above are basics. I did not say anything about synchronization, thread pools, executors etc. But I hope this will help you to start. Then find some java thread tutorial and go through it. In couple of days you will be the world class specialist in java threads. :)
Happy threading.
We have 1000 threads that hit a web service and time how long the call takes. We wish for each thread to return their own timing result to the main application, so that various statistics can be recorded.
Please note that various tools were considered for this, but for various reasons we need to write our own.
What would be the best way for each thread to return the timing - we have considered two options so far :-
1. once a thread has its timing result it calls a singleton that provides a synchronised method to write to the file. This ensures that all each thread will write to the file in turn (although in an undetermined order - which is fine), and since the call is done after the timing results have been taken by the thread, then being blocked waiting to write is not really an issue. When all threads have completed, the main application can then read the file to generate the statistics.
2. Using the Executor, Callable and Future interfaces
Which would be the best way, or are there any other better ways ?
Thanks very much in advance
Paul
Use the latter method.
Your workers implement Callable. You then submit them to a threadpool, and get a Future instance for each.
Then just call get() on the Futures to get the results of the calculations.
import java.util.*;
import java.util.concurrent.*;
public class WebServiceTester {
public static class Tester
implements Callable {
public Integer call() {
Integer start = now();
//Do your test here
Integer end = now();
return end - start;
}
}
public static void main(String args[]) throws Exception {
ExecutorService pool = Executors.newFixedThreadPool(1000);
Set<Future<Integer>> set = new HashSet<Future<Integer>>();
for (int i =0 ; i < 1000 i++) {
set.add(pool.submit(new Tester()));
}
Set<Integer> results = new Set<Integer>();
for (Future<Integer> future : set) {
results.put(future.get());
}
//Manipulate results however you wish....
}
}
Another possible solution I can think of would be to use a CountDownLatch (from the java concurrency packages), each thread decrementing it (flagging they are finished), then once all complete (and the CountDownLatch reaches 0) your main thread can happily go through them all, asking them what their time was.
The executor framework can be implemented here. The time processing can be done by the Callable object. The Future can help you identify if the thread has completed processing.
You could pass an ArrayBlockingQueue to the threads to report their results to. You could then have a file writing thread that takes from the queue to write to the file.
I understand the concept behind threading and have written threads in other languages, but I am having trouble understanding how to adapt them to my needs in java.
Basicly at present I have a vector of objects, which are read in from a file sequentially.
The file then has a list of events, which need to happen concurrently so waiting for one event to finish which takes 20-30 seconds is not an option.
There is only a couple of methods in the object which deal with these events. However from looking at tutorials, objects must extend/implement threads/runnable however if the object is in a thread making a method call to that object seems to happen sequentially anyway.
An y extra information would be appreciated as I am clearly missing something I am just not quite sure what!
So to summarise how can I execute a single method using a thread?
To start a thread you call start() on an instance of Thread or a subclass thereof. The start() method returns immediately. At the same time, the other thread (the one incarnated by the Thread instance) takes off, and proceeds with executing the run() method of the Thread instance.
Managing threads is not as easy as it seems. For a smoother API, try using an Executor (see the classes in java.util.concurrent).
The best thing to do in Java is create another class that takes in the data you need to process and performs whatever you need it to perform:
class Worker implements Runnable{
Object mydata;
Worker(Object data)
{
mydata = data;
}
#override
void run()
{
//process the data
System.out.println(data.toString());
//or if you want to use your class then:
YourClass yc = (YourClass)myData;
yc.methodB();
}
}
class YourClass
{
private final ExecutorService executor = Executors.newCachedThreadPool();
private ArrayList<Object> list;
YourClass()
{
list = new ArrayList<Object>();
list.add(new Object());
...
...
list.add(new Object());
}
void methodA()
{
for(Object item : list )
{
// Create a new thread with the worker class taking the data
executor.execute(new Worker(item));
}
}
void methodB(){/*do something else here*/}
}
Note that instead of getting the data, you can pass the actual class that you need the method to be invoked on:
executor.execute(new Worker(new MyClass()));
In the run method of the Worker class you invoke whatever you need to invoke on MyClass... the executor creates a new thread and calls run on your Worker. Each Worker will run in a separate thread and it will be parallel.
Thomas has already given the technical details. I am going to try and focus on the logic.
Here is what I can suggest from my understanding of your problem.
Lets say you have a collection of objects of type X (or maybe even a mix of different types). You need to call methods foo and/or bar in these objects based on some event specified. So now, you maybe have a second collection that stores those.
So we have two List objects (one for the X objects and other for the events).
Now, we have a function execute that will take X, and the event, and call foo or bar. This execute method can be wrapped in a thread, and executed simultaneously. Each of these threads can take one object from the list, increment the counter, and execute foo/bar. Once done, check the counter, and take the next one from the list. You can have 5 or more of these threads working on the list.
So, as we see, the objects coming from file do not have to be the Thread objects.
You have to be very careful that the List and counter are synchronized. Much better data structures are possible. I am sticking to a crude one for ease of understanding.
Hope this helps.
The key to threads is to remember that each task that must be running must be in its own thread. Tasks executing in the same thread will execute sequentially. Dividing the concurrent tasks among separate threads will allow you to do your required cocurrent processing.