I understand the concept behind threading and have written threads in other languages, but I am having trouble understanding how to adapt them to my needs in java.
Basicly at present I have a vector of objects, which are read in from a file sequentially.
The file then has a list of events, which need to happen concurrently so waiting for one event to finish which takes 20-30 seconds is not an option.
There is only a couple of methods in the object which deal with these events. However from looking at tutorials, objects must extend/implement threads/runnable however if the object is in a thread making a method call to that object seems to happen sequentially anyway.
An y extra information would be appreciated as I am clearly missing something I am just not quite sure what!
So to summarise how can I execute a single method using a thread?
To start a thread you call start() on an instance of Thread or a subclass thereof. The start() method returns immediately. At the same time, the other thread (the one incarnated by the Thread instance) takes off, and proceeds with executing the run() method of the Thread instance.
Managing threads is not as easy as it seems. For a smoother API, try using an Executor (see the classes in java.util.concurrent).
The best thing to do in Java is create another class that takes in the data you need to process and performs whatever you need it to perform:
class Worker implements Runnable{
Object mydata;
Worker(Object data)
{
mydata = data;
}
#override
void run()
{
//process the data
System.out.println(data.toString());
//or if you want to use your class then:
YourClass yc = (YourClass)myData;
yc.methodB();
}
}
class YourClass
{
private final ExecutorService executor = Executors.newCachedThreadPool();
private ArrayList<Object> list;
YourClass()
{
list = new ArrayList<Object>();
list.add(new Object());
...
...
list.add(new Object());
}
void methodA()
{
for(Object item : list )
{
// Create a new thread with the worker class taking the data
executor.execute(new Worker(item));
}
}
void methodB(){/*do something else here*/}
}
Note that instead of getting the data, you can pass the actual class that you need the method to be invoked on:
executor.execute(new Worker(new MyClass()));
In the run method of the Worker class you invoke whatever you need to invoke on MyClass... the executor creates a new thread and calls run on your Worker. Each Worker will run in a separate thread and it will be parallel.
Thomas has already given the technical details. I am going to try and focus on the logic.
Here is what I can suggest from my understanding of your problem.
Lets say you have a collection of objects of type X (or maybe even a mix of different types). You need to call methods foo and/or bar in these objects based on some event specified. So now, you maybe have a second collection that stores those.
So we have two List objects (one for the X objects and other for the events).
Now, we have a function execute that will take X, and the event, and call foo or bar. This execute method can be wrapped in a thread, and executed simultaneously. Each of these threads can take one object from the list, increment the counter, and execute foo/bar. Once done, check the counter, and take the next one from the list. You can have 5 or more of these threads working on the list.
So, as we see, the objects coming from file do not have to be the Thread objects.
You have to be very careful that the List and counter are synchronized. Much better data structures are possible. I am sticking to a crude one for ease of understanding.
Hope this helps.
The key to threads is to remember that each task that must be running must be in its own thread. Tasks executing in the same thread will execute sequentially. Dividing the concurrent tasks among separate threads will allow you to do your required cocurrent processing.
Related
My questions are:
Does a Java program, by default, cause creation of only 1 thread?
If yes, and if we create a multi threaded program, when do multiple threads access the same code of a Java object?
For example I have a Java program with 2 methods - add() and sub(). In what scenario will 2 or more threads run the 'add()' method?
Isn't code always thread safe, as multiple threads will access different sections of code?
If not, please show an example program where thread safety is a concern.
Don't think of "sections of code", think of where the data lives and how many threads are accessing that actual data.
Local variables live on the stack of the thread they are being used in and are thread safe since they are different data "containers" per thread.
Any data that lives on the heap, like instance or static fields, are not inherently thread-safe because if more than one thread accesses that data then they might have contention.
We could get more complicated and talk about where the data really is but this basic explanation should give you a good idea of what's going on.
The below code gives an example of an instance that is shared by two threads, in this case both threads are accessing the same array list, which is pointing to the same array data containers in the heap. Run it a couple times and you'll eventually see a failure. If you comment out one of the threads it will work correctly every time, counting down from 99.
import java.util.ArrayList;
import java.util.List;
public class Main {
public static void main(String[] args) {
MyRunnable r = new MyRunnable();
new Thread(r).start();
new Thread(r).start();
}
public static class MyRunnable implements Runnable {
// imagine this list living out in the heap and both threads messing with it
// this is really just a reference, but the actual data is in the heap
private List<Integer> list = new ArrayList<>();
{ for (int i = 0; i < 100; i++) list.add(i); }
#Override public void run() {
while (list.size() > 0) System.out.println(list.remove(list.size() - 1));
}
}
}
1) Does a Java program, by default, cause creation of only 1 thread?
Really depends on what your code is doing. A simple System.out.println() call might probably just create one thread. But as soon as you for example raise a Swing GUI window, at least one other thread will be around (the "event dispatcher thread" that reacts to user input and takes care of UI updates).
2) If yes, and if we create a multi threaded program, when do multiple threads access the same code of a Java object?
Misconception on your end. Objects do not have code. Basically, a thread will run a specific method; either its own run() method, or some other method made available to it. And then the thread just executes that method, and any other method call that is triggered from that initial method.
And of course, while running that code, that thread might create other objects, or manipulate the status of already existing objects. When each thread only touches a different set of objects, then no problems arise. But as soon as more than one thread deals with the same object state, proper precaution is required (to avoid indeterministic behavior).
Your question suggests that you might not fully understand what "thread" means.
When we learned to program, they taught us that a computer program is a sequence of instructions, and they taught us that the computer executes those instructions one-by-one, starting from some well-defined entry point (e.g., the main() routine).
OK, but when we talk about multi-threaded programs, it no longer is sufficient to say that "the computer" executes our code. Now we say that threads execute our code. Each thread has its own idea of where it is in your program, and if two or more threads happen to be executing in the same function at the same time, then each of them has its own private copy of the function's arguments and local variables.
So, You asked:
Does a Java program, by default, cause creation of only 1 thread?
A Java program always starts with one thread executing your code, and usually several other threads executing JVM code. You don't normally need to be aware of the JVM threads. The one thread that executes your code starts its work at the beginning of your main() routine.
Programmers often call that initial thread the "main thread." Probably they call it that because it calls main(), but be careful! The name can be misleading: The JVM doesn't treat the "main thread" any differently from any other thread in a multi-threaded Java program.
if we create a multi threaded program, when do multiple threads access the same code of a Java object?
Threads only do what your program tells them to do. If you write code for two different threads to call the same function, then that's what they will do. But, let's break that question down a bit...
...First of all, how do we create a multi-threaded program?
A program becomes multi-threaded when your code tells it to become multi-threaded. In one simple case, it looks like this:
class MyRunnable implements Runnable {
public void run() {
DoSomeUsefulThing();
DoSomeOtherThing();
}
}
MyRunnable r = new MyRunnable();
Thread t = new Thread(r);
t.start();
...
Java creates a new thread when some other thread in your program calls t.start(). (NOTE! The Thread instance, t, is not the thread. It is only a handle that your program can use to start the thread and inquire about its thread's state and control it.)
When the new thread starts executing program instructions, it will start by calling r.run(). As you can see, the body of r.run() will cause the new thread to DoSomeUsefulThing() and then DoSomeOtherThing() before r.run() returns.
When r.run() returns, the thread is finished (a.k.a., "terminated", a.k.a., "dead").
So,
when do multiple threads access the same code of a Java object?
When your code makes them do it. Let's add a line to the example above:
...
Thread t = new Thread(r);
t.start();
DoSomeUsefulThing();
...
Note that the main thread did not stop after starting the new thread. It goes on to execute whatever came after the t.start() call. In this case, the next thing it does is to call DoSomeUsefulThing(). But that's the same as what the program told the new thread to do! If DoSomeUsefulThing() takes any significant time to complete, then both threads will be doing it at the same time... because that's what the program told them to do.
please show an example program where thread safety is a concern
I just did.
Think about what DoSomeUsefulThing() might be doing. If it's doing something useful, then it almost certainly is doing something to some data somewhere. But, I didn't tell it what data to operate on, so chances are, both threads are doing something to the same data at the same time.
That has a lot of potential to not turn out well.
One way to fix that is to tell the function what data to work on.
class MyDataClass { ... }
Class MyRunnable implements Runnable {
private MyDataClass data;
public MyRunnable(MyDataClass data) {
this.data = data;
}
public void run() {
DoSomeUsefulThingWITH(data);
DoSomeOtherThingWITH(data);
}
}
MyDataClass dat_a = new MyDataClass(...);
MyDataClass dat_b = new MyDataClass(...);
MyRunnable r = new MyRunnable(dat_a);
Thread t = new Thread(r);
t.start();
DoSomeUsefulThingWITH(dat_b);
There! Now the two threads are doing the same thing, but they are doing it to different data.
But what if you want them to operate on the same data?
That's a topic for a different question. Google for "mutual exclusion" to get started.
Depends on the implementation. Only one thread (the "main thread") will invoke the public static void main(String[]) method, but that doesn't mean other threads weren't started for other tasks.
A thread will access the "same code" if you program it to do so. I'm not sure what your idea of "section of code" is or where the idea that two threads will never access the same "section" at the same time comes from, but it's quite trivial to create thread-unsafe code.
import java.util.ArrayList;
import java.util.List;
public class Main {
public static void main(String[] args) throws InterruptedException {
List<Object> list = new ArrayList<>();
Runnable action = () -> {
while (true) {
list.add(new Object());
}
};
Thread thread1 = new Thread(action, "tread-1");
thread1.setDaemon(true); // don't keep JVM alive
Thread thread2 = new Thread(action, "thread-2");
thread2.setDaemon(true); // don't keep JVM alive
thread1.start();
thread2.start();
Thread.sleep(1_000L);
}
}
An ArrayList is not thread-safe. The above code has two threads constantly trying to add a new Object to the same ArrayList for approximately one second. It's not guaranteed, but if you run that code you might see an ArrayIndexOutOfBoundsException or something similar. Regardless of any exceptions being thrown, the state of the ArrayList is in danger of being corrupted. This is because state is updated by multiple threads with no synchronization.
If I have a class that extends Thread with static methods on it (this is very simplified):
public class MyThread extends Thread {
private static long SLEEP_INT = 30000;
private static Map<Integer, String> myData;
//every 30 seconds, update map of data
public void run() {
while(isActive) {
try {
populateDataFromDB();
Thread.sleep( SLEEP_INT );
}
catch( Exception e ) {
//do nothing
}
}
}
//static method to update map of data
public static void populateDataFromDB() {
//do stuff here, setting values in myData
}
}
and then somewhere else in my application I have:
MyThread.populateDataFromDB();
If I know that there is only one instance of the MyThread class in my application, is it still necessary to write synchronized code inside of populateDataFromDB in order to ensure thread safety?
You do need synchronization, because you will have more than one thread that is accessing the data held in MyThread.myData. You have shown us one thread, which is going to periodically read from your database and fill in your Map. You would only do this if you have something that is going to utilize this data.
You do not want the threads that use the Map to ever see a half-filled map, or a map that contains inconsistent state. To be safe, you would want to use synchronization to keep threads from reading myData while the MyThread thread is updating it every 30 seconds.
In other words, just because you only have one instance of a given class doesn't necessarily mean that you do not need synchronization. You need synchronization because you have multiple threads (each perhaps running different code) that are accessing the same data. You probably can allow all the readers of the data to access the data at the same time, but ensure exclusive access during the operation that writes to the data structure.
No, you never need synchronization when you have only one thread. (And main thread is the first thread in the application)
BUT When you do thread.start from your main thread, then you have 2 threads in your system. If for some reason your threads (new and main thread) are trying write a on memory which both threads have access to, then you want to serialize the access of threads on that shared memory. How to serialize the access is where synchronization helps.
So in your example if populateDataFromDB tries to modify that shared data and I assume that you may calling this from the new thread (inside run)and you also want to access that populateDataFromDB from main thread(I assumed that as you said "then somewhere else in my application I have:"), then you definitely needs synchronization.
I have a class which basically does the same series of steps twice. Sounds like a perfect example of where to multithread your program. My question is though if I can do this with only two threads. Here is the general jist of things
Tester implements Runnable{
Thread obj1Thread, obj2Thread;
MyObj obj1, obj2;
String obj1Results, obj2Results;
void runTests(){
obj1Thread = new Thread(this, "ob1 thread");
obj2Thread = new Thread(this, "ob2 thread");
obj1.start();//builds up obj1
obj2.start();//builds up obj2
if(obj1 and obj2 are finished building){
System.out.println(obj1);
System.out.println(obj2);
}
obj1Thread.startSecondPhase()//runs a separate function that tests obj1 vs ob2. Essentially test(ob1, ob2)
obj2Thread.startSecondPhase()//runs a separate function that tests obj2 vs ob1. Essentially test(ob2, ob1)
if(obj1 and obj2 are finished testing){
System.out.println(obj1Results);
System.out.println(obj2Results);
}
}
}
I have gotten the first part - building up the objects - working. My questions are now -
How can I get the main thread to wait for the two threads to finish their first part? Perhaps the main would do a wait on both objects and then after the threads notifyAll they do a wait on the main thread? But then how do the threads get a hold of the main thread? Perhaps with this?
How can I have this 'second phase' of the run function without making a new class with a new thread and a new specific run function? I dont want to have to make a new class and everything for every little task.
To clarify the sequence of events I want specifically is -
Main thread initializes and starts two threads
Both threads simultaneously build their respective objects
When both threads finish building they pause. Then main thread prints the objects out in order.
After main thread is done, the two threads continue their code to a testing phase simultaneously
When the threads are done the main thread prints the results out. Could probably use a join() here
Edit: Also, how can I tell the specific threads which objects I want them to work on? Right now Im doing it in a kinda hacky way (i'm working off the thread name).
I would use higher-level abstractions: use an execute and ExecutorService.invokeAll(Collection<? extends Callable<T>> tasks), which returns a list of Future.
Your main thread can
dispatch two tasks,
obtain two futures,
print the results, then
dispatch two more tasks
The executor service and futures will handle all the concurrency under the hood.
EDIT:
I see your comment:
A special Runnable class just to implement basically one line of code?
Perhaps I'm being too idealistic but that feels wrong to me.
You typically use ananymous inner classes in such case:
Future future = executorService.submit(new Runnable() {
public void run() {
System.out.println("Asynchronous task");
}
});
Nothing wrong with that. When Java has lambda it will become even shorter.
Future future = executorService.submit(() -> {System.out.println("Asynchronous task");});
Here's my problem:
I have a whole bunch of identical objects. These objects interface with a server. It takes 4 or 5 seconds to get data from the server, as the server is slow.
Now i need all the objects to get data. So i call MyObject.getData() for each object. I could do it in a series but 20 objects, each taking 5 seconds is too slow. I thought I should use threads and have each object on its own thread.
Here's my question:
If i make the objects extend thread. Will a call to o MyObject.getData(); run in that object's thread, or in the thread the method was called from? I know i can use Thread.Run() to get the object going but thats not what i want. I want to get methods running at my will.
So how do i do this?
Thanks so much.
The text book way to do this could be something like this:
class GetDataObj implements Callable<Data> {
public Data call(){
//get data
return data;
}
}
then
ExecutorService exec = Executors.newCachedThreadPool();
Set<Callable<Data>> objects = //get objects;
List<Future<Data>> futures = exec.invokeAll(objects);
for(Future<Data>> future : futures){
Data data = future.get();
//do stuff with data
}
exec.shutdown();
Note that when you iterate through futures, the get() method will block until the result is available for that DataObj. If you want to wait until all data are available, this is fine.
If you call object.myMethod(), the method will run in the caller thread.
You have to start() the thread to make it run, not call the run() method.
A think you can do, is rewriting your object so that the myMethod() method launch a new thread. So you can use your objects exactly the same as actually. But if myMethod return something, that change, because you have to wait for the thread to terminate befor
I think it would be best to use a thread pool to get data from the objects. Therefore you would need each object to implement Runnable. See thread pools.
If you pass a reference to a queue to the objects once they have got the data they can place it in the queue. The main thread can then just take the data off the queue when it is ready.
See the producer-consumer pattern.
An example of this is:
BlockingQueue<Data> queue = new BlockingQueue<Data>();
ExecutorService pool = Executors.newFixedThreadPool(5);
//implements Runnable, getting data from this
//places Data object in queue instead of returning it
DataObject obj = new DataObject(queue);
pool.execute(obj); //invokes the run method of the DataObject
Data data = queue.take();
You would need to have a for-loop for more than one object.
Hope this helps.
I'm learning threads yet, but don't know much things.
I see that I need implement the Runnable interface and create various instances of the same class to each thread execute each one. It's correct?
If is correct, I need to create another class to contains the variables that will be accessed/shared by all threads?
EDIT: I need maintain some variables to coordinate the thread work, otherwise they will execute the same work. This will be one variable shared by all threads.
EDIT 2: this questions is related to this: How I make result of SQL querys with LIMIT different in each query? . I will need maintain the quantity of threads that have done a query to database to set the OFFSET parameter.
Each thread needs an instance of a Runnable to do its work, yes. In some cases the threads could share the same instance, but only if there is no state held within the instance that needs to differ between threads. Generally you will want different instances in each thread.
Threads should share as little state as possible to avoid problems, but if you do want to share state, in general you are right that you will need an instance or instances somewhere to hold that state.
Note that this shared state could also be held in class variables rather than instance variables.
There are many ways to solve this...this is really a question about Design Patterns.
Each thread could be provided via it's constructor an object or objects that describe its unique work.
Or you could provide the thread with a reference to a work queue from which they could query the next available task.
Or you could put a method in the class that implements Runnable that could be called by a master thread...
Many ways to skin this cat...I'm sure there are existing libraries for thread work distribution, configuration, etc.
Let's put all things on their places.
Statement new Thread(r) creates thread. But this thread still does not run. If you say"
Thread t = new Thread(r);
t.start();
you make thread to run, i.e. execute run() method of your runnable.
Other (equal) way to create and run thread is to inherit from class Thread and override default implementation of its run() method.
Now. If you have specific logic and you wish to run the same logic simultaneously in different threads you have to create different threads and execute their start() method.
If you prefer to implement Runnable interface and your logic does not require any parameters you even can create only one instance of your runnable implementation and run it into different threads.
public class MyLogic implements Runnable {
public void run() {
// do something.
}
}
//// ................
Runnable r = new MyLogic();
Thread t1 = new Thread(r);
Thread t2 = new Thread(r);
t1.start();
t2.start();
Now this logic is running simultaniusly in 2 separate threads while we created only one instance of MyLogic.
If howerver your logic requires parameters you should create separate instances.
public class MyLogic implements Runnable {
private int p;
public MyLogic(int p) {
this.p = p;
}
public void run() {
// this logic uses value of p.
}
}
//// ................
Thread t1 = new Thread(new MyLogic(111));
Thread t2 = new Thread(new MyLogic(222));
t1.start();
t2.start();
These 2 threads run the same logic with different arguments (111 and 222).
BTW this example shows how to pass values to thread. To get information from it you should use similar method. Define member variable result. The variable will be initiated by method run(). Provide appropriate getter. Now you can pass result from thread to anyone that is interesting to do this.
Obviously described above are basics. I did not say anything about synchronization, thread pools, executors etc. But I hope this will help you to start. Then find some java thread tutorial and go through it. In couple of days you will be the world class specialist in java threads. :)
Happy threading.