Java: Am I Missing the Point of Threads? (Objects within Threads)

Java: Am I Missing the Point of Threads? (Objects within Threads) - java

I'm trying to learn threads in Java. I've followed two different tutorials, but I feel like I'm not really getting the concept. As I understand it, when you create threads, you use the Thread class, and then embedding your own object within the thread. I can do that, but I can't figure out how to access the instance variables within the "embedded" object.
Suppose as a learning exercise, I wanted to create three threads which would go off and do work individually of one another. I could define this object to "drive" the threads:
public class StoogeObject implements Runnable {
String name;
int data;
public StoogeObject(String name, int data){
this.name=name;
this.data=data;
}
#Override
public void run() {
System.out.println("Thread "+this.name+" has this data: "+this.data);
// Do useful work here
System.out.println("Thread "+this.name+" is exiting...");
}
public String getName(){
return this.name;
}
}
Then, in a driver program, I would launch my threads:
public class driver {
public static void main(String[] args){
Thread stooge1 = new Thread(new StoogeObject("Larry", 123));
Thread stooge2 = new Thread(new StoogeObject("Curly", 456));
Thread stooge3 = new Thread(new StoogeObject("Moe", 789));
stooge1.start();
stooge2.start();
stooge3.start();
if(stooge1.isAlive())
System.out.println("From main(): "+stooge1.getName());
}
}
Output is:
From main(): Thread-0
Thread Larry has this data: 123
Thread Curly has this data: 456
Thread Moe has this data: 789
Thread Larry is exiting...
Thread Curly is exiting...
Thread Moe is exiting...
I was surprised when the stooge1.getName() line in main() produced "Thread-0", not "Larry". I was expecting the getName() method I wrote in stoogeObject.java to override and return the instance variable String name in this instance. Instead, I'm getting the name of the Thread, not the StoogeObject.
So... The stooge1 thread has a StoogeObject within it, but I don't know how to access its instance variables. More significantly, this example makes me wonder if I'm missing the point of threads. If I want my Larry, Curly, & Moe objects to go off and do productive work AND keep their own instance variables, is using threads the wrong way to go here? Should I start over, making these objects into processes?

I can't figure out how to access the instance variables within the "embedded" object.
You access them in exactly the same way that you would access the instance variables of any other object.
The "embedded" object, FYI, is called the thread's target or the thread's delegate.
There is nothing special about the target of a Thread. It's just an object.
I was surprised when the stooge1.getName() line in main() produced "Thread-0", not "Larry". I was expecting ... Instead, I'm getting the name of the Thread, not the StoogeObject.
That's because the Thread object and the StoogeObject are different objects.
this example makes me wonder if I'm missing the point of threads.
There are two different ways that threads can be used in a program. The first (which is what people think of more often than not) is that threads are how a Java program can make use of more than one CPU if your platform has more than one CPU. (Virtually all modern servers and workstations have more than one these days, and it's getting to where a lot of cell phones and tablets have more than one as well.) If your platform has, say eight CPUs, then up to eight of your threads may be able to run simultaneously if that many of them are "ready to run."
The second way to use threads in a program is to wait for things. For example, if your program is a server that has to waits for input from each of N clients, and respond to it; you can structure it as N threads that each just listen to and respond to one client. That often makes the code easier to understand. (Just like, it's easier to juggle one ball than it is to juggle N balls).
is using threads the wrong way to go here? Should I start over, making these objects into processes?
Threads can be much more tightly coupled than processes because the threads of a single program all share the same virtual address space (i.e., in a Java program, they all share the same heap). Communication between threads usually is one or two orders of magnitude faster than communication between different processes on the same machine.
If you need fine-grained communication between them, then they definitely should be threads. A good rule of thumb is that, an application should never spawn a new process unless there is a really good reason why it should not be just another thread.

If you want to access the runnable object that you pass to the thread, you need to keep a reference to it.
Here is an example:
stoogeObject obj = new stoogeObject("Larry", 123);
Thread stooge1 = new Thread(obj);
stooge1.start();
System.out.println(obj.getName());
This will print Larry.
Keep in mind that if the name variable from the stoogeObject instance is changed during the thread's runtime, you'll have to wait for that thread to finish (or finish changing the variable) in order to get the correct value.
You can do that by using join().
stoogeObject obj = new stoogeObject("Larry", 123);
Thread stooge1 = new Thread(obj);
stooge1.start();
stooge1.join();
System.out.println(obj.getName());
Here the System.out.println(obj.getName()) statement is executed only after the thread is done.

I was surprised when the stooge1.getName() line in main() produced "Thread-0", not "Larry". I was expecting the getName() method I wrote in stoogeObject.java to override and return the instance variable String name in this instance. Instead, I'm getting the name of the Thread, not the StoogeObject.
How is this surprising? You never set the thread's name, then you call stooge1.getName(), and stooge1 is the Thread, and you're getting precisely what you asked for: "the name of the Thread".
The only thing the Thread knows about the Runnable that you pass it is that it has a run() method, it doesn't know or care about any other things you've added to your Runnable implementation.
If you want to set the thread's name, either use the Thread constructor that takes a name:
Thread stooge1 = new Thread(new StoogeObject(...), "Thread's Name");
Or set its name later:
stooge1.setName("Thread's Name");
So... The stooge1 thread has a StoogeObject within it, but I don't know how to access its instance variables.
It's up to you to store and manage your StoogeObjects, and Titus' answer covers this nicely. I just wanted to add a bit on top of that to answer your thread name related question.
As a side note: Once you wrap your head around the fundamentals, check out the official high-level concurrency tutorial, particularly the section on "executors". The Java API provides a few really convenient high-level constructs for concurrency that you might find useful in certain situations.

Related

Access vs execute thread's method [duplicate]

This question already has an answer here:
NetworkOnmainThreadException android+Rxjava
(1 answer)
Closed 10 months ago.
Does this line:
ClientThread.ClientSocket.getInputStream().read()
takes ClientThread's method and executes it on MainUI thread, or does it tell ClientThread to execute it?
It results in MainThreadNetworking Exception so I believe it's the first option, sadly. If so, how can I execute that method in the ClientThread instead?

There is no way to tell another thread to execute something. So, no, the above simply runs it in your thread. Note, there's the concept 'Thread -a thing that runs on an OS core' and 'java.lang.Thread - an object that represents that concept'. They are not quite identical. I'll use j.l.Thread to indicate the Thread object and just thread for the OS concept. j.l.Thread is just an object, it's not particularly magical. Just interacting with a j.l.Thread instance is just.. interacting with an object. It doesn't convey magical 'anything you do with this object somehow runs in a separate thread' powers.
There are only only 2 ways to tell another thread (OS-level or j.l.Thread) to execute something, i.e., only 2 ways to use j.l.Thread to actually run stuff in separate threads:
When creating a j.l.Thread, you pass along the 'run()' code (either passing a Runnable, or extending Thread and supplying a run() method). When some thread runs the code thatThreadObject.start() (note: not run!) - then that starts an actual thread, and that newly created thread will, after initializing, begin by executing the code in that run() method of the j.l.Thread object.
By programming it. In other words, perhaps some thread looks like this:
private class Runner extends Thread {
private final ArrayBlockingQueue <Runnable> queue = new ArrayBlockingQueue<>(1000);
public void offer(Runnable r) {
queue.add(r);
}
#Override public void run() {
while (true) {
Runnable r = queue.take();
r.run();
}
}
}
Then, imagine some code is running in a thread (not the thread represented by the j.l.Thread instance made from the above code), and calls thatThreadObj.offer(() -> System.out.println("Hello!");...
then eventually the thread represented by that j.l.Thread will run it. Because you programmed it to do this.
If you want something like that, note that the java.util.concurrent package has implementations of this concept (ExecutorPool and friends). Don't write it yourself.

How to use name pattern in BasicThreadFactory?

I am working on a multithreaded application that reads data from a number of sources, does some calculations and writes results to several outputs. I do have several reader threads, several calculation threads and several writers. Number of each types of threads are given in configuration.
I would like to have these threads named accordingly: "reader-1", "reader-2", "writer-1", etc.
So, I wanted to use org.apache.commons.lang3.concurrent.BasicThreadFactory for this purpose.
I did write the following code:
BasicThreadFactory threadFactory = new BasicThreadFactory.Builder()
.namingPattern("%s-%d")
.daemon(false)
.priority(Thread.MAX_PRIORITY)
.build();
ExecutorService executors = Executors.newFixedThreadPool(config.getPoolSize(), threadFactory);
However, I cannot find anywhere how can I specify name and number of the working thread upon submission.
I searched hundreds of links and did not see a single example of how do I the do it.
I am creating my Callable
Callable reader = new BatchFileReader(config);
for(int i = 1; i <= maxReaders; i++) {
executors.submit(reader);
}
Submit method does not have any other parameters except Runnable/Callable instance. I cannot figure out where I can specify string "reader" and it sequential number to the thread factory.
If anyone can give me a hint, I will appreciate it greatly.

If you really want to set the name of the Thread which executes a submitted Callable/Runnable just give the name to the Callable/Runnable and let it set the name of the Thread it is running on, by just calling Thread.currentThread().setName(...), ie:
public class BatchFileReader implements Runnable {
private final String name;
public BatchFileReader(final String name /*, other arguments go here...*/) {
this.name = Objects.requireNonNull(name);
}
#Override
public void run() {
Thread.currentThread().setName(name);
//Do your work here...
}
}
I am expecting this to work independently of any Executor implementation, assuming that each Thread can only run one given task at a time, which as far as I know is the case indeed. If a single Thread may run more than one Callable/Runnable at a time in (pseudo-)parallel then this approach won't work.
I also looked at subclassing each provided Executor implementation, but that would be a bit more pain. At least on ThreadPoolExecutor you can achieve this by subclassing it and overriding ThreadPoolExecutor#beforeExecute and both AbstractExecutorService#newTaskFor methods in case you are using any submit method to submit Callables/Runnables. If you are just using execute for a Runnable then you will only need to override ThreadPoolExecutor#beforeExecute. But then again you would have to keep a reference to the Runnable's name inside it (so that you can pass it over to the Thread in ThreadPoolExecutor#beforeExecute).
Another thing I noticed lies on the question if you really have to rely on Threads for their name at all... I mean if you are just interacting with the Callables/Runnables inside their call/run method that you provide, then why not just simply provide them with their name and that's all? I mean I don't know why you need at all to provide a name to internally handled Threads anyway, by the time you are only interacting with their Callables/Runnables (assuming you are doing so). As far as I understand, you could just provide an accessor method for the name (ie a getName) in your BatchFileReader implementations.

When do multiple threads access the same code?

My questions are:
Does a Java program, by default, cause creation of only 1 thread?
If yes, and if we create a multi threaded program, when do multiple threads access the same code of a Java object?
For example I have a Java program with 2 methods - add() and sub(). In what scenario will 2 or more threads run the 'add()' method?
Isn't code always thread safe, as multiple threads will access different sections of code?
If not, please show an example program where thread safety is a concern.

Don't think of "sections of code", think of where the data lives and how many threads are accessing that actual data.
Local variables live on the stack of the thread they are being used in and are thread safe since they are different data "containers" per thread.
Any data that lives on the heap, like instance or static fields, are not inherently thread-safe because if more than one thread accesses that data then they might have contention.
We could get more complicated and talk about where the data really is but this basic explanation should give you a good idea of what's going on.
The below code gives an example of an instance that is shared by two threads, in this case both threads are accessing the same array list, which is pointing to the same array data containers in the heap. Run it a couple times and you'll eventually see a failure. If you comment out one of the threads it will work correctly every time, counting down from 99.
import java.util.ArrayList;
import java.util.List;
public class Main {
public static void main(String[] args) {
MyRunnable r = new MyRunnable();
new Thread(r).start();
new Thread(r).start();
}
public static class MyRunnable implements Runnable {
// imagine this list living out in the heap and both threads messing with it
// this is really just a reference, but the actual data is in the heap
private List<Integer> list = new ArrayList<>();
{ for (int i = 0; i < 100; i++) list.add(i); }
#Override public void run() {
while (list.size() > 0) System.out.println(list.remove(list.size() - 1));
}
}
}

1) Does a Java program, by default, cause creation of only 1 thread?
Really depends on what your code is doing. A simple System.out.println() call might probably just create one thread. But as soon as you for example raise a Swing GUI window, at least one other thread will be around (the "event dispatcher thread" that reacts to user input and takes care of UI updates).
2) If yes, and if we create a multi threaded program, when do multiple threads access the same code of a Java object?
Misconception on your end. Objects do not have code. Basically, a thread will run a specific method; either its own run() method, or some other method made available to it. And then the thread just executes that method, and any other method call that is triggered from that initial method.
And of course, while running that code, that thread might create other objects, or manipulate the status of already existing objects. When each thread only touches a different set of objects, then no problems arise. But as soon as more than one thread deals with the same object state, proper precaution is required (to avoid indeterministic behavior).

Your question suggests that you might not fully understand what "thread" means.
When we learned to program, they taught us that a computer program is a sequence of instructions, and they taught us that the computer executes those instructions one-by-one, starting from some well-defined entry point (e.g., the main() routine).
OK, but when we talk about multi-threaded programs, it no longer is sufficient to say that "the computer" executes our code. Now we say that threads execute our code. Each thread has its own idea of where it is in your program, and if two or more threads happen to be executing in the same function at the same time, then each of them has its own private copy of the function's arguments and local variables.
So, You asked:
Does a Java program, by default, cause creation of only 1 thread?
A Java program always starts with one thread executing your code, and usually several other threads executing JVM code. You don't normally need to be aware of the JVM threads. The one thread that executes your code starts its work at the beginning of your main() routine.
Programmers often call that initial thread the "main thread." Probably they call it that because it calls main(), but be careful! The name can be misleading: The JVM doesn't treat the "main thread" any differently from any other thread in a multi-threaded Java program.
if we create a multi threaded program, when do multiple threads access the same code of a Java object?
Threads only do what your program tells them to do. If you write code for two different threads to call the same function, then that's what they will do. But, let's break that question down a bit...
...First of all, how do we create a multi-threaded program?
A program becomes multi-threaded when your code tells it to become multi-threaded. In one simple case, it looks like this:
class MyRunnable implements Runnable {
public void run() {
DoSomeUsefulThing();
DoSomeOtherThing();
}
}
MyRunnable r = new MyRunnable();
Thread t = new Thread(r);
t.start();
...
Java creates a new thread when some other thread in your program calls t.start(). (NOTE! The Thread instance, t, is not the thread. It is only a handle that your program can use to start the thread and inquire about its thread's state and control it.)
When the new thread starts executing program instructions, it will start by calling r.run(). As you can see, the body of r.run() will cause the new thread to DoSomeUsefulThing() and then DoSomeOtherThing() before r.run() returns.
When r.run() returns, the thread is finished (a.k.a., "terminated", a.k.a., "dead").
So,
when do multiple threads access the same code of a Java object?
When your code makes them do it. Let's add a line to the example above:
...
Thread t = new Thread(r);
t.start();
DoSomeUsefulThing();
...
Note that the main thread did not stop after starting the new thread. It goes on to execute whatever came after the t.start() call. In this case, the next thing it does is to call DoSomeUsefulThing(). But that's the same as what the program told the new thread to do! If DoSomeUsefulThing() takes any significant time to complete, then both threads will be doing it at the same time... because that's what the program told them to do.
please show an example program where thread safety is a concern
I just did.
Think about what DoSomeUsefulThing() might be doing. If it's doing something useful, then it almost certainly is doing something to some data somewhere. But, I didn't tell it what data to operate on, so chances are, both threads are doing something to the same data at the same time.
That has a lot of potential to not turn out well.
One way to fix that is to tell the function what data to work on.
class MyDataClass { ... }
Class MyRunnable implements Runnable {
private MyDataClass data;
public MyRunnable(MyDataClass data) {
this.data = data;
}
public void run() {
DoSomeUsefulThingWITH(data);
DoSomeOtherThingWITH(data);
}
}
MyDataClass dat_a = new MyDataClass(...);
MyDataClass dat_b = new MyDataClass(...);
MyRunnable r = new MyRunnable(dat_a);
Thread t = new Thread(r);
t.start();
DoSomeUsefulThingWITH(dat_b);
There! Now the two threads are doing the same thing, but they are doing it to different data.
But what if you want them to operate on the same data?
That's a topic for a different question. Google for "mutual exclusion" to get started.

Depends on the implementation. Only one thread (the "main thread") will invoke the public static void main(String[]) method, but that doesn't mean other threads weren't started for other tasks.
A thread will access the "same code" if you program it to do so. I'm not sure what your idea of "section of code" is or where the idea that two threads will never access the same "section" at the same time comes from, but it's quite trivial to create thread-unsafe code.
import java.util.ArrayList;
import java.util.List;
public class Main {
public static void main(String[] args) throws InterruptedException {
List<Object> list = new ArrayList<>();
Runnable action = () -> {
while (true) {
list.add(new Object());
}
};
Thread thread1 = new Thread(action, "tread-1");
thread1.setDaemon(true); // don't keep JVM alive
Thread thread2 = new Thread(action, "thread-2");
thread2.setDaemon(true); // don't keep JVM alive
thread1.start();
thread2.start();
Thread.sleep(1_000L);
}
}
An ArrayList is not thread-safe. The above code has two threads constantly trying to add a new Object to the same ArrayList for approximately one second. It's not guaranteed, but if you run that code you might see an ArrayIndexOutOfBoundsException or something similar. Regardless of any exceptions being thrown, the state of the ArrayList is in danger of being corrupted. This is because state is updated by multiple threads with no synchronization.

Are there performance implications to creating a Thread and never starting it?

I'm working on an existing Java codebase which has an object that extends Thread, and also contains a number of other properties and methods that pertain to the operation of the thread itself. In former versions of the codebase, the thread was always an actual, heavyweight Thread that was started with Thread.start, waited for with Thread.join, and the like.
I'm currently refactoring the codebase, and in the present version, the object's Thread functionality is not always needed (but the object itself is, due to the other functionality contained in the object; in many cases, it's usable even when the thread itself is not running). So there are situations in which the application creates these objects (which extend Thread) and never calls .start() on them, purely using them for their other properties and methods.
In the future, the application may need to create many more of these objects than previously, to the point where I potentially need to worry about performance. Obviously, creating and starting a large number of actual threads would be a performance nightmare. Does the same thing apply to Thread objects that are never started? That is, are any operating system resources, or large Java resources, required purely to create a Thread? Or are the resources used only when the Thread is actually .started, making unstarted Thread objects safe to use in quantity? It would be possible to refactor the code to split the non-threading-related functionality into a separate function, but I don't want to do a large refactoring if it's entirely pointless to do so.
I've attempted to determine the answer to this with a few web searches, but it's hard to aim the query because search engines can't normally distinguish a Thread object from an actual Java thread.

You could implement Runnable instead of extending Thread.
public class MyRunnableClass implements Runnable {
// Your stuff...
#Override
public void run() {
// Thread-related stuff...
}
}
Whenever you need to run your Object to behave as a Thread, simply use:
Thread t = new Thread(new MyRunnableClass());
t.start();

As the others have pointed out: performance isn't a problem here.
I would focus much more on the "good design" approach. It simply doesn't make (much, any?) sense to extend Thread when you do not intend to ever invoke start(). And you see: you write code to communicate your intentions.
Extending Thread without using it as thread, that only communicates confusion. Every new future reader of your code will wonder "why is that"?
Therefore, focus on getting to a straight forward design. And I would go one step further: don't just turn to Runnable, and continuing to use threads. Instead: learn about ExecutorServices, and how to submit tasks, and Futures, and all that.
"Bare iron" Threads (and Runnables) are like 20 year old concepts. Java has better things to offer by now. So, if you are really serious about improving your code base: look into these new abstraction concepts to figure where they would make sense to be used.

You can create about 1.5 million of these objects per GB of memory.
import java.util.LinkedList;
import java.util.List;
class A {
public static void main(String[] args) {
int count = 0;
try {
List<Thread> threads = new LinkedList<>();
while (true) {
threads.add(new Thread());
if (++count % 10000 == 0)
System.out.println(count);
}
} catch (Error e) {
System.out.println("Got " + e + " after " + count + " threads");
}
}
}
using -Xms1g -Xmx1g for Oracle Java 8, the process grinds to halt at around
1 GB - 1780000
2 GB - 3560000
6 GB - 10690000
The object uses a bit more than you might expect from reading the source code, but it's still about 600 bytes each.
NOTE: Throwable also use more memory than you might expect by reading the Java source. It can be 500 - 2000 bytes more depending on the size of the stack at the time it was created.

In Java, you must have a class with shared variables that threads will access?

I'm learning threads yet, but don't know much things.
I see that I need implement the Runnable interface and create various instances of the same class to each thread execute each one. It's correct?
If is correct, I need to create another class to contains the variables that will be accessed/shared by all threads?
EDIT: I need maintain some variables to coordinate the thread work, otherwise they will execute the same work. This will be one variable shared by all threads.
EDIT 2: this questions is related to this: How I make result of SQL querys with LIMIT different in each query? . I will need maintain the quantity of threads that have done a query to database to set the OFFSET parameter.

Each thread needs an instance of a Runnable to do its work, yes. In some cases the threads could share the same instance, but only if there is no state held within the instance that needs to differ between threads. Generally you will want different instances in each thread.
Threads should share as little state as possible to avoid problems, but if you do want to share state, in general you are right that you will need an instance or instances somewhere to hold that state.
Note that this shared state could also be held in class variables rather than instance variables.

There are many ways to solve this...this is really a question about Design Patterns.
Each thread could be provided via it's constructor an object or objects that describe its unique work.
Or you could provide the thread with a reference to a work queue from which they could query the next available task.
Or you could put a method in the class that implements Runnable that could be called by a master thread...
Many ways to skin this cat...I'm sure there are existing libraries for thread work distribution, configuration, etc.

Let's put all things on their places.
Statement new Thread(r) creates thread. But this thread still does not run. If you say"
Thread t = new Thread(r);
t.start();
you make thread to run, i.e. execute run() method of your runnable.
Other (equal) way to create and run thread is to inherit from class Thread and override default implementation of its run() method.
Now. If you have specific logic and you wish to run the same logic simultaneously in different threads you have to create different threads and execute their start() method.
If you prefer to implement Runnable interface and your logic does not require any parameters you even can create only one instance of your runnable implementation and run it into different threads.
public class MyLogic implements Runnable {
public void run() {
// do something.
}
}
//// ................
Runnable r = new MyLogic();
Thread t1 = new Thread(r);
Thread t2 = new Thread(r);
t1.start();
t2.start();
Now this logic is running simultaniusly in 2 separate threads while we created only one instance of MyLogic.
If howerver your logic requires parameters you should create separate instances.
public class MyLogic implements Runnable {
private int p;
public MyLogic(int p) {
this.p = p;
}
public void run() {
// this logic uses value of p.
}
}
//// ................
Thread t1 = new Thread(new MyLogic(111));
Thread t2 = new Thread(new MyLogic(222));
t1.start();
t2.start();
These 2 threads run the same logic with different arguments (111 and 222).
BTW this example shows how to pass values to thread. To get information from it you should use similar method. Define member variable result. The variable will be initiated by method run(). Provide appropriate getter. Now you can pass result from thread to anyone that is interesting to do this.
Obviously described above are basics. I did not say anything about synchronization, thread pools, executors etc. But I hope this will help you to start. Then find some java thread tutorial and go through it. In couple of days you will be the world class specialist in java threads. :)
Happy threading.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.