Without having the source code for a Java API, is there anyway to know if the API methods create multiple threads ? Are there any conventions to follow if you are writing Java APIs and they create multiple threads. This may be very fundamental question but it happened to spawn out of a discussion in which the crux question was - " How do you know which Java APIs create threads and which don't " ?
One way of determining which libraries create new threads is by disallowing Thread creation and ThreadGroup modification in the SecurityManager. See the java.lang.SecurityManager.checkAccess(Thread) method. By implementing your own SecurityManager, you are able to react on the creation of Threads.
To answer the other question: many libraries create new threads, even if you don't expect it. For example APIs for HTTP communication create Timers for Keep-Alives or session timeouts. Java 2D is creating a signalling thread. Java itself has multiple threads, e.g. the Finalizer thread; the AWT/Swing event dispatcher thread etc.
There's no way to tell. Actually, I don't think you normally would care that much unless you're in some kind of constrained environment. What's I've found is more relevant is to determine if a method is written with an expectation of being run on a particular thread (the AWT Event dispatch thread, in the case I've seen). There's not a way to do that either, unless the code is using some kind of naming convention, or it's documented.
In my experience, if you are looking at core java, not J2EE, the only time I can think that threads are created in core Java is with Swing.
I haven't seen any example of other threads being created by the core Java APIs, except for the Thread class, of course. :)
But, if you are using other libraries then it may be that they are creating threads, but, if you don't want to profile, you may want to use AspectJ to log whenever a new thread is created, and the stack track of what called it, so you can see what is creating the threads.
UPDATE:
Swing uses 4 threads, according to this post, but he also explains how you can go about killing off the threads, if needed.
http://www.herongyang.com/Swing/jframe_2.html
If you want to see active threads, just fire up the jvisualvm application (located in your $JDK/bin directory) to connect to any local java process. You'll be able to see a multitude of information about the process, including thread names, status, and history. Get more information here.
Related
I am working on a platfor that hosts small Java applications, all of which currently uses a single thread, living inside a Docker engine, consuming data from a Kafka server and logging to a central DB.
Now, I need to put another Java application to this platform. This app at hand uses multithreading relatively heavily, I already tested it inside a Docker container and it works perfectly there, so I'm ready to deploy it on the platform where it would be scaled manually, that is, some human would define the number of containers that would be started, each of them containing an instance of this app.
My Architect has an objection, saying that "In a distributed environment we never use multithreading". So now, I have to refactor my application eliminating any thread related logic from it, making it single threaded. I requested a more detailed reasoning from him, but he yells "If you are not aware of this principle, you have no place near Java".
Is it really a mistake to use a multithreaded Java application in a distributed system - a simple cluster with ten or twenty physical machines, each hosting a number of virtual machines, which then runs Docker containers, with Java applications inside them.
Honestly, I don't see the problem of multithreading inside a container.
Is it really a mistake or somehow "forbidden"?
Thanks.
When you write for example a web application that will run in a Java EE application server, then normally you should not start up your own threads in your web application. The application server will manage threads, and will allocate threads to process incoming requests on the server.
However, there is no hard rule or reason why it is never a good idea to use multi-threading in a distributed environment.
There are advantages to making applications single-threaded: the code will be simpler and you won't have to deal with difficult concurrency issues.
But "in a distributed environment we never use multithreading" is not necessarily always true and "if you are not aware of this principle, you have no place near Java" sounds arrogant and condescending.
I guess he only tells you this as using a single thread eliminates multi threading and data ordering issues.
There is nothing wrong with multithreading though.
Distributed systems usually have tasks that are heavily I/O bound.
If I/O calls are blocking in your system
The only way to achieve concurrency within the process is spawning new threads to do other useful work. (Multi-threading).
The caveat with this approach is that, if they are too many threads
in flight, the operating system will spend too much time context
switching between threads, which is wasteful work.
If I/O calls are Non-Blocking in your system
Then you can avoid the Multi-threading approach and use a single thread to service all your requests. (read about event-loops or Java's Netty Framework or NodeJS)
The upside for single thread approach
The OS does not any wasteful thread context switches.
You will NOT run into any concurrency problems like dead locks or race conditions.
The downside is that
It is often harder to code/think in a non-blocking fashion
You typically end up using more memory in the form of blocking queues.
What? We use RxJava and Spring Reactor pretty heavily in our application and it works pretty fine. You can't work with threads across two JVMs anyway. So just make sure that your logic is working as you expect on a single JVM.
I know, using processbuilder we can create a process in java, but how do I create a thread inside a process? Also if i want to create multiple threads inside a process what is the best way of doing that?
Thanks in advance.
I'm looking for thread creation inside of new process.
Once launched, the launching application is not in control of the threads within the new process. Additional threads in the new process will be started as and when that processes code decides to. Only if you are the author of the code for the other process, would you be able to change how and when it spawns new threads.
The difference between processes and threads with respect to Java is that threads run within the same JVM instance, while processes run in different JVM instances.
For example, launching two instances of the same Java application, results in two processes, each running in their own JVM. Despite being the same application, they run independently of each other unless the application includes means of communicating between itself.
Thread creation in a different process would be the responsibility of that's process' Java code. If you are looking to create threads in one JVM under the direction of code in another JVM, you will have to implement an inter-process control mechanism (e.g. socket, control file, RMI, JMX, etc).
Without knowing your reason for spawning threads in a different process, I can only assume that you want some type of isolation. If it is data isolation you seek, consider revising your application's architecture to provide it intrinsically and follow one of the suggestions in Peter Lawrey's comment. A good starting point for ExecutorService is Java 8 Concurrency Tools: Threads and Executors.
I have an application which is scheduler running different threads.
The application may load new Runnable classes and run them.
Currently the application is in production, that is it's running on remote server.
My team consists of 3 people developing Runnable classes.
When the class is ready, it's uploaded to server and loaded to scheduler.
I would like to give my team the ability to debug specific threads.
That is: person A may debug threads of Runnable A, B-B, and so on.
Giving them the full access to the remote JVM is not a solution, because
the developers are not allowed to see the system core, and each others solutions.
So my question is: how to allow multiple remote debugging with thread specific connections?
Preferable IDE: Eclipse
EDIT:
It's possible to connect remotely to specific thread with jdb
http://docs.oracle.com/javase/7/docs/technotes/tools/windows/jdb.html
Here is an example: http://www.itec.uni-klu.ac.at/~harald/CSE/Content/debugging.html
1) Find your thread with jdb threads
2) Put breakpoint and enter the wanted thread
Still the security issue stays.
One solution was to compile protected code without debug symbols, but it will only protect the core, allow seeing each other's threads.
So, next step - digging Security Manager. Maybe there's privilege layer suitable for my situation.
I'm not sure I've got a good answer to your question, but let's see how it pans out.
As I understand it you want to allow different developers to debug their class alone, and their class runs as a thread as part of a single Java process.
On the face of it that sort of runs counter to the nature of debugging in that normally you have access to everything in the process. I don't imagine that Java is any different to any other language in this respect (I'm no Java programmer).
So how about running the classes in separate Java processes. That way I presume the standard Eclipse tools would allow each developer to remote attach and debug their class.
However I presume that these classes need to interact with each other in some way, otherwise you wouldn't be asking your question in the first place. And running each class in a separate process (JVM) sounds like a bad thing as far as interaction is concerned.
So how about a different form of interaction where tbe process boundary between each class doesn't really matter that much? You could look at using JCSP which, as far as I can tell, doesn't really care if two threads are in the same process or not.
It's a completely different interaction model, based solely on synchronous message passing. You get some nice fringe benefits - scalability is suddenly no longer a massive problem, and it allows you to dodge many pitfalls normally associated with multithreaded programs (deadlock, etc). However if you've already written a large amount of code, adopting JCSP is probably a significant rewrite.
Is that anywhere near the mark? Good luck.
One of the first things I've learned about Java EE development is that I shouldn't spawn my own threads inside a Java EE container. But when I come to think about it, I don't know the reason.
Can you clearly explain why it is discouraged?
I am sure most enterprise applications need some kind of asynchronous jobs like mail daemons, idle sessions, cleanup jobs etc.
So, if indeed one shouldn't spawn threads, what is the correct way to do it when needed?
It is discouraged because all resources within the environment are meant to be managed, and potentially monitored, by the server. Also, much of the context in which a thread is being used is typically attached to the thread of execution itself. If you simply start your own thread (which I believe some servers will not even allow), it cannot access other resources. What this means, is that you cannot get an InitialContext and do JNDI lookups to access other system resources such as JMS Connection Factories and Datasources.
There are ways to do this "correctly", but it is dependent on the platform being used.
The commonj WorkManager is common for WebSphere and WebLogic as well as others
More info here
And here
Also somewhat duplicates this one from this morning
UPDATE: Please note that this question and answer relate to the state of Java EE in 2009, things have improved since then!
For EJBs, it's not only discouraged, it's expressly forbidden by the specification:
An enterprise bean must not use thread
synchronization primitives to
synchronize execution of multiple
instances.
and
The enterprise bean must not attempt
to manage threads. The enterprise
bean must not attempt to start, stop,
suspend, or resume a thread, or to
change a thread’s priority or name.
The enterprise bean must not attempt
to manage thread groups.
The reason is that EJBs are meant to operate in a distributed environment. An EJB might be moved from one machine in a cluster to another. Threads (and sockets and other restricted facilities) are a significant barrier to this portability.
The reason that you shouldn't spawn your own threads is that these won't be managed by the container. The container takes care of a lot of things that a novice developer can find hard to imagine. For example things like thread pooling, clustering, crash recoveries are performed by the container. When you start a thread you may lose some of those. Also the container lets you restart your application without affecting the JVM it runs on. How this would be possible if there are threads out of the container's control?
This the reason that from J2EE 1.4 timer services were introduced. See this article for details.
Concurrency Utilities for Java EE
There is now a standard, and correct way to create threads with the core Java EE API:
JSR 236: Concurrency Utilities for Java™ EE
By using Concurrency Utils, you ensure that your new thread is created, and managed by the container, guaranteeing that all EE services are available.
Examples here
There is no real reason not to do so. I used Quarz with Spring in a webapp without problems. Also the concurrency framework java.util.concurrent may be used. If you implement your own thread handling, set the theads to deamon or use a own deamon thread group for them so the container may unload your webapp any time.
But be careful, the bean scopes session and request do not work in threads spawned! Also other code beased on ThreadLocal does not work out of the box, you need to transfer the values to the spawned threads by yourself.
You can always tell the container to start stuff as part of your deployment descriptors. These can then do whatever maintainance tasks you need to do.
Follow the rules. You will be glad some day you did :)
Threads are prohibited in Java EE containers according to the blueprints. Please refer to the blueprints for more information.
I've never read that it's discouraged, except from the fact that it's not easy to do correctly.
It is fairly low-level programming, and like other low-level techniques you ought to have a good reason. Most concurrency problems can be resolved far more effectively using built-in constructs like thread pools.
One reason I have found if you spawn some threads in you EJB and then you try to have the container unload or update your EJB you are going to run into problems. There is almost always another way to do something where you don't need a Thread so just say NO.
I am currently building a java-servlet-based web application that should offer its service to quite a lot of users (don't ask me how much "a lot" is :-) - I don't know yet).
However, while the application is being used, there might occur some long-taking processing on the serverside.
In order to avoid bad UI responsiveness, I decided to move these processing operations into their own threads.
This means that once a user is logged in, it can happen that 1-10 threads run in the background (per user!).
I once heard that using multiple threads in a web application is a "bad idea".
Is this true and if yes: Why?
Update: I forgot to mention that my application heavily relies on ajax calls. Every user action causes a new ajax call. So, when the main servlet thread is busy, the ajax call takes very long to process. That's why I want to use multiple threads.
It is a bad idea to manually create the threads yourself. This has been discussed a lot here in SO. See this question for example.
Another question discusses alternative solutions.
The "bad idea" isn't multiple threads. Java EE was originally written so multi-threading was in the hands of the app server, so users were discouraged from starting their own threads.
I think what you really want is asynchronous processing for long-running tasks so users won't have to wait for them to finish before proceeding.
You could do that with JMS and stay within the lines in the Java EE coloring book. I think that it's safer to do on your own, now that there are new classes and constructs in the java.util.concurrent package.
It's still not an easy thing to do. Multi-threaded code isn't trivial. But I think it's easier than it used to be in Java.
Part of the problem might be that you're asking that servlet to do too much. Servlets should listen for HTTP request and orchestrate getting a response from other classes, not do all the processing themselves. Perhaps your servlet is telling you that it's time to refactor a bit. This will help your testing, since you'll be able to unit test those asynch classes without having a servlet/JSP engine running.
AJAX calls to services via HTTP need not block. If the service can return a token, a la FedEx, that tells the app when and how to get the response, there's no reason why the service can't process asynchronously. It's an implementation detail for the services that you should hide from clients.
1.
Brilliant Idea.
It's not common, but it's nothing wrong.
If you think asynchronous tasks are needed for better user experiences. Just use it.
2.
You need to be careful with it.
2.1.
Creating and destroying threads add a lot of overhead to your server.
You'd better use a executor, like java.util.concurrent.ThreadPoolExecutor.
2.2.
Don't just use Executors.newFixedThreadPool(). It is for beginners and hides dangerous details.
You need to know the edge behavior of ThreadPoolExecutor, and configure it properly.
How much threads are enough for your task? You need to calculate it out.
What would happen if there is no free theads in your pool? Different configurations can make it wait, cache, or abandon new tasks. What should you expect?
What would happen if a task runs for too long(such as an infinite loop)? There is no real timeout-and-exit mechanism in java. How do you prevent these.
If the application requires it, then I say go ahead and do the background threads, but, since you don't know how many users you will have, you are taking a great risk that you will overwhelm your server. You might consider some alternatives, if they will work in your situation. Can you run the background tasks completely offline, e.g. in a batch job? Can you limit the number of threads that each logged in user will need? How will you get the results of the background threads back to the user?
This is a bad idea for three main reasons:
Excessive number of running threads can kill system resources and cause some strange things such as starvation and priority inversion. Often this can be solved with a thread pool.
User session duration is unpredictable. The user can fire an action and go for a coffee, or he/she might complain for the delay an redo the action. This can cause creation of multiple background jobs, so requires complex control, and when we talk about threads, we never know for sure if we didn't left race conditions or unantecipated scenarios.
Most likely servlets will have some interaction with the threads. Now suppose your application needs to be scaled, so you use a clustered container (after all, you have "a lot" of users). The container can passivate a session and restore it in another node. But your threads will remain in the initial node, so the link between session and threads will be broken. This ends in unexpected exceptions and error 500 - server failure.
I think the best solution is to design your application so that it won't create so many background threads.
But if you insist or really need it, try using Java EE message driven beans (MDBs) and make your servlet invoke it using JMS, like #duffymo said.
The challenge is how to make communication between MDBs and user sessions. Perhaps your servlet can create a JMS queue or topic and send it to MDBs for them to reply, but I don't know if the servlet side of JMS connection can be passivated and restored.
Another forms of communication would be JNDI or an external database or file, but this requires polling, which might be unresponsive or CPU-excessive.