Suppose I want to make a method non-blocking, and make the app continue as it is and still surely get the return value:
Key key = datastore.put(complexInstance);
String name = key.getName();
doSomethingWithTheName(name);
Or simply, for some Java environment that can't run thread for more than 30 seconds.
Where in the put method:
public Key put(Object instance){
Key result = null;
// In here process could take up time, say 30 seconds or more, IDK :-/
return result;
}
What is the strategy to achieve this?
You could use an implementation of an ExecutorService in combination with a Future object (http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Future.html). You would simply start a new thread (or use an existing one) and could fetch the result later.
Java 8 made the process a lot simpler:
//field in a manager class
ScheduledExecutorService es = Executors.newScheduledThreadPool(10);
//Schedule a task
es.schedule(() -> { /* contents of a runnable */ }, 0, TimeUnit.SECONDS);
Otherwise, you can still just use an anonymous runnable with the Scheduler:
es.schedule(new Runnable() {
public void run() {
/* do what you need */
}
}, 0, TimeUnit.SECONDS);
However, as you specified, you will still need to do something for a returned value. There isn't really much that you can do, aside from use something from either a state manager, or to execute relevant methods within your runnable.
Your class needs to take a thread pool, probably via the interface ExecutorService, that the methods will run on. You could make it a private static variable, but more likely it's better for it to be passed in or at least configured by the client code, who will set size, etc.
Note that if IO is the asynchronous part, it's better to use something built on Java's nio framework than to use lots of threads.
You will need to return a Future of some sort. Through Java 7 at least (I'm not sure about 8), Java's future library is very weak and omits some obviously needed functionality. Look at either Functional Java or Google's library. But you will notice that many libraries (Apache's MINA, Amazon Web Service's Java SDK, etc.) implement their own promise libraries to get over these weaknesses. (I did the same in my company's code base. Whoops.)
Related
I'm working on an existing Java codebase which has an object that extends Thread, and also contains a number of other properties and methods that pertain to the operation of the thread itself. In former versions of the codebase, the thread was always an actual, heavyweight Thread that was started with Thread.start, waited for with Thread.join, and the like.
I'm currently refactoring the codebase, and in the present version, the object's Thread functionality is not always needed (but the object itself is, due to the other functionality contained in the object; in many cases, it's usable even when the thread itself is not running). So there are situations in which the application creates these objects (which extend Thread) and never calls .start() on them, purely using them for their other properties and methods.
In the future, the application may need to create many more of these objects than previously, to the point where I potentially need to worry about performance. Obviously, creating and starting a large number of actual threads would be a performance nightmare. Does the same thing apply to Thread objects that are never started? That is, are any operating system resources, or large Java resources, required purely to create a Thread? Or are the resources used only when the Thread is actually .started, making unstarted Thread objects safe to use in quantity? It would be possible to refactor the code to split the non-threading-related functionality into a separate function, but I don't want to do a large refactoring if it's entirely pointless to do so.
I've attempted to determine the answer to this with a few web searches, but it's hard to aim the query because search engines can't normally distinguish a Thread object from an actual Java thread.
You could implement Runnable instead of extending Thread.
public class MyRunnableClass implements Runnable {
// Your stuff...
#Override
public void run() {
// Thread-related stuff...
}
}
Whenever you need to run your Object to behave as a Thread, simply use:
Thread t = new Thread(new MyRunnableClass());
t.start();
As the others have pointed out: performance isn't a problem here.
I would focus much more on the "good design" approach. It simply doesn't make (much, any?) sense to extend Thread when you do not intend to ever invoke start(). And you see: you write code to communicate your intentions.
Extending Thread without using it as thread, that only communicates confusion. Every new future reader of your code will wonder "why is that"?
Therefore, focus on getting to a straight forward design. And I would go one step further: don't just turn to Runnable, and continuing to use threads. Instead: learn about ExecutorServices, and how to submit tasks, and Futures, and all that.
"Bare iron" Threads (and Runnables) are like 20 year old concepts. Java has better things to offer by now. So, if you are really serious about improving your code base: look into these new abstraction concepts to figure where they would make sense to be used.
You can create about 1.5 million of these objects per GB of memory.
import java.util.LinkedList;
import java.util.List;
class A {
public static void main(String[] args) {
int count = 0;
try {
List<Thread> threads = new LinkedList<>();
while (true) {
threads.add(new Thread());
if (++count % 10000 == 0)
System.out.println(count);
}
} catch (Error e) {
System.out.println("Got " + e + " after " + count + " threads");
}
}
}
using -Xms1g -Xmx1g for Oracle Java 8, the process grinds to halt at around
1 GB - 1780000
2 GB - 3560000
6 GB - 10690000
The object uses a bit more than you might expect from reading the source code, but it's still about 600 bytes each.
NOTE: Throwable also use more memory than you might expect by reading the Java source. It can be 500 - 2000 bytes more depending on the size of the stack at the time it was created.
I've got a question about CompletableFuture and its possible usage for lazy computations.
It seems like it is a great substitute for RunnableFuture for this task since it is possible to easily create task chains and to have total control of each chain link. Still I found that it is very hard to control when exactly does the computation take place.
If I just create a CompletableFuture with supplyAssync method or something like that, it is OK. It waits patiently for me to call get or join method to compute. But if I try to make an actual chain with whenCompose, handle or any other method, the evaluation starts immediately, which is very frustrating.
Of course, I can always place some blocker task at the start of the chain and release the block when I am ready to begin calculation, but it seems a bit ugly solution. Does anybody know how to control when does CompletableFuture actually run.
CompletableFuture is a push-design, i.e. results are pushed down to dependent tasks as soon as they become available. This also means side-chains that are not in themselves consumed still get executed, which can have side-effects.
What you want is a pull-design where ancestors would only be pulled in as their data is consumed.
This would be a fundamentally different design because side-effects of non-consumed trees would never happen.
Of course with enough contortions CF could be made to do what you want, but you should look into the fork-join framework instead which allows you to only run the computations you depend on instead of pushing down results.
There's a conceptual difference between RunnableFuture and CompletableFuture that you're missing here.
RunnableFuture implementations take a task as input and hold onto it. It runs the task when you call the run method.
A CompletableFuture does not hold onto a task. It only knows about the result of a task. It has three states: complete, incomplete, and completed exceptionally (failed).
CompletableFuture.supplyAsync is a factory method that gives you an incomplete CompletableFuture. It also schedules a task which, when it completes, will pass its result to the CompletableFuture's complete method. In other words, the future that supplyAsync hands you doesn't know anything about the task, and can't control when the task runs.
To use a CompletableFuture in the way you describe, you would need to create a subclass:
public class RunnableCompletableFuture<T> extends CompletableFuture<T> implements RunnableFuture<T> {
private final Callable<T> task;
public RunnableCompletableFuture(Callable<T> task) {
this.task = task;
}
#Override
public void run() {
try {
complete(task.call());
} catch (Exception e) {
completeExceptionally(e);
}
}
}
A simple way of dealing with your problem is wrapping your CompletableFuture in something with a lazy nature. You could use a Supplier or even Java 8 Stream.
it is late, but how about using constructor for first CompletableFuture in the chain?
CompletableFuture<Object> cf = new CompletableFuture<>();
// compose the chain
cf.thenCompose(sometask_here);
// later starts the chain with
cf.complete(anInputObject);
A lot of times while writing applications, I wish to profile and measure the time taken for all methods in a stacktrace. What I mean is say:
Method A --> Method B --> Method C ...
A method internally calls B and it might call another. I wish to know the time taken to execute inside each method. This way in a web application, I can precisely know the percentage of time being consumed by what part of the code.
To explain further, most of the times in spring application, I write an aspect which collects information for every method call of a class. Which finally gives me summary. But I hate doing this, its repetitive and verbose and need to keep changing regex to accommodate different classes. Instead I would like this:
#Monitor
public void generateReport(int id){
...
}
Adding some annotation on method will trigger instrumentation api to collect all statistics of time taken by this method and any method later called. And when this method is exited, it stops collection information. I think this should be relatively easy to implement.
The questions is: Are there any reasonable alternatives that lets me do that for general java code? Or any quick way of collection this information. Or even a spring plugin for spring applications?
PS: Exactly like XRebel, it generates beautiful summaries of time take by the security, dao, service etc part of code. But it costs a bomb. If you can afford, you should definitely buy it.
You want to write a Java agent. Such an agent allows you to redefine a class when it is loaded. This way, you can implement an aspect without polluting your source code. I have written a library, Byte Buddy, which makes this fairly easy.
For your monitor example, you could use Byte Buddy as follows:
new AgentBuilder.Default()
.rebase(declaresMethod(isAnnotatedWith(Monitor.class))
.transform( (builder, type) ->
builder
.method(isAnnotatedWith(Monitor.class))
.intercept(MethodDelegation.to(MonitorInterceptor.class);
);
class MonitorInterceptor {
#RuntimeType
Object intercept(#Origin String method,
#SuperCall Callable<?> zuper)
throws Exception {
long start = System.currentTimeMillis();
try {
return zuper.call();
} finally {
System.out.println(method + " took " + (System.currentTimeMillis() - start);
}
}
}
The above built agent can than be installed on an instance of the instrumentation interface which is provided to any Java agent.
As an advantage over using Spring, the above agent will work for any Java instance, not only for Spring beans.
I don't know if theres already a library doing it nor can I give you a ready to use code. But I can give you a description how you can implement it on your own.
First of all i assume its no problem to include AspectJ into your project. Than create an annotation f.e. #Monitor which acts as marker for the time measurment of whatever you like.
Than create a simple data strucutre holding the information you wana track.
An example for this could be the following :
public class OperationMonitoring {
boolean active=false;
List<MethodExecution> methodExecutions = new ArrayList<>();
}
public class MethodExecution {
MethodExcecution invoker;
List<MethodExeuction> invocations = new ArrayList<>();
long startTime;
long endTime;
}
Than create an Around advice for all methods. On execution check if the called Method is annotated with your Monitoring annotation. If yes started monitoring each method execution in this thread. A simple example code could look like:
#Aspect
public class MonitoringAspect {
private ThreadLocal<OperationMonitoring> operationMonitorings = new ThreadLocal<>();
#Around("execution(* *.*(..))")
public void monitoring(ProceedingJoinPoint pjp) {
Method method = extractMethod(pjp);
if (method != null) {
OperationMonitoring monitoring = null;
if(method.isAnnotationPresent(Monitoring.class){
monitoring = operationMonitorings.get();
if(monitoring!=null){
if(!monitoring.active) {
monitoring.active=true;
}
} else {
// Create new OperationMonitoring object and set it
}
}
if(monitoring == null){
// this method is not annotated but is the tracking already active?
monitoring = operationMonitoring.get();
}
if(monitoring!=null && monitoring.active){
// do monitoring stuff and invoke the called method
} else {
// invoke the called method without monitoring
}
// Stop the monitoring by setting monitoring.active=false if this method was annotated with Monitoring (and it started the monitoring).
}
}
private Method extractMethod(JoinPoint joinPoint) {
if (joinPoint.getKind().equals(JoinPoint.METHOD_EXECUTION) && joinPoint.getSignature() instanceof MethodSignature) {
return ((MethodSignature) joinPoint.getSignature()).getMethod();
}
return null;
}
}
The code above is just a how to. I would also restructure the code but I've written it in a textfield, so please be aware of architectural flaws. As mentioned with a comment at the end. This solution does not supporte multiple annotated methods along the way. But it would be easy to add this.
A limitation of this approach is that it fails when you start additional threads during a tracked path. Adding support for starting new threads in a monitored Thread is not that easy. Thats also the reason why IoC frameworks have own features for handling threads to be able to track this.
I hope you understand the general concept of this, if not feel free to ask further questions.
This is the exact reason why I built the open source tool stagemonitor, which uses Byte Buddy to insert profiling code. If you want to monitor a web application you don't have to alter or annotate your code. If you have a standalone application, there is a #MonitorRequests annotation you can use.
You say you want to know the percentage of time taken within each routine on the stack.
I assume you mean inclusive time.
I also assume you mean wall-clock time, on the theory that if one of those lower-level callees happens to do some I/O, locking, etc., you don't want to be blind to that.
So a stack-sampling profiler that samples on wall-clock time will be getting the right kind of information.
The percentage time that A takes is the percentage of samples containing A, same for B, etc.
To get the percentage of A's time used by B, it is the percentage of samples containing A that happen to have B at the next level below.
The information is all in the stack samples, but it may be hard to get the profiler to extract just the information you want.
You also say you want precise percentage.
That means you also need a large number of stack samples.
For example, if you want to shrink the uncertainty of your measurements by a factor of 10, you need 100 times as many samples.
In my experience finding performance problems, I am willing to tolerate an uncertainty of 10% or more, because my goal is to find big wastage, not to know with precision how bad it is.
So I take samples manually, and look at them manually.
In fact, if you look at the statistics, you only have to see something wasteful on as few as two samples to know it's bad, and the fewer samples you take before seeing it twice, the worse it is.
(Example: If the problem wastes 30% of time, it takes on average 2/30% = 6.67 samples to see it twice. If it wastes 90% of time, it only takes 2.2 samples, on average.)
Well I am not familiar with threads in java, so I am dealing with this problem: I have a singleton object who contains some objects (let say sessions) and each object has a duration time, so that means that after a some time one object is considered expired so it needs to be removed from (a pool - List in singleton) singleton. To do this I decided to have a thread that checks every 5 minutes (or 10 minutes or whatever) and clean up all session in the singleton class. How can I implement such a functionality avoiding any possible deadlock and or time consuming blocks. Thank you in advance.
I wouldn't implement it like that. Instead, I would delete the timed out sessions when a session is asked to the pool (not necessary at each get, though). This is, BTW, what is done by Guava's CacheBuilder, which you could use, since it's simple, tested, and provide useful features.
If you really want to go this way, then you should probably use a ConcurrentMap or ConcurrentList, and use a single-thread ScheduledExecutorService, which would wake iterate through the list and remove older sessions every X minutes.
Runnable cleaner = new Runnable() {
public void run() { /* remove expired objects here */
//Using get method check whether object is expired
}
};
Executors.newScheduledThreadPool(1)
.scheduleWithFixedDelay(cleaner, 0, 30, TimeUnit.SECONDS);
Is it an option for you to use a pre-existing in-memory caching-solution instead of writing your own?
If yes you could check out Google Guava, which offers a Caching-Solution among many other things.
See: Caching with Guava
See: http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/cache/CacheBuilder.html
I agree with #quaylar (+1), use an existing caching technology, if you can.
If you can't, however, one solution is to use a java.util.Timer. Initialise it with the time till first session object expiry and put it to sleep. Then, on it awakening, have it remove your session object and reset it with the time to the next expiry time. Let java handle the timing aspects.
Well you can do something like this : I am assuming the class for your singleton object is called singleton so you have something like this (its not perfect code)
public class Singleton {
List<Objects> singletonList = Collections.synchronizedList(new ArrayList<Objects>);
}
public class RemoveExpiredItemsThread implements Runnable {
private Singleton singletonReference;
private int sleepTime = 5*60*1000;
// the constructor
// then the run method which is something like this
public void run() {
while(done == false) {
Thread.sleep(sleepTime);
singletonReference.removeItem();
}
}
}
I'm trying to write a construct which allows me to run computations in a given time window. Something like:
def expensiveComputation(): Double = //... some intensive math
val result: Option[Double] = timeLimited( 45 ) { expensiveComputation() }
Here the timeLimited will run expensiveComputation with a timeout of 45 minutes. If it reaches the timeout it returns None, else it wrapped the result into Some.
I am looking for a solution which:
Is pretty cheap in performance and memory;
Will run the time-limited task in the current thread.
Any suggestion ?
EDIT
I understand my original problem has no solution. Say I can create a thread for the calculation (but I prefer not using a threadpool/executor/dispatcher). What's the fastest, safest and cleanest way to do it ?
Runs the given code block or throws an exception on timeout:
#throws(classOf[java.util.concurrent.TimeoutException])
def timedRun[F](timeout: Long)(f: => F): F = {
import java.util.concurrent.{Callable, FutureTask, TimeUnit}
val task = new FutureTask(new Callable[F]() {
def call() = f
})
new Thread(task).start()
task.get(timeout, TimeUnit.MILLISECONDS)
}
Only an idea: I am not so familiar with akka futures. But perhaps its possible to stick the future executing thread to the current thread and use akka futures with timeouts?
To the best of my knowledge, either you yield (the computation calls to some scheduler) or you use a thread, which gets manipulated from the "outside".
If you want to run the task in the current thread and if there should be no other threads involved, you would have to check whether the time limit is over inside of expensiveComputation. For example, if expensiveComputation is a loop, you could check for the time after each iteration.
If you are ok for the code of expensiveComputation to check Thread.interrupted() frequently, pretty easy. But I suppose you are not.
I don't think there is any solution that will work for arbitrary expensiveComputation code.
The question is what are you prepared to have as constraint on expensiveComputation.
You have the deprecated and quite unsafe Thead.stop(Throwable) too. If your code does not modify any object but those it created by itself, it might work.
I saw a pattern like this work well for time-limited tasks (Java code):
try {
setTimeout(45*60*1000); // 45 min in ms
while (not done) {
checkTimeout();
// do some stuff
// if the stuff can take long, again:
checkTimeout();
// do some more stuff
}
return Some(result);
}
catch (TimeoutException ex) {
return None;
}
The checkTimeout() function is cheap to call; you add it to code so that it is called reasonably often, but not too often. All it does is check current time against timer value set by setTimeout() plus the timeout value. If current time exceeds that value, checkTimeout() raises a TimeoutException.
I hope this logic can be reproduced in Scala, too.
For a generic solution (without having to go litter each of your expensiveComputations with checkTimeout() code) perhaps use Javassist.
http://www.csg.is.titech.ac.jp/~chiba/javassist/
You can then insert various checkTimeout() methods dynamically.
Here is the intro text on their website:
Javassist (Java Programming Assistant) makes Java bytecode manipulation simple. It is a class library for editing bytecodes in Java; it enables Java programs to define a new class at runtime and to modify a class file when the JVM loads it. Unlike other similar bytecode editors, Javassist provides two levels of API: source level and bytecode level. If the users use the source-level API, they can edit a class file without knowledge of the specifications of the Java bytecode. The whole API is designed with only the vocabulary of the Java language. You can even specify inserted bytecode in the form of source text; Javassist compiles it on the fly. On the other hand, the bytecode-level API allows the users to directly edit a class file as other editors.
Aspect Oriented Programming: Javassist can be a good tool for adding new methods into a class and for inserting before/after/around advice at the both caller and callee sides.
Reflection: One of applications of Javassist is runtime reflection; Javassist enables Java programs to use a metaobject that controls method calls on base-level objects. No specialized compiler or virtual machine are needed.
In the currentThread?? Phhhew...
Check after each step in computation
Well if your "expensive computation" can be broken up into multiple steps or has iterative logic you could capture the time when you start and then check periodically between your steps. This is by no means a generic solution but will work.
For a more generic solution you might make use of aspects or annotation processing, that automatically litters your code with these checks. If the "check" tells you that your time is up return None.
Ill ponder a solution in java quickly below using annotations and an annotation processor...
public abstract Answer{}
public class Some extends Answer {public Answer(double answer){answer=answer}Double answer = null;}
public class None extends Answer {}
//This is the method before annotation processing
#TimeLimit(45)
public Answer CalculateQuestionToAnswerOf42() {
double fairydust = Math.Pi * 1.618;
double moonshadowdrops = (222.21) ^5;
double thedevil == 222*3;
return new Answer(fairydust + moonshadowdrops + thedevil);
}
//After annotation processing
public Answer calculateQuestionToAnswerOf42() {
Date start = new Date() // added via annotation processing;
double fairydust = Math.Pi * 1.618;
if(checkTimeout(start, 45)) return None; // added via annotation processing;
double moonshadowdrops = (222.21) ^5;
if(checkTimeout(start, 45)) return None; // added via annotation processing;
double thedevil == 222*3;
if(checkTimeout(start, 45)) return None; // added via annotation processing;
return new Answer(fairydust + moonshadowdrops + thedevil);
}
If you're very seriously in need of this you could create a compiler plugin that inserts check blocks in loops and conditions. These check blocks can then check Thread.isInterrupted() and throw an Exception to escape.
You could possibly use an annotation, i.e. #interruptible, to mark the methods to enhance.