I'm trying to write a construct which allows me to run computations in a given time window. Something like:
def expensiveComputation(): Double = //... some intensive math
val result: Option[Double] = timeLimited( 45 ) { expensiveComputation() }
Here the timeLimited will run expensiveComputation with a timeout of 45 minutes. If it reaches the timeout it returns None, else it wrapped the result into Some.
I am looking for a solution which:
Is pretty cheap in performance and memory;
Will run the time-limited task in the current thread.
Any suggestion ?
EDIT
I understand my original problem has no solution. Say I can create a thread for the calculation (but I prefer not using a threadpool/executor/dispatcher). What's the fastest, safest and cleanest way to do it ?
Runs the given code block or throws an exception on timeout:
#throws(classOf[java.util.concurrent.TimeoutException])
def timedRun[F](timeout: Long)(f: => F): F = {
import java.util.concurrent.{Callable, FutureTask, TimeUnit}
val task = new FutureTask(new Callable[F]() {
def call() = f
})
new Thread(task).start()
task.get(timeout, TimeUnit.MILLISECONDS)
}
Only an idea: I am not so familiar with akka futures. But perhaps its possible to stick the future executing thread to the current thread and use akka futures with timeouts?
To the best of my knowledge, either you yield (the computation calls to some scheduler) or you use a thread, which gets manipulated from the "outside".
If you want to run the task in the current thread and if there should be no other threads involved, you would have to check whether the time limit is over inside of expensiveComputation. For example, if expensiveComputation is a loop, you could check for the time after each iteration.
If you are ok for the code of expensiveComputation to check Thread.interrupted() frequently, pretty easy. But I suppose you are not.
I don't think there is any solution that will work for arbitrary expensiveComputation code.
The question is what are you prepared to have as constraint on expensiveComputation.
You have the deprecated and quite unsafe Thead.stop(Throwable) too. If your code does not modify any object but those it created by itself, it might work.
I saw a pattern like this work well for time-limited tasks (Java code):
try {
setTimeout(45*60*1000); // 45 min in ms
while (not done) {
checkTimeout();
// do some stuff
// if the stuff can take long, again:
checkTimeout();
// do some more stuff
}
return Some(result);
}
catch (TimeoutException ex) {
return None;
}
The checkTimeout() function is cheap to call; you add it to code so that it is called reasonably often, but not too often. All it does is check current time against timer value set by setTimeout() plus the timeout value. If current time exceeds that value, checkTimeout() raises a TimeoutException.
I hope this logic can be reproduced in Scala, too.
For a generic solution (without having to go litter each of your expensiveComputations with checkTimeout() code) perhaps use Javassist.
http://www.csg.is.titech.ac.jp/~chiba/javassist/
You can then insert various checkTimeout() methods dynamically.
Here is the intro text on their website:
Javassist (Java Programming Assistant) makes Java bytecode manipulation simple. It is a class library for editing bytecodes in Java; it enables Java programs to define a new class at runtime and to modify a class file when the JVM loads it. Unlike other similar bytecode editors, Javassist provides two levels of API: source level and bytecode level. If the users use the source-level API, they can edit a class file without knowledge of the specifications of the Java bytecode. The whole API is designed with only the vocabulary of the Java language. You can even specify inserted bytecode in the form of source text; Javassist compiles it on the fly. On the other hand, the bytecode-level API allows the users to directly edit a class file as other editors.
Aspect Oriented Programming: Javassist can be a good tool for adding new methods into a class and for inserting before/after/around advice at the both caller and callee sides.
Reflection: One of applications of Javassist is runtime reflection; Javassist enables Java programs to use a metaobject that controls method calls on base-level objects. No specialized compiler or virtual machine are needed.
In the currentThread?? Phhhew...
Check after each step in computation
Well if your "expensive computation" can be broken up into multiple steps or has iterative logic you could capture the time when you start and then check periodically between your steps. This is by no means a generic solution but will work.
For a more generic solution you might make use of aspects or annotation processing, that automatically litters your code with these checks. If the "check" tells you that your time is up return None.
Ill ponder a solution in java quickly below using annotations and an annotation processor...
public abstract Answer{}
public class Some extends Answer {public Answer(double answer){answer=answer}Double answer = null;}
public class None extends Answer {}
//This is the method before annotation processing
#TimeLimit(45)
public Answer CalculateQuestionToAnswerOf42() {
double fairydust = Math.Pi * 1.618;
double moonshadowdrops = (222.21) ^5;
double thedevil == 222*3;
return new Answer(fairydust + moonshadowdrops + thedevil);
}
//After annotation processing
public Answer calculateQuestionToAnswerOf42() {
Date start = new Date() // added via annotation processing;
double fairydust = Math.Pi * 1.618;
if(checkTimeout(start, 45)) return None; // added via annotation processing;
double moonshadowdrops = (222.21) ^5;
if(checkTimeout(start, 45)) return None; // added via annotation processing;
double thedevil == 222*3;
if(checkTimeout(start, 45)) return None; // added via annotation processing;
return new Answer(fairydust + moonshadowdrops + thedevil);
}
If you're very seriously in need of this you could create a compiler plugin that inserts check blocks in loops and conditions. These check blocks can then check Thread.isInterrupted() and throw an Exception to escape.
You could possibly use an annotation, i.e. #interruptible, to mark the methods to enhance.
Related
I have been working on very large enterprise system for financial institution for quite some time. I have only noticed few usages of asynchronous methods (frankly speaking maybe 2 or 3). Lets say i have 3 methods: doSomething1(), doSomething2(), doSomething3();
// X = {1,2,3}
SomeResult doSomethingX() {
// execution of this method takes 5-15 secs
}
xxx foo() {
SomeResult result1 = doSomething1();
SomeResult result2 = doSomething2();
SomeResult result3 = doSomething3();
// some code
}
So the execution of foo takes about 3x(5-15)sec = ~30sec
There is a lot of methods similar to foo in our system and I am wondering why there are not any async methods? Wouldn't just adding #Async to doSomethings() methods make it much faster? Or is it just 'we dont use threads explicitly in enterprise systems'
It is always worth remembering that code written before you joined a project may have been written by someone who had more experience, or who had to solve a unique issue you have not seen, and after trying smarter ways had to do something that seems strange to you. Maybe there is some state you're missing that would not be in place if it was done asynchronously.
But of course, it could just be the case that either:
a) the developers didn't know about it/use it
or
b) it wasn't available at the time for whatever reason.
Enterprises certainly aren't allergic to asynchronous code, multi-threading, or anything else you may thing of.
If you are using spring, you can use the #Async annotation to doSomething(), but it's not all you have to do:
You have to return an AsyncResult from the method and you have to use Future to manage your return values. The following "code" is taken more or less whole cloth from the spring example: https://spring.io/guides/gs/async-method/:
Future res1 = doSomething("one");
Future res2 = doSomething("two");
Future res3 = doSomething("three");
// Wait until they are all done
while (!(res1.isDone() && res2.isDone() && res3.isDone())) {
Thread.sleep(10); //10-millisecond pause between each check
}
System.out.println(res1.get());
Thats already a fair amount of orchestration (perhaps there are better ways), but it gives you an idea of the amount of labor that will go in to handling concurrency at a low level. With complexity comes risk.
It seems to me that most folks have come to the conclusion that it's better to let the container handle such scaling issues rather than to handle them by hand. You're supposed to let the container scale your EJBs and your queue workers. There are plenty of java implementations that let you scale in this way.
Nonetheless, if you made something that took 60 seconds take 5 using a low level method like the above, go for it. You'll be a hero.
A lot of times while writing applications, I wish to profile and measure the time taken for all methods in a stacktrace. What I mean is say:
Method A --> Method B --> Method C ...
A method internally calls B and it might call another. I wish to know the time taken to execute inside each method. This way in a web application, I can precisely know the percentage of time being consumed by what part of the code.
To explain further, most of the times in spring application, I write an aspect which collects information for every method call of a class. Which finally gives me summary. But I hate doing this, its repetitive and verbose and need to keep changing regex to accommodate different classes. Instead I would like this:
#Monitor
public void generateReport(int id){
...
}
Adding some annotation on method will trigger instrumentation api to collect all statistics of time taken by this method and any method later called. And when this method is exited, it stops collection information. I think this should be relatively easy to implement.
The questions is: Are there any reasonable alternatives that lets me do that for general java code? Or any quick way of collection this information. Or even a spring plugin for spring applications?
PS: Exactly like XRebel, it generates beautiful summaries of time take by the security, dao, service etc part of code. But it costs a bomb. If you can afford, you should definitely buy it.
You want to write a Java agent. Such an agent allows you to redefine a class when it is loaded. This way, you can implement an aspect without polluting your source code. I have written a library, Byte Buddy, which makes this fairly easy.
For your monitor example, you could use Byte Buddy as follows:
new AgentBuilder.Default()
.rebase(declaresMethod(isAnnotatedWith(Monitor.class))
.transform( (builder, type) ->
builder
.method(isAnnotatedWith(Monitor.class))
.intercept(MethodDelegation.to(MonitorInterceptor.class);
);
class MonitorInterceptor {
#RuntimeType
Object intercept(#Origin String method,
#SuperCall Callable<?> zuper)
throws Exception {
long start = System.currentTimeMillis();
try {
return zuper.call();
} finally {
System.out.println(method + " took " + (System.currentTimeMillis() - start);
}
}
}
The above built agent can than be installed on an instance of the instrumentation interface which is provided to any Java agent.
As an advantage over using Spring, the above agent will work for any Java instance, not only for Spring beans.
I don't know if theres already a library doing it nor can I give you a ready to use code. But I can give you a description how you can implement it on your own.
First of all i assume its no problem to include AspectJ into your project. Than create an annotation f.e. #Monitor which acts as marker for the time measurment of whatever you like.
Than create a simple data strucutre holding the information you wana track.
An example for this could be the following :
public class OperationMonitoring {
boolean active=false;
List<MethodExecution> methodExecutions = new ArrayList<>();
}
public class MethodExecution {
MethodExcecution invoker;
List<MethodExeuction> invocations = new ArrayList<>();
long startTime;
long endTime;
}
Than create an Around advice for all methods. On execution check if the called Method is annotated with your Monitoring annotation. If yes started monitoring each method execution in this thread. A simple example code could look like:
#Aspect
public class MonitoringAspect {
private ThreadLocal<OperationMonitoring> operationMonitorings = new ThreadLocal<>();
#Around("execution(* *.*(..))")
public void monitoring(ProceedingJoinPoint pjp) {
Method method = extractMethod(pjp);
if (method != null) {
OperationMonitoring monitoring = null;
if(method.isAnnotationPresent(Monitoring.class){
monitoring = operationMonitorings.get();
if(monitoring!=null){
if(!monitoring.active) {
monitoring.active=true;
}
} else {
// Create new OperationMonitoring object and set it
}
}
if(monitoring == null){
// this method is not annotated but is the tracking already active?
monitoring = operationMonitoring.get();
}
if(monitoring!=null && monitoring.active){
// do monitoring stuff and invoke the called method
} else {
// invoke the called method without monitoring
}
// Stop the monitoring by setting monitoring.active=false if this method was annotated with Monitoring (and it started the monitoring).
}
}
private Method extractMethod(JoinPoint joinPoint) {
if (joinPoint.getKind().equals(JoinPoint.METHOD_EXECUTION) && joinPoint.getSignature() instanceof MethodSignature) {
return ((MethodSignature) joinPoint.getSignature()).getMethod();
}
return null;
}
}
The code above is just a how to. I would also restructure the code but I've written it in a textfield, so please be aware of architectural flaws. As mentioned with a comment at the end. This solution does not supporte multiple annotated methods along the way. But it would be easy to add this.
A limitation of this approach is that it fails when you start additional threads during a tracked path. Adding support for starting new threads in a monitored Thread is not that easy. Thats also the reason why IoC frameworks have own features for handling threads to be able to track this.
I hope you understand the general concept of this, if not feel free to ask further questions.
This is the exact reason why I built the open source tool stagemonitor, which uses Byte Buddy to insert profiling code. If you want to monitor a web application you don't have to alter or annotate your code. If you have a standalone application, there is a #MonitorRequests annotation you can use.
You say you want to know the percentage of time taken within each routine on the stack.
I assume you mean inclusive time.
I also assume you mean wall-clock time, on the theory that if one of those lower-level callees happens to do some I/O, locking, etc., you don't want to be blind to that.
So a stack-sampling profiler that samples on wall-clock time will be getting the right kind of information.
The percentage time that A takes is the percentage of samples containing A, same for B, etc.
To get the percentage of A's time used by B, it is the percentage of samples containing A that happen to have B at the next level below.
The information is all in the stack samples, but it may be hard to get the profiler to extract just the information you want.
You also say you want precise percentage.
That means you also need a large number of stack samples.
For example, if you want to shrink the uncertainty of your measurements by a factor of 10, you need 100 times as many samples.
In my experience finding performance problems, I am willing to tolerate an uncertainty of 10% or more, because my goal is to find big wastage, not to know with precision how bad it is.
So I take samples manually, and look at them manually.
In fact, if you look at the statistics, you only have to see something wasteful on as few as two samples to know it's bad, and the fewer samples you take before seeing it twice, the worse it is.
(Example: If the problem wastes 30% of time, it takes on average 2/30% = 6.67 samples to see it twice. If it wastes 90% of time, it only takes 2.2 samples, on average.)
While I am writing the code sometimes I bump in the situation when I need to choose whether I should create a separate method (the advantage is that I can use my own syntax later) or implement the complex method which already exists (also less lines of the code).
Here are the examples using different programming languages (Objective-C and Java) to explain the question.
Objective-C example:
-(double) maxValueFinder: (NSMutableArray *)data {
double max = [[data valueForKeyPath:#"#max.intValue"] doubleValue];
return maxValue;
}
then later:
...
double max = [self maxValueFinder:data];
...
or just every time try to call:
...
double max = [[data valueForKeyPath:#"#max.intValue"] doubleValue];
...
Java example:
public static double maxFinder (ArrayList<Double> data) {
double maxValue = Collections.max(data);
return maxValue;
}
then later:
...
double max = maxFinder(data);
...
or just every time try to call:
...
double max = Collections.max(data);
...
or more complex case to make the point of my question sharper:
//using jsoup
public static Element getElement(Document content){
Element link = content.getElementsByTag("a").first();
return link;
}
or every time:
...
Element link = content.getElementsByTag("a").first();
...
Which approach cost less resources (performance, memory) or it is the same?
It absolutely doesn't matter. At least in your Java case you're uselessly recreating existing functionality, which is ridiculous.
You should first see if the functionality is contained in the standard library, then see if existing well known libraries have it, and only after that should you consider writing implementations yourself (especially for more complex functionality).
Performance has nothing to do with your question, except in the sense that the more time you spend on recreating existing functionality, the less time you have left for actual new code (therefore lowering your programming performance).
As for creating wrapper methods, that can be useful in some cases, especially if the actual method calls are often chained and you find yourself having more and more of those in the code. But there's a delicate difference between code clarity and writing excessive code.
public void parseHtml() {
parseFirstPart();
parseSecondPart();
parseThirdPart();
}
If we assume that each parse method only contains 1 or maybe 2 method calls then adding these additional methods is most likely useless, since the same thing can be achieved by proper commenting. If the parse methods contain a lot of calls, it makes sense to extract methods out of them. There's no rule about it, it's a skill you learn while you program (and of course depends a lot on what you view as beautiful code.
It's absolutely useless to recreating existing functionality.
Because these function is already implement in library.
If you talk about performance then both cases you are loading same line
double maxValue = Collections.max(data);
Performance is not matter in both cases because you are loading same code.
Suppose I want to make a method non-blocking, and make the app continue as it is and still surely get the return value:
Key key = datastore.put(complexInstance);
String name = key.getName();
doSomethingWithTheName(name);
Or simply, for some Java environment that can't run thread for more than 30 seconds.
Where in the put method:
public Key put(Object instance){
Key result = null;
// In here process could take up time, say 30 seconds or more, IDK :-/
return result;
}
What is the strategy to achieve this?
You could use an implementation of an ExecutorService in combination with a Future object (http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Future.html). You would simply start a new thread (or use an existing one) and could fetch the result later.
Java 8 made the process a lot simpler:
//field in a manager class
ScheduledExecutorService es = Executors.newScheduledThreadPool(10);
//Schedule a task
es.schedule(() -> { /* contents of a runnable */ }, 0, TimeUnit.SECONDS);
Otherwise, you can still just use an anonymous runnable with the Scheduler:
es.schedule(new Runnable() {
public void run() {
/* do what you need */
}
}, 0, TimeUnit.SECONDS);
However, as you specified, you will still need to do something for a returned value. There isn't really much that you can do, aside from use something from either a state manager, or to execute relevant methods within your runnable.
Your class needs to take a thread pool, probably via the interface ExecutorService, that the methods will run on. You could make it a private static variable, but more likely it's better for it to be passed in or at least configured by the client code, who will set size, etc.
Note that if IO is the asynchronous part, it's better to use something built on Java's nio framework than to use lots of threads.
You will need to return a Future of some sort. Through Java 7 at least (I'm not sure about 8), Java's future library is very weak and omits some obviously needed functionality. Look at either Functional Java or Google's library. But you will notice that many libraries (Apache's MINA, Amazon Web Service's Java SDK, etc.) implement their own promise libraries to get over these weaknesses. (I did the same in my company's code base. Whoops.)
I would like to use wait(int) as the signature of a method in a fluent API (used for http://www.jooq.org). The goal is to be able to construct SQL queries like this example:
SELECT * FROM T_AUTHOR
WHERE ROWNUM <= 1
FOR UPDATE OF FIRST_NAME, LAST_NAME
WAIT 5
The full FOR UPDATE clause syntax specification (at least for Oracle) can be seen here:
FOR UPDATE [ OF [ [ schema. ] { table | view } . ] column
[, [ [ schema. ] { table | view } . ] column]...]
[ { NOWAIT | WAIT integer | SKIP LOCKED } ]
http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/img_text/for_update_clause.htm
With jOOQ, I really want to stay close to the SQL syntax. So I'd like to be able to model the above SQL clause with the jOOQ fluent API like this:
Result<Record> result = create.select()
.from(T_AUTHOR)
.limit(1)
.forUpdate()
.of(FIRST_NAME, LAST_NAME)
.wait(5) // Here's the issue
.fetch();
The fetch method is used to render the API's underlying object as SQL and run the SQL statement against an Oracle (or any other) database. The above can be legally specified in an interface:
/**
* A type that models a "step" in the creation of a query using the fluent API
*/
public interface SelectForUpdateWaitStep extends SelectFinalStep {
// [...]
/**
* Add a "FOR UPDATE .. WAIT n" clause to the query
*/
SelectFinalStep wait(int seconds);
// [...]
}
I have some doubts about this, though, because there is a risk of collision with another method:
public class Object {
// [...]
public final native void wait(long timeout) throws InterruptedException;
// [...]
}
Thanks to method-overloading (int vs. long arguments), I can actually do this. But I'm afraid it might confuse my users and lead to mistakes. So this would be wrong:
.forUpdate()
.of(FIRST_NAME, LAST_NAME)
.wait((long) 5) // This doesn't make sense
.fetch(); // This doesn't compile
So my questions are:
Can I somehow prevent calling/accessing Object.wait(long) altoghether? I don't think so because it's declared final but maybe someone knows a compiler-trick, or something else?
Do you have a better idea for my API design apart from just renaming the method to something silly like doWait(int) or WAIT(int)?
You might try using a waitFor method instead, which specifies both a time and a "condition" to wait for. The implementation detail would be hidden, but one possible implementation would be to try your action immediately and loop until the specified condition has been met, with an appropriate pause between attempts.
Here's a sample interface for a Condition I use myself (as you can see, it doesn't need to be complex):
public interface Condition {
public boolean met();
}
void wait(long) is a part of the contract offered by Object and therefore it should not be changed. Imagine that someone stores your object and attempts to use it for wait/notify threading logic. So completely changing it's logic is just playing against the rules. So you will have to come up with different name.
On the other hand, it seems that having forUpdate take parameter indicating wait time will fit the bill. You could just have another version of forUpdate in addition to existing one.
What this requires is a way to disable an Object method. And main reason seems to be because it has a nice name that would fit the purposes of a proprietary API.
At first, this contradicts the entire idea of inheritance -- once you inherit from a class, all subclasses must expose the same non-private fields & method. You can always override a method, except when (1) it is marked as final and (2) it has an incompatible (non-covariant) return type, both of which are true with the void wait(long) method.
Furthermore, since every object is an Object in Java, everything must have a method void wait(long) and there should be no way to hide/delete/disable/forward/override it. Assuming it were possible to hide the void wait(long) method, how would you go about invoking it, should you wish to invoke it?
However, assuming you would never need to invoke void wait(long) for your particular classes, there is always the approach of source/byte-code weaving that AspectJ uses in order to make changes to the .class Java bytecode based on certain invocation rules. You could trap every call to wait(long) and declare an error/warning. See more here: http://www.eclipse.org/aspectj/doc/released/adk15notebook/annotations-decp.html
However, native method pointcuts are not possible even with AspectJ with byte-code weaving. Most likely, this is not possible even with source-code weaving -- but it might be worth a try.
Hacking around with core Java for the sake of DSL is simply not a good idea.
Why not make your DSL more expressive?
What does wait(int n) mean anyway? wait for N milliseconds, seconds, minutes?
A better signature would be:
wait(long duration, java.util.concurrent.TimeUnit){ ... }
which reads better, for example:
wait(30, TimeUnit.MILLISECONDS)