AWS lambda and Java concurrency

AWS lambda and Java concurrency - java

It is known that AWS lambda may reuse early created objects of handlers, and it really does it (see FAQ):
Q: Will AWS Lambda reuse function instances?
To improve performance, AWS Lambda may choose to retain an instance of
your function and reuse it to serve a subsequent request, rather than
creating a new copy. Your code should not assume that this will always
happen.
The question is regarding Java concurrency. If I have a class for a handler, say:
public class MyHandler {
private Foo foo;
public void handler(Map<String,String> request, Context context) {
...
}
}
so, will it be thread-safe to access and work with the object variable foo here or not?
In other words: may AWS lambda use the same object concurrently for different calls?
EDIT My function is processed on an event based source, particularly it is invoked by an API Gateway method.
EDIT-2 Such kind of question rises when you want to implement some kind of connection pool to external resources, so I want to keep the connection to the external resource as an object variable. It actually works as desired, but I'm afraid of concurrency problems.
EDIT-3 More specifically I'm wondering: can instances of handlers of AWS lambda share common heap (memory) or not? I have to specify this additional detail in order to prevent answers with listing of obvious and common-known things about java thread-safe objects.

May AWS lambda use same object concurrently for different calls?
Can instances of handlers of AWS lambda share common heap (memory) or not?
A strong, definite NO. Instances of handlers of AWS Lambda cannot even share files (in /tmp).
An AWS Lambda container may not be reused for two or more concurrently existing invocations of a Lambda function, since that would break the isolation requirement:
Q: How does AWS Lambda isolate my code?
Each AWS Lambda function runs in its own isolated environment, with its own resources and file system view.
The section "How Does AWS Lambda Run My Code? The Container Model" in the official description of how lambda functions work states:
After a Lambda function is executed, AWS Lambda maintains the
container for some time in anticipation of another Lambda function
invocation. In effect, the service freezes the container after a
Lambda function completes, and thaws the container for reuse, if AWS
Lambda chooses to reuse the container when the Lambda function is
invoked again. This container reuse approach has the following
implications:
Any declarations in your Lambda function code remains initialized,
providing additional optimization when the function is invoked again.
For example, if your Lambda function establishes a database
connection, instead of reestablishing the connection, the original
connection is used in subsequent invocations. You can add logic in
your code to check if a connection already exists before creating one.
Each container provides some disk space in the /tmp directory. The
directory content remains when the container is frozen, providing
transient cache that can be used for multiple invocations. You can add
extra code to check if the cache has the data that you stored.
Background processes or callbacks initiated by your Lambda function
that did not complete when the function ended resume if AWS Lambda
chooses to reuse the container. You should make sure any background
processes or callbacks (in case of Node.js) in your code are complete
before the code exits.
As you can see, there is absolutely no warning about race conditions between multiple concurrent invocations of a Lambda function when trying to take advantage of container reuse. The only note is "don't rely on it!".

Taking advantage of the execution context reuse is definitely a practice when working with AWS Lambda (See AWS Lambda Best Practices). But this does not apply to concurrent executions as for concurrent execution a new container is created and thus new context. In short, for concurrent executions if one handler changes the value other won't get the new value.

As I see there is no concurrency issues related to Lambda. Only a single invocation "owns" the container. The second invocation will get an another container (or possible have to wait until the first one become free).
BUT I didn't find any guarantee the Java memory visibility issues cannot happen. In this case changes done by the first invocation could stay invisible for the second one. Or the changes of the first invocation will be written to RAM after the changes done by the second invocation.
In the most cases visibility issues are handled in the same way as concurrency issues. Therefore I would suggest to develop Lambda function thread-safe (or synchronized). At least as long as AWS won't give us a guarantee, that they do something on their side to flush CPU state to the memory after every invocation.

Related

Run one by one lambda on AWS. The second run always throw an exception for the same process

I have app in java on lambda AWS. When I run the test for lambda it analyzes the file from S3 and creates output. When I run first time after deploying a new image for lambda it always processes with successful status. Also when I will wait some time and run this process again it will end the successful status.
But If I run 2 or more lambdas one by one it throws exceptions for the next lambda when trying to read the file from S3 (I am always using another copy of the same file).
In the java application, I am using the Spring framework.
It looks like lambda was holding states for some objects, and the next lambda that starts up has been using these objects.
What can I do with this?

Lambda keeps the process of your Lambda function's invocation around for a certain unspecified amount of time (in the order of a few minutes).
So, after you deploy a new version (or a first version) of your Lambda function or after you waited for said unspecified amount of time and then invoke your Lambda function, the Lambda runtime will instantiate a new small container with your Lambda function as a process living inside of it.
This container and your process is then used to handle the Lambda invocation and then gets dormant (no CPU time issued to the process). However, the process itself and all of its memory is still kept around.
When, within said unspecified time interval, your Lambda function is invoked again, then the Lambda runtime will know that it still has a created instance of your Lambda function's container lying around and will then give the container CPU time to execute your Lambda handler. Once it returns, the container falls to sleep again.
We say that a Lambda function is "cold" when no container for it is currently created by the Lambda runtime. Likewise, we say that a Lambda function is "warm" whenever the interval since the last invocation was within said unspecified time such that the Lambda runtime is still keeping an instantiated container around for your Lambda function.
So, whatever you are doing in your process (JVM process, Spring application), notably storing things in memory, will still be around when your Lambda function is called while it is "warm".
When your Lambda function errors after having been called two or more times in quick succession, then that means that you are storing/caching some state in-process in memory between invocations which in your case results in an error.
To avoid this, you must, in principle, design Lambda functions such that they are stateless (nothing is shared/kept in memory between invocations). However, in order to improve runtime performance and take advantage of the fact that your Lambda function will be "warm" for a certain amount of time, you can cache certain things, such as database connections or AWS SDK client instantiations. Basically, everything you would do once after your application starts up.

AWS Container reuse and Synchronized for resource outside of the handler function

With AWS container reuse, I want to understand if the reuse happens when any call is pending within that container or after the call has finished. I need to declare some resources outside of the handler function and either I use synchronization and doc, but before I do that I want to understand what AWS promise/contract is.

The lambda does not need synchronisation, everything you declare outside the handler will be reused for the next invocation that hits the same execution environment. But there will not be two invocations in the same container at the same time. Only after the first invocations finishes a second one may hit the same execution environment.
The key term to look for is the already mentioned "execution environment" / "execution context". You may find https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html and https://docs.aws.amazon.com/lambda/latest/dg/running-lambda-code.html helpful.
After a Lambda function is executed, AWS Lambda maintains the execution context for some time in anticipation of another Lambda function invocation. In effect, the service freezes the execution context after a Lambda function completes, and thaws the context for reuse, if AWS Lambda chooses to reuse the context when the Lambda function is invoked again
The "After" does not make it 100% absolute, but clear enough and I can confirm that this is the case.

Tracking Async Lambda execution on AWS

I am trying to build a process that invokes AWS lambda, which then utilizes AWS SNS to send messages that trigger more lambdas. Each such triggered lambdas write an output file to S3. The process is as depicted below -
My question is this - How can I know that all lambdas are done with writing files? I want to execute another process that collects all these files and does merging. I could think of two obvious ways -
Constantly monitor s3 for as many output files as SNS messages. Once, total count reaches, invoke the final merging lambda.
Use a db as sync source, write counts for that particular job/session and keep monitoring it till the count reaches SNS messages count.
Both solutions require constant polling, which i would like to avoid. I want to do this in an event driven manner. I was hoping for Amazon SQS would come to my rescue with some sort of "empty queue lambda trigger", but SQS only supports lambdas triggering on new messages. Is there any known way to achieve this in an event driven manner in AWS? Your suggestions/comments/answers are much appreciated.

I would propose a couple of options here:
Step Functions:
This is a managed service for state machines. It's great for co-ordinating workflows.
Atomic Counting:
If you know the number of things in advance, you could initialize an Atomic Counter in DynamoDB and then atomically decrement it as work completes. Use DynamoDB Streams to trigger Lambda invocation when the counter is mutated, and trigger your next phase (or end of work) when the counter hits zero. Note that whenever an application creates, updates, or deletes items in the table, DynamoDB Streams writes a stream record, so every mutation of the counter would trigger your Lambda.
Note that DynamoDB Streams guarantees the following:
Each stream record appears exactly once in the stream.
For each item that is modified in a DynamoDB table, the stream records appear in the same sequence as the actual modifications to the item.

AWS Step Functions (a managed state machine service) would be the obvious choice. AWS has some examples as starting points. I remember one being a looping state that you could probably apply to this use case.
Another idea off top of my head...
Create an "Orchestration Lambda" that has the list of your files...
Orchestration Lambda invokes a "File Writer Lambda" in a loop, passing the file info. The invokeAsync(InvokeRequest request) returns a Future object. Orchestration Lambda can check the future object state for completion.
Orchestration Lambda can make a similar call to the "File Writer Lambda" but instead use the more flexible method: invokeAsync(InvokeRequest request, AsyncHandler asyncHandler). You can make an inner class that implements this AsyncHandler and monitor the completion there in the Orchestration Lambda. It is a little cleaner than all the loops.
There are probably many ways to solve this problem, but there are two ideas.

Personally, I prefer the idea with "Step Functions".
But if you want to simplify your architecture, you could create trigered lambda function. Chose 'S3 trigger' in left side of lambda function designer and configure it bottom.
Check out more - Using AWS Lambda with Amazon S3
But in this case you have to create more sophisticated lambda function wich will check that all apropriate files are uploaded on S3 and after this start your merge.

The stated problem seems a suitable candidate for the Saga Pattern.
Basically Saga is described like any long running , distributed process.
As mentioned earlier , the AWS platform allows using Step functions to implement a Saga, as described here enter

Should logic of Spark transformation and action need to be threadsafe?

This may be a stupid question. However, I would like to know if I have something like this - rdd.mapPartitions(func). Should the logic in func be threadsafe?
Thanks

The short answer is no, it does not have to be thread safe.
The reason for this is that spark divides the data between partitions. It then creates a task for each partition and the function you write would run within that specific partition as a single threaded operation (i.e. no other thread would access the same data).
That said, you have to make sure you do not create thread "unsafety" manually by accessing resources which are not the RDD data. For example, if you create a static object and access that, it might cause issues as multiple tasks might run in the same executor (JVM) and access it as well. That said, you shouldn't be doing something like that to begin with unless you know exactly what you are doing...

Any function passed to the mapPartitions (or any other action or transformation) has to be thread safe. Spark on JVM (this is not necessarily true for guest languages) uses executor threads and doesn't guarantee any isolation between individual tasks.
This is particularly important when you use resources which are not initialized in the function, but passed with the closure like for example objects initialized in the main function, but referenced in the function.
It goes without saying you should not modify any of the arguments unless it is explicitly allowed.

When you do "rdd.mapPartitions(func)", the func may actually execute in a different jvm!!! Thread does not have significance across JVM.
If you are running in local mode, and using global state or thread unsafe functions, the job might work as expected but the behaviours is not defined or supported.

OptimisticLockingException with Camunda Service Task

We're seeing OptimisticLockingExceptions in a Camunda process with the following Scenario:
The process consists of one UserTask followed by one Gateway and one ServiceTask. The UserTask executes
runtimeService.setVariable(execId, "object", out);`.
taskService.complete(taskId);
The following ServiceTask uses "object" as input variable (does not modify it) and, upon completion throws said OptimisticLockingException. My problem seems to originate from the fact, that taskService.complete() immediately executes the ServiceTask, prior to flushing the variables set in the UserTask.
I've had another, related issue, which occured, when in one UserTask I executed runtimeService.setVariable(Map<Strong, Boolean>) and tried to access the members of the Map as transition-guards in a gateway following that UserTask.
I've found the following article: http://forums.activiti.org/content/urgenterror-updated-another-transaction-concurrently which seems somehow related to my issue. However, I'm not clear on the question whether this is (un)wanted behaviour and how I can access a DelegateExecution-Object from a UserTask.

After long and cumbersome search we think, we have nailed two issues with camunda which (added together) lead to the Exception from the original question.
Camunda uses equals on serialized objects (represented by byte-arrays) to determine, whether process variables have to be written back to the database. This even happens when variables are only read and not set. As equals is defined by pointer-identity on arrays, a serializabled-Object is never determined "equal" if it has been serialized more than once. We have found, that a single runtimeService.setVariable() leads to four db-updates at the time of completeTask() (One for setVariable itself, the other three for various camunda-internal validation actions). We think this is a bug and will file a bug report to camunda.
Obviously there are two ways to set variables. One way is to use runtimeService.setVariable(), the other is to use delegateTask/delegateExecution.setVariable(). There is some flaw when using both ways at the same time. While we cannot simplify our setup to a simple unit-test, we have identified several components which have to be involved for the Exception to occur:
2.1 We are using a TaskListener to set up some context-variables at the start of Tasks this task-listener used runtimeService.setVariable() instead of delegateTask.setVariable(). After we changed that, the Exception vanished.
2.2 We used (and still use) runtimeService.setVariable() during Task-Execution. After we switched to completeTask(Variables) and omitted the runtimeService.setVariable() calls, the Exception vanished as well. However, this isn't a permanent solution as we have to store process variables during task execution.
2.3 The exception occured only in combination when process variables where read or written by the delegate<X>.getVariable() way (either by our code or implicitly in the camunda implementation of juel-parsing with gateways and serviceTasks or completeTask(HashMap))
Thanks a lot for all your input.

You could consider using an asynchronous continuation on the service task. This will make sure that the service task is executed inside a new transaction / command context.
Consider reading the camunda documentation on transactions and asynchronous continuations.
The DelegateExecution object is meant for providing service task (JavaDelegate) implementations access to process instance variables. It is not meant to be used from a User Task.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

AWS lambda and Java concurrency - java

Related

Run one by one lambda on AWS. The second run always throw an exception for the same process

AWS Container reuse and Synchronized for resource outside of the handler function

Tracking Async Lambda execution on AWS

Should logic of Spark transformation and action need to be threadsafe?

OptimisticLockingException with Camunda Service Task

Categories

Resources