I have read some answers about similar topics but I was not satisfied with any of them.
We are deploying some code into AWS Lambda with a jar file which contains a version of the code like name-of-my-app-14.jar, the 14 is the Jenkins build number.
The problem I have is that I don't have a way to understand which version of the jar is currently deployed in AWS and it would be a nice to have.
This is the cloudformation fragment I have to create the lambda:
MyLambdaFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: name-of-my-app.jar
FunctionName: "my-function-name"
Handler: "com.package.something.myapp.HandlerClass::handleRequest"
MemorySize: 256
Role: "arn:aws:iam::1234567890:role/some-role"
Runtime: "java8"
Timeout: 60
Environment:
Variables:
SOME_VARIABLE: "value"
To deploy we download the jar with the version we want to deploy from our artifact repository, saving it as it is specified in the above template and we run:
aws cloudformation package --template-file myapp-stack.yaml --output-template-file tmp.yaml --s3-bucket my.bucket
aws cloudformation deploy --region my-region --template-file tmp.yaml --stack-name prod-myappstackname --capabilities CAPABILITY_IAM --parameter-overrides Environment=prod --no-fail-on-empty-changeset
Related
I have a .jar file containing useful functions for my application located in an AWS S3 bucket, and I want to use it as a dependency in Spark without having to first download it locally. Is it possible to directly reference the .jar file with spark-submit (or pyspark) --jars option?
So far, I have tried the following:
spark-shell --packages com.amazonaws:aws-java-sdk:1.12.336,org.apache.hadoop:hadoop-aws:3.3.4 --jars s3a://bucket/path/to/jar/file.jar
The AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY variables are correctly set, since when running the same command without the --jars option, other files in the same bucket are successfully read. However, if the option is added, I get the following error:
Exception in thread "main" java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2688)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3431)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3466)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
at org.apache.spark.util.DependencyUtils$.resolveGlobPath(DependencyUtils.scala:317)
at org.apache.spark.util.DependencyUtils$.$anonfun$resolveGlobPaths$2(DependencyUtils.scala:273)
at org.apache.spark.util.DependencyUtils$.$anonfun$resolveGlobPaths$2$adapted(DependencyUtils.scala:271)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
at org.apache.spark.util.DependencyUtils$.resolveGlobPaths(DependencyUtils.scala:271)
at org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$4(SparkSubmit.scala:364)
at scala.Option.map(Option.scala:230)
at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:364)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:901)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2592)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2686)
... 27 more
I'm using Spark 3.3.1 pre-built for Apache Hadoop 3.3 and later.
This may be because when in client mode - Spark during its boot distributes the Jars (specified in --jars) via Netty first. To download a remote JAR from a third-party file system (i.e. S3), it'll need the right dependency (i.e. hadoop-aws) in the classpath already (before it prepares the final classpath).
But since it is yet to distribute the JARs it has not prepared the classpath - thus when it tries to download the JAR from s3, it fails with ClassNotFound (as hadoop-aws is yet to be on the classpath), but when doing the same in the application code it succeeds - as by that time the classpath has been resolved.
i.e. Downloading the dependency is dependent on a library that will be loaded later.
To run Apache Spark applications with a JAR dependency from Amazon S3, you can use the --jars command-line option to specify the S3 URL of the JAR file when submitting the Spark application.
For example, if your JAR file is stored in the my-bucket S3 bucket at the jars/my-jar.jar path, you can submit the Spark application as follows:
spark-submit --jars s3a://my-bucket/jars/my-jar.jar \
--class com.example.MySparkApp \
s3a://my-bucket/my-spark-app.jar
This will download the JAR file from S3 and add it to the classpath of the Spark application.
Note that you will need to include the s3a:// prefix in the S3 URL to use the s3a filesystem connector, which is the recommended connector for reading from and writing to S3. You may also need to configure the fs.s3a.access.key and fs.s3a.secret.key properties with your AWS access key and secret key in order to authenticate the connection to S3.
I've got a Spring Boot application that'd I'd like to automatically generate traces for using the OpenTelemetry Java agent, and subsequently upload those traces to Google Cloud Trace.
I've added the following code to the entry point of my application for sending traces:
OpenTelemetrySdk.builder()
.setTracerProvider(
SdkTracerProvider.builder()
.addSpanProcessor(
SimpleSpanProcessor.create(TraceExporter.createWithDefaultConfiguration())
)
.build()
)
.buildAndRegisterGlobal();
...and I'm running my application with the following system properties:
-javaagent:path/to/opentelemetry-javaagent-all.jar \
-jar myapp.jar
...but I don't know how to connect the two.
Is there some agent configuration I can apply? Something like:
-Dotel.traces.exporter=google_cloud_trace
I ended up resolving this as follows:
Clone the GoogleCloudPlatform /
opentelemetry-operations-java repo
git clone
git#github.com:GoogleCloudPlatform/opentelemetry-operations-java.git
Build the exporter-auto project
./gradlew clean :exporter-auto:shadowJar
Copy the jar produced in exporter-auto/build/libs to my target project
Run the application with the following arguments:
-javaagent:path/to/opentelemetry-javaagent-all.jar
-Dotel.javaagent.experimental.extensions=[artifact-from-step-3].jar
-Dotel.traces.exporter=google_cloud_trace
-Dotel.metrics.exporter=none
-jar myapp.jar
Note: This setup does not require any explicit code changes in the target code base.
I am trying to run spark/java application on kubernetese (via minikube) using spark-operator. I am getting a bit confused on what should I place in the Dockerfile so that it could be built in the image format and execute via spark-operator ?
Sample spark-operator.yaml :
apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
name: my-spark-app
namespace: default
spec:
type: Java
mode: cluster
image: docker/repo/my-spark-app-image
mainApplicationFile: local:///opt/app/my-spark-app.jar
As mentioned above, the spark operator yaml only requires the jar and the image location. So, do I need to mention just below in my Dockerfile ? Is there any sample Dockerfile available which I can refer ?
Dockerfile:
FROM openjdk11:alpine-jre
COPY target/*.jar /opt/app/csp_auxdb_refresh.jar
COPY src/main/resources/* opt/app
In the Dockerfile you have provided, nor Spark, nor other dependencies are installed.
To quickly get started, use gcr.io/spark-operator/spark:v3.1.1 as the base for your image, i.e. change the FROM statement to FROM gcr.io/spark-operator/spark:v3.1.1 and build again.
There is a great guide on how to get started with the spark-operator in their Github repo (here).
I finally reached the point where my Elastic Beanstalk Instance / Environment got launched. (Java Corretto 11 Platform) Now it fails starting up the provided .jar file.
In the eb-engine.log file, I am not able to find any more error than this:
2021/05/27 11:36:25.889735 [INFO] Executing instruction: StageJavaApplication
2021/05/27 11:36:25.889871 [ERROR] An error occurred during execution of command [app-deploy] - [StageJavaApplication]. Stop running the command. Error: staging java app failed due to invalid zip file
The jar file is a Spring Boot application built with mvn -B package.
Locally the whole thing starts, but crashes afterwards because of not given environment variables (Expected behaviour).
But it seems AWS is not even starting the application..
Any suggestions on this?
Spring Boot apps run nicely on Elastic Beanstalk. However, you do need to set some variables. For example, have you set server-port variable to 5000?
And as you stated, to successfully use a Service Client, you can set environment variables for your creds. Here is an end to end walkthrough that shows how to successfully put a Spring BOOT app that invokes several AWS Services on Elastic Beanstalk.
Creating your first AWS Java web application
PS - your log file mentions a ZIP file. Be sure to create the JAR properly as discussed in the above example.
Just in case someone arrive here looking for an answer about this guy:
Error: staging java app failed due to invalid zip file
I was renaming my service jar in Gradle, using:
tasks.withType<org.springframework.boot.gradle.tasks.bundling.BootJar> {
archiveFileName.set("service.jar")
launchScript()
}
And ElasticBeanstalk was not happy about the renaming.
When I let it have the default name, then no zip issues and all worked like a charm.
I want to test my lambda functions locally with Serverless Application Model (SAM)
In the AWS docs they write :
SAM Local leverages the docker-lambda Docker images to run your code in a sandbox that simulates the Lambda execution environment.
I pulled the docker image on my computer. I could successfully run a simple Hello World Lambda Function.
Command to run Lambda function locally:
$ docker run -v "$PWD/target/classes":/var/task lambci/lambda:java8 com.amazonaws.lambda.demo.LambdaFunctionHandler
results:
"Hello from Lambda!"
Code of Lambda function automatically generated with Eclipse Toolkit:
package com.amazonaws.lambda.demo;
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
public class LambdaFunctionHandler implements RequestHandler<Object, String> {
#Override
public String handleRequest(Object input, Context context) {
context.getLogger().log("Input: " + input);
// TODO: implement your handler
return "Hello from Lambda!"
}
}
This is my progress till yet. What i couldnt do is to use sam local which uses the docker-lambda image.(Maybe i should not have to download it manually?).
I installed sam local on my windows:
npm install -g aws-sam-local
created a template.yaml config sam file.
AWSTemplateFormatVersion: 2010-09-09
Transform: AWS::Serverless-2016-10-31
Resources:
ExampleJavaFunction:
Type: AWS::Serverless::Function
Properties:
Handler: com.amazonaws.lambda.demo.LambdaFunctionHandler
CodeUri: ./target/demo-1.0.0-shaded.jar
Runtime: java8
the name for CodeUri: i choosed after i build my shaded jar file with:
mvn compile shade:shade
after this i should run to run my lambda function:
$ echo '{ "some": "input" }' | sam local invoke
NOW i have this ERROR:
2017/12/05 14:56:36 Successfully parsed template.yaml
2017/12/05 14:56:36 Running AWS SAM projects locally requires Docker. Have you got it installed?
2017/12/05 14:56:36 error during connect: Get http://%2F%2F.%2Fpipe%2Fdocker_engine/_ping: open //./pipe/docker_engine: The system cannot find the file specified. In the default daemon configuration on Windows, the docker client must be run elevated to connect. This error may also indicate that the docker daemon is not running.
What is my mistake to use SAM Local with Java? Can it be that its not working because my computer has not Hyper-V and iam using dockertoolbox?..
here you can see Advanced sam docs with compiled languages like java.
It was a bug in Sam local.. fixed with new
update
If you still have a problem in windows then try this :
COMPOSE_CONVERT_WINDOWS_PATHS=1
this should help if your Path is wrong. / \