AWS SAM local with JAVA and Docker Toolbox - java

I want to test my lambda functions locally with Serverless Application Model (SAM)
In the AWS docs they write :
SAM Local leverages the docker-lambda Docker images to run your code in a sandbox that simulates the Lambda execution environment.
I pulled the docker image on my computer. I could successfully run a simple Hello World Lambda Function.
Command to run Lambda function locally:
$ docker run -v "$PWD/target/classes":/var/task lambci/lambda:java8 com.amazonaws.lambda.demo.LambdaFunctionHandler
results:
"Hello from Lambda!"
Code of Lambda function automatically generated with Eclipse Toolkit:
package com.amazonaws.lambda.demo;
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
public class LambdaFunctionHandler implements RequestHandler<Object, String> {
#Override
public String handleRequest(Object input, Context context) {
context.getLogger().log("Input: " + input);
// TODO: implement your handler
return "Hello from Lambda!"
}
}
This is my progress till yet. What i couldnt do is to use sam local which uses the docker-lambda image.(Maybe i should not have to download it manually?).
I installed sam local on my windows:
npm install -g aws-sam-local
created a template.yaml config sam file.
AWSTemplateFormatVersion: 2010-09-09
Transform: AWS::Serverless-2016-10-31
Resources:
ExampleJavaFunction:
Type: AWS::Serverless::Function
Properties:
Handler: com.amazonaws.lambda.demo.LambdaFunctionHandler
CodeUri: ./target/demo-1.0.0-shaded.jar
Runtime: java8
the name for CodeUri: i choosed after i build my shaded jar file with:
mvn compile shade:shade
after this i should run to run my lambda function:
$ echo '{ "some": "input" }' | sam local invoke
NOW i have this ERROR:
2017/12/05 14:56:36 Successfully parsed template.yaml
2017/12/05 14:56:36 Running AWS SAM projects locally requires Docker. Have you got it installed?
2017/12/05 14:56:36 error during connect: Get http://%2F%2F.%2Fpipe%2Fdocker_engine/_ping: open //./pipe/docker_engine: The system cannot find the file specified. In the default daemon configuration on Windows, the docker client must be run elevated to connect. This error may also indicate that the docker daemon is not running.
What is my mistake to use SAM Local with Java? Can it be that its not working because my computer has not Hyper-V and iam using dockertoolbox?..
here you can see Advanced sam docs with compiled languages like java.

It was a bug in Sam local.. fixed with new
update
If you still have a problem in windows then try this :
COMPOSE_CONVERT_WINDOWS_PATHS=1
this should help if your Path is wrong. / \

Related

How can I export traces generated by the OpenTelemetry Java agent to Google Cloud Trace?

I've got a Spring Boot application that'd I'd like to automatically generate traces for using the OpenTelemetry Java agent, and subsequently upload those traces to Google Cloud Trace.
I've added the following code to the entry point of my application for sending traces:
OpenTelemetrySdk.builder()
.setTracerProvider(
SdkTracerProvider.builder()
.addSpanProcessor(
SimpleSpanProcessor.create(TraceExporter.createWithDefaultConfiguration())
)
.build()
)
.buildAndRegisterGlobal();
...and I'm running my application with the following system properties:
-javaagent:path/to/opentelemetry-javaagent-all.jar \
-jar myapp.jar
...but I don't know how to connect the two.
Is there some agent configuration I can apply? Something like:
-Dotel.traces.exporter=google_cloud_trace
I ended up resolving this as follows:
Clone the GoogleCloudPlatform /
opentelemetry-operations-java repo
git clone
git#github.com:GoogleCloudPlatform/opentelemetry-operations-java.git
Build the exporter-auto project
./gradlew clean :exporter-auto:shadowJar
Copy the jar produced in exporter-auto/build/libs to my target project
Run the application with the following arguments:
-javaagent:path/to/opentelemetry-javaagent-all.jar
-Dotel.javaagent.experimental.extensions=[artifact-from-step-3].jar
-Dotel.traces.exporter=google_cloud_trace
-Dotel.metrics.exporter=none
-jar myapp.jar
Note: This setup does not require any explicit code changes in the target code base.

Run Java Jar within Python Google Cloud Function

I'm trying to port over some old Ruby code I used to run on Heroku to a Python-based Google Cloud Function.
This code runs Apple's Reporter tool which is "a command-line tool that you can use to download your Sales and Trends reports and Payments and Financial Reports". Docs can be found here.
The Ruby code worked well for years until yesterday, running on Heroku with a Ruby + Java build pack. A small snippet of this, where options are args received :
options = [
vendor_id,
file_type,
sub_file_type,
'Daily',
trimmed_date,
version
]
Dir.chdir("#{Rails.root}/tmp/") do
stdout, stderr, status = Open3.capture3("java -jar #{Rails.root}/public/jars/Reporter.jar p=Reporter.properties m=Robot.XML Sales.getReport #{options.join(', ')}")
return {:status => status, :error => stderr.to_s, :stdout => stdout.to_s }
end
The error I'm seeing on Heroku after no code or stack updates is Network is available but cannot connect to application. Check your proxy and firewall settings and try again.
Most of our other similar processes have been moved to Google Cloud Functions, so after getting nowhere with the above error I thought I'd move this also.
So a similar snippet this time in Python:
from subprocess import Popen, PIPE
def execute_reporter_jar(vendor_id, trimmed_date, file_type, api_version):
process = Popen(["java -jar Reporter.jar p=Reporter.properties Sales.getReportVersion Sales, Detailed"], stdin=PIPE, stdout=PIPE, shell=True)
out, err = process.communicate()
print("returncode = %s", process.returncode)
print("stdout = %s", out)
print("stderr = %s", err)
This works well locally, but when I deploy to Gooogle Cloud it seemingly runs successfully in a few ms, however, nothing actually happens and when I dig deeper it seems the subprocess is returning a 127 - command not found error. So it seems the cloud function can't access Java.
After a good 24hrs, I've hit a wall with this. Can anyone help? I have zero Java knowledge and I know cloud functions have a Java runtime, but I would prefer to stick with Python.
The ultimate aim is for Apple's reporter to run and save the requested file to Google Cloud Storage.
Thanks in advance!
The execution environment for Cloud Function's with Python runtime (both 3.7 and 3.8) is currently based in Ubuntu 18.04 (check the information in this link).
The runtime only includes the following system packages and running subprocess is usually not a recommended idea as the system packages included are limited.
If it's paramount for you to stick with Python you could try to deploy your function using the BuildPack CLI and extending the builder image to install Java on the Python runtime or if your application can be dockerized consider building an image yourself with Java included and deploying your application in Cloud Run.

TypeError: 'JavaPackage' object is not callable (spark._jvm)

I'm setting up GeoSpark Python and after installing all the pre-requisites, I'm running the very basic code examples to test it.
from pyspark.sql import SparkSession
from geo_pyspark.register import GeoSparkRegistrator
spark = SparkSession.builder.\
getOrCreate()
GeoSparkRegistrator.registerAll(spark)
df = spark.sql("""SELECT st_GeomFromWKT('POINT(6.0 52.0)') as geom""")
df.show()
I tried running it with python3 basic.py and spark-submit basic.py, both give me this error:
Traceback (most recent call last):
File "/home/jessica/Downloads/geo_pyspark/basic.py", line 8, in <module>
GeoSparkRegistrator.registerAll(spark)
File "/home/jessica/Downloads/geo_pyspark/geo_pyspark/register/geo_registrator.py", line 22, in registerAll
cls.register(spark)
File "/home/jessica/Downloads/geo_pyspark/geo_pyspark/register/geo_registrator.py", line 27, in register
spark._jvm. \
TypeError: 'JavaPackage' object is not callable
I'm using Java 8, Python 3, Apache Spark 2.4, my JAVA_HOME is set correctly, I'm running Linux Mint 19. My SPARK_HOME is also set:
$ printenv SPARK_HOME
/home/jessica/spark/
How can I fix this?
The Jars for geoSpark are not correctly registered with your Spark Session. There's a few ways around this ranging from a tad inconvenient to pretty seamless. For example, if when you call spark-submit you specify:
--jars jar1.jar,jar2.jar,jar3.jar
then the problem will go away, you can also provide a similar command to pyspark if that's your poison.
If, like me, you don't really want to be doing this every time you boot (and setting this as a .conf() in Jupyter will get tiresome) then instead you can go into $SPARK_HOME/conf/spark-defaults.conf and set:
spark-jars jar1.jar,jar2.jar,jar3.jar
Which will then be loaded when you create a spark instance. If you've not used the conf file before it'll be there as spark-defaults.conf.template.
Of course, when I say jar1.jar.... What I really mean is something along the lines of:
/jars/geo_wrapper_2.11-0.3.0.jar,/jars/geospark-1.2.0.jar,/jars/geospark-sql_2.3-1.2.0.jar,/jars/geospark-viz_2.3-1.2.0.jar
but that's up to you to get the right ones from the geo_pyspark package.
If you are using an EMR:
You need to set your cluster config json to
[
{
"classification":"spark-defaults",
"properties":{
"spark.jars": "/jars/geo_wrapper_2.11-0.3.0.jar,/jars/geospark-1.2.0.jar,/jars/geospark-sql_2.3-1.2.0.jar,/jars/geospark-viz_2.3-1.2.0.jar"
},
"configurations":[]
}
]
and also get your jars to upload as part of your bootstrap. You can do this from Maven but I just threw them on an S3 bucket:
#!/bin/bash
sudo mkdir /jars
sudo aws s3 cp s3://geospark-test-ds/bootstrap/geo_wrapper_2.11-0.3.0.jar /jars/
sudo aws s3 cp s3://geospark-test-ds/bootstrap/geospark-1.2.0.jar /jars/
sudo aws s3 cp s3://geospark-test-ds/bootstrap/geospark-sql_2.3-1.2.0.jar /jars/
sudo aws s3 cp s3://geospark-test-ds/bootstrap/geospark-viz_2.3-1.2.0.jar /jars/
If you are using an EMR Notebook
You need a magic cell at the top of your notebook:
%%configure -f
{
"jars": [
"s3://geospark-test-ds/bootstrap/geo_wrapper_2.11-0.3.0.jar",
"s3://geospark-test-ds/bootstrap/geospark-1.2.0.jar",
"s3://geospark-test-ds/bootstrap/geospark-sql_2.3-1.2.0.jar",
"s3://geospark-test-ds/bootstrap/geospark-viz_2.3-1.2.0.jar"
]
}
I was seeing a similar kind of issue with SparkMeasure jars on Windows 10 machine
self.stagemetrics =
self.sc._jvm.ch.cern.sparkmeasure.StageMetrics(self.sparksession._jsparkSession)
TypeError: 'JavaPackage' object is not callable
So what I did was
Went to 'SPARK_HOME' via Pyspark shell, and installed the required jar
bin/pyspark --packages ch.cern.sparkmeasure:spark-measure_2.12:0.16
Grabbed that jar ( ch.cern.sparkmeasure_spark-measure_2.12-0.16.jar ) and copied into the the Jars folder of 'SPARK_HOME'
Reran the script and now it worked without that above error.

Versioning AWS Lambda Function

I have read some answers about similar topics but I was not satisfied with any of them.
We are deploying some code into AWS Lambda with a jar file which contains a version of the code like name-of-my-app-14.jar, the 14 is the Jenkins build number.
The problem I have is that I don't have a way to understand which version of the jar is currently deployed in AWS and it would be a nice to have.
This is the cloudformation fragment I have to create the lambda:
MyLambdaFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: name-of-my-app.jar
FunctionName: "my-function-name"
Handler: "com.package.something.myapp.HandlerClass::handleRequest"
MemorySize: 256
Role: "arn:aws:iam::1234567890:role/some-role"
Runtime: "java8"
Timeout: 60
Environment:
Variables:
SOME_VARIABLE: "value"
To deploy we download the jar with the version we want to deploy from our artifact repository, saving it as it is specified in the above template and we run:
aws cloudformation package --template-file myapp-stack.yaml --output-template-file tmp.yaml --s3-bucket my.bucket
aws cloudformation deploy --region my-region --template-file tmp.yaml --stack-name prod-myappstackname --capabilities CAPABILITY_IAM --parameter-overrides Environment=prod --no-fail-on-empty-changeset

Spark with Java - Error: Cannot load main class from JAR

I am trying a simple movie recommendation machine learning program in spark.
Spark version:2.1.1
Java version:java 8
Scala version: Scala code runner version 2.11.7
Env: windows 7
Running these commands to start master and worker slaves
//start master
spark-class org.apache.spark.deploy.master.Master
//start worker
spark-class org.apache.spark.deploy.worker.Worker spark://valid ip:7077
I am trying a very simple movie recommendation code from here: http://blogs.quovantis.com/recommendation-engine-using-apache-spark/
I have updated code to :
SparkConf conf = new SparkConf().setAppName("Collaborative Filtering Example").setMaster("spark://valid ip:7077");
conf.setJars(new String[] {"C:\\Spark2.1.1\\spark-2.1.1-bin-hadoop2.7\\jars\\spark-mllib_2.11-2.1.1.jar"});
I cannot run this thru intelliJ
Running mvn clean install and copying the jar to folder does not work.
The command I used to run on :
bin\spark-submit --verbose –-jars jars\spark-mllib_2.11-2.1.1.jar –-class “com.abc.enterprise.RecommendationEngine” –-master spark://valid ip:7077 C:\Spark2.1.1\spark-2.1.1-bin-hadoop2.7\spark-mllib-example\spark-poc-1.0-SNAPSHOT.jar C:\Spark2.1.1\spark-2.1.1-bin-hadoop2.7\spark-mllib-example\ratings.csv C:\Spark2.1.1\spark-2.1.1-bin-hadoop2.7\spark-mllib-example\movies.csv 10
The error I see is:
C:\Spark2.1.1\spark-2.1.1-bin-hadoop2.7>bin\spark-submit --verbose --class "com.sandc.enterprise.RecommendationEngine" --master spark://10.64.98.101:7077 C:\Spark2.1.1\spark-2.1.1-
bin-hadoop2.7\spark-mllib-example\spark-poc-1.0-SNAPSHOT.jar C:\Spark2.1.1\spark-2.1.1-bin-hadoop2.7\spark-mllib-example\ratings.csv C:\Spark2.1.1\spark-2.1.1-bin-hadoop2.7\spark-m
llib-example\movies.csv 10
Using properties file: C:\Spark2.1.1\spark-2.1.1-bin-hadoop2.7\bin\..\conf\spark-defaults.conf
Adding default property: spark.serializer=org.apache.spark.serializer.KryoSerializer
Adding default property: spark.executor.extraJavaOptions=-XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
Adding default property: spark.eventLog.enabled=true
Adding default property: spark.driver.memory=5g
Adding default property: spark.master=spark://valid ip:7077
Error: Cannot load main class from JAR file:/C:/Spark2.1.1/spark-2.1.1-bin-hadoop2.7/û-class
Run with --help for usage help or --verbose for debug output
If I give the --jar command, it gives the error:
Error: Cannot load main class from JAR file:/C:/Spark2.1.1/spark-2.1.1-bin-hadoop2.7/û-jars
Any ideas how I can submit this job to spark??
Is your Jar built correctly ?
Also you don't need to add double quotes for --class option value.

Categories