Create AWS lambda to copy csv file from S3 to RDS MySQL

Create AWS lambda to copy csv file from S3 to RDS MySQL - java

I am trying to load at least 4 csv files from my S3 bucket into my RDS Mysql database. Everytime the files are put in the bucket they will have a different name. The filenames have the date added at the end. I would like for them to automatically be uploaded to database when they are put in the S3 bucket. So far all I have is the load function to connect to the database. At this point I'm just trying to load one file. What would I do to have the file automatically loaded once its put in the S3 bucket? Thanks for the help!
lambdafunctionhandler file
public class LambdaFunctionHandler implements RequestHandler<Service, ResponseClass> {
public void loadService(){
Statement stmt = null;
try{
Connection conn = DriverManager.getConnection("jdbc:mysql://connection/db", "user", "password");
log.info("Connected to database.");
//load date sql
String query="LOAD DATA FROM S3 '"+ S3_BUCKET_NAME + "' INTO TABLE " + sTablename
+ " FIELDS TERMINATED BY ',' ENCLOSED BY '\"' "
+ "lines terminated by '\r\n' "+"IGNORE " + ignoreLines+" LINES";
stmt.executeUpdate(query);
System.out.println("loaded table.");
conn.close();
}catch(SQLException e){
e.printStackTrace();
}
}
#Override
public ResponseClass handleRequest(Service arg0, Context arg1) {
String path="";
return null;
}

If you have the full key of whatever file you're trying to load into S3 is going to be, then the standard AmazonS3 client object has this method: boolean doesObjectExist(String bucketName, String objectName) . By the "rules" of S3, uploading a file to S3 is atomic. The specified S3 key will not return true for this call unless the file is completely uploaded.
So you can trigger your upload of your file, and test for completeness with the doesObjectExist call. Once done, then perform your lambda function.
Alternatively, S3 also has another service (if you want to keep feeding the AWS beast) where you can turn on Bucket notifications, or trigger a Lambda function to execute with one of these notifications. I can't remember the name off the top of my head.

Related

How to copy object to different amazon account using aws sdk v2?

I am working on java project in which I am using aws sdk v2 for using amazon s3 services .
I am performing copy operation it is working with same account but not working with different account.
Code :-
public void copyObjects(S3Object[] s3DestObjects, String sDestBucket, String sSourceBucket, String sSourceObject) {
try {
AwsBasicCredentials awsCreds = AwsBasicCredentials.create(
ACCESS_KEY,
SECRET_KEY);
S3ClientBuilder s3ClientBuilder =
S3Client.builder().credentialsProvider(StaticCredentialsProvider.create(awsCreds));
s3ClientBuilder.region(Region.US_EAST_2);
S3Client s3Client = s3ClientBuilder.build();
String encodedUrl = null;
try {
encodedUrl = URLEncoder.encode(sSourceBucket + "/" + sSourceObject, StandardCharsets.UTF_8.toString());
} catch (UnsupportedEncodingException e) {
System.out.println("URL could not be encoded: " + e.getMessage());
}
for (S3Object s3DestObject : s3DestObjects) {
//CopyObjectRequest copyObjectRequest = CopyObjectRequest.builder().destinationBucket(dstBucket).destinationKey(dstS3Object.key).copySource(encodedUrl).build();
CopyObjectRequest copyObjectRequest = CopyObjectRequest.builder()
.copySource(encodedUrl)
.destinationBucket(sDestBucket)
.destinationKey(s3DestObject.key)
.metadata(s3DestObject.getMetadata()).metadataDirective(MetadataDirective.REPLACE)
.build();
CopyObjectResponse copyObjectResponse = s3Client.copyObject(copyObjectRequest);
}
} catch (S3Exception e) {
throw e;
}
}
This above code is working with same account bucket but not working with different account bucket and getting error :-
Access Denied (Service: S3, Status Code: 403, Request ID: 4VCND27Z6P3CEJ8H, Extended Request ID: 2T88jx4+R+LjO74pBHOhJj8uOUx6M4Hx3UYYkWm4Sbf6cb9NVM8f5DvFcanv0rbXhZUfEkqpSuI=)
please suggest how can i do copy objects to different accounts?

It appears your situation is:
You have Amazon S3 buckets in different AWS Accounts
You wish to copy objects between the buckets
There are two ways to do this:
1. 'Push' the objects
If your code is running in Account A and you wish to copy from a bucket in Account A to a bucket in Account B, then you will need:
Permission on the IAM Entity (eg IAM User or IAM Role) that is being used by your program to write to the bucket in Account B, AND
A bucket policy on the bucket in Account B that permits the IAM Entity used by your program to write to the bucket
When copying the object, you must set ACL=bucket-owner-full-control to 'hand-over' ownership of the object to the destination AWS Account
OR
2. 'Pull' the objects
If your code is running in Account B and you wish to copy from a bucket in Account A to a bucket in Account B, then you will need:
Permission on the IAM Entity (eg IAM User or IAM Role) that is being used by your program to read from the bucket in Account A, AND
A bucket policy on the bucket in Account A that permits the IAM Entity used by your program to read from the bucket
Note that in both cases, your program needs permission from the AWS Account it is running in AND a bucket policy on the bucket in the other AWS Account.

execution failed while invoking the lambda function

I am trying to invoke a lambda function which is triggered by S3Event, I have created a bucket as well, also added two images into the bucket.
below are the specifications of bucket.
Below is my code which I have written in java
public String handleRequest(S3Event event, Context context) {
context.getLogger().log("Received event: " + event);
// Get the object from the event and show its content type
String bucket = event.getRecords().get(0).getS3().getBucket().getName();
String key = event.getRecords().get(0).getS3().getObject().getKey();
try {
S3Object response = s3.getObject(new GetObjectRequest(bucket, key));
String contentType = response.getObjectMetadata().getContentType();
context.getLogger().log("CONTENT TYPE: " + contentType);
return contentType;
} catch (Exception e) {
e.printStackTrace();
context.getLogger().log(String.format(
"Error getting object %s from bucket %s. Make sure they exist and"
+ " your bucket is in the same region as this function.", bucket, key));
throw e;
}
}
and below is the error I am getting
com.amazonaws.services.lambda.runtime.events.S3Event not present

Code looks fine, Confirm that you have this package imported :
com.amazonaws.services.lambda.runtime.events.S3Event
And implement the interface "RequestHandler" with your class.
If issue still persist follow this tutorial:
AWS Lambda with S3 for real-time data processing
Hope this will help !

Clearing memcache AppEngine not working

I use the memcache session handling in AppEngine. Sometimes when doing releases I change objects in a way that renders memcache contents obsolete. When I do and Im testing I want to be able to clear my session.
Ive added a servlet to clear memcache that uses:
try {
CacheFactory cacheFactory = CacheManager.getInstance()
.getCacheFactory();
cacheFactory.createCache(Collections.emptyMap()).clear();
outputMessage(response, "CLEARED cache");
} catch (CacheException e1) {
LOG.log(Level.SEVERE, "cache issue", e1);
outputMessage(response, "cache issue!!!!!!!!");
}
and I dump out the session contents using:
Enumeration<String> e = request.getSession().getAttributeNames();
outputMessage(response, "DUMPING SESSION..");
while (e.hasMoreElements()) {
String name = e.nextElement();
outputMessage(response, "Name:" + name + " value: "
+ request.getSession().getAttribute(name).toString());
}
Doing a dump of session before and after a clear doesnt look any different.
Am I using this right?
Cheers

To get around this problem, I usually append a version id to the key of the object stored in memcache. For example, instead of:
memcache.add('key', 'value')
I do:
version = '1'
memcache.add(VERSION + '#key', 'value')
Later, if I want to invalidate all the data in memcache, I just change the version number (and the entries already stored in memcache will be automatically deleted when they expire).

ormlite with persistent h2 db - new tables not get persisted

When I am creating a new H2 database via ORMLite the database file get created but after I close my application, all the data that it stored in the database is lost:
JdbcConnectionSource connection =
new JdbcConnectionSource("jdbc:h2:file:" + path.getAbsolutePath() + ".h2.db");
TableUtils.createTable(connection, SomeClass.class);
Dao<SomeClass, Integer> dao = DaoManager.createDao(connection, SomeClass.class);
SomeClass sc = new SomeClass(id, ...);
dao.create(sc);
SomeClass retrieved = dao.queryForId(id);
System.out.println("" + retrieved);
This code will produce good results. It will print the object that I stored.
But when I start the application again this time without creating the table and storing new object I get an exception telling me that the required table is not exists:
JdbcConnectionSource connection =
new JdbcConnectionSource("jdbc:h2:file:" + path.getAbsolutePath() + ".h2.db");
Dao<SomeClass, Integer> dao = DaoManager.createDao(connection, SomeClass.class);
SomeClass retrieved = dao.queryForId(id); // will produce an exception..
System.out.println("" + retrieved);

The following worked fine for me if I ran it once and then a second time with the createTable turned off. The 2nd insert gave me a primary key violation of course but that was expected. It created the file with (as #Thomas mentioned) a ".h2.db.h2.db" prefix.
Some questions:
After you run your application the first time, can you see the path file being created?
Is it on permanent storage and not in some temporary location cleared by the OS?
Any chance some other part of your application is clearing it before the database code begins?
Hope this helps.
#Test
public void testStuff() throws Exception {
File path = new File("/tmp/x");
JdbcConnectionSource connection = new JdbcConnectionSource("jdbc:h2:file:"
+ path.getAbsolutePath() + ".h2.db");
// TableUtils.createTable(connection, SomeClass.class);
Dao<SomeClass, Integer> dao = DaoManager.createDao(connection,
SomeClass.class);
int id = 131233;
SomeClass sc = new SomeClass(id, "fopewjfew");
dao.create(sc);
SomeClass retrieved = dao.queryForId(id);
System.out.println("" + retrieved);
connection.close();
}
I can see Russia from my house:
> ls -l /tmp/
...
-rw-r--r-- 1 graywatson wheel 14336 Aug 31 08:47 x.h2.db.h2.db

Did you close the database? It is closed automatically but it's better to close it manually (so recovery is faster).
In many cases the database URL is the problem. Are you sure the same path is used in both cases? Otherwise you end up with two databases. By the way, ".h2.db" is added automatically, you don't need to add it manually.
To better analyze the problem, you could append ;TRACE_LEVEL_FILE=2 to the database URL, and then check in the *.trace.db file what SQL statements were executed against the database.

Play Framework Image BLOB File for Test Object Yaml

How do you set up a Test Blob Image using the yaml structure?
Also, what is the database structure for a BLOB file? (MySQL)

I have experienced the same kind of problem a while ago on a project. However as I could not find a way to solve this with the fixtures (as the database stores the blob object as a string as Pere explained above), I created a workaround to at least solve this problem in a test-case-scenario. I created the following file /app/job/Bootstrap.java:
import play.test.*;
import play.jobs.*;
import play.db.DB;
import models.*;
import java.util.List;
#OnApplicationStart
public class Bootstrap extends Job {
public void doJob() {
// Load default data if the database is empty
if(Item.count() == 0) {
Fixtures.loadModels("my_fixtures.yml");
List<Item> allItems = Item.findAll();
for (Item a: allItems){
DB.execute("UPDATE `Item` SET image='item_" + a.name.toLowerCase() + ".png|image/png' WHERE id=" + a.getId());
}
}
}
}
The first thing I do is filling the database with initial data if there are no 'Item' already stored in the database.
The second thing is iterating over all the 'Item' which play! just stored in the database, which are read from the "my_fixtures.yml" file. Here for each item the string field will get updated as shown in the example above.
I know this is not exactly the answer to question in the OP, but it gives some kind idea to work around this issue..
EDIT: In the example given above I assume that the pictures are uploaded manually to your attachment folder as given in your application.conf, and that each image name is like: "item_<item_name_in_lowercase>" with a ".png" extension

Well, play is quite weird on that point.
The blob is not saved into the database but in a upload folder defined in your application.conf. It is the path toward the file that is saved in the database.
I cannot check it right now, but I seem to recall they are saved as textuel representations (VARCHAR, TEXT)

The blob is saved in the file system, by default under "data/attachments" if I recall correctly, but you can change that in the configuration (application.conf)
In the database, it's stored as a String (varchar in most DB) with two components: the name and the mime type. It looks like:
12345asbcdefghi12345abcdfed|image/jpeg
The first part is the name of the file. When you upload a file Play generates a unique UUID as name to avoid collision. Yes, this means you are loosing the original name. (note: now I'm having doubts on the name part, I would swear it is lost, but I may be wrong!)
The second part (after the |) is the myme type. Play uses a magic-myme library to automatically detect it.
You can see the code here.

Here is a modified version of Unji's answer that loads the images from a folder in conf, please note that I have removed all the import statements:
/**
* A job executed when the application starts.
*/
#OnApplicationStart
public class Bootstrap extends Job {
/**
* Loads the initial data if there are no
* WebAdministrators at the database.
* <p>
* It loads images on the post with the following criteria:
* <ol>
* <li>file loaction: /conf/initialMedia/</li>
* <li>file name: {post.title.toCamelCase()}-{i}.jpg</li>
* </ol>
* Where i must start in 0.
* </p>
*/
#Override
public void doJob() {
// Check if the database is empty
if(WebAdministrator.count() == 0) {
Logger.info("Loading Initial Data.");
Fixtures.loadModels("initial-data.yml");
List<Post> posts = Post.findAll();
for (Post post: posts) {
Logger.info("Looking for files for post: [" + post.title + "]");
for (int i=0; true; i++) {
VirtualFile vf = VirtualFile.fromRelativePath("/conf/initialMedia/"
+ JavaExtensions.camelCase(post.title) + "-" + i + ".jpg");
File imageFile = vf.getRealFile();
if (imageFile.exists()) {
try {
Blob blobImage = new Blob();
blobImage.set(new FileInputStream(imageFile), MimeTypes.getContentType(imageFile.getName()));
MediaItem mediaItem = new Image(blobImage);
mediaItem.save();
post.mediaItems.add(mediaItem);
post.save();
Logger.info("File: [%s] Loaded", imageFile.getAbsolutePath());
} catch (FileNotFoundException e) {
// this should never happen.
}
} else {
Logger.info("Media Loaded for post [%s]: %d files.", post.title, i);
break;
}
}
}
}
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Create AWS lambda to copy csv file from S3 to RDS MySQL - java

Related

How to copy object to different amazon account using aws sdk v2?

execution failed while invoking the lambda function

Clearing memcache AppEngine not working

ormlite with persistent h2 db - new tables not get persisted

Play Framework Image BLOB File for Test Object Yaml

Categories

Resources