uploading a large file with rest template causing High CPU usage - java

CPU usage is high when using Rest Template for uploading a file to S3 bucket.
used a rest template to upload a file of around 100 MB. it is causing CPU spikes in deployment server.
we have a configuration to restart the pod if it reaches 70% usage of pod. it is simple pod with less cpu and memory configuration.
any way is fine if CPU usage is less and it is not impacting memory and JVM heap.
tried rest template with and without SimpleClientHttpRequestFactory and also tried changing chunk size during file upload (post) call
Code snip:
SimpleClientHttpRequestFactory requestFactory = new SimpleClientHttpRequestFactory();
requestFactory.setBufferRequestBody(false);
//requestFactory.setChunkSize(); //tested with different chunk sizes
restTemplate.setRequestFactory(requestFactory);
//setting headers and body
HttpEntity<MultiValueMap<String, Object>> requestEntity = getRequestHttpEntity(file, uploadFileRequest);
String serverUrl = uploadFileRequest.getPostUrl();
try {
ResponseEntity<Object> response = restTemplate.postForEntity(serverUrl, requestEntity, Object.class);
log.info("Response " + response.getStatusCode(), null);
if (HttpStatus.NO_CONTENT == response.getStatusCode()) {
log.info("Successfully uploaded file ", null);
return DocumentUploadAck.builder().documentUploadStatus(true).build();
}
} catch (Exception exception) {
log.error("Exception when uploading file" + exception.getMessage(), null,exception);
//throwing custom business exception
}
}

Related

How to keep track of bytes uploaded through Azure java

I have a use case where if a file is uploaded as a multipart file to azure in java, I want to get the real time updates of how many bytes that are transferred till now. Below is a code snippet i have been using. I couldn't find any implementation that azure provides. Is there any workaround available?
// multipart upload logic
ParallelTransferOptions options = new ParallelTransferOptions()
.setBlockSizeLong(int)
.setMaxConcurrency(int)
.setMaxSingleUploadSizeLong(int);
BlobHttpHeaders headers = getBlobHeaders(localPath, key, folder, null, metadata);
RWLog.MAIN.info("Multipart upload started for " + path);
blob.uploadFromFile(localPath, options, headers, metadata,
tier, null, null);
RWLog.MAIN.info("Multipart upload completed for " + path);

Add download progress RestTemplate in Java Spring

so I am downloading some files using Spring Rest template. I have a requirement to log the progress of the download in backend itself.
So can the download progress be logged in some way ?
Here is my implementation :
File file = restTemplate.execute(FILE_URL, HttpMethod.GET, null, clientHttpResponse -> {
File ret = File.createTempFile("download", "tmp");
StreamUtils.copy(clientHttpResponse.getBody(), new FileOutputStream(ret));
return ret;
});
PS: I was thinking if there is a way to intercept how much of the response body is transferred.

Two Spring Boot apps setup with two ports throws 500 error when one uses RestTemplate to Post

Ok I have STS 4 and building Spring Boot applications. I want one to handle the GUI using Vaadin and it is working fine. I have another that picks up an ActiveMQ Topic and creates a POST to the first one. If I use POSTMAN to do the post, it works fine to the REST interface on the first app. But when I run the second one and a Topic is detected and it tries to post to the first app REST interface, I get...
org.springframework.web.client.HttpServerErrorException$InternalServerError: 500 : [<!doctype html><html lang="en"><head><title>HTTP Status 500 – Internal Server Error</title><style type="text/css">body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-colo... (6083 bytes)]
Further testing revealed that it is complaining because it is listening on the same port (8080) as the first one.
In my application.properties for the first app I have set
server.port=${PORT:8080}
vaadin.compatibilityMode = false
logging.level.org.atmosphere = warn
spring.activemq.broker-url=tcp://localhost:61616
spring.activemq.user=admin
spring.activemq.password=admin
spring.jms.pub-sub-domain=true
While in the second app I have set
server.port=${PORT:8081}
spring.activemq.broker-url=tcp://localhost:61616
spring.activemq.user=admin
spring.activemq.password=admin
spring.jms.pub-sub-domain=true
active-mq.topic=personQBE
The offending code that is trying to POST is...
public void postResponses(Person person) {
System.out.println("postResponses(Person person)");
RestTemplate restTemplate = new RestTemplate();
HttpHeaders headers = new HttpHeaders();
headers.setContentType(MediaType.APPLICATION_FORM_URLENCODED);
headers.add("Accept", MediaType.APPLICATION_JSON.toString()); //Optional in case server sends back JSON data
MultiValueMap<String, String> map = new LinkedMultiValueMap<>();
map.add("firstName", person.getFirstName());
map.add("lastName", person.getLastName());
map.add("alias", person.getAlias());
map.add("dataSource", "BI");
HttpEntity<MultiValueMap<String, String>> entity = new HttpEntity<>(map, headers);
ResponseEntity<String> response = null;
try {
response = restTemplate.exchange("http://localhost:8080/person/add",
HttpMethod.POST,
entity,
String.class);
} catch (RestClientException e) {
// TODO Log.error later
e.printStackTrace();
}
System.out.println("response:" + response);
}
I have read the blog post Microservices with Spring but not sure if I need to do this to solve my problem and down the road, the second app will be a template for an agent to fetch data from an external API whenever that topic gets posted by the first app so port contention will not be a problem if the second app is running by itself someplace.
Add the below value for run the application with the random port number.
server.port=0
Hopefully, you have configured Eureka and Zuul.

How to download and process large data reactively?

I need to initiate download of some content over HTTP and then read the data as a reactive stream.
So, even though the downloaded data are big, I can almost immediately read the first few bytes of the response body (no need to wait for the whole response body). Then, do some computations and in a few seconds read another portion of the data. There has to be some limit of the cached data, because operation memory can't handle the whole content (its tens of GB).
I've been trying to use HttpClient's sendAsync method with BodyHandlers.ofInputStream(), but it always blocks and waits for all the data to arrive.
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create("https://..."))
.build();
HttpResponse<InputStream> response = client
.sendAsync(request, HttpResponse.BodyHandlers.ofInputStream())
.get(); // this finishes as soon as the header is received
try {
InputStream stream = response.body();
byte[] test = stream.readNBytes(20); // trying to read just a few bytes
// but it waits for the whole body
} catch (IOException ex) {}
What do I need to change so the response body is downloaded gradually?
This is a bug. It has been fixed in Java 11.0.2:
https://bugs.openjdk.java.net/browse/JDK-8212926

429 Too many requests when generating presigned urls for s3 objects using aws-sdk

I have an app which is a Digital Asset Management system. It displays thumbnails. I have these thumbnails set up to be served with AWS S3 presigned urls: https://docs.aws.amazon.com/AmazonS3/latest/dev/ShareObjectPreSignedURLJavaSDK.html. This piece of code is working, until I change how many items get processed through the request. The application has selections for 25, 50, 100, 200. If I select 100 or 200 the process will fail with "Error: com.amazonaws.AmazonServiceException: Too Many Requests (Service: null; Status Code: 429; Error Code: null; Request ID: null)"
Right now the process is as follows:
Perform a search > run each object key through a method that returns a presigned url for that object.
We run this application through Elastic Container Service which allows us to pull in credentials via ContainerCredentialsProvider.
Relevant code for review:
String s3SignedUrl(String objectKeyUrl) {
// Environment variables for S3 client.
String clientRegion = System.getenv("REGION");
String bucketName = System.getenv("S3_BUCKET");
try {
// S3 credentials get pulled in from AWS via ContainerCredentialsProvider.
AmazonS3 s3Client = AmazonS3ClientBuilder.standard()
.withRegion(clientRegion)
.withCredentials(new ContainerCredentialsProvider())
.build();
// Set the pre-signed URL to expire after one hour.
java.util.Date expiration = new java.util.Date();
long expTimeMillis = expiration.getTime();
expTimeMillis += 1000 * 60 * 60;
expiration.setTime(expTimeMillis);
// Generate the presigned URL.
GeneratePresignedUrlRequest generatePresignedUrlRequest =
new GeneratePresignedUrlRequest(bucketName, objectKeyUrl)
.withMethod(HttpMethod.GET)
.withExpiration(expiration);
return s3Client.generatePresignedUrl(generatePresignedUrlRequest).toString();
} catch (AmazonServiceException e) {
throw new AssetException(FAILED_TO_GET_METADATA, "The call was transmitted successfully, but Amazon " +
"S3 couldn't process it, so it returned an error response. Error: " + e);
} catch (SdkClientException e) {
throw new AssetException(FAILED_TO_GET_METADATA, "Amazon S3 couldn't be contacted for a response, or " +
"the client couldn't parse the response from Amazon S3. Error: " + e);
}
}
And this is the part where we process the items:
// Overwrite the url, it's nested deeply in maps of maps.
for (Object anAssetList : assetList) {
String assetId = ((Map) anAssetList).get("asset_id").toString();
if (renditionAssetRecordMap.containsKey(assetId)) {
String s3ObjectKey = renditionAssetRecordMap.get(assetId).getThumbObjectLocation();
((Map) ((Map) ((Map) anAssetList)
.getOrDefault("rendition_content", new HashMap<>()))
.getOrDefault("thumbnail_content", new HashMap<>()))
.put("url", s3SignedUrl(s3ObjectKey));
}
}
Any guidance would be appreciated. Would love a solution that is simple and hopefully configurable on the AWS side. Otherwise, right now I am looking at adding a process for this to generate the urls in batches.
The problem is unrelated to generating pre-signed URLs. These are done with no interaction with the service, so there is no possible way it could be rate-limited. A pre-signed URL uses an HMAC-SHA algorithm to prove to the service that an entity in possession of the credentials has authorized a specific request. The one-way (non-reversible) nature of HMAC-SHA allows these URLs to be generated entirely on the machine where the code is running, with no service interaction.
However, it seems very likely that repeatedly fetching the credentials is the actual cause of the exception -- and you appear to be doing that unnecessarily over and over.
This is an expensive operation:
AmazonS3 s3Client = AmazonS3ClientBuilder.standard()
.withRegion(clientRegion)
.withCredentials(new ContainerCredentialsProvider())
.build();
Each time you call this again, the credentials have to be fetched again. That's actually the limit you're hitting.
Build your s3client only once, and refactor s3SignedUrl() to expect that object to be passed in, so you can reuse it.
You should see a notable performance improvement, in addition to resolving the 429 error.

Categories