I need to create up to 130 calendars on a Google account using Java and Google Calendar API but keep getting
"403 : Rate Limit Exceeded".
What I've tried :
-looping with
service.insert(calendar).execute();
-> result : I receive error 403 after 25 inserts ( which weirdly enough seems to be the old limit, should be 60 according to : https://support.google.com/a/answer/2905486?hl=en )
-looping with delay between each request (up to 60 seconds)
-> result : Didn't change the outcome. still 403 after 25 inserts ( In its documentation about Exponential Backoff, Google talks about mere seconds so I should think 1 whole minute is enough even if I don't increase exponentially that delay ).
-Request Batching ( following THIS Google example code )
-> result : After about 10 callbacks the response falls into the onFailure method with.. You guessed it 403 status code.
I think I am well within my ( maxed-out ) API quotas most of the time. I've seen "quotaExceeded" only once after several tests.
Batch Request sample :
batch = this.service.batch();
JsonBatchCallback<Calendar> callback = new JsonBatchCallback<Calendar>(){
#Override
public void onFailure(GoogleJsonError e, HttpHeaders responseHeaders) throws HttpResponseException{
log.debug(e);
throw new HttpResponseException(Integer.valueOf(e.get("code").toString()), e.getMessage());
}
#Override
public void onSuccess(Calendar cal, HttpHeaders responseHeaders) {
log.debug("Calendrier créé pour " + cal.getSummary());
}
};
for( String user : usernameList ) {
cal = new Calendar().setSummary(user);
this.service.calendars().insert(cal).queue(batch, callback);
}
batch.execute();
Related
I've searched for hours and can't find an answer (lots of unanswered questions on SO, among other things).
My current code
/**
* Use when the server has determined how many Alerts are situated inside
* a single Monitored Zone of a user.
*
* Documentation:
* https://firebase.google.com/docs/cloud-messaging/admin/send-messages
*
* #param registrationToken tokens of the devices the notif is sent to
* #param mzName name of the Monitored Zone (can be its ID)
* #param nbrAlertes number of alerts detected inside the MZ
*/
public static void sendMzLiveAlertNotif(ArrayList<String> registrationToken,
String mzName, int nbrAlertes) {
if(registrationToken.size() == 0 || nbrAlertes == 0 || mzName.isEmpty())
return;
registrationToken.forEach(token -> {
// See documentation on defining a message payload.
Message message = Message.builder()
.putData("title", NotifType.NEW_LIVE_ALERT.getTitle())
.putData("body", "\"" + mzName + "\" contient " + nbrAlertes + " alertes.")
.putData("tag", mzName)
.putData("nbrAlerts", nbrAlertes+"")
.setToken(token)
.build();
try {
// Send a message to the device.
String response = FirebaseMessaging.getInstance().send(message);
// Response is a message ID string.
System.out.println("Successfully sent message: " + response);
} catch (Exception e) {
e.printStackTrace();
}
});
}
This worked perfectly, until I wanted to add a lifespan to my notifications. I am still trying to figure out what is the proper way to do so with Java.
My question
I'm wondering how I am supposed to impose a 30 hours lifespan to my message (and also why it works without me using the setAndroidConfig method that Google seems to use).
My server is coded in Java, and the notifications are pushed to an Android application.
My initial thought was to go for:
Message message = Message.builder()
.putData("title", NotifType.NEW_LIVE_ALERT.getTitle())
.putData("body", "\"" + mzName + "\" contient " + nbrAlertes + " alertes.")
.putData("tag", mzName)
.putData("nbrAlerts", nbrAlertes+"")
.setToken(token)
.setAndroidConfig(AndroidConfig.builder()
.setTtl(3600 * 30) // 30 hours ?
.build())
.build();
... but after seeing how Google uses the AndroidConfig for the whole thing, I'm wondering if I should too.
Google Documentation
The only examples I can find are from Google themselves. Here is an example:
#Test
public void testAndroidMessageWithoutNotification() throws IOException {
Message message = Message.builder()
.setAndroidConfig(AndroidConfig.builder()
.setCollapseKey("test-key")
.setPriority(Priority.HIGH)
.setTtl(10)
.setRestrictedPackageName("test-pkg-name")
.putData("k1", "v1")
.putAllData(ImmutableMap.of("k2", "v2", "k3", "v3"))
.build())
.setTopic("test-topic")
.build();
Map<String, Object> data = ImmutableMap.<String, Object>of(
"collapse_key", "test-key",
"priority", "high",
"ttl", "0.010000000s", // 10 ms = 10,000,000 ns
"restricted_package_name", "test-pkg-name",
"data", ImmutableMap.of("k1", "v1", "k2", "v2", "k3", "v3")
);
assertJsonEquals(ImmutableMap.of("topic", "test-topic", "android", data), message);
}
Do you see the .setTtl(10)and the "ttl", "0.010000000s", // 10 ms = 10,000,000 ns parts? It confuses me. Their documentation says (emphasis is mine):
time_to_live : Optional, number : This parameter specifies how long (in seconds) the message should be kept in FCM storage if the
device is offline. The maximum time to live supported is 4 weeks, and
the default value is 4 weeks. For more information, see Setting the
lifespan of a
message.
The link they tell us to read says:
On Android and Web/JavaScript, you can specify the maximum lifespan of
a message. The value must be a duration from 0 to 2,419,200 seconds
(28 days), and it corresponds to the maximum period of time for which
FCM stores and attempts to deliver the message. Requests that don't
contain this field default to the maximum period of four weeks.
They clearly want the thing to be in seconds. Yet their tests show usage of milliseconds ?! It is frustrating to find so many examples in JavaScript, and almost nothing in Java within their documentation!
In itself, one can also find this contradictory documentation:
public AndroidConfig.Builder setTtl (long ttl)
Sets the time-to-live duration of the message in milliseconds.
A good case of confusing documentation.
You should go with ms, so for 30 hours you would get something like this:
Message message = Message.builder()
.putData("title", NotifType.NEW_USER_ALERT.getTitle())
.putData("body", "\"" + mzName + "\" contient " + nbrAlertes + " alertes.")
.putData("tag", mzName)
.putData("nbrAlerts", nbrAlertes+"")
.setToken(token)
.setAndroidConfig(AndroidConfig.builder()
.setTtl(30*3600*1000) // 30 hours
.build())
.build();
I am using the verbosegc to capture some data and try to analyze the memory usage of my application.
I have a module that will pulling data from database or third party and put it into a list object then only return to front end for display.
When I choose the date to be date range, it will pull the data from database.
When I choose the date to be today date, then my application will send a request to MQ server, and the MQ server will response my application with xml message. The I will use Apache camel library to handle it.
Here is my verbosegc screen shot when pulling data from database:
As you can see, everytime when I trigger the search function, the memory usage will increase, and then drop back. So this is normal, and also what I expected.
And this is the verbosegc screen shot when pulling data from third party.
As you can see, after the memory increase, it will will horizontal there for a period, and then only drop back.
I suspect that the org.apache.camel.Exchange or org.apache.camel.Message or those object in Apache will holding the memory for longer time.
Here is some of my code to handle the xml message from third party:
/**
* Camel Exchange producer template
*/
protected ProducerTemplate< Exchange > template;
#SuppressWarnings("unchecked")
private < T > T doSend(final Object request, final String headerName,
final Object headerObject,
final SendEaiMessageTemplateCallBack callback)
throws BaseRuntimeException {
log.debug( "doSend START >> {} ", request );
if ( this.requestObjectValidator != null
&& requestObjectValidator
.requiredValidation( requestObjectValidator ) ) {
requestObjectValidator.validateRequest( request );
}
final Exchange exchange = template.request( to, new Processor( ) {
public void process(final Exchange exchange) throws Exception {
exchange.getIn( ).setBody( request );
if ( headerName != null && headerObject != null ) {
exchange.getIn( ).setHeader( headerName, headerObject );
}
}
} );
log.debug( "doSend >> END >> exchange is failed? {}",
exchange.isFailed( ) );
Message outBoundMessage = null;
if ( callback != null ) {
// provide the callBack method to access exchange
callback.exchangeCallBack( exchange );
}
if ( exchange.isFailed( ) ) {
failedHandler.handleExchangeFailed( exchange, request );
} else {
outBoundMessage = exchange.getOut( false );
}
// handler outbound message
if ( this.outboundMessageHandler != null ) {
this.outboundMessageHandler.handleMessage( outBoundMessage );
}
if ( outBoundMessage != null ) {
if ( outBoundMessage.getBody( ) != null ) {
log.debug( "OutBoundMessage body {}", outBoundMessage.getBody( ) );
}
return (T) outBoundMessage.getBody( );
} else {
return null;
}
}
Because of this, my application was hitting Out Of Memory Exception. I am not sure is it because of Apache Camel library or not, kindly advise.
Other than that, when I open the heapdump file, there is 52% complain on the com/ibm/xml/xlxp2/scan/util/SimpleDataBufferFactory$DataBufferLink
And the other are complain on the "Java heap is used by this char[] alone", which is some sub category under DataBufferLink as well.
I google on this, all is talking about the xml message too large.
I have no idea on which way or which direction I should continue to troubleshoot, can kindly advise on this?
FYI, I am using camel-core-1.5.0.jar
I'm wondering if any one experienced the same problem.
We have a Vert.x application and in the end it's purpose is to insert 600 million rows into a Cassandra cluster. We are testing the speed of Vert.x in combination with Cassandra by doing tests in smaller amounts.
If we run the fat jar (build with Shade plugin) without the -cluster option, we are able to insert 10 million records in about a minute. When we add the -cluster option (eventually we will run the Vert.x application in cluster) it takes about 5 minutes for 10 million records to insert.
Does anyone know why?
We know that the Hazelcast config will create some overhead, but never thought it would be 5 times slower. This implies we will need 5 EC2 instances in cluster to get the same result when using 1 EC2 without the cluster option.
As mentioned, everything runs on EC2 instances:
2 Cassandra servers on t2.small
1 Vert.x server on t2.2xlarge
You are actually running into corner cases of the Vert.x Hazelcast Cluster manager.
First of all you are using a worker Verticle to send your messages (30000001). Under the hood Hazelcast is blocking and thus when you send a message from a worker the version 3.3.3 does not take that in account. Recently we added this fix https://github.com/vert-x3/issues/issues/75 (not present in 3.4.0.Beta1 but present in 3.4.0-SNAPSHOTS) that will improve this case.
Second when you send all your messages at the same time, it runs into another corner case that prevents the Hazelcast cluster manager to use a cache of the cluster topology. This topology cache is usually updated after the first message has been sent and sending all the messages in one shot prevents the usage of the ache (short explanation HazelcastAsyncMultiMap#getInProgressCount will be > 0 and prevents the cache to be used), hence paying the penalty of an expensive lookup (hence the cache).
If I use Bertjan's reproducer with 3.4.0-SNAPSHOT + Hazelcast and the following change: send message to destination, wait for reply. Upon reply send all messages then I get a lot of improvements.
Without clustering : 5852 ms
With clustering with HZ 3.3.3 :16745 ms
With clustering with HZ 3.4.0-SNAPSHOT + initial message : 8609 ms
I believe also you should not use a worker verticle to send that many messages and instead send them using an event loop verticle via batches. Perhaps you should explain your use case and we can think about the best way to solve it.
When you're you enable clustering (of any kind) to an application you are making your application more resilient to failures but you're also adding a performance penalty.
For example your current flow (without clustering) is something like:
client ->
vert.x app ->
in memory same process eventbus (negletible) ->
handler -> cassandra
<- vert.x app
<- client
Once you enable clustering:
client ->
vert.x app ->
serialize request ->
network request cluster member ->
deserialize request ->
handler -> cassandra
<- serialize response
<- network reply
<- deserialize response
<- vert.x app
<- client
As you can see there are many encode decode operations required plus several network calls and this all gets added to your total request time.
In order to achive best performance you need to take advantage of locality the closer you are of your data store usually the fastest.
Just to add the code of the project. I guess that would help.
Sender verticle:
public class ProviderVerticle extends AbstractVerticle {
#Override
public void start() throws Exception {
IntStream.range(1, 30000001).parallel().forEach(i -> {
vertx.eventBus().send("clustertest1", Json.encode(new TestCluster1(i, "abc", LocalDateTime.now())));
});
}
#Override
public void stop() throws Exception {
super.stop();
}
}
And the inserter verticle
public class ReceiverVerticle extends AbstractVerticle {
private int messagesReceived = 1;
private Session cassandraSession;
#Override
public void start() throws Exception {
PoolingOptions poolingOptions = new PoolingOptions()
.setCoreConnectionsPerHost(HostDistance.LOCAL, 2)
.setMaxConnectionsPerHost(HostDistance.LOCAL, 3)
.setCoreConnectionsPerHost(HostDistance.REMOTE, 1)
.setMaxConnectionsPerHost(HostDistance.REMOTE, 3)
.setMaxRequestsPerConnection(HostDistance.LOCAL, 20)
.setMaxQueueSize(32768)
.setMaxRequestsPerConnection(HostDistance.REMOTE, 20);
Cluster cluster = Cluster.builder()
.withPoolingOptions(poolingOptions)
.addContactPoints(ClusterSetup.SEEDS)
.build();
System.out.println("Connecting session");
cassandraSession = cluster.connect("kiespees");
System.out.println("Session connected:\n\tcluster [" + cassandraSession.getCluster().getClusterName() + "]");
System.out.println("Connected hosts: ");
cassandraSession.getState().getConnectedHosts().forEach(host -> System.out.println(host.getAddress()));
PreparedStatement prepared = cassandraSession.prepare(
"insert into clustertest1 (id, value, created) " +
"values (:id, :value, :created)");
PreparedStatement preparedTimer = cassandraSession.prepare(
"insert into timer (name, created_on, amount) " +
"values (:name, :createdOn, :amount)");
BoundStatement timerStart = preparedTimer.bind()
.setString("name", "clusterteststart")
.setInt("amount", 0)
.setTimestamp("createdOn", new Timestamp(new Date().getTime()));
cassandraSession.executeAsync(timerStart);
EventBus bus = vertx.eventBus();
System.out.println("Bus info: " + bus.toString());
MessageConsumer<String> cons = bus.consumer("clustertest1");
System.out.println("Consumer info: " + cons.address());
System.out.println("Waiting for messages");
cons.handler(message -> {
TestCluster1 tc = Json.decodeValue(message.body(), TestCluster1.class);
if (messagesReceived % 100000 == 0)
System.out.println("Message received: " + messagesReceived);
BoundStatement boundRecord = prepared.bind()
.setInt("id", tc.getId())
.setString("value", tc.getValue())
.setTimestamp("created", new Timestamp(new Date().getTime()));
cassandraSession.executeAsync(boundRecord);
if (messagesReceived % 100000 == 0) {
BoundStatement timerStop = preparedTimer.bind()
.setString("name", "clusterteststop")
.setInt("amount", messagesReceived)
.setTimestamp("createdOn", new Timestamp(new Date().getTime()));
cassandraSession.executeAsync(timerStop);
}
messagesReceived++;
//message.reply("OK");
});
}
#Override
public void stop() throws Exception {
super.stop();
cassandraSession.close();
}
}
We have a problem. Our customers are complaining that they are getting duplicate emails in their in-box. Some days up to 5 or 6 instances of the exact same email at the exact same time. We don't understand why. The code has been re-written at least once but the problem persists.
I'll try to explain this... but it's a bit complicated :O(
Every night (early morning) we want to send our users a daily report containing usage stats. So we have a cron job:
<cron>
<url>/redacted/report/url</url>
<description>Send out daily reports to active subscribers</description>
<schedule>every 2 hours</schedule>
</cron>
The cron job hits the servlet get method:
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
AccountFilter filter = AccountFilter.forWebSafeName(req.getParameter("filter"));
createTasks(filter, null);
}
Which calls the createTasks method with a null cursor:
private void createTasks(AccountFilter accountFilter, String cursor) {
try {
PagedResults<Account> pagedAccounts = accountRepository.getAccounts(accountFilter.getFilter(), 50, cursor);
createTaskBatch(pagedAccounts);
// If there are still more results in cursor, then send cursor back to this servlet's doPost method so we don't hit the request time limit
if (pagedAccounts.getCursor() != null) {
getQueue(QUEUE_NAME).add(withUrl(WORKER_URL).param(CURSOR_KEY, pagedAccounts.getCursor()).param(FILTER_KEY, accountFilter.getWebSafeName()));
}
} catch(Exception ex) {
logger.log(Level.WARNING, "Problem creating daily report task batch for filter " + accountFilter.getWebSafeName(), ex);
}
}
which grabs 50 accounts and iterates over them creating new queued jobs for the emails that should be sent at this time. There is code to explcitely check the last report sent timestamp and update the timestamp BEFORE creating the new queued task. This should err on the side of not sending the report rather than sending duplicates:
private void createTaskBatch(PagedResults<Account> pagedAccounts) {
// GAE datastore query might return duplicate results?!
List<Account> list = pagedAccounts.getResults();
Set<Account> noDuplicates = new HashSet<>(list);
int dups = list.size() - noDuplicates.size();
if ( dups > 0 ){
logger.warning ("Accounts paged results contained " + dups + " duplicates!");
}
for (Account account : noDuplicates) {
try {
if (lastReportSentOver12HoursAgo(account)) {
List<Parent> parents = parentRepository.getVerifiedParentsForAccount(account.getId());
if (eitherParentSubscribed(parents)) {
List<AccountUser> users = accountUserRepository.listUsers(account.getId());
List<Device> devices = getUserDevices(account, users);
if (!devices.isEmpty()) {
DateTimeZone tz = getMostCommonTimezone(devices);
if ( null == tz ){
logger.warning("No timezone found for account: " + account.getId() );
}
else{
// Send early in the morning as the report contains the previous day's stats
if (now(tz).getHourOfDay() < 7) {
// mark sent now because queue might not be processed for a while
// and the next cursor set might contain some of the same accounts
accountRepository.markReportSent(account.getId(), now());
getQueue(QUEUE_NAME).add(withUrl(DailyReportServlet.WORKER_URL).param(DailyReportServlet.ACCOUNT_ID, account.getId()).param(DailyReportServlet.COMMON_TIMEZONE, tz.getID()));
}
}
}
}
}
} catch(Exception ex) {
logger.log(Level.WARNING, "Problem creating daily report task for " + account.getId(), ex);
}
}
}
The servlet POST method takes care of handling the follow up pages of results via the cursor method:
public void doPost(HttpServletRequest req, HttpServletResponse resp) throws IOException {
AccountFilter accountFilter = AccountFilter.forWebSafeName(req.getParameter(FILTER_KEY));
logger.log(Level.INFO, "doPost hit from task queue with filter " + accountFilter.getWebSafeName());
String cursor = req.getParameter(CURSOR_KEY);
createTasks(accountFilter, cursor);
}
There is another servlet that handles each report task and it just creates the email contents and calls send on the com.sendgrid.SendGrid class.
The eventual consistency in Datastore seems a likely candidate but that should be resolved within a few seconds and I don't see how that would account for both the number of customers complaining and the number of duplicates that some customers see.
Help! Any ideas? Are we being dumb somewhere?
UPDATED
For clarity... the email send task queue ends up in this method which does catch exceptions and reports them back to us. We don't see an exception for the duplicate cases:
private void sendReport(Account account, DateTimeZone tz) throws IOException, EntityNotFoundException {
try {
boolean sent = false;
Map<String, Object> root = buildEmailData(account, tz);
for (Parent parent : parentRepository.getVerifiedParentsForAccount(account.getId())) {
if (parent.getEmailPreferences().isSubscribedReports()) {
emailBuilder.send(account, parent, root, "report", EmailSender.NOTIFICATION);
sent = true;
}
}
if ( sent ){
accountRepository.markReportSent(account.getId(), now());
}
} catch (Exception ex) {
String message = "Problem building report email for account " + account.getId();
logger.log(Level.WARNING, message, ex);;
new TeamNotificationEvent( message + " : exception: " + ex.getMessage()).fire();
throw new IOException(message, ex);
}
}
UPDATE 2 AFTER ADDING EXTRA DEBUG LOGGING
I see two POSTS in at the same time to the same task queue with the same cursor:
09:35:08.397 2015-04-30 200 0 B 3.78s /ws/notification/daily-report-task-creator
0.1.0.2 - - [30/Apr/2015:01:35:08 -0700] "POST /ws/notification/daily-report-task-creator HTTP/1.1" 200 0 "http://screentimelabs.appspot.com/ws/notification/daily-report-task-creator" "AppEngine-Google; (+http://code.google.com/appengine)" "screentimelabs.appspot.com" ms=3782 cpu_ms=662 queue_name=dailyReports task_name=8168414365365326983 instance=00c61b117c33a909790f0d1882657e04f40b2c7e app_engine_release=1.9.20
09:35:04.618 com.screentime.service.taskqueue.reports.DailyReportTaskCreatorServlet createTasks: createTasks called for filter: ACTIVE with cursor: E-ABAIICO2oQc35zY3JlZW50aW1lbGFic3InCxIHQWNjb3VudCIaamFybW8ua2Fya2thaW5lbkBnbWFpbC5jb20MiAIAFA
09:35:08.432 2015-04-30 200 0 B 8.84s /ws/notification/daily-report-task-creator
0.1.0.2 - - [30/Apr/2015:01:35:08 -0700] "POST /ws/notification/daily-report-task-creator HTTP/1.1" 200 0 "http://screentimelabs.appspot.com/ws/notification/daily-report-task-creator" "AppEngine-Google; (+http://code.google.com/appengine)" "screentimelabs.appspot.com" ms=8837 cpu_ms=1348 queue_name=dailyReports task_name=50170612326424582061 instance=00c61b117c2bffe8de313e96fea8aeb813f4b20f app_engine_release=1.9.20 trace_id=7e5c0348382e66cf4e2c6ba400529fb7
09:34:59.608 com.screentime.service.taskqueue.reports.DailyReportTaskCreatorServlet createTasks: createTasks called for filter: ACTIVE with cursor: E-ABAIICO2oQc35zY3JlZW50aW1lbGFic3InCxIHQWNjb3VudCIaamFybW8ua2Fya2thaW5lbkBnbWFpbC5jb20MiAIAFA
Searching for 1 particular account id I see these requests:
09:35:08.397 2015-04-30 200 0 B 3.78s /ws/notification/daily-report-task-creator
09:35:08.432 2015-04-30 200 0 B 8.84s /ws/notification/daily-report-task-creator
09:35:08.443 2015-04-30 200 0 B 6.73s /ws/notification/daily-report-task-creator
09:35:10.541 2015-04-30 200 0 B 4.03s /ws/notification/daily-report-task-creator
09:35:10.690 2015-04-30 200 0 B 11.09s /ws/notification/daily-report-task-creator
09:35:13.678 2015-04-30 200 0 B 862ms /ws/notification/daily-report-worker
09:35:13.829 2015-04-30 500 0 B 1.21s /ws/notification/daily-report-worker
09:35:14.677 2015-04-30 200 0 B 1.56s /ws/notification/daily-report-worker
09:35:14.961 2015-04-30 200 0 B 346ms /ws/notification/daily-report-worker
Some have repeated cursor values.
I will make a guess because i dont see the task queue code. Its likely that you are not handling errors correctly in the task queue. If a task queue finishes with an error, gae will re-queue it. thus if some emails were already sent, the task will still run again. you need a way to remember what you already processed in the task queue so a retry wont reprocess those.
Update 4 - rephrasing question for clarity
I am using Pull Queues to feed back-end workers tasks that send push notifications. I can see the front-end instance queue the task in the logs. However, the task is only occasionally handled by the back-end. I see no indication of why the task disappears prior to being handled and deleted from the queue.
This may be related: I am seeing an unusually high number of TransientFailureExceptions when attempting to lease tasks from the queue - despite sleeping between attempts.
Everything works properly on my development server (and an earlier version had worked in production) but production is no longer working properly. At first I thought it was a certificate issue. However, notifications are sometimes sent when the backend first starts.
There is no indication that an error is happening except for the TransientFailureException when I call leaseTasks on the queue. Also, it seems to take a very long time for my logs to show up.
I can provide more information and code snippets as needed.
Thanks for the help.
Update 1:
The application uses 10 pull queues. It would normally use 2 but queue tagging is still considered experimental. They are declared in the standard fashion:
<queue>
<name>gcm-henchdist</name>
<mode>pull</mode>
</queue>
The lease tasks function is:
public boolean processBatchOfTasks()
{
List< TaskHandle > tasks = attemptLeaseTasks();
if( null == tasks || tasks.isEmpty() )
{
return false;
}
processLeasedTasks( tasks );
return true;
}
private List< TaskHandle > attemptLeaseTasks()
{
for( int attemptNnum = 1; !LifecycleManager.getInstance().isShuttingDown(); ++attemptNnum )
{
try
{
return m_taskQueue.leaseTasks( m_numLeaseTimeUnits, m_leaseTimeUnit, m_maxTasksPerLease );
} catch( TransientFailureException exc )
{
LOG.warn( "TransientFailureException when leasing tasks from queue '{}'", m_taskQueue.getQueueName(), exc );
ApiProxy.flushLogs();
} catch( ApiDeadlineExceededException exc )
{
LOG.warn( "ApiDeadlineExceededException when when leasing tasks from queue '{}'",
m_taskQueue.getQueueName(), exc );
ApiProxy.flushLogs();
}
if( !backOff( attemptNnum ) )
{
LOG.warn( "Failed to lease tasks." );
break;
}
}
return Collections.emptyList();
}
where the lease variables are 30, TimeUnit.MINUTES, 100 respectively
the processBatchOfTasks function is polled via:
private void startPollingForClient( EClientType clientType )
{
InterimApnsCertificateConfig config = InterimApnsCertificateConfigMgr.getConfig( clientType );
Queue notificationQueue = QueueFactory.getQueue( config.getQueueId().getName() );
ApplePushNotificationWorker worker = new ApplePushNotificationWorker(
notificationQueue,
m_messageConverter.getObjectMapper(),
config.getCertificateBytes(),
config.getPassword(),
config.isProduction() );
LOG.info( "Started worker for {} polling queue {}", clientType, notificationQueue.getQueueName() );
while ( !LifecycleManager.getInstance().isShuttingDown() )
{
boolean tasksProcessed = worker.processBatchOfTasks();
ApiProxy.flushLogs();
if ( !tasksProcessed )
{
// Wait before trying to lease tasks again.
try
{
//LOG.info( "Going to sleep" );
Thread.sleep( MILLISECONDS_TO_WAIT_WHEN_NO_TASKS_LEASED );
//LOG.info( "Waking up" );
} catch ( InterruptedException exc )
{
LOG.info( "Polling loop interrupted. Terminating loop.", exc );
return;
}
}
}
LOG.info( "Instance is shutting down" );
}
and the thread is created via:
Thread thread = ThreadManager.createBackgroundThread( new Runnable()
{
#Override
public void run()
{
startPollingForClient( clientType );
}
} );
thread.start();
GCM notifications are handled in a similar fashion.
Update 2
The following is the backoff function. I have verified in the logs (with both GAE and my own timestamps) that the sleep is incrementing properly
private boolean backOff( int attemptNo )
{
// Exponential back off between 2 seconds and 64 seconds with jitter
// 0..1000 ms.
attemptNo = Math.min( 6, attemptNo );
int backOffTimeInSeconds = 1 << attemptNo;
long backOffTimeInMilliseconds = backOffTimeInSeconds * 1000 + (int)( Math.random() * 1000 );
LOG.info( "Backing off for {} milliseconds from queue '{}'", backOffTimeInMilliseconds, m_taskQueue.getQueueName() );
ApiProxy.flushLogs();
try
{
Thread.sleep( backOffTimeInMilliseconds );
} catch( InterruptedException e )
{
return false;
}
LOG.info( "Waking up from {} milliseconds sleep for queue '{}'", backOffTimeInMilliseconds, m_taskQueue.getQueueName() );
ApiProxy.flushLogs();
return true;
}
Update 3
The tasks are added to the queue within a transaction on a front-end instance:
if( null != queueType )
{
String deviceName;
int numDevices = deviceList.size();
for ( int iDevice = 0; iDevice < numDevices; ++iDevice )
{
deviceName = deviceList.get( iDevice ).getName();
LOG.info( "Queueing Your-Turn notification for user: {} device: {} queue: {}", user.getId(), deviceName, queueType.getName() );
Queue queue = QueueFactory.getQueue( queueType.getName() );
queue.addAsync( TaskOptions.Builder.withMethod( TaskOptions.Method.PULL )
.param( "alertLocKey", "NOTIF_YOUR_TURN" ).param( "device", deviceName ) );
}
}
I know that the transaction succeeds because the database updates correctly.
In the logs I see the "Queuing Your-Turn notification..." entry, but I see nothing appear on the back-end logs.
In the administration panel, I see Task Queue API Calls increment by 1 as well as Task Queue Stored Task Count increment by 1. However, the queue that was written to shows zero in both the Tasks In Queue and Leased In Last Minute fields.
The TransientFailureException JavaDoc says that "The requested operation may succeed if attempted again" (because the failure is transient). Therefore when this exception is thrown your code should loop back and repeat the leaseTasks call. Furthermore AppEngine does not have to redo the request itself because it notified you via the exception that you should do so.
It's a pity you repeat the method name leaseTasks as one of your own because now it's not clear which one I'm referring to when I mention leaseTasks. Still, wrap the inner call to m_taskQueue.leaseTasks in a while loop and an additional try block to catch only the TransientFailureException. Use a flag to end the while loop only if that exception is not thrown.
Is that enough explanation, or do you need a complete source code listing?
It appears that the culprit may have been that I was calling addAsync when enqueuing the task instead of just calling add.
I replaced the call and things seem to be consistently working now. I would like to know why this makes a difference and will update the answer when I find the reason.