How can I monitor EWS SOAP messages relating to subscription creation

How can I monitor EWS SOAP messages relating to subscription creation - java

We have a spring java app using EWS to connect to our on prem 2016 Exchange server and 'stream' pulling emails. Every 30 minutes a new 30 minute subscription is made (via new thread). We assume old connection just expires.
When one instance is running in our environment, it works perfectly fine, but when two instances run, after some time one instance will eventually start throwing error about
You have exceeded the available concurrent connections for your account. Try again once your other requests have completed.
It seems like an issue which is then hit by throttling. I found that the Exchange servers config is:
EWSMaxConcurrency=27, MaxStreamingConcurrency=10,
HangingConnectionLimit=10
Our code previously didn't explicitly close connections and unsubscribe (was running fine without when one instance). We tried including both but the issue still persists and we noticed the close method for StreamingSubscriptionConnection throws error. The team that handles the Exchange server can find errors referencing the exceeding connection count error above, but nothing relating to the close connection error
...[m.e.w.d.n.StreamingSubscriptionConnection.close(349)]: java.lang.Exception: microsoft.exchange.webservices.data.notification.StreamingSubscriptionConnection
Currently we don't have much ability to make changes on the exchange server side. I'm not familiar with SOAP messages but I was planning to look into how to monitor them to see what inbound and outbound messages there are for some insights
For the service I set service.setTraceEnabled(true) and service.setTraceFlags(EnumSet.allOf(TraceFlags.class)
However I only see trace messages in console when an email arrives. I dont see any messages during start up when a subscription/connection is created
Can anyone help provide any advice on how I can monitor these subscription related messages?
I tried using SOAPUI but I'm having difficulty applying our server's WSDL. I considered using the Tunnelij plugin for intellij but I'm not too familiar with how to set it up either
My suspicion is that there is some intermittent latency issue on Exchange server side, perhaps response messages are not coming back in a timely manner, and this may be screwing up. I presume if I monitor these SOAP messages then I should see more than 10 requests to subscribe before that error appears

The EWS Logs on the CAS (Client Access Server) should have details about the throttling issue. Are you using Impersonation in you Application if you not using Impersonation then the concurrent connections are charged against the account your using with Impersonation that get charged against the account your impersonating. The difference here is that a single user can have no more the 10 streaming subscriptions (unless you modify the web.config) if your using impersonation than you can scale your application to 1000's of users see https://github.com/MicrosoftDocs/office-developer-exchange-docs/blob/main/docs/exchange-web-services/how-to-maintain-affinity-between-group-of-subscriptions-and-mailbox-server.md

Related

Getting MQ error (reason 2594) when trying to connect to MQ manager after first few messages

I am upgrading a standalone Java app that uses IBM MQ to send messages to a local Websphere 8.5 server. The existing app uses a bunch of different jars for the MQ code (mq, mqbind mqjms, connector-api, jms).
For the new one I saw that there is now an all-encompassing "allclient" MQ JAR (https://mvnrepository.com/artifact/com.ibm.mq/com.ibm.mq.allclient/9.2.0.0) so I decided to use that.
It appears to work just fine for the first few messages, but after sending 4-5 messages, all subsequent messages will then fail with a code 2594 (https://www.ibm.com/support/knowledgecenter/en/SSFKSJ_9.1.0/com.ibm.mq.tro.doc/q120510_.htm):
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2594;AMQ9204: Connection to host 'localhost(5558)' rejected. [1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2594;AMQ9503: Channel negotiation failed. [3=WAS.JMS.SVRCONN ]],3=localhost(5558),5=RemoteConnection.initSess]
at com.ibm.mq.jmqi.remote.api.RemoteFAP$Connector.jmqiConnect(RemoteFAP.java:13588)
at com.ibm.mq.jmqi.remote.api.RemoteFAP$Connector.access$100(RemoteFAP.java:13125)
at com.ibm.mq.jmqi.remote.api.RemoteFAP.jmqiConnect(RemoteFAP.java:1430)
at com.ibm.mq.jmqi.remote.api.RemoteFAP.jmqiConnect(RemoteFAP.java:1389)
at com.ibm.mq.ese.jmqi.InterceptedJmqiImpl.jmqiConnect(InterceptedJmqiImpl.java:377)
at com.ibm.mq.ese.jmqi.ESEJMQI.jmqiConnect(ESEJMQI.java:562)
at com.ibm.mq.MQSESSION.MQCONNX_j(MQSESSION.java:916)
at com.ibm.mq.MQManagedConnectionJ11.<init>(MQManagedConnectionJ11.java:240)
On the server side, I get the following in the console:
CWSIC3712E: A WebSphere MQ client, previously connected from host 127.0.0.1:58963 on transport chain InboundBasicMQLink, has been disconnected because of exception java.io.IOException: Async IO operation failed (1), reason: RC: 55 The specified network resource or device is no longer available.
After this error occurs, any subsequent attempts to send a message will fail with the same error. I have to restart the app at which point the same thing repeats: first 4-5 messages send before the fails begin. If I switch back to using the old JARs without changing the code, I'm able to send an unlimited number of messages without any issues.
The reason code is confusing to me ("An MQCONN or MQCONNX call was issued from a client connected application, but it failed to agree a password protection algorithm with the queue manager.") because if it's truly a password issue, why do the first couple messages send without issue? It does not seem to be an issue with closing/disconnecting the queue/manager because I'll wait a few seconds between each send and can breakpoint/println and see that each time they are being closed before the next send.
Any ideas?

I found a partial workaround to this problem:
In the Java code, for our MQMessage object, we declare a "replyToQueueName". If I remove that setter, the problem seems to go away (we can send as many messages as we want without error).
I'm not certain why in this particular case that seems to work. The failure that occurs happens on the MQQueueManager declaration which is much higher up in the code than the replyToQueueName setter. This combined with this bug only occurring after 4-5 messages are sent seems to maybe indicate that something is not being "closed" properly but as far as I know there is no way to "close" a message and we are already closing/disconnecting the manager and the queue.

Spring boot + tomcat 8.5 + mongoDB, AsyncRequestTimeoutException

I have created a spring boot web application and deployed war of the same to tomcat container.
The application connects to mongoDB using Async connections. I am using mongodb-driver-async library for that.
At startup everything works fine. But as soon as load increases, It shows following exception in DB connections:
org.springframework.web.context.request.async.AsyncRequestTimeoutException: null
at org.springframework.web.context.request.async.TimeoutDeferredResultProcessingInterceptor.handleTimeout(TimeoutDeferredResultProcessingInterceptor.java:42)
at org.springframework.web.context.request.async.DeferredResultInterceptorChain.triggerAfterTimeout(DeferredResultInterceptorChain.java:75)
at org.springframework.web.context.request.async.WebAsyncManager$5.run(WebAsyncManager.java:392)
at org.springframework.web.context.request.async.StandardServletAsyncWebRequest.onTimeout(StandardServletAsyncWebRequest.java:143)
at org.apache.catalina.core.AsyncListenerWrapper.fireOnTimeout(AsyncListenerWrapper.java:44)
at org.apache.catalina.core.AsyncContextImpl.timeout(AsyncContextImpl.java:131)
at org.apache.catalina.connector.CoyoteAdapter.asyncDispatch(CoyoteAdapter.java:157)
I am using following versions of software:
Spring boot -> 1.5.4.RELEASE
Tomcat (installed as standalone binary) -> apache-tomcat-8.5.37
Mongo DB version: v3.4.10
mongodb-driver-async: 3.4.2
As soon as I restart the tomcat service, everything starts working fine.
Please help, what could be the root cause of this issue.
P.S.: I am using DeferredResult and CompletableFuture to create Async REST API.
I have also tried using spring.mvc.async.request-timeout in application and configured asynTimeout in tomcat. But still getting same error.

It's probably obvious that Spring is timing out your requests and throwing AsyncRequestTimeoutException, which returns a 503 back to your client.
Now the question is, why is this happening? There are two possibilities.
These are legitimate timeouts. You mentioned that you only see the exceptions when the load on your server increases. So possibly your server just can't handle that load and its performance has degraded to the point where some requests can't complete before Spring times them out.
The timeouts are caused by your server failing to send a response to an asynchronous request due to a programming error, leaving the request open until Spring eventually times it out. It's easy for this to happen if your server doesn't handle exceptions well. If your server is synchronous, it's okay to be a little sloppy with exception handling because unhandled exceptions will propagate up to the server framework, which will send a response back to the client. But if you fail to handle an exception in some asynchronous code, that exception will be caught elsewhere (probably in some thread pool management code), and there's no way for that code to know that there's an asynchronous request waiting on the result of the operation that threw the exception.
It's hard to figure out what might be happening without knowing more about your application. But there are some things you could investigate.
First, try looking for resource exhaustion.
Is the garbage collector running all the time?
Are all CPUs pegged at 100%?
Is the OS swapping heavily?
If the database server is on a separate machine, is that machine showing signs of resource exhaustion?
How many connections are open to the database? If there is a connection pool, is it maxed out?
How many threads are running? If there are thread pools in the server, are they maxed out?
If something's at its limit then possibly it is the bottleneck that is causing your requests to time out.
Try setting spring.mvc.async.request-timeout to -1 and see what happens. Do you now get responses for every request, only slowly, or do some requests seem to hang forever? If it's the latter, that strongly suggests that there's a bug in your server that's causing it to lose track of requests and fail to send responses. (If setting spring.mvc.async.request-timeout appears to have no effect, then the next thing you should investigate is whether the mechanism you're using for setting the configuration actually works.)
A strategy that I've found useful in these cases is to generate a unique ID for each request and write the ID along with some contextual information every time the server either makes an asynchronous call or receives a response from an asynchronous call, and at various checkpoints within asynchronous handlers. If requests go missing, you can use the log information to figure out the request IDs and what the server was last doing with that request.
A similar strategy is to save each request ID into a map in which the value is an object that tracks when the request was started and what your server last did with that request. (In this case your server is updating this map at each checkpoint rather than, or in addition to, writing to the log.) You can set up a filter to generate the request IDs and maintain the map. If your filter sees the server send a 5xx response, you can log the last action for that request from the map.
Hope this helps!

Asynchroneus tasks are arranged in a queue(pool) which is processed in parallel depending on the number of threads allocated. Not all asynchroneus tasks are executed at the same time. Some of them are queued. In a such system getting AsyncRequestTimeoutException is normal behaviour.
If you are filling up the queues with asynchroneus tasks that are unable to execute under pressure. Increasing the timeout will only delay the problem. You should focus instead on the problem:
Reduce the execution time(through various optimizations) of asynchroneus task. This will relax the pooling of async tasks. It oviously requires coding.
Increase the number of CPUSs allocated in order to be able to run more efficiently the parallel tasks.
Increase the number of threads servicing the executor of the driver.
Mongo Async driver is using AsynchronousSocketChannel or Netty if Netty is found in the classpath. In order to increase the number of the worker threads servicing the async comunication you should use:
MongoClientSettings.builder()
.streamFactoryFactory(NettyStreamFactoryFactory(io.netty.channel.EventLoopGroup eventLoopGroup,
io.netty.buffer.ByteBufAllocator allocator))
.build();
where eventLoopGroup would be io.netty.channel.nio.NioEventLoopGroup(int nThreads))
on the NioEventLoopGroup you can set the number of threads servicing your async comunication
Read more about Netty configuration here https://mongodb.github.io/mongo-java-driver/3.2/driver-async/reference/connecting/connection-settings/

What cometd configurations to use to reduce 402 error occurrences?

We have implemented a Java servlet running on JBoss container that uses CometD long-polling. This has been implemented in a few organizations without any issue, but in a recent implementation there are functional issues which appear to be related to the network setup of this organization.
Specifically, around 5% of the time, the connect requests are getting back 402 errors:
{"id":"39","error":"402::Unknown client","successful":false,"advice":{"interval":0,"reconnect":"handshake"},"channel":"/meta/connect"}
Getting this organization to address network performance is a significant challenge, so we are looking at a way to tune the implementation to reduce these issues.
Which cometd configuration parameters can be updated to improve this?
maxinterval, timeout, multiSessionInverval, etc?
Thank you!

The "402 unknown client" error is due to the fact that the server does not see /meta/connect heartbeat messages from the client and expires the correspondent session on the server. This is typically due to network issues.
Once the client network is restored, the client sends a /meta/connect heartbeat message but the server doesn't have the correspondent session, hence the 402.
The parameter that controls the server side expiration of sessions is maxInterval, documented here: https://docs.cometd.org/current/reference/#_java_server.
By default is 10 seconds. If you increase it, it means you are retaining in the server memory sessions for a longer time, so you need to take that into account.

Some Spring WebSocket Sessions never disconnect

I have a websocket solution for duplex communication between mobile apps and a java backend system. I am using Spring WebSockets, with STOMP. I have implemented a ping-pong solution to keep websockets open longer than 30 seconds because I need longer sessions than that. Sometimes I get these errors in the logs, which seem to come from checkSession() in Spring's SubProtocolWebSocketHandler.
server.log: 07:38:41,090 ERROR [org.springframework.web.socket.messaging.SubProtocolWebSocketHandler] (ajp-http-executor-threads - 14526905) No messages received after 60205 ms. Closing StandardWebSocketSession[id=214a10, uri=/base/api/websocket].
They are not very frequent, but happens every day and the time of 60 seconds seem appropriate since it's hardcoded into the Spring class mentioned above. But then after running the application for a while I start getting large amounts of these really long-lived 'timeouts':
server.log: 00:09:25,961 ERROR [org.springframework.web.socket.messaging.SubProtocolWebSocketHandler] (ajp-http-executor-threads - 14199679) No messages received after 208049286 ms. Closing StandardWebSocketSession[id=11a9d9, uri=/base/api/websocket].
And at about this time the application starts experiencing problems.
I've been trying to search for this behavior but havn't found it anywhere on the web. Has anyone seen this problem before, know a solution, or can explain it to me?

We found some things:
We have added our own ping/pong functionality on STOMP level that runs every 30 seconds.
The mobile client had a bug that caused them to keep replying to the pings even when going into screensaving mode. This meant that the websocket was never closed or timed out.
On each pong message that the server received the Spring check found that no 'real' messages had been received for a very long time and triggered the log to be written. It then tries to close the websocket with this code:
session.close(CloseStatus.SESSION_NOT_RELIABLE);
but I suspect this doesn't close the session correctly. And even if it did, the mobile clients would try to reconnect. So when 30 more seconds have passed another pong message is sent to the server causing yet another one of these logs to be written. And so on forever...
The solution was to write some server-side code to close old websockets based on this project and also to fix the bug in the mobile clients that made them respond to ping/pong even when being in screensaver mode.
Oh, one thing that might be good for other people to know is that clients should never be trusted and we saw that they could sometimes send multiple request for websockets within one millisecond, so make sure to handle these 'duplicate requests' some way!

I am also facing the same problem.
net stat on Linux output shows tcp connections and status as below:
1 LISTEN
13 ESTABLISHED
67 CLOSE_WAIT
67 TCP connections are waiting to be closed but these are never getting closed.

"Shutdown tomcat server" etcpp. message to all currently logged in users

I would like to inform all logged in users that the server will shutdown. This special interest would be nice in an ajaxfy application (RIA).
What are the possible solutions? What are the best practice solutions?
There were two possible end-scenarios:
Send a text $x to the server ergo to all users. ("The server will not be available for some minutes.")
Send a key $y to the server which will used to generate a (custom) text to all users. ("SERVER_SHUTDOWN")
Environment: Tomcat (6/7), Spring 3+
Messaging to users: with polling or pseudo-pushing via an async servlet.
Ideas
1. Context.destroy(): Implementing a custom ContextListener's destroy
I don't think it is a good solution to block within a "destroy()" -- blocking, because we should wait about 5-10 seconds to make sure that all logged in users receive a message.
2. JMX Beans
This would mean, that any server service operation (start, stop) have to invoke a special program which sends the message.
3. Any other messaging queues like AMQP or ActiveMQ
Like 2.

Unless the server shuts down regularly and the shutdown has a significant impact on users (for e.g. they will lose any unsubmitted work - think halfway through editing a big post on a page) then notifying of server shutdown won't really be of much benefit.
There are a couple of things you could do.
First, if the server is going to be shutdown due to planned maintenance then you could include a message on web pages like;
Server will be unavailable Monday 22nd Aug 9pm - 6am for planned
maintenance. Contact knalli#example.com for more information.
Second, before shutting down the server, redirect requests to a static holding page (just change your web server config). This holding page should have information on why the server is down and when it will be available again.
With both options, its also important to plan server downtime. It's normal to have maintenance windows outside of normal working hours. Alternatively, if you have more than one server you can cluster them. This allows you to take individual servers out of the cluster to perform maintenance without having any server downtime at all.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.