NewRelic Ignore cometd LongPolling in Jetty

NewRelic Ignore cometd LongPolling in Jetty - java

I have a Java Web app running on Jetty which connects to the server using cometD to receive data and returns after 25s if the server has no data and reconnects, i.e., long-polling.
I monitor the performance of the server using NewRelic but those long-polling connections skew the performance diagrams.
Is there a way to tell newrelic to actually ignore the time the server is waiting and only show the actual time that the server has been busy? I understand that it is probably impossible to do this on the newrelic side, but I thought there may be some best practices on how to deal with long-polling connections in newrelic.
Any help is appreciated!

You wont be able to just exclude or ignore the time the server is waiting and only show the actual time that the server has been busy, but what you can do is ignore the transaction completely if you do not need to see those metrics. https://docs.newrelic.com/docs/java/java-agent-api This documentation talks about using New Relics API for ignoring transactions.

CometD sends long polls to a URL that is the base CometD Servlet URL with "/connect" appended, see parameter appendMessageTypeToURL in the documentation.
For example, if you have mapped the CometD Servlet to /cometd/*, then long polls are sent to /cometd/connect.
I don't know NewRelic, but perhaps you can filter out the requests that end in */connect and gather your statistics on the other requests, that now won't be skewed by the long poll timeout.

Related

What cometd configurations to use to reduce 402 error occurrences?

We have implemented a Java servlet running on JBoss container that uses CometD long-polling. This has been implemented in a few organizations without any issue, but in a recent implementation there are functional issues which appear to be related to the network setup of this organization.
Specifically, around 5% of the time, the connect requests are getting back 402 errors:
{"id":"39","error":"402::Unknown client","successful":false,"advice":{"interval":0,"reconnect":"handshake"},"channel":"/meta/connect"}
Getting this organization to address network performance is a significant challenge, so we are looking at a way to tune the implementation to reduce these issues.
Which cometd configuration parameters can be updated to improve this?
maxinterval, timeout, multiSessionInverval, etc?
Thank you!

The "402 unknown client" error is due to the fact that the server does not see /meta/connect heartbeat messages from the client and expires the correspondent session on the server. This is typically due to network issues.
Once the client network is restored, the client sends a /meta/connect heartbeat message but the server doesn't have the correspondent session, hence the 402.
The parameter that controls the server side expiration of sessions is maxInterval, documented here: https://docs.cometd.org/current/reference/#_java_server.
By default is 10 seconds. If you increase it, it means you are retaining in the server memory sessions for a longer time, so you need to take that into account.

Elasticsearch unclosed client. Live threads after Tomcat shutdown. Memory usage impact?

I am using Elasticsearch 1.5.1 and Tomcat 7. Web application creates a TCP client instance as Singleton during server startup through Spring Framework.
Just noticed that I failed to close the client during server shutdown.
Through analysis on various tools like VisualVm, JConsole, MAT in Eclipse, it is evident that threads created by the elasticsearch client are live even after server(tomcat) shutdown.
Note: after introducing client.close() via Context Listener destroy methods, the threads are killed gracefully.
But my query here is,
how to check the memory occupied by these live threads?
Memory leak impact due to this thread?
We have got few Out of memory:Perm gen errors in PROD. This might be a reason but still I would like to measure and provide stats for this.
Any suggestions/help please.

Typically clients run in a different process than the services they communicate with. For example, I can open a web page in a web browser, and then shutdown the webserver, and the client will remain open.
This has to do with the underlying design choices of TCP/IP. Glossing over the details, under most cases a client only detects it's server is gone during the next request to the server. (Again generally speaking) it does not continually poll the server to see if it is alive, nor does the server generally send a "please disconnect" message on shutting down.
The reason that clients don't generally poll servers is because it allows the server to handle more clients. With a polling approach, the server is limited by the number of clients running, but without a polling approach, it is limited by the number of clients actively communicating. This allows it to support more clients because many of the running clients aren't actively communicating.
The reason that servers typically don't send an "I'm shutting down" message is because many times the server goes down uncontrollably (power outage, operating system crash, fire, short circuit, etc) This means that an protocol which requires such a message will leave the clients in a corrupt state if the server goes down in an uncontrolled manner.
So losing a connection is really a function of a failed request to the server. The client will still typically be running until it makes the next attempt to do something.
Likewise, opening a connection to a server often does nothing most of the time too. To validate that you really have a working connection to a server, you must ask it for some data and get a reply. Most protocols do this automatically to simplify the logic; but, if you ever write your own service, if you don't ask for data from the server, even if the API says you have a good "connection", you might not. The API can report back a good "connections" when you have all the stuff configured on your machine successfully. To really know if it works 100% on the other machine, you need to ask for data (and get it).
Finally servers sometimes lose their clients, but because they don't waste bandwidth chattering with clients just to see if they are there, often the servers will put a "timeout" on the client connection. Basically if the server doesn't hear from the client in 10 minutes (or the configured value) then it closes the cached connection information for the client (recreating the connection information as necessary if the client comes back).
From your description it is not clear which of the scenarios you might be seeing, but hopefully this general knowledge will help you understand why after closing one side of a connection, the other side of a connection might still think it is open for a while.
There are ways to configure the network connection to report closures more immediately, but I would avoid using them, unless you are willing to lose a lot of your network bandwidth to keep-alive messages and don't want your servers to respond as quickly as they could.

Forward http request to other server that will respond to the original requester using java servlets

I have a problem where I have several servers sending HttpRequests (using round robin to decide which server to send to) to several servers that process the requests and return the response.
I would like to have a broker in the middle that examines the request and decides which server to forward it to but the responses can be very big so I would like the response to only be sent to the original requester and not be passed back through the broker. Kind of like a proxy but the way I understand a proxy is that all data is sent back through the proxy. Is this possible?
I'm working with legacy code and would rather not change the way the requests and responses are processed but only put something in the middle that can do some smarter routing of the requests.
All this is currently done using HttpServletRequest/Response and Servlets running on embedded Jetty web servers.
Thank you!

What you're after is that the broker component is using the client's IP address when connecting to the target server. That is called IP spoofing.
Are you sure that you want to implement this yourself? Intricacies of network implementation of such a solution are quite daunting. Consider using software that has this option builtin, such as HAProxy. See these blog posts.

Threaded apache cxf clients and performance on high frequency requests

I have a relatively simple java service that fetches information from various SOAP webservices and does so using apache cxf 2.5.2 under the hood. The service launches 20 worker threads to churn through 1000-8000 requests every hour and each request could make 2-5 webservice calls depending on the nature of the request.
Setup
I am using connection pooling on the webservice connections
Connection Timeout is set to 2 seconds in order to realistically tackle the volume of requests efficiently.
All connections are going out through a http proxy.
20 Worker Threads
Grunty 16 cpu box
The problem is that I am starting to see 'connect time out' errors in the logs and quite a large number of them and it seems the the application service is also effecting the machines network performance as curl from the command line takes >5 seconds just establish a connection to the same webservices. However when I stop the service application, curl performance improves drastically to < 5ms
How have other people tackled this situation using CXF? did it work or did they switch to a different library? If you were to start from scratch how would you design for 'small payload high frequency' transactions?

Once we had the similar problem as yours that the request took very long time to complete. It is not CXF issue, every web services' stacks will operate long for very frequent request.
To solve this issue we implemented JMS EJB message driven bean. The flow were as follows: when the users send their request to web service, all requests were put into JMS queue so that response to users come very quickly and request is left to process at the background. Later the users were able to see their operations: if they are still send to process, if they are processing, if they are completed successfully or if they failed to complete for some reason.
If I had to design frequent transactions application, I would definitely use JMS for that.
Hope this helps.

"Shutdown tomcat server" etcpp. message to all currently logged in users

I would like to inform all logged in users that the server will shutdown. This special interest would be nice in an ajaxfy application (RIA).
What are the possible solutions? What are the best practice solutions?
There were two possible end-scenarios:
Send a text $x to the server ergo to all users. ("The server will not be available for some minutes.")
Send a key $y to the server which will used to generate a (custom) text to all users. ("SERVER_SHUTDOWN")
Environment: Tomcat (6/7), Spring 3+
Messaging to users: with polling or pseudo-pushing via an async servlet.
Ideas
1. Context.destroy(): Implementing a custom ContextListener's destroy
I don't think it is a good solution to block within a "destroy()" -- blocking, because we should wait about 5-10 seconds to make sure that all logged in users receive a message.
2. JMX Beans
This would mean, that any server service operation (start, stop) have to invoke a special program which sends the message.
3. Any other messaging queues like AMQP or ActiveMQ
Like 2.

Unless the server shuts down regularly and the shutdown has a significant impact on users (for e.g. they will lose any unsubmitted work - think halfway through editing a big post on a page) then notifying of server shutdown won't really be of much benefit.
There are a couple of things you could do.
First, if the server is going to be shutdown due to planned maintenance then you could include a message on web pages like;
Server will be unavailable Monday 22nd Aug 9pm - 6am for planned
maintenance. Contact knalli#example.com for more information.
Second, before shutting down the server, redirect requests to a static holding page (just change your web server config). This holding page should have information on why the server is down and when it will be available again.
With both options, its also important to plan server downtime. It's normal to have maintenance windows outside of normal working hours. Alternatively, if you have more than one server you can cluster them. This allows you to take individual servers out of the cluster to perform maintenance without having any server downtime at all.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.