About SpringBoot Tomcat acceptCount pool - java

I'm dealing with the Tomcat configuration on springboot.
Let's supposse i have the following configuration:
server:
tomcat:
min-spare-threads: ${min-tomcat-threads:20}
max-threads: ${max-tomcat-threads:20}
accept-count: ${accept-concurrent-queue:1}
max-connections: ${max-tomcat-connections:100}
I have a simple RestController with this code:
public String request(#Valid #RequestBody Info info) {
log.info("Thread sleeping");
Thread.sleep(8000);
return "OK";
}
Then i make the following test:
I send 200 HTTP request per second.
I check the log and as I expected I see 100 simultaneous executions and after 8 seconds I see the last one (queued).
Other executions are rejected.
The main problem that i have with this is that if i have a timeout control on client call (for example, 5 seconds), the queued operation will be processed on server anyways even if it was rejected on client.
I want to avoid this situation, so I tried:
server:
tomcat:
min-spare-threads: ${min-tomcat-threads:20}
max-threads: ${max-tomcat-threads:20}
accept-count: ${accept-concurrent-queue:0}
max-connections: ${max-tomcat-connections:100}
But this "0" is totally ignored (i think in this case it means "infinite").
So, my question is:
¿Is it possible to configure Tomcat to don't queue operations if the max-connections limit is reached?
Or maybe
¿Is it possible to configure Tomcat to reject any operation queued?
Thank you very much in advance.
Best regards.

The value of the acceptCount parameter is passed directly to the operating system: e.g. for UNIX-es it is passed to listen. Since an incoming connection is always put in the OS queue before the JVM accepts it, values lower than 1 make no sense. Tomcat explicitly ignores such values and keeps its default 100.
However, the real queue in Tomcat are the connections that where accepted from the OS queue, but are not being processed due to a lack of processing threads (maxThreads). You might have at most maxConnections - maxThreads + 1 such connections. In your case it's 81 connections waiting to be processed.

Related

Unable to execute HTTP request: Acquire operation took longer than the configured maximum time

I am running a batch job in AWS which consumes messages from a SQS queue and writes them to a Kafka topic using akka. I've created a Sqs Async Client with the following parameters:
private static SqsAsyncClient getSqsAsyncClient(final Config configuration, final String awsRegion) {
var asyncHttpClientBuilder = NettyNioAsyncHttpClient.builder()
.maxConcurrency(100)
.maxPendingConnectionAcquires(10_000)
.connectionMaxIdleTime(Duration.ofSeconds(60))
.connectionTimeout(Duration.ofSeconds(30))
.connectionAcquisitionTimeout(Duration.ofSeconds(30))
.readTimeout(Duration.ofSeconds(30));
return SqsAsyncClient.builder()
.region(Region.of(awsRegion))
.httpClientBuilder(asyncHttpClientBuilder)
.endpointOverride(URI.create("https://sqs.us-east-1.amazonaws.com/000000000000")).build();
}
private static SqsSourceSettings getSqsSourceSettings(final Config configuration) {
final SqsSourceSettings sqsSourceSettings = SqsSourceSettings.create().withCloseOnEmptyReceive(false);
if (configuration.hasPath(ConfigPaths.SqsSource.MAX_BATCH_SIZE)) {
sqsSourceSettings.withMaxBatchSize(10);
}
if (configuration.hasPath(ConfigPaths.SqsSource.MAX_BUFFER_SIZE)) {
sqsSourceSettings.withMaxBufferSize(1000);
}
if (configuration.hasPath(ConfigPaths.SqsSource.WAIT_TIME_SECS)) {
sqsSourceSettings.withWaitTime(Duration.of(20, SECONDS));
}
return sqsSourceSettings;
}
But, whilst running my batch job I get the following AWS SDK exception:
software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Acquire operation took longer than the configured maximum time. This indicates that a request cannot get a connection from the pool within the specified maximum time. This can be due to high request rate.
The exception still seems to occur even after I try tweaking the parameters mentioned here:
Consider taking any of the following actions to mitigate the issue: increase max connections, increase acquire timeout, or slowing the request rate. Increasing the max connections can increase client throughput (unless the network interface is already fully utilized), but can eventually start to hit operation system limitations on the number of file descriptors used by the process. If you already are fully utilizing your network interface or cannot further increase your connection count, increasing the acquire timeout gives extra time for requests to acquire a connection before timing out. If the connections doesn't free up, the subsequent requests will still timeout. If the above mechanisms are not able to fix the issue, try smoothing out your requests so that large traffic bursts cannot overload the client, being more efficient with the number of times you need to call AWS, or by increasing the number of hosts sending requests
Has anyone run into this issue before?
I encountered the same issue, and I ended up firing 100 async batch requests then wait for those 100 to get cleared before firing another 100 and so on.

Azure eventhub Kafka org.apache.kafka.common.errors.TimeoutException for some of the records

Have a ArrayList containing 80 to 100 records trying to stream and send each individual record(POJO ,not entire list) to Kafka topic (event hub) . Scheduled a cron job like every hour to send these records(POJO) to event hub.
Able to see messages being sent to eventhub ,but after 3 to 4 successful run getting following exception (which includes several messages being sent and several failing with below exception)
Expiring 14 record(s) for eventhubname: 30125 ms has passed since batch creation plus linger time
Following is the config for Producer used,
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.ACKS_CONFIG, "1");
props.put(ProducerConfig.RETRIES_CONFIG, "3");
Message Retention period - 7
Partition - 6
using spring Kafka(2.2.3) to send the events
method marked as #Async where kafka send is written
#Async
protected void send() {
kafkatemplate.send(record);
}
Expected - No exception to be thrown from kafka
Actual - org.apache.kafka.common.errors.TimeoutException is been thrown
Prakash - we have seen a number of issues where spiky producer patterns see batch timeout.
The problem here is that the producer has two TCP connections that can go idle for > 4 mins - at that point, Azure load balancers close out the idle connections. The Kafka client is unaware that the connections have been closed so it attempts to send a batch on a dead connection, which times out, at which point retry kicks in.
Set connections.max.idle.ms to < 4mins – this allows Kafka client’s network client layer to gracefully handle connection close for the producer’s message-sending TCP connection
Set metadata.max.age.ms to < 4mins – this is effectively a keep-alive for the producer metadata TCP connection
Feel free to reach out to the EH product team on Github, we are fairly good about responding to issues - https://github.com/Azure/azure-event-hubs-for-kafka
This exception indicates you are queueing records at a faster rate than they can be sent. Once a record is added a batch, there is a time limit for sending that batch to ensure it has been sent within a specified duration. This is controlled by the Producer configuration parameter, request.timeout.ms. If the batch has been queued longer than the timeout limit, the exception will be thrown. Records in that batch will be removed from the send queue.
Please check the below for similar issue, this might help better.
Kafka producer TimeoutException: Expiring 1 record(s)
you can also check this link
when-does-the-apache-kafka-client-throw-a-batch-expired-exception/34794261#34794261 for reason more details about batch expired exception.
Also implement proper retry policy.
Note this does not account any network issues scanner side. With network issues you will not be able to send to either hub.
Hope it helps.

Connection timeouts with HikariCP

I have a Spring Boot (v2.0.8) application which makes use of a HikariCP (v2.7.9) Pool (connecting to MariaDB) configured with:
minimumIdle: 1
maximumPoolSize: 10
leakDetectionThreshold: 30000
The issue is that our production component, once every few weeks, is repeatedly throwing SQLTransientConnectionException " Connection is not available, request timed out after 30000ms...". The issue is that it never recovers from this and consistently throws the exception. A restart of the componnent is therefore required.
From looking at the HikariPool source code, it would seem that this is happening because every time it is calling connectionBag.borrow(timeout, MILLISECONDS) the poolEntry is null and hence throws the timeout Exception. For it to be null, the connection pool must have no free entries i.e. all PoolEntry in the sharedList are marked IN_USE.
I am not sure why the component would not recover from this since eventually I would expect a PoolEntry to be marked NOT_IN_USE and this would break the repeated Exceptions.
Possible scenarios I can think of:
All entries are IN_USE and the DB goes down temporarily. I would expect Exceptions to be thrown for the in-flight queries. Perhaps at this point the PoolEntry status is never reset and therefore is stuck at IN_USE. In this case I would have thought if an Exception is thrown the status is changed so that the connection can cleared from the pool. Can anyone confirm if this is the case?
A flood of REST requests are made to the component which in turn require DB queries to be executed. This fills the connection pool and therefore subsequent requests timeout waiting for previous requests to complete. This makes sense however I would expect the component to recover once the requests complete, which it is not.
Does anyone have an idea of what might be the issue here? I have tried configuring the various timeouts that are in the Hikari documentation but have had no luck diagnosing / resolving this issue. Any help would be appreciated.
Thanks!
Scenario 2 is most likely what is happening. I ran into the same issue when using it with cloud dataflow and receiving a large amount of connection requests. The only solution I found was to play with the config to find a combination that worked for my use case.
I'll leave you my code that works for 50-100 requests per second and wish you luck.
private static DataSource pool;
final HikariConfig config = new HikariConfig();
config.setMinimumIdle(5);
config.setMaximumPoolSize(50);
config.setConnectionTimeout(10000);
config.setIdleTimeout(600000);
config.setMaxLifetime(1800000);
config.setJdbcUrl(JDBC_URL);
config.setUsername(JDBC_USER);
config.setPassword(JDBC_PASS);
pool = new HikariDataSource(config);

Difference between web service connection timeout and request timeout

WebClientTestService service = new WebClientTestService() ;
int connectionTimeOutInMs = 5000;
Map<String,Object> context=((BindingProvider)service).getRequestContext();
context.put("com.sun.xml.internal.ws.connect.timeout", connectionTimeOutInMs);
context.put("com.sun.xml.internal.ws.request.timeout", connectionTimeOutInMs);
context.put("com.sun.xml.ws.request.timeout", connectionTimeOutInMs);
context.put("com.sun.xml.ws.connect.timeout", connectionTimeOutInMs);
Please share the differences mainly in connect timeout and request timeout.
I need to know the recommended values for these parameter values.
What are the criteria for setting timeout value ?
Please share the differences mainly in connect timeout and request timeout.
I need to know the recommended values for these parameter values.
Connect timeout (10s-30s): How long to wait to make an initial connection e.g. if service is currently unavailable.
Socket timeout (10s-20s): How long to wait if the service stops responding after data is sent.
Request timeout (30s-300s): How long to wait for the entire request to complete.
What are the criteria for setting timeout value ?
It depends a web user will get impatient if nothing has happened after 1-2 minutes, however a back end request could be allowed to run longer.
Also consider server resources are not released until request completes (or times out) - so if you have too many requests and long timeouts your server could run out of resources and be unable to service further requests.
request timeout should be set to a value greater then the expected time for the request to complete, perhaps with some room to allow occasionally slower performance under heavy loads.
connect/socket timeouts are often set lower as normally indicate a server problem where waiting another 10-15s usually won't resolve.

Read timeout on a web service, but the operation still completes?

I have a Java web service client running on Linux (using Axis 1.4) that invokes a series of web services operations performed against a Windows server. There are times that some transactional operations fail with this Exception:
java.net.SocketTimeoutException: Read timed out
However, the operation on the server is completed (even having no useful response on the client). Is this a bug of either the web service server/client? Or is expected to happen on a TCP socket?
This is the expected behavior, rather than a bug. The operation behind the web service doesn't know anything about your read timing out so continues processing the operation.
You could increase the timeout of the connection - if you are manually manipulating the socket itself, the socket.connect() method can take a timeout (in milliseconds). A zero should avoid your side timing out - see the API docs.
If the operation is going to take a long time in each case, you may want to look at making this asynchronous - a first request submits the operations, then a second request to get back the results, possibly with some polling to see when the results are ready.
If you think the operation should be completing in this time, have you access to the server to see why it is taking so long?
I had similar issue. We have JAX-WS soap webservice running on Jboss EAP6 (or JBOSS 7). The default http socket timeout is set to 60 seconds unless otherwise overridden in server or by the client. To fix this issue I changed our java client to something like this. I had to use 3 different combinations of propeties here
This combination seems to work as standalone java client or webservice client running as part of other application on other web server.
//Set timeout on the client
String edxWsUrl ="http://www.example.com/service?wsdl";
URL WsURL = new URL(edxWsUrl);
EdxWebServiceImplService edxService = new EdxWebServiceImplService(WsURL);
EdxWebServiceImpl edxServicePort = edxService.getEdxWebServiceImplPort();
//Set timeout on the client
BindingProvider edxWebserviceBindingProvider = (BindingProvider)edxServicePort;
BindingProvider edxWebserviceBindingProvider = (BindingProvider)edxServicePort;
edxWebserviceBindingProvider.getRequestContext().put("com.sun.xml.internal.ws.request.timeout", connectionTimeoutInMilliSeconds);
edxWebserviceBindingProvider.getRequestContext().put("com.sun.xml.internal.ws.connect.timeout", connectionTimeoutInMilliSeconds);
edxWebserviceBindingProvider.getRequestContext().put("com.sun.xml.ws.request.timeout", connectionTimeoutInMilliSeconds);
edxWebserviceBindingProvider.getRequestContext().put("com.sun.xml.ws.connect.timeout", connectionTimeoutInMilliSeconds);
edxWebserviceBindingProvider.getRequestContext().put("javax.xml.ws.client.receiveTimeout", connectionTimeoutInMilliSeconds);
edxWebserviceBindingProvider.getRequestContext().put("javax.xml.ws.client.connectionTimeout", connectionTimeoutInMilliSeconds);

Categories