How to set Sesame 2.8.0 RepositoryConnection timeout - java

I am trying to implement something like circuit breaker for my Sesame connections to the back-end database. When the database is absent I want to know this after 2 seconds, not to rely on the defaults of the client for timeouts. I can possibly overcome this with my own FutureTasks where I will execute the repository initialization and the connection obtaining. However in the logs I can see that sesame client uses o.a.h.i.c.PoolingClientConnectionManager - which is passed I bet ExecutorService and some default timeouts. This will make my FutureTask solution pretty messy. Is there an easier way to set timeouts for the sesame client.

You can set the query and update timeout, specifically, on the query/update object itself:
RepositoryConnection conn = ....;
...
TupleQuery query = conn.prepareTupleQuery(QueryLangage.SPARQL, "SELECT ...");
query.setMaxExecutionTime(2);
However, if you want to set a general timeout for all api calls over HTTP, the only way to currently do that is by obtaining a reference to the HttpClient object, and reconfigure it:
HTTPRepository repo = ....;
AbstractHttpClient httpClient = (AbstractHttpClient)((SesameClientImpl)repo.getSesameClient()).getHtttpClient();
HttpParams params = httpClient.getParams();
params.setIntParameter(CoreConnectionPNames.SO_TIMEOUT, 2000);
httpClient.setParams(params);
As you can see, this is rather brittle (lots of explicit casts), and uses an approach that is deprecated in Apache HttpClient 4.4. So I don't exactly recommend this as a stable solution, but it should provide a workaround in the short term.
In the longer term the Sesame dev team are working on more convenient access to the configuration of the httpclient.

Related

java vertx jdbc sqlite: how to set PRAGMA syncronous=NORMAL

Vertx outlines that this is the normal way to connect to a database here https://vertx.io/docs/vertx-jdbc-client/java/ :
String databaseFile = "sqlite.db";
JDBCPool pool = JDBCPool.pool(
this.context.getVertx(),
new JDBCConnectOptions()
.setJdbcUrl("jdbc:sqlite:".concat(databaseFile)),
new PoolOptions()
.setMaxSize(1)
.setConnectionTimeout(CONNECTION_TIMEOUT)
);
This application I am writing has interprocess communication, so I want to use WAL mode, and synchronous=NORMAL to avoid heavy disk usage. The WAL pragma (PRAGMA journal_model=WAL) is set to the database itself, so I dont need to worry about it on application startup. However, the synchronous pragma is set per connection, so I need to set that when the appplication starts. Currently that looks like this:
// await this future
pool
.preparedQuery("PRAGMA synchronous=NORMAL")
.execute()
I can confirm that later on the synchronous pragma is set on the database connection.
pool
.preparedQuery("PRAGMA synchronous")
.execute()
.map(rows -> {
for (Row row : rows) {
System.out.println("pragma synchronous is " + row.getInteger("synchronous"))
}
})
and since I enforce a single connection in the pool, this should be fine. However I cant help but feel that there is a better way of doing this.
As a side note, I chose a single connection because sqlite is synchronous in nature, there is only ever one write happening at a time to the database. Creating write contention within a single application sounds detrimental rather than helpful, and I have designed my application to have as little concurrent writes within a single process as possible, though inter-process concurrency is real.
So these arent definitive answers, but I have tried a few other options, and want to outline them here.
For instance, vertx can instantiate a SQLClient without a pool:
JsonObject config = new JsonObject()
.put("url", "jdbc:sqlite:"+databaseFile)
.put("driver_class", "org.sqlite.jdbcDriver")
.put("max_pool_size", 1);
Vertx vertx = Vertx.vertx();
SQLClient client = JDBCClient.create(vertx, config);
though this still uses a connection pool, so I have to make the same adjustments to set a single connection in the pool, so that the pragma sticks.
There is also a SQLiteConfig class from the sqlite library, but I have no idea how to connect that into the vertx jdbc wrappers
org.sqlite.SQLiteConfig config = new org.sqlite.SQLiteConfig();
config.setSynchronous(SynchronousMode.NORMAL);
is a pool required with vertx? I did try running the sqlite jdbc driver directly, without a vertx wrapper. But this ran into all kinds of SQLITE_BUSY exceptions.

How do I make a Java function that retries a URL connection every half second if the connection takes too long?

So I have a problem with a Java program I have. The program's basic functionality includes basically connecting to a web API for data. The function that does that is something like this:
public static Object getData(String sURL) throws IOException {
URL url = new URL(sURL);
URLConnection request = url.openConnection();
request.connect();
return request.getContent();
}
The code works fine as it is, but recently, after my house changed ISPs, I have found that sometimes the connections take an unreasonably long amount of time, something like 10 seconds or more in about 10% of attempts, while the other 90% takes only around 200ms. I have found it to be faster to ask my program to call the function again in a different thread than to wait for some of these connections to finally connect.
Therefore, I want to change the function so that if after 500ms, the connection did not establish, it would disconnect and a new connection would be attempted. How could I do this?
Somewhere online I read that HttpURLConnection might help, but I am not sure how.
URLConnection allows you to specify the connect and read timeout prior to calling connect():
https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/net/URLConnection.html#setConnectTimeout(int)
Sets a specified timeout value, in milliseconds, to be used when
opening a communications link to the resource referenced by this
URLConnection. If the timeout expires before the connection can be
established, a java.net.SocketTimeoutException is raised. A timeout of
zero is interpreted as an infinite timeout.
With 500ms timeout:
try {
URLConnection request = url.openConnection();
request.setConnectTimeout(500); // 500 ms
request.connect();
// on successful connection
} catch (SocketTimeoutException ex) {
// on request timeout
}
This you can pack into a loop, but I recommend limiting the number of attempts made.
Java's URLConnection doesn't have retry capabilities in Java 8 therefore the best way here to achieve this - use an appropriate standalone 3-party library such as Apache HttpClient.
This is by far the best standalone 3-party HTTP client with advanced capabilities as of 2020 and it's still maintained.
By default as of version 5.2.x Apache Http Client, Apache Http Client uses the default implementation of org.apache.http.client.HttpRequestRetryHandler, which retries 3 times, but you can use a custom implementation instead.
The configuration might look like this(full imports are for example's sake):
org.apache.http.client.HttpClient httpClient = org.apache.http.impl.client.HttpClients.custom()
.setRetryHandler(YourCustomImplOfTheRetryHandlerClass)
//other config
.build();
There is no way I can reproduce that problem using my ISP.
I suggest you dig deeper into the problem and find a better solution. Sending another request just doesn't seem good enough to me. Maybe try a different way to get the data and see if that works for you. Can't say for sure as I can't reproduce the problem.

What does setDefaultMaxPerRoute and setMaxTotal mean in HttpClient?

I am using Apache HttpClient in one of my project. I am also using PoolingHttpClientConnectionManager along with my HttpClient as well.
I am confuse what are these properties mean. I tried going through documentation in the code but I don't see any documentation around these variables so was not able to understand.
setMaxTotal
setDefaultMaxPerRoute
setConnectTimeout
setSocketTimeout
setConnectionRequestTimeout
setStaleConnectionCheckEnabled
Below is how I am using in my code:
RequestConfig requestConfig = RequestConfig.custom().setConnectTimeout(5 * 1000).setSocketTimeout(5 * 1000)
.setStaleConnectionCheckEnabled(false).build();
PoolingHttpClientConnectionManager poolingHttpClientConnectionManager = new PoolingHttpClientConnectionManager();
poolingHttpClientConnectionManager.setMaxTotal(200);
poolingHttpClientConnectionManager.setDefaultMaxPerRoute(20);
CloseableHttpClient httpClientBuilder = HttpClientBuilder.create()
.setConnectionManager(poolingHttpClientConnectionManager).setDefaultRequestConfig(requestConfig)
.build();
Can anyone explain me these properties so that I can understand and decide what values I should put in there. Also, are there any other properties that I should use apart from as shown above to get better performance?
I am using http-client 4.3.1
Some parameters are explained at http://hc.apache.org/httpclient-3.x/preference-api.html
Others must be gleaned from the source.
setMaxTotal
The maximum number of connections allowed across all routes.
setDefaultMaxPerRoute
The maximum number of connections allowed for a route that has not been specified otherwise by a call to setMaxPerRoute. Use setMaxPerRoute when you know the route ahead of time and setDefaultMaxPerRoute when you do not.
setConnectTimeout
How long to wait for a connection to be established with the remote server before throwing a timeout exception.
setSocketTimeout
How long to wait for the server to respond to various calls before throwing a timeout exception. See http://docs.oracle.com/javase/1.5.0/docs/api/java/net/SocketOptions.html#SO_TIMEOUT for details.
setConnectionRequestTimeout
How long to wait when trying to checkout a connection from the connection pool before throwing an exception (the connection pool won't return immediately if, for example, all the connections are checked out).
setStaleConnectionCheckEnabled
Can be disabled for a slight performance improvement at the cost of potential IOExceptions. See http://hc.apache.org/httpclient-3.x/performance.html#Stale_connection_check

Is this the right way to use AIMDBackoffManager to instantiate HttpClient?

Background :
I am using HttpClient (SolrJ) to connect to a Solr service. The question is not directly related to Solr though.
I bumped into the following issue when doing a Load testing.
Caused by: java.lang.IllegalStateException: Invalid use of BasicClientConnManager: connection still allocated.
SOF Answer - to use Pooled connection manager
Invalid use of BasicClientConnManager: connection still allocated
Question :
I am using the PoolingHttpClientConnectionManager as in the following code. Instead of manually throttling the connection size, I would like it to be managed using the AIMDBackoffManager. However, I see that the AIMDBackoffManager needs the connection pool as its parameter.
public static final PoolingClientConnectionManager poolingConnectionManager = new PoolingClientConnectionManager();
public static DefaultHttpClient getHttpClient(){
DefaultHttpClient httpClient = new DefaultHttpClient(poolingConnectionManager);
httpClient.setBackoffManager(new AIMDBackoffManager(poolingConnectionManager));
...
...
}
I googled a fair bit but I am unable to find any examples on the usage of BackoffManager. So, this is what I did but I am not excited in passing the connection manager twice to the DefaultHttpClient. Or should I not be worried considering the first time I am passing it to the HttpClient and the second time I am passing it to the BackoffManager?
I am using httpclient-4.2.3
I ventured into this deep water as well. I have been investigating how to use ServiceUnavailableRetryStrategy which seems failing due to BackoffManager in my case. I have an impression that this is not a finished functionality as I can't google out its usage and there is not much in the HttpClient source code either.
The AIMDBackoffManager constructor takes a ConnPoolControl (which the connection manager implements). Looking at this interface you'll see it only returns route-specific statistics of the pool which is what the BackoffManager uses to perform its tasks.
So you should not be worried about passing the connection manager twice while building the client, just be aware that AIMDBackoffManager acquires a lock on the connection manager in its backOff and probe implementations, which you can see in the source.

does python's urllib2 do connection pooling?

Really what I'm wondering: is python's urllib2 more like java's HttpUrlConnection, or more like apache's HttpClient? And, ultimately I'm wondering if urllib2 scales when used in a http server, or if there is some alternate library that is used when performance is an issue (as is the case in the java world).
To expand on my question a bit:
Java's HttpUrlConnection internally holds one connection open per host, and does pipelining. So if you do the following concurrently across threads it won't perform well:
HttpUrlConnection cxn = new Url('www.google.com').openConnection();
InputStream is = cxn.getInputStream();
By comparison, apache's HttpClient can be initialized with a connection pool, like this:
// this instance can be a singleton and shared across threads safely:
HttpClient client = new HttpClient();
MultiThreadedHttpConnectionManager cm = new MultiThreadedHttpConnectionManager();
HttpConnectionManagerParams p = new HttpConnectionManagerParams();
p.setMaxConnectionsPerHost(HostConfiguration.ANY_HOST_CONFIGURATION,20);
p.setMaxTotalConnections(100);
p.setConnectionTimeout(100);
p.setSoTimeout(250);
cm.setParams(p);
client.setHttpConnectionManager(cm);
The important part in the example above being that the number of total connections and the per-host connections are configurable.
In a comment urllib3 was mentioned, but I can't tell from reading the docs if it allows a per-host max to be set.
As of Python 2.7.14rc1, No.
For urllib, urlopen() eventually calls httplib.HTTP, which creates a new instance of HTTPConnection. HTTPConnection is tied to a socket and has methods for opening and closing it.
For urllib2, HTTPHandler does something similar and creates a new instance of HTTPConnection.

Categories