Java gRPC server inbound vs outbound threads - java

So I have this grpc Java server:
#Bean(initMethod = "start", destroyMethod = "shutdown")
public Server bodyShopGrpcServer(#Autowired BodyShopServiceInt bodyShopServiceInt) {
return ServerBuilder.forPort(bodyShopGrpcServerPort)
.executor(Executors.newFixedThreadPool(12))
.addService(new BodyShopServiceGrpcGw(bodyShopServiceInt))
.build();
}
..and this client:
long overallStart = System.nanoTime();
int iterations = 10000;
List<Long> results = new CopyOnWriteArrayList<>();
ExecutorService executorService = Executors.newFixedThreadPool(bodyShopGrpcThreadPoolSize);
ManagedChannel channel =
InProcessChannelBuilder.forName("bodyShopGrpcInProcessServer")
.executor(executorService)
.build();
BodyShopServiceGrpc.BodyShopServiceStub bodyShopServiceStub =
BodyShopServiceGrpc.newStub(channel);
for (int i = 0; i < iterations; i++) {
long start = System.nanoTime();
StreamObserver<MakeBodyResponse> responseObserver =
new StreamObserver<>() {
#Override
public void onNext(MakeBodyResponse makeBodyResponse) {
long stop = System.nanoTime();
results.add(stop - start);
}
#Override
public void onError(Throwable throwable) {
Status status = Status.fromThrowable(throwable);
logger.error("Error status: {}", status);
}
#Override
public void onCompleted() {}
};
bodyShopServiceStub.makeBody(
MakeBodyRequest.newBuilder()
.setBody(CarBody.values()[random.nextInt(CarBody.values().length)].toString())
.build(),
responseObserver);
}
channel
.shutdown()
.awaitTermination(
10, TimeUnit.SECONDS);
long sum = results.stream().reduce(0L, Math::addExact);
BigDecimal avg =
BigDecimal.valueOf(sum).divide(BigDecimal.valueOf(iterations), RoundingMode.HALF_DOWN);
long overallStop = System.nanoTime();
This gives me average round-trip latency and overall time for a batch of 10000.
Now what bothers me is that latency is ~30-50% of overall batch time.
I assume this is because all of the server threads are being assign to serve client requests and there's no thread left in the pool to serve callbacks.
Is there a way how to tune this? I mean, it's not possible to set a different thread pool for requests and callbacks.
I know there's a streaming API in grpc, is that a preferred/only way to reduce round-trip latency?

Thx #Eric Anderson it did not occur to me.
Plotted the results and you're absolutely right:
latency plot
My assumption that callback is waiting for an available thread was wrong, it's just that all the requests enter the system at the same time - start measuring at the same time. In fact I was comparing sync. vs. async. values, while this was working for sync. calls for async. it's clearly wrong.

Related

Apache HttpClient 4.5 or Java HttpsURLConnection - Many instances execute GET, POST with Threads in parallel, time increase

I want to run many GET and POST. In this example I use GET and for every task the same URL, but later it will be always a different URL for each task.
What I find that the time increases as the number of task and threads used.
num = 1 -> Done in 1846
num = 10 -> Done in 2114
num = 100 -> Done in 7204
num = 200 -> Done in 13720
If I have just 1 task I use 1 thread. If 10 tasks I use 10 threads, and so on.
I don't understand the time increase. If time for 1 task executed with 1 thread would take approx. 1 second, then for 10 tasks executed with 10 threads I would expect about the same time of 1 sec. Because on my 4-core CPU I can executed many threads concurrently.
Is it possibly that because I have only 1 network device, the requests don't get send in parallel but somehow in sequence?
// Amount of task and threads
int num = 10;
// Create many instances of the task
List<MyCallable> tasks = new ArrayList<>();
// Create num instances of MyCallable
ExecutorService executor = Executors.newFixedThreadPool(num);
List<Future<Void>> invokeAll = null;
long started = System.currentTimeMillis();
try {
invokeAll = executor.invokeAll(tasks);
} catch (InterruptedException ex) {
}
long ended = System.currentTimeMillis();
System.out.println("Done in " + (ended - started));
executor.shutdown();
private class MyCallable implements Callable<Void> {
public MyCallable() {}
#Override
public Void call() throws Exception {
int statusCode = sendGet();
return null;
}
private int sendGet() throws Exception {
CloseableHttpClient closeableHttpClient = HttpClients.createDefault();
CloseableHttpResponse closeableHttpResponse = closeableHttpClient.execute(new HttpGet("https://bing.com")); // https://www.google.com
int statusCode = closeableHttpResponse.getStatusLine().getStatusCode();
closeableHttpClient.close();
return statusCode;
}
}

Repeating task with handler take more time than interval

I get my data from the server and have to update it every x seconds.
I do this using the Handler's postDelayed function.
private long mInterval = 10000;
Runnable mStatusChecker = new Runnable() {
#Override
public void run() {
try {
takeServerResponse(); //with vary duration
}catch (Exception e){
itsRunning = false;
} finally {
if(mHandler!=null) {
mHandler.postDelayed(mStatusChecker, mInterval);
}
}
}
};
Sometimes it may take more than X seconds to get new data.
What can I do in this situation?
If we need increase interval,how to determine when to do so?
You can calculate the duration time of your job and postDelayed your handler based on the duration time.
For example:
startTime = System.currentTimeMillis();
//your job
duration = System.currentTimeMillis() - startTime;
mInterval = mInterval - duration
your handler used to call the server response after 10 sec.But Its all depend own your internet speed to get the data from server that's the reason its take long time

How to set a timeout threshold to wait/sleep in java?

My task is simple to download a file from a url using selenium. I did till clicking on the download part. Now I want to wait till the file is downloaded.Fine. I use following and done.
do {
Thread.sleep(60000);
}
while ((downloadeBuild.length()/1024) < 138900);
Now challenge is for how much time do I wait ? Can I set some threshold ? I can think of is use a counter in do while and check till counter goes to 10 or something like that ? But any other way in Java ? As such I do not have any action to do till the file is downloaded.
How about this?
I think using TimeOut is not stable since there is no need to wait for a un-predictable downloading operation.
You can just turn to CompletableFuture using supplyAsync to do the downloading and use thenApply to do the processing/converting and retrieve the result by join as follows:
public class SimpleCompletableFuture {
public static void main(String... args) {
testDownload();
}
private static void testDownload() {
CompletableFuture future = CompletableFuture.supplyAsync(() -> downloadMock())
.thenApply(SimpleCompletableFuture::processDownloaded);
System.out.println(future.join());
}
private static String downloadMock() {
try {
Thread.sleep(new Random().nextInt() + 1000); // mock the downloading time;
} catch (InterruptedException ignored) {
ignored.printStackTrace();
}
return "Downloaded";
}
private static String processDownloaded(String fileMock) {
System.out.println("Processing " + fileMock);
System.out.println("Done!");
return "Processed";
}
}
You can use guava Stopwatch
Stopwatch stopwatch = Stopwatch.createStarted();
while ((downloadeBuild.length()/1024) < 138900 && topWatch.elapsed(TimeUnit.SECONDS) < 60);
If what you want is a time out practice, may be you can try code below:
long timeout = 10 * 60 * 1000;
long start = System.currentTimeMillis();
while(System.currentTimeMillis() - timeout <= start ){
//Not timeout yet, wait
}
//Time out, continue
It's quite common in java library.

Apache Curator - Zookeeper connection loss exception, possible memory leak

I have been working on a process that continuously monitors a distributed atomic long counter. It monitors it every minute using the following class ZkClient's method getCounter. In fact, I have multiple threads running each of which are monitoring a different counter (distributed atomic long) stored in the Zookeeper nodes. Each thread specifies the path of the counter via the parameters of the getCounter method.
public class TagserterZookeeperManager {
public enum ZkClient {
COUNTER("10.11.18.25:2181"); // Integration URL
private CuratorFramework client;
private ZkClient(String servers) {
Properties props = TagserterConfigs.ZOOKEEPER.getProperties();
String zkFromConfig = props.getProperty("servers", "");
if (zkFromConfig != null && !zkFromConfig.isEmpty()) {
servers = zkFromConfig.trim();
}
ExponentialBackoffRetry exponentialBackoffRetry = new ExponentialBackoffRetry(1000, 3);
client = CuratorFrameworkFactory.newClient(servers, exponentialBackoffRetry);
client.start();
}
public CuratorFramework getClient() {
return client;
}
}
public static String buildPath(String ... node) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < node.length; i++) {
if (node[i] != null && !node[i].isEmpty()) {
sb.append("/");
sb.append(node[i]);
}
}
return sb.toString();
}
public static DistributedAtomicLong getCounter(String taskType, int hid, String jobId, String countType) {
String path = buildPath(taskType, hid+"", jobId, countType);
Builder builder = PromotedToLock.builder().lockPath(path + "/lock").retryPolicy(new ExponentialBackoffRetry(10, 10));
DistributedAtomicLong count = new DistributedAtomicLong(ZkClient.COUNTER.getClient(), path, new RetryNTimes(5, 20), builder.build());
return count;
}
}
From within the threads, this is how I am calling this method:
DistributedAtomicLong counterTotal = TagserterZookeeperManager
.getCounter("testTopic", hid, jobId, "test");
Now it seems like after the threads have run for a few hours, at one stage I start getting the following org.apache.zookeeper.KeeperException$ConnectionLossException exception inside the getCounter method where it tries to read the count:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /contentTaskProd
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1073)
at org.apache.curator.utils.ZKPaths.mkdirs(ZKPaths.java:215)
at org.apache.curator.utils.EnsurePath$InitialHelper$1.call(EnsurePath.java:148)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at org.apache.curator.utils.EnsurePath$InitialHelper.ensure(EnsurePath.java:141)
at org.apache.curator.utils.EnsurePath.ensure(EnsurePath.java:99)
at org.apache.curator.framework.recipes.atomic.DistributedAtomicValue.getCurrentValue(DistributedAtomicValue.java:254)
at org.apache.curator.framework.recipes.atomic.DistributedAtomicValue.get(DistributedAtomicValue.java:91)
at org.apache.curator.framework.recipes.atomic.DistributedAtomicLong.get(DistributedAtomicLong.java:72)
...
I keep getting this exception from thereon for a while and I get the feeling it is causing some internal memory leaks that eventually causes an OutOfMemory error and the whole process bails out. Does anybody have any idea what the reason for this could be? Why would Zookeeper suddenly start throwing the connection loss exception? After the process bails out, I can manually connect to Zookeeper through another small console program that I have written (also using curator) and all look good there.
In order to monitor a node in Zookeeper using curator you can use the NodeCache this won't solve your connection problems.... but instead of polling the node once a minute you can get a push event when it changes.
In my experience, the NodeCache handles quite well disconnection and resume of connections.

OpenLDAP 2.3/2.4 concurrency issue

Experiencing concurrency issues when authenticating and other read requests with Open LDAP (version tested 2.3.43 and 2.4.39)
When making 100 concurrent bind requests the test code takes around 150 milliseconds. Increasing this to 1000 concurrent requests sees the time taken increase to 9303 milliseconds.
So from x10 concurrent requests we are seeing a x62 increase in time taken.
Is this expected behaviour? Or is there something missing in our OpenLDAP server configuration/linux host configuration?
NOTE: We have run this test code against a Windows based Apache DS server 2.0.0 (same tree structure, etc) for comparison and against that server, the performance results where what we would normally expect (i.e. 100x takes ~80ms, 1000x takes ~400ms, 10,000x takes ~2700ms)
Settings in slapd.conf:
cachesize 100000
idlcachesize 300000
database bdb
suffix "dc=company,dc=com"
rootdn "uid=admin,ou=system"
rootpw secret
directory /var/lib/ldap
index objectClass eq,pres
index ou,cn,mail,surname,givenname eq,pres,sub
index uidNumber,gidNumber,loginShell eq,pres
index uid,memberUid eq,pres,sub
index nisMapName,nisMapEntry eq,pres,sub
sizelimit 100000
loglevel 256
Test code:
import java.util.ArrayList;
import javax.naming.NamingException;
import javax.naming.directory.DirContext;
import org.springframework.ldap.core.LdapTemplate;
import org.springframework.ldap.core.support.LdapContextSource;
public class DirectoryServiceMain {
public static void main(String[] args) {
int concurrentThreadCount = 100;
LdapContextSource ctx = new LdapContextSource();
ctx.setUrls(new String [] { "ldap://ldap1.dev.company.com:389/", "ldap://ldap1.dev.company.com:389/" });
ctx.setBase("dc=company,dc=com");
ctx.setUserDn("uid=admin,ou=system");
ctx.setPassword("secret");
ctx.setPooled(true);
ctx.setCacheEnvironmentProperties(false);
LdapTemplate template = new LdapTemplate();
template.setContextSource(ctx);
long startTime = System.currentTimeMillis();
ArrayList<Thread> threads = new ArrayList<>();
for(int i = 0; i < concurrentThreadCount; i++) {
Thread t = new Thread(
() -> {
DirContext context = template.getContextSource().getContext("uid=username,dc=users,uid=office,dc=suborganisations,uid=ABC,dc=organisations,dc=company,dc=com",
"password");
try {
context.close();
} catch(NamingException e) {}
});
t.start();
threads.add(t);
}
boolean alive = true;
while(alive) {
alive = false;
for(Thread t : threads) {
if(t.isAlive()) {
alive = true;
try {Thread.sleep(10);} catch(InterruptedException e) {}
}
}
}
long endTime = System.currentTimeMillis();
System.out.println("Total time: " + (endTime - startTime));
}
}
ulimit -n
131072
* UPDATE *
If a slight delay (e.g. Thread.sleep(1)) is added after each t.start(), then processing time of n concurrent threads drops considerably.
A longer answer is if you are using BDB as the database then you will likely see linear scaling problems above a certain number of concurrent requests. BDB has its own db_config file that you can configure to provide better performance characteristics. You could also consider change to MDB which was specifically written for open ldap and has better linear scaling with minimal configuration.
You should also consider limiting the number of concurrent connection made by setting the jndi ldap connection pool sizes against the LDAPContextSource:
Map<String, Object> map = new HashMap<>();
map.put("com.sun.jndi.ldap.connect.pool.initsize", 2);
map.put("com.sun.jndi.ldap.connect.pool.maxsize", 2);
map.put("com.sun.jndi.ldap.connect.pool.prefsize", 2);
ctx.setBaseEnvironmentProperties(map);

Categories