Performance issues in Play 2.2

Performance issues in Play 2.2 - java

I put Thread.sleep(100) in my Play 2.2 Controller to mimic a back-end operation.
#BodyParser.Of(BodyParser.Json.class)
public Promise<Result> d() {
Promise<JsonNode> promiseOfJson = Promise.promise(new Function0<JsonNode>() {
public JsonNode apply() throws Exception {
ABean aBean = new ABean();
aBean.setStatus("success");
aBean.setMessage("Hello World...");
Thread.sleep(100);
return Json.toJson(aBean);
}
});
return promiseOfJson.map(new Function<JsonNode, Result>() {
public Result apply(JsonNode jsonNode) {
return ok(jsonNode);
}
});
}
I have my application.conf configured as
#Play Internal Thread Pool
internal-threadpool-size=4
#Scala Iteratee thread pool
iteratee-threadpool-size=4
# Akka Configuration for making parallelism
# ~~~~~
# Play's Default Thread Pool
play {
akka {
akka.loggers = ["akka.event.Logging$DefaultLogger", "akka.event.slf4j.Slf4jLogger"]
loglevel = ERROR
actor {
deployment {
/actions {
router = round-robin
nr-of-instances = 100
}
/promises {
router = round-robin
nr-of-instances = 100
}
}
retrieveBodyParserTimeout = 5 seconds
actions-dispatcher = {
fork-join-executor {
parallelism-factor = 100
parallelism-max = 100
}
}
promises-dispatcher = {
fork-join-executor {
parallelism-factor = 100
parallelism-max = 100
}
}
default-dispatcher = {
fork-join-executor {
# No. of minimum threads = 8 ; at the most 64 ; or other wise 3 times the no. of processors available on the system
# parallelism-factor = 1.0
# parallelism-max = 64
# parallelism-min = 8
parallelism-factor = 3.0
parallelism-max = 64
parallelism-min = 8
}
}
}
}
}
I ran ab command as follows:
ab -n 900 -c 1 http://localhost:9000/a/b/c/d
The result shows only 9 requests being handled per second.
Can application.conf be tweaked for better performance ? If yes , how ?

That is caused by the sleep(100) call. Even if all the rest of your code was infinitely fast, you would still only get 10 requests/second with only 1 test thread.

Related

Optimal orchestration of list of network calls and processing tasks in Java

I have the following workflow:
There are n records that need to be retrieved over the network and subsequently n expensive computations that need to be done on each. Put in code, this would look like:
List<Integer> ids = {1,2,....n};
ids.forEach(id -> {
Record r = RetrieveRecord(id); // Blocking IO
ProcessRecord(r); // CPU Intensive
})
I would like to convert the blocking part into async so that the time is minimized with a single thread- essentially, by ensuring that record i+1 is being retrieved when record i is being processed. So that the execution would look like:
Retrieve(1).start()
Retrieve(1).onEnd(() -> { start Retrieve(2), Process(1) })
Retrieve(2).onEnd(() -> { start Retrieve(3), Process(2) })
....
Now I can come up with the naive way to implement this with a List<> and CompletableFuture, but this would require me to handle the first record differently.
Is there a more elegant way of solving this with something like reactive streams?
A solution that would maybe let me easily configure how many records Process() can trail behind Retreive()?

So you have N tasks and want to run them in parallel but no more than K tasks simultaneously. Most natural way is to have a task generator and a permission counter with K permissions initially. Task generator creates K tasks and waits for more permissions. Each permission is owned by some task and is returned when the task ends. Standard permission counter in Java is class java.util.concurrent.Semaphore:
List<Integer> ids = {1,2,....n};
Semaphore sem = new Semaphore(K);
ids.forEach(id -> {
sem.aquire();
CompletableFuture<Data> fut = Retrieve(id);
fut.thenRun(sem::release);
fut.thenAcceptAsync(this::ProcessRecord, someExecutor);
})
Since the task generator occupies only one thread, there is little sense to make it asynchronous. If, however, you don't want to use a dedicated thread for task generator and want to implement asynchronous solution, then the main question is what class can play the role of asynchronous permission counter. You have 3 options:
use implicit asynchronous permission counter which is a part of reactive streams, found in RxJava, project Reactor etc.
use explicit asynchronous semaphore org.df4j.core.boundconnector.permitstream.Semafor included in my asynchronous library df4j
make it yourself

Solution using df4j, with explicit asynchronous semaphore:
import org.df4j.core.boundconnector.permitstream.Semafor;
import org.df4j.core.tasknode.Action;
import org.df4j.core.tasknode.messagestream.Actor;
import java.util.Arrays;
import java.util.Iterator;
import java.util.List;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ForkJoinPool;
public class AsyncSemaDemo extends Actor {
List<Integer> ids = Arrays.asList(1, 2, 3, 4, 5);
Semafor sema = new Semafor(this, 2);
Iterator<Integer> iter = ids.iterator();
int tick = 100; // millis
CountDownLatch done = new CountDownLatch(ids.size());
long start = System.currentTimeMillis();
private void printClock(String s) {
long ticks = (System.currentTimeMillis() - start)/tick;
System.out.println(Long.toString(ticks) + " " + s);
}
CompletableFuture<Integer> Retrieve(Integer e) {
return CompletableFuture.supplyAsync(() -> {
printClock("Req " + e + " started");
try {
Thread.sleep(tick); // Network
} catch (InterruptedException ex) {
}
printClock(" Req " + e + " done");
return e;
}, executor);
}
void ProcessRecord(Integer s) {
printClock(" Proc " + s + " started");
try {
Thread.sleep(tick*2); // Compute
} catch (InterruptedException ex) {
}
printClock(" Proc " + s + " done");
}
#Action
public void act() {
if (iter.hasNext()) {
CompletableFuture<Integer> fut = Retrieve(iter.next());
fut.thenRun(sema::release);
fut.thenAcceptAsync(this::ProcessRecord, executor)
.thenRun(done::countDown);
} else {
super.stop();
}
}
public static void main(String[] args) throws InterruptedException {
AsyncSemaDemo asyncSemaDemo = new AsyncSemaDemo();
asyncSemaDemo.start(ForkJoinPool.commonPool());
asyncSemaDemo.done.await();
}
}
its log should be:
0 Req 1 started
0 Req 2 started
1 Req 1 done
1 Proc 1 started
1 Req 3 started
1 Req 2 done
1 Proc 2 started
1 Req 4 started
2 Req 3 done
2 Proc 3 started
2 Req 5 started
2 Req 4 done
2 Proc 4 started
3 Proc 1 done
3 Req 5 done
3 Proc 5 started
3 Proc 2 done
4 Proc 3 done
4 Proc 4 done
5 Proc 5 done
Note how this solution is close to my previous answer with standard java.util.concurrent.Semaphore.

Here's what I finally came up with that seems to get the job done:
Flowable.just(1,2,3,4,5,6) // Completes in 1 + 6 * 3 = 19 secs
.concatMapEager(v->
Flowable.just(v)
.subscribeOn(Schedulers.io())
.map( e->{
System.out.println(getElapsed("Req " + e + " started");
Thread.sleep(1000); // Network: 1 sec
System.out.println(getElapsed("Req " + e + " done");
return e;
}, requestsOnWire, 1) // requestsOnWire = K = 2
.blockingSubscribe(new DisposableSubscriber<Integer>() {
#Override
protected void onStart() {
request(1);
}
#Override
public void onNext(Integer s) {
request(1);
System.out.println("Proc " + s + " started");
try {
Thread.sleep(3000); // Compute: 3 secs
System.out.println("Proc " + s + " done");
} catch (InterruptedException e) {
e.printStackTrace();
}
}
#Override
public void onError(Throwable t) {
}
#Override
public void onComplete() {
}
});
Below is the execution order. Note that at any given point of time, there are 1 record being processed, at most 2 requests on wire and at most 2 unprocessed records in memory (Process trails behind by K=2) records:
0 secs: Req 1 started
: Req 2 started
1 secs: Req 2 done
: Req 1 done
: Proc 1 started
: Req 3 started
: Req 4 started
2 secs: Req 3 done
: Req 4 done
4 secs: Proc 1 done
: Proc 2 started
: Req 5 started
5 secs: Req 5 done
7 secs: Proc 2 done
: Proc 3 started
: Req 6 started
8 secs: Req 6 done
10 secs: Proc 3 done
: Proc 4 started
13 secs: Proc 4 done
: Proc 5 started
16 secs: Proc 5 done
: Proc 6 started
19 secs: Proc 6 done
Hope there are no anti-patterns/pitfalls here.

When kafka consumer poll return null records?

As show as below, my code is a high level consumer fetch a topic with 32 partitions in kafka server, I'm confused that why sometimes I get a empty return from consumer.poll().
I have tried to increase poll timeout, and then when I increase timeout to 1000, then each poll has return data while I set timeout to 10 or 0, then I see a lot of empty return.
Can anyone tell me how to set a correct timeout ?
def main(args: Array[String]): Unit = {
val props = new Properties()
props.put("bootstrap.servers", "kafka-01:9098")
props.put("group.id", "kch1")
props.put("enable.auto.commit", "true")
props.put("auto.commit.interval.ms", "1000")
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
//props.put("max.poll.records", "1000")
val consumers = new Array[KafkaConsumer[String, String]](16)
for(i <- 0 to 15) {
consumers(i) = new KafkaConsumer[String, String](props)
consumers(i).subscribe(util.Arrays.asList("veh321"))
}
var cnt = 0
var cacheIterator: Iterator[ConsumerRecord[String, String]] = null
for(i <- 0 to 15) {
new Thread(new Runnable {
override def run(): Unit = {
var finish = false
while(!finish) {
val start = System.currentTimeMillis()
cacheIterator = consumers(i).poll(100).iterator()
val end = System.currentTimeMillis() - start
if (end > 10 ) {
println(s"${Thread.currentThread().getId} + Duration is ${end}， ${cacheIterator.hasNext} ${cacheIterator.size}")
}
}
}
}).start()
}

Java consumer employs Linux's epoll as the underlying implementation scheme by invoking java.nio.channels.Selector.select(timeout). It is very likely to return nothing if you only give it 100 ms to try how many SelectionKeys are ready during that short time interval.
Besides, during the same 100 ms, consumer will do some other jobs including polling coordinator status, so the real time interval for record polling is obviously less than 100ms which makes harder to retrieve some really useful things.

Scala server with Spray Akka overthreading

I'm using spray.io and akka.io on my freebsd 1CPU/2GB server and I'm facing
threading problems. I've started to notice it when I got OutOfMemory exception
because of "can't create native thread".
I check Thread.activeCount() usually and I see it grows enormously.
Currently I use these settings:
myapp-namespace-akka {
akka {
loggers = ["akka.event.Logging$DefaultLogger"]
loglevel = "DEBUG"
stdout-loglevel = "DEBUG"
actor {
deployment {
default {
dispatcher = "nio-dispatcher"
router = "round-robin"
nr-of-instances = 1
}
}
debug {
receive = on
autoreceive = on
lifecycle = on
}
nio-dispatcher {
type = "Dispatcher"
executor = "fork-join-executor"
fork-join-executor {
parallelism-min = 8
parallelism-factor = 1.0
parallelism-max = 16
task-peeking-mode = "FIFO"
}
shutdown-timeout = 4s
throughput = 4
throughput-deadline-time = 0ms
attempt-teamwork = off
mailbox-requirement = ""
}
aside-dispatcher {
type = "Dispatcher"
executor = "fork-join-executor"
fork-join-executor {
parallelism-min = 8
parallelism-factor = 1.0
parallelism-max = 32
task-peeking-mode = "FIFO"
}
shutdown-timeout = 4s
throughput = 4
throughput-deadline-time = 0ms
attempt-teamwork = on
mailbox-requirement = ""
}
}
}
}
I want nio-dispatcher to be my default non blocking (lets say single thread)
dispatcher. And I execute all my futures (db, network queries) on aside-dispatcher.
I get my contexts through my application as follows:
trait Contexts {
def system: ActorSystem
def nio: ExecutionContext
def aside: ExecutionContext
}
object Contexts {
val Scope = "myapp-namespace-akka"
}
class ContextsImpl(settings: Config) extends Contexts {
val System = "myapp-namespace-akka"
val NioDispatcher = "akka.actor.nio-dispatcher"
val AsideDispatcher = "akka.actor.aside-dispatcher"
val Settings = settings.getConfig(Contexts.Scope)
override val system: ActorSystem = ActorSystem(System, Settings)
override val nio: ExecutionContext = system.dispatchers.lookup(NioDispatcher)
override val aside: ExecutionContext = system.dispatchers.lookup(AsideDispatcher)
}
// Spray trait mixed to service actors
trait ImplicitAsideContext {
this: EnvActor =>
implicit val aside = env.contexts.aside
}
I think that I did mess up with configs or implementations. Help me out here.
Usually I see thousands of threads for now on my app until it crashes (I set
freebsd pre process limit to 5000).

If your app indeed starts so many threads this can be usually tracked back to blocking behaviours inside ForkJoinPools (bad bad thing to do!), I explained the issue in detail in the answer here: blocking blocks, so you may want to read up on it there and verify what threads are being created in your app and why – the ForkJoinPool does not have a static upper limit.

Why can't I kill thread?

Dear All I have this type of code:
public class Testimplements Runnable {
public static void main(String[] args) {
InTheLoop l= new InTheLoop();
Thread th = new Thread(l);
th.start();
th.interrupt();
}
#Override
public void run() {
int count = 0;
for (Integer i = 0; i <=10000000000000000000; i++) {
}
}
}
I know there is ways to kill thread. For instance:
// Example 1
if (Thread.interrupted()){
return;
}
// Example 2
if(flag){ // volatile
return;
}
but can't I kill the thread without if statement?

You can use the stop() method if you really have to, but be aware that it is inherently unsafe and deprecated.
See http://docs.oracle.com/javase/7/docs/technotes/guides/concurrency/threadPrimitiveDeprecation.html for details.

As far as I know you have to implement the interruption policy for your thread yourself, how it handles the interrupt call and when it stops etc.
See: http://docs.oracle.com/javase/tutorial/essential/concurrency/interrupt.html

Instead of thread better use ExecutorService for more control on threads.
Read more in oracle documentation and here is tutorial

This subject is very important and I could not find any clear answer for That so I write a sample code with OOP form to explain that
import threading
import time
import tkinter as tk
class A(tk.Tk):
def __init__(self ,*args, **kwargs):
tk.Tk.__init__(self, *args, **kwargs)
self.count = 0
self._running = True
self.Do = 0
A.Run(self)
def Thread_Terminate(self): # a method for killing the Thread
self._running =False
self.T.join()
self.Btn2.config (state='normal')
self.Btn1.config (state='disabled')
def Thread_Restart(self): # a thred for Start and Restart the thread
self.Btn1.config (state='normal')
self._running = True
self.T = threading.Thread(target=self.Core) # define Thread
if (self.T.is_alive() != True): # Cheak the Thread is that alive
self.T.start()
self.Btn2.config (state='disabled')
def Window (self):
self.title(" Graph ")
self.geometry("300x300+100+100")
self.resizable(width=False, height=False)
self.Lbl1 = Label(self,text = '0' ,font=('Times', -50, 'bold'), fg='Blue')
self.Lbl1.pack()
self.Lbl1.place(x=130, y=30 )
self.Btn1 = Button(self,text= 'Thread OFF',width=25,command = self.Thread_Terminate)
self.Btn1.pack()
self.Btn1.place(x=50,y=100)
self.Btn1 ['state'] = 'disable'
self.Btn2 = Button(self, text='Thread ON', width=25, command=self.Thread_Restart)
self.Btn2.pack()
self.Btn2.place(x=50, y=140)
self.Ent1 = Entry(self, width = 30, fg='Blue')
self.Ent1.pack()
self.Ent1.place(x=50, y=180)
def Core(self): # this method is the thread Method
self.count = 0
i = self.Ent1.get()
if (i==''):
i='10'
while (self._running and self.count<int(i)):
self.count +=1
self.Lbl1.config(text=str(self.count))
print(str (self.count)+'Thread Is ON')
time.sleep(0.5)
def Run(self):
A.Window(self)
self.mainloop()
if __name__ == "__main__":
Obj1 = A()

OpenLDAP 2.3/2.4 concurrency issue

Experiencing concurrency issues when authenticating and other read requests with Open LDAP (version tested 2.3.43 and 2.4.39)
When making 100 concurrent bind requests the test code takes around 150 milliseconds. Increasing this to 1000 concurrent requests sees the time taken increase to 9303 milliseconds.
So from x10 concurrent requests we are seeing a x62 increase in time taken.
Is this expected behaviour? Or is there something missing in our OpenLDAP server configuration/linux host configuration?
NOTE: We have run this test code against a Windows based Apache DS server 2.0.0 (same tree structure, etc) for comparison and against that server, the performance results where what we would normally expect (i.e. 100x takes ~80ms, 1000x takes ~400ms, 10,000x takes ~2700ms)
Settings in slapd.conf:
cachesize 100000
idlcachesize 300000
database bdb
suffix "dc=company,dc=com"
rootdn "uid=admin,ou=system"
rootpw secret
directory /var/lib/ldap
index objectClass eq,pres
index ou,cn,mail,surname,givenname eq,pres,sub
index uidNumber,gidNumber,loginShell eq,pres
index uid,memberUid eq,pres,sub
index nisMapName,nisMapEntry eq,pres,sub
sizelimit 100000
loglevel 256
Test code:
import java.util.ArrayList;
import javax.naming.NamingException;
import javax.naming.directory.DirContext;
import org.springframework.ldap.core.LdapTemplate;
import org.springframework.ldap.core.support.LdapContextSource;
public class DirectoryServiceMain {
public static void main(String[] args) {
int concurrentThreadCount = 100;
LdapContextSource ctx = new LdapContextSource();
ctx.setUrls(new String [] { "ldap://ldap1.dev.company.com:389/", "ldap://ldap1.dev.company.com:389/" });
ctx.setBase("dc=company,dc=com");
ctx.setUserDn("uid=admin,ou=system");
ctx.setPassword("secret");
ctx.setPooled(true);
ctx.setCacheEnvironmentProperties(false);
LdapTemplate template = new LdapTemplate();
template.setContextSource(ctx);
long startTime = System.currentTimeMillis();
ArrayList<Thread> threads = new ArrayList<>();
for(int i = 0; i < concurrentThreadCount; i++) {
Thread t = new Thread(
() -> {
DirContext context = template.getContextSource().getContext("uid=username,dc=users,uid=office,dc=suborganisations,uid=ABC,dc=organisations,dc=company,dc=com",
"password");
try {
context.close();
} catch(NamingException e) {}
});
t.start();
threads.add(t);
}
boolean alive = true;
while(alive) {
alive = false;
for(Thread t : threads) {
if(t.isAlive()) {
alive = true;
try {Thread.sleep(10);} catch(InterruptedException e) {}
}
}
}
long endTime = System.currentTimeMillis();
System.out.println("Total time: " + (endTime - startTime));
}
}
ulimit -n
131072
* UPDATE *
If a slight delay (e.g. Thread.sleep(1)) is added after each t.start(), then processing time of n concurrent threads drops considerably.

A longer answer is if you are using BDB as the database then you will likely see linear scaling problems above a certain number of concurrent requests. BDB has its own db_config file that you can configure to provide better performance characteristics. You could also consider change to MDB which was specifically written for open ldap and has better linear scaling with minimal configuration.
You should also consider limiting the number of concurrent connection made by setting the jndi ldap connection pool sizes against the LDAPContextSource:
Map<String, Object> map = new HashMap<>();
map.put("com.sun.jndi.ldap.connect.pool.initsize", 2);
map.put("com.sun.jndi.ldap.connect.pool.maxsize", 2);
map.put("com.sun.jndi.ldap.connect.pool.prefsize", 2);
ctx.setBaseEnvironmentProperties(map);

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Performance issues in Play 2.2 - java

That is caused by the sleep(100) call. Even if all the rest of your code was infinitely fast, you would still only get 10 requests/second with only 1 test thread.

Related

Optimal orchestration of list of network calls and processing tasks in Java

When kafka consumer poll return null records?

Scala server with Spray Akka overthreading

Why can't I kill thread?

OpenLDAP 2.3/2.4 concurrency issue

Categories

Resources