Scala server with Spray Akka overthreading - java

I'm using spray.io and akka.io on my freebsd 1CPU/2GB server and I'm facing
threading problems. I've started to notice it when I got OutOfMemory exception
because of "can't create native thread".
I check Thread.activeCount() usually and I see it grows enormously.
Currently I use these settings:
myapp-namespace-akka {
akka {
loggers = ["akka.event.Logging$DefaultLogger"]
loglevel = "DEBUG"
stdout-loglevel = "DEBUG"
actor {
deployment {
default {
dispatcher = "nio-dispatcher"
router = "round-robin"
nr-of-instances = 1
}
}
debug {
receive = on
autoreceive = on
lifecycle = on
}
nio-dispatcher {
type = "Dispatcher"
executor = "fork-join-executor"
fork-join-executor {
parallelism-min = 8
parallelism-factor = 1.0
parallelism-max = 16
task-peeking-mode = "FIFO"
}
shutdown-timeout = 4s
throughput = 4
throughput-deadline-time = 0ms
attempt-teamwork = off
mailbox-requirement = ""
}
aside-dispatcher {
type = "Dispatcher"
executor = "fork-join-executor"
fork-join-executor {
parallelism-min = 8
parallelism-factor = 1.0
parallelism-max = 32
task-peeking-mode = "FIFO"
}
shutdown-timeout = 4s
throughput = 4
throughput-deadline-time = 0ms
attempt-teamwork = on
mailbox-requirement = ""
}
}
}
}
I want nio-dispatcher to be my default non blocking (lets say single thread)
dispatcher. And I execute all my futures (db, network queries) on aside-dispatcher.
I get my contexts through my application as follows:
trait Contexts {
def system: ActorSystem
def nio: ExecutionContext
def aside: ExecutionContext
}
object Contexts {
val Scope = "myapp-namespace-akka"
}
class ContextsImpl(settings: Config) extends Contexts {
val System = "myapp-namespace-akka"
val NioDispatcher = "akka.actor.nio-dispatcher"
val AsideDispatcher = "akka.actor.aside-dispatcher"
val Settings = settings.getConfig(Contexts.Scope)
override val system: ActorSystem = ActorSystem(System, Settings)
override val nio: ExecutionContext = system.dispatchers.lookup(NioDispatcher)
override val aside: ExecutionContext = system.dispatchers.lookup(AsideDispatcher)
}
// Spray trait mixed to service actors
trait ImplicitAsideContext {
this: EnvActor =>
implicit val aside = env.contexts.aside
}
I think that I did mess up with configs or implementations. Help me out here.
Usually I see thousands of threads for now on my app until it crashes (I set
freebsd pre process limit to 5000).

If your app indeed starts so many threads this can be usually tracked back to blocking behaviours inside ForkJoinPools (bad bad thing to do!), I explained the issue in detail in the answer here: blocking blocks, so you may want to read up on it there and verify what threads are being created in your app and why – the ForkJoinPool does not have a static upper limit.

Related

Pod size returned zero post kubernetes job creation

We are creating kubernates job using java kubernates client api (V:5.12.2) like below.
I am struck with two places . Could some one please help on this ?
podList.getItems().size() in below code snippet is some times returning zero even though I see pod get created and with other existing jobs.
How to specify particular label to job pod?
KubernetesClient kubernetesClient = new DefaultKubernetesClient();
String namespace = System.getenv(POD_NAMESPACE);
String jobName = TextUtils.concatenateToString("flatten" + Constants.HYPHEN+ flattenId);
Job jobRequest = createJob(flattenId, authValue);
var jobResult = kubernetesClient.batch().v1().jobs().inNamespace(namespace)
.create(jobRequest);
PodList podList = kubernetesClient.pods().inNamespace(namespace)
.withLabel("job-name", jobName).list();
// Wait for pod to complete
var pods = podList.getItems().size();
var terminalPodStatus = List.of("succeeded", "failed");
_LOGGER.info("pods created size:" + pods);
if (pods > 0) {
// returns zero some times.
var k8sPod = podList.getItems().get(0);
var podName = k8sPod.getMetadata().getName();
kubernetesClient.pods().inNamespace(namespace).withName(podName)
.waitUntilCondition(pod -> {
var podPhase = pod.getStatus().getPhase();
//some logic
return terminalPodStatus.contains(podPhase.toLowerCase());
}, JOB_TIMEOUT, TimeUnit.MINUTES);
kubernetesClient.close();
}
private Job createJob(String flattenId, String authValue) {
return new JobBuilder()
.withApiVersion(API_VERSION)
.withNewMetadata().withName(jobName)
.withLabels(labels)
.endMetadata()
.withNewSpec()
.withTtlSecondsAfterFinished(300)
.withBackoffLimit(0)
.withNewTemplate()
.withNewMetadata().withAnnotations(LINKERD_INJECT_ANNOTATIONS)
.endMetadata()
.withNewSpec()
.withServiceAccount(Constants.TEST_SERVICEACCOUNT)
.addNewContainer()
.addAllToEnv(envVars)
.withImage(System.getenv(BUILD_JOB_IMAGE))
.withName(“”test)
.withCommand("/bin/bash", "-c", "java -jar test.jar")
.endContainer()
.withRestartPolicy(RESTART_POLICY_NEVER)
.endSpec()
.endTemplate()
.endSpec()
.build();
}
Pods are not instantly created as consequence of creating a Job: The Job controller needs to become active and create the Pods accordingly. Depending on the load on your control plane and number of Job instances you may need to wait more or less time.

Substitute for thread executor pool in Scala

My application requires that I have multiple threads running fetching data from various HDFS nodes. For that I am using the thread executor pool and forking threads.
Forking at :
val pathSuffixList = fileStatuses.getOrElse("FileStatus", List[Any]()).asInstanceOf[List[Map[String, Any]]]
pathSuffixList.foreach(block => {
ConsumptionExecutor.execute(new Consumption(webHdfsUri,block))
})
My class Consumption :
class Consumption(webHdfsUri: String, block:Map[String,Any]) extends Runnable {
override def run(): Unit = {
val uriSplit = webHdfsUri.split("\\?")
val fileOpenUri = uriSplit(0) + "/" + block.getOrElse("pathSuffix", "").toString + "?op=OPEN"
val inputStream = new URL(fileOpenUri).openStream()
val datumReader = new GenericDatumReader[Void]()
val dataStreamReader = new DataFileStream(inputStream, datumReader)
// val schema = dataStreamReader.getSchema()
val dataIterator = dataStreamReader.iterator()
while (dataIterator.hasNext) {
println(" data : " + dataStreamReader.next())
}
}
}
ConsumptionExecutor :
object ConsumptionExecutor{
val counter: AtomicLong = new AtomicLong()
val executionContext: ExecutorService = Executors.newCachedThreadPool(new ThreadFactory {
def newThread(r: Runnable): Thread = {
val thread: Thread = new Thread(r)
thread.setName("ConsumptionExecutor-" + counter.incrementAndGet())
thread
}
})
executionContext.asInstanceOf[ThreadPoolExecutor].setMaximumPoolSize(200)
def execute(trigger: Runnable) {
executionContext.execute(trigger)
}
}
However I want to use Akka streaming/ Akka actors where in I don't need to give a fixed thread pool size and Akka takes care of everything.
I am pretty new to Akka and the concept of Streaming and actors . Can someone give me any leads in the form of a sample code to fit my use case?
Thanks in advance!
An idea would be to create a (subclass) instance of ActorPublisher for each HDFS node that you are reading from, and then Merge them in as multiple Sources in a FlowGraph.
Something like this pseudo-code, where the details of the ActorPublisher sources are left out:
val g = PartialFlowGraph { implicit b =>
import FlowGraphImplicits._
val in1 = actorSource1
val in2 = actorSource2
// etc.
val out = UndefinedSink[T]
val merge = Merge[T]
in1 ~> merge ~> out
in2 ~> merge
// etc.
}
This can be improved for a collection of actor sources by just iterating over them and adding an edge to the merge for each one, but this gives the idea.

akka.io dispatcher configuration Exception

In sbt 0.13 project, I put the following in application.conf file located under src/main/resources
application.conf:
blocking-dispatcher {
type = PinnedDispatcher
executor = "thread-pool-executor"
thread-pool-executor {
core-pool-size-min = 2
core-pool-size-factor = 2.0
core-pool-size-max = 10
}
throughput = 100
mailbox-capacity = -1
mailbox-type =""
}
now when I create actor I'm getting exception:
object Main extends App {
implicit val system = ActorSystem()
val fileReaderActor = system.actorOf(Props(new FileReaderActor(fileName)).withDispatcher("blocking-dispatcher"), "fileReaderActor")
}
I'm getting:
Exception in thread "main" akka.ConfigurationException: Dispatcher [blocking-dispatcher] not configured for path akka://default/user/fileReaderActor
at akka.actor.LocalActorRefProvider.actorOf(ActorRefProvider.scala:714)
at akka.actor.dungeon.Children$class.makeChild(Children.scala:191)
at akka.actor.dungeon.Children$class.attachChild(Children.scala:42)
at akka.actor.ActorCell.attachChild(ActorCell.scala:338)
at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:518)
at Main$delayedInit$body.apply(Main.scala:14)
at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:71)
at scala.App$$anonfun$main$1.apply(App.scala:71)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
at scala.App$class.main(App.scala:71)
at Main$.main(Main.scala:8)
at Main.main(Main.scala)
What am I missing?
First, make sure that your config gets loaded:
System.out.println(system.settings());
// this is a shortcut for system.settings().config().root().render()
Read more about it here: http://doc.akka.io/docs/akka/2.2.3/general/configuration.html#Logging_of_Configuration
Second, the following configuration doesn't really make sense:
blocking-dispatcher {
type = PinnedDispatcher
executor = "thread-pool-executor"
thread-pool-executor {
core-pool-size-min = 2 <----- Since you're using a PinnedDispatcher, it only uses 1 thread
core-pool-size-factor = 2.0 <----- same here
core-pool-size-max = 10 <----- same here
} <--- PinnedDispatcher will automatically make it 1 thread: https://github.com/akka/akka/blob/master/akka-actor/src/main/scala/akka/dispatch/PinnedDispatcher.scala#L27
throughput = 100
mailbox-capacity = -1
mailbox-type =""
}
You have to do the lookup of the execution context dispatcher like so
implicit val executionContext = system.dispatchers.lookup("blocking-dispatcher")
and then pass this executionContext to the ActorSystem.
Quote from documentation:
If an ActorSystem is created with an ExecutionContext passed in, this ExecutionContext will be used as the default executor for all dispatchers in this ActorSystem. If no ExecutionContext is given, it will fallback to the executor specified in akka.actor.default-dispatcher.default-executor.fallback[1]
[1] http://doc.akka.io/docs/akka/snapshot/scala/dispatchers.html

Performance issues in Play 2.2

I put Thread.sleep(100) in my Play 2.2 Controller to mimic a back-end operation.
#BodyParser.Of(BodyParser.Json.class)
public Promise<Result> d() {
Promise<JsonNode> promiseOfJson = Promise.promise(new Function0<JsonNode>() {
public JsonNode apply() throws Exception {
ABean aBean = new ABean();
aBean.setStatus("success");
aBean.setMessage("Hello World...");
Thread.sleep(100);
return Json.toJson(aBean);
}
});
return promiseOfJson.map(new Function<JsonNode, Result>() {
public Result apply(JsonNode jsonNode) {
return ok(jsonNode);
}
});
}
I have my application.conf configured as
#Play Internal Thread Pool
internal-threadpool-size=4
#Scala Iteratee thread pool
iteratee-threadpool-size=4
# Akka Configuration for making parallelism
# ~~~~~
# Play's Default Thread Pool
play {
akka {
akka.loggers = ["akka.event.Logging$DefaultLogger", "akka.event.slf4j.Slf4jLogger"]
loglevel = ERROR
actor {
deployment {
/actions {
router = round-robin
nr-of-instances = 100
}
/promises {
router = round-robin
nr-of-instances = 100
}
}
retrieveBodyParserTimeout = 5 seconds
actions-dispatcher = {
fork-join-executor {
parallelism-factor = 100
parallelism-max = 100
}
}
promises-dispatcher = {
fork-join-executor {
parallelism-factor = 100
parallelism-max = 100
}
}
default-dispatcher = {
fork-join-executor {
# No. of minimum threads = 8 ; at the most 64 ; or other wise 3 times the no. of processors available on the system
# parallelism-factor = 1.0
# parallelism-max = 64
# parallelism-min = 8
parallelism-factor = 3.0
parallelism-max = 64
parallelism-min = 8
}
}
}
}
}
I ran ab command as follows:
ab -n 900 -c 1 http://localhost:9000/a/b/c/d
The result shows only 9 requests being handled per second.
Can application.conf be tweaked for better performance ? If yes , how ?
That is caused by the sleep(100) call. Even if all the rest of your code was infinitely fast, you would still only get 10 requests/second with only 1 test thread.

Multiple akka system in a single pc

How can we run multiple akka nodes in a single pc? Currently, I've following in my application.conf file. For each system, I added different port numbers, but, I can't start more than one instance. Error says, Address already in use failed to bind.
application.conf file
remotelookup {
include "common"
akka {
remote.server.port = 2500
cluster.nodename = "n1"
}
}
Update : multiple akka nodes means, I have different different stand alone server application, which will communicate to remote master node using akka.
The approach we are using is:
Create different settings in your application.conf for each of the systems:
systemOne {
akka {
remote {
enabled-transports = ["akka.remote.netty.tcp"]
netty.tcp {
hostname = ${public-hostname}
port = 2552
}
}
}
}
systemTwo {
akka {
remote {
enabled-transports = ["akka.remote.netty.tcp"]
netty.tcp {
hostname = ${public-hostname}
port = 2553
}
}
}
}
Application.conf is the default config file, so in your settings module add configs for you systems:
object Configs {
private val root = ConfigFactory.load()
val one = root.getConfig("systemOne")
val two = root.getConfig("systemTwo")
}
and then create systems with this configs:
val one = ActorSystem("SystemName", one)
val two = ActorSystem("AnotherSystemName", two)
Don't forget that system names must differ
If you don't want to hardcode the info into your application.conf, you can do this:
def remoteConfig(hostname: String, port: Int, commonConfig: Config): Config = {
val configStr = s"""
|akka.remote.netty.hostname = $hostname
|akka.remote.netty.port = $port
""".stripMargin
ConfigFactory.parseString(configStr).withFallback(commonConfig)
}
Then use it like:
val appConfig = ConfigFactory.load
val sys1 = ActorSystem("sys1", remoteConfig(args(0), args(1).toInt, appConfig))
val sys2 = ActorSystem("sys2", remoteConfig(args(0), args(2).toInt, appConfig))
If you use 0 for the port Akka will assign a random port # to that ActorSystem.
The problem was in the port definition. It should be like
remotelookup {
include "common"
akka {
remote.netty.port = 2500
cluster.nodename = "n1"
}
}
Other wise, akka will take default port.

Categories