I have an actor that depends on stashing. Stashing requires UnboundedDequeBasedMailbox. Is it possible to use stashing together with Bounded mailbox ?
As Arnaud pointed out, my setup is :
Actor with UnrestrictedStash with RequiresMessageQueue[BoundedDequeBasedMessageQueueSemantics]
and configuration :
akka {
loggers = ["akka.event.slf4j.Slf4jLogger"]
loglevel = INFO
daemonic = on
actor {
mailbox {
bounded-deque-based {
mailbox-type = "akka.dispatch.BoundedDequeBasedMailbox"
mailbox-capacity = 2000
mailbox-push-timeout-time = 5s
}
requirements {
"akka.dispatch.BoundedDequeBasedMessageQueueSemantics" = akka.actor.mailbox.bounded-deque-based
}
}
}
}
And the error is :
[ERROR] 2013/12/06 14:03:58.268 [r-4] a.a.OneForOneStrategy - DequeBasedMailbox required, got: akka.dispatch.UnboundedMailbox$MessageQueue
An (unbounded) deque-based mailbox can be configured as follows:
my-custom-mailbox {
mailbox-type = "akka.dispatch.UnboundedDequeBasedMailbox"
}
akka.actor.ActorInitializationException: exception during creation
at akka.actor.ActorInitializationException$.apply(Actor.scala:218) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.actor.ActorCell.create(ActorCell.scala:578) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:425) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:447) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:262) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.dispatch.Mailbox.run(Mailbox.scala:218) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) [akka-actor_2.10-2.2.0.jar:2.2.0]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [scala-library-2.10.2.jar:na]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [scala-library-2.10.2.jar:na]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [scala-library-2.10.2.jar:na]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [scala-library-2.10.2.jar:na]
Caused by: akka.actor.ActorInitializationException: DequeBasedMailbox required, got: akka.dispatch.UnboundedMailbox$MessageQueue
An (unbounded) deque-based mailbox can be configured as follows:
my-custom-mailbox {
mailbox-type = "akka.dispatch.UnboundedDequeBasedMailbox"
}
at akka.actor.ActorInitializationException$.apply(Actor.scala:218) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.actor.UnrestrictedStash$class.$init$(Stash.scala:82) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at com.fg.mail.smtp.index.Indexer.<init>(Indexer.scala:38) ~[classes/:na]
at com.fg.mail.smtp.Supervisor$$anonfun$preStart$1.apply(Supervisor.scala:20) ~[classes/:na]
at com.fg.mail.smtp.Supervisor$$anonfun$preStart$1.apply(Supervisor.scala:20) ~[classes/:na]
at akka.actor.CreatorFunctionConsumer.produce(Props.scala:369) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.actor.Props.newActor(Props.scala:323) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.actor.ActorCell.newActor(ActorCell.scala:534) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.actor.ActorCell.create(ActorCell.scala:560) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
... 9 common frames omitted
If I do Props.withMailbox("bounded-deque-based") then I get
Caused by: akka.ConfigurationException: Mailbox Type [bounded-deque-based] not configured
ANSWER : thanks to Arnaud : The problem was the configuration, bounded-deque-based config block should be at the same level as akka config block. Documentation is totally misleading in this case...
akka {
loggers = ["akka.event.slf4j.Slf4jLogger"]
loglevel = INFO
daemonic = on
actor {
mailbox {
requirements {
"akka.dispatch.BoundedDequeBasedMessageQueueSemantics" = akka.actor.mailbox.bounded-deque-based
}
}
}
}
bounded-deque-based {
mailbox-type = "akka.dispatch.BoundedDequeBasedMailbox"
mailbox-capacity = 2000
mailbox-push-timeout-time = 5s
}
According the official Akka stash doc you can decide to use a Stash trait that does not enforce any mailbox type see UnrestrictedStash.
If you use UnrestrictedStash you can configure manually the proper mailbox as long as it extends the akka.dispatch.DequeBasedMessageQueueSemantics marker trait.
You can manually configure your BoundedMailbox mailbox following the mailboxes doc with something like :
bounded-mailbox {
mailbox-type = "akka.dispatch.BoundedDequeBasedMailbox"
mailbox-capacity = 1000
mailbox-push-timeout-time = 10s
}
akka.actor.mailbox.requirements {
"akka.dispatch.BoundedDequeBasedMessageQueueSemantics" = bounded-mailbox
}
I did not try it myself but it should work.
Edit : what version of Akka are you using? Looks like the stash trait definition changed with the version 2.2.0
Related
I'm trying to add a -Pojo suffix to my generated jOOQ pojos.
The strategy implementation is straightforward enough:
package my.app.jooq.strategy
import org.jooq.codegen.DefaultGeneratorStrategy
import org.jooq.codegen.GeneratorStrategy
import org.jooq.codegen.GeneratorStrategy.Mode.POJO
import org.jooq.meta.Definition
class MyGeneratorStrategy : DefaultGeneratorStrategy() {
override fun getJavaClassName(definition: Definition, mode: GeneratorStrategy.Mode): String {
return when (mode) {
POJO -> super.getJavaClassName(definition, mode) + "Pojo"
else -> super.getJavaClassName(definition, mode)
}
}
}
but codegen just refuses to pick it up. (ClassNotFoundException).
According to https://groups.google.com/g/jooq-user/c/LM5ioRHNhJw :
you would have to create a separate project/module only for that strategy class, in order to create a dependency graph like so:
Code generation module... depends on
Strategy module... depends on
jOOQ libraries
so I did... jOOQ doesn't care.
PM org.jooq.tools.JooqLogger error
SEVERE: Error in file: /home/user/code/my-app/backend/db/build/tmp/generateMyAppJooq/config.xml. Error : my.app.jooq.strategy.MyGeneratorStrategy
java.lang.ClassNotFoundException: my.app.jooq.strategy.MyGeneratorStrategy
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
at org.jooq.codegen.GenerationTool.loadClass0(GenerationTool.java:1075)
at org.jooq.codegen.GenerationTool.loadClass(GenerationTool.java:1005)
at org.jooq.codegen.GenerationTool.run0(GenerationTool.java:405)
at org.jooq.codegen.GenerationTool.run(GenerationTool.java:233)
at org.jooq.codegen.GenerationTool.generate(GenerationTool.java:228)
at org.jooq.codegen.GenerationTool.main(GenerationTool.java:200)
here's how the codegen module is set up:
plugins {
id 'nu.studer.jooq' version "${plugin_jooq}"
id 'io.spring.dependency-management'
}
dependencyManagement {
imports {
mavenBom SpringBootPlugin.BOM_COORDINATES
mavenBom "org.jetbrains.kotlin:kotlin-bom:${fw_kotlin}"
}
}
dependencies {
api "org.postgresql:postgresql"
implementation project(":backend-jooq-config")
api "org.springframework.boot:spring-boot-starter-jooq"
implementation "org.jooq:jooq-meta"
implementation "org.jooq:jooq-codegen"
jooqGenerator "org.postgresql:postgresql"
}
jooq {
version = dependencyManagement.importedProperties['jooq.version']
edition = nu.studer.gradle.jooq.JooqEdition.OSS
configurations {
myApp {
generateSchemaSourceOnCompilation = true
generationTool {
logging = org.jooq.meta.jaxb.Logging.WARN
jdbc {
driver = 'org.postgresql.Driver'
url = 'jdbc:postgresql://localhost:5432/myapp'
user = 'user'
password = 'password'
properties {
property {
key = 'PAGE_SIZE'
value = 2048
}
}
}
generator {
name = 'org.jooq.codegen.DefaultGenerator'
strategy {
name = 'my.app.jooq.strategy.MyGeneratorStrategy'
}
database {
name = 'org.jooq.meta.postgres.PostgresDatabase'
inputSchema = 'public'
includes = '.*'
excludes = ''
}
generate {
daos = true
springAnnotations = true
}
target {
directory = 'src/main/java'
packageName = 'my.app.db'
}
}
}
}
}
}
I don't see what I'm doing wrong.
And yes, the strategy is being compiled:
find .. -name "*GeneratorStrategy*"
../backend/jooq-config/build/classes/kotlin/main/my/app/jooq/strategy/MyGeneratorStrategy.class
../backend/jooq-config/build/classes/kotlin/main/my/app/jooq/strategy/MyGeneratorStrategy$WhenMappings.class
../backend/jooq-config/src/main/kotlin/my/app/jooq/strategy/MyGeneratorStrategy.kt
how do I resolve this?
Fixing your configuration
There's an example here on how to set up this third party code generation plugin to use a custom strategy:
https://github.com/etiennestuder/gradle-jooq-plugin/tree/master/example/configure_custom_generator_strategy
As you can see, the dependency is specified as:
jooqGenerator project(':backend-jooq-config')
Not
implementation project(':backend-jooq-config')
Using a declarative configuration instead
For such simple cases, jOOQ also offers an out-of-the-box configuration called "matcher strategy," where it matches names and replaces them by something else. E.g.
generator {
strategy {
matchers {
tables {
table {
pojoClass {
transform = 'PASCAL'
expression = '\$0_POJO'
}
}
}
}
}
}
I have a consumer running in its own thread. This consumer is polling records from a topic, extracts relevant data and stores the result to a database. Is it possible to achieve exactly-once semantics with this setup? I.e is it possible to ensure that a record is stored only once to the database?
The consumer config is:
kafka:
bootstrapServers: ${KAFKA_BOOTSTRAP_SERVERS}
kafkaSecurityProtocol: ${KAFKA_DK_SECURITY_PROTOCOL}
schemaRegistryUrl: ${SCHEMA_REGISTRY_URL}
autoOffsetReset: earliest
enableAutoCommit: false
sessionTimeoutMs: 60000
heartbeatIntervalMs: 6000
defaultApiTimeoutMs: 120000
inputTopic: ${KAFKA_INPUT_TOPIC}
keyDeserializerClass: org.apache.kafka.common.serialization.StringDeserializer
valueDeserializerClass: org.apache.kafka.common.serialization.StringDeserializer
My consumer thread looks like the following:
import datasource
import dbContext
import extractRelevantData
class ConsumerThread(
name: String,
private val consumer: Consumer<String, String>,
private val sleepTime: Long,
) :
Thread(name) {
private val saveTimer = Metrics.gauge("time.used.on.saving", AtomicLong(0))!!
private val receiveCounter = Metrics.counter("received")
override fun run() {
while (true) {
try {
consumer.poll(Duration.ofSeconds(30)).let { rs ->
rs.forEach { r ->
val data = extractRelevantData(r.value())
dbContext.startConnection(dataSource).use {
val time = measureTimeMillis {
Dao(dbContext).saveData(data)
}
saveTimer.set(time)
}
}
log.info("Received ${rs.count()} {}", this.name)
receiveCounter.increment(rs.count().toDouble())
consumer.commitSync()
}
} catch (e: Exception) {
log.error("Unhandled exception when fetching {} from kafka", this.name, e)
sleep(sleepTime)
}
}
}
}
and my Dao looks like:
class TpAcknowledgementDao(private val dbContext: DbContext) {
private val table: DbContextTable = dbContext.table("table")
private val reasonTable: DbContextTable = dbContext.table("reason")
fun saveData(data: DataType): String {
dbContext.ensureTransaction().use {
// Do changes to database (i.e. save data to database and create a saveStatus object)
it.setComplete()
return saveStatus.id.toString()
}
}
}
I thought my current setup ensured exactly-once semantics: If an exception is thrown, the consumer is not comitting and the database transaction ensures the changes are rolled back. At restart, the records will be consumed one more time and the database transaction will be re-attempted.
However, when I get the following exception:
java.sql.SQLRecoverableException: I/O-error: The Network Adapter could not establish the connection
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:862)
at oracle.jdbc.driver.PhysicalConnection.connect(PhysicalConnection.java:793)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:57)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:747)
at oracle.jdbc.pool.OracleDataSource.getPhysicalConnection(OracleDataSource.java:413)
at oracle.jdbc.pool.OracleDataSource.getConnection(OracleDataSource.java:298)
at oracle.jdbc.pool.OracleDataSource.getConnection(OracleDataSource.java:213)
at oracle.jdbc.pool.OracleDataSource.getConnection(OracleDataSource.java:191)
at org.fluentjdbc.DbContext$TopLevelDbContextConnection.getConnection(DbContext.java:274)
at org.fluentjdbc.DbContext.getThreadConnection(DbContext.java:151)
at org.fluentjdbc.DbContext.ensureTransaction(DbContext.java:184)
Caused by: java.net.ConnectException: Connection refused
at java.base/sun.nio.ch.Net.connect0(Native Method)
at java.base/sun.nio.ch.Net.connect(Unknown Source)
at java.base/sun.nio.ch.Net.connect(Unknown Source)
at java.base/sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
at java.base/java.nio.channels.SocketChannel.open(Unknown Source)
at oracle.net.nt.TimeoutSocketChannel.connect(TimeoutSocketChannel.java:99)
at oracle.net.nt.TimeoutSocketChannel.<init>(TimeoutSocketChannel.java:77)
at oracle.net.nt.TcpNTAdapter.connect(TcpNTAdapter.java:192)
... 19 common frames omitted
One or more records will not be stored in the database. Any idea on how I can ensure that all the records are stored to the database only once?
I've configured an SQS queue and an additional dead letter queue using terraform.
resource "aws_sqs_queue" "sqs_deadletter" {
name = "worker-dead-letter"
}
resource "aws_sqs_queue" "sqs" {
name = "worker"
/* TODO: If I enable this all messages goes to the dead letter queue
redrive_policy = jsonencode({
deadLetterTargetArn = aws_sqs_queue.sqs_deadletter.arn
maxReceiveCount = 4
})
*/
}
resource "aws_lambda_event_source_mapping" "sqs" {
event_source_arn = aws_sqs_queue.sqs.arn
function_name = aws_lambda_function.worker.arn
enabled = true
batch_size = var.batch_size
}
I use the below handler to process my messages.
#Introspected
class LegacyToModernRequestHandler : MicronautRequestHandler<SQSEvent, Unit>() {
private val logger = KotlinLogging.logger {}
override fun execute(input: SQSEvent) {
input.records.forEach {
handle(it)
}
}
private fun handle(message: SQSMessage) {
val key = message.body
logger.info { "LegacyToModernRequestHandler($key)" }
}
}
But all my messages goes to the DLQ. How can I indicate successful handling so that doesn't happen?
If you are not using AUTO_ACKNOWLEDGEMENT mode, you will have to explicitly acknowledge the message so that it is processed successfully. Otherwise it will go to DLQ. Can you show how have you configured your SQS queue?
I have the following code:
typealias MessagePredicate = (Message) -> Boolean
object EmailHelper {
private val session: Session by lazy {
val props = System.getProperties()
props["mail.imaps.usesocketchannels"] = "true"
props["mail.imap.usesocketchannels"] = "true"
Session.getInstance(props, null)
}
private val store = session.getStore("gimap") as GmailStore
private val idleManager = IdleManager(session, Executors.newSingleThreadExecutor())
private val folder: GmailFolder by lazy { store.getFolder("INBOX") as GmailFolder }
init {
store.connect("imap.gmail.com", "***#gmail.com", "***")
folder.open(Folder.READ_ONLY)
idleManager.watch(folder)
}
fun watchForMessage(condition: MessagePredicate): CompletableFuture<Message> {
val promise = CompletableFuture<Message>()
folder.addMessageCountListener(object : MessageCountAdapter() {
override fun messagesAdded(e: MessageCountEvent) {
super.messagesAdded(e)
e.messages.firstOrNull(condition)?.let {
folder.removeMessageCountListener(this)
promise.complete(it)
}
}
})
return promise
}
}
However when I run this code I'm getting the following exception:
Exception in thread "main" java.lang.ExceptionInInitializerError
at com.muliyul.MainKt.main(Main.kt:28)
Caused by: javax.mail.MessagingException: Folder is not using SocketChannels
at com.sun.mail.imap.IdleManager.watch(IdleManager.java:205)
at com.muliyul.EmailHelper.<clinit>(EmailHelper.kt:40)
... 1 more
I am setting the property "mail.imaps.usesocketchannels" beforehand and I've also read this question yet I can't wrap my head around what's wrong with my code.
Can someone point me in the right direction?
Side note: the email provider is Gmail (obviously).
An hour after I posted this question (and 3 hours of researching) I finally found an answer.
You have to set the property mail.gimap.usesocketchannels to "true" (and not mail.imap.usesocketchannels or mail.imaps.usesocketchannels)
This is due to the fact that gimap is a different protocol than imap.
There goes 3 hours down the drain.
I try to write dataframe to ignite using jdbc ,
The Spark version is : 2.1
Ignite version:2.3
JDK:1.8
Scala:2.11.8
this is my code snippet:
def WriteToIgnite(hiveDF:DataFrame,targetTable:String):Unit = {
val conn = DataSource.conn
var psmt:PreparedStatement = null
try {
OperationIgniteUtil.deleteIgniteData(conn,targetTable)
hiveDF.foreachPartition({
partitionOfRecords => {
partitionOfRecords.foreach(
row => for ( i <- 0 until row.length ) {
psmt = OperationIgniteUtil.getInsertStatement(conn, targetTable, hiveDF.schema)
psmt.setObject(i+1, row.get(i))
psmt.execute()
}
)
}
})
}catch {
case e: Exception => e.printStackTrace()
} finally {
conn.close
}
}
and then I run on spark ,it print erro message:
org.apache.spark.SparkException: Task not serializable
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2094)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:924)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:923)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:923)
at org.apache.spark.sql.Dataset$$anonfun$foreachPartition$1.apply$mcV$sp(Dataset.scala:2305)
at org.apache.spark.sql.Dataset$$anonfun$foreachPartition$1.apply(Dataset.scala:2305)
at org.apache.spark.sql.Dataset$$anonfun$foreachPartition$1.apply(Dataset.scala:2305)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2765)
at org.apache.spark.sql.Dataset.foreachPartition(Dataset.scala:2304)
at com.pingan.pilot.ignite.common.OperationIgniteUtil$.WriteToIgnite(OperationIgniteUtil.scala:72)
at com.pingan.pilot.ignite.etl.HdfsToIgnite$.main(HdfsToIgnite.scala:36)
at com.pingan.pilot.ignite.etl.HdfsToIgnite.main(HdfsToIgnite.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.io.NotSerializableException:
org.apache.ignite.internal.jdbc2.JdbcConnection Serialization stack:
- object not serializable (class: org.apache.ignite.internal.jdbc2.JdbcConnection, value:
org.apache.ignite.internal.jdbc2.JdbcConnection#7ebc2975)
- field (class: com.pingan.pilot.ignite.common.OperationIgniteUtil$$anonfun$WriteToIgnite$1,
name: conn$1, type: interface java.sql.Connection)
- object (class com.pingan.pilot.ignite.common.OperationIgniteUtil$$anonfun$WriteToIgnite$1,
)
at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:295)
... 27 more
Anyone konws I to fix it?
Thanks!
The problem here is you cannot serialize the connection to Ignite DataSource.conn. The closure you provide to forEachPartition contains the connection as part of its scope which is why Spark cannot serialize it.
Fortunately, Ignite provides a custom implementation of RDD which allows you to save values to it. You will need to create an IgniteContext first, then retrieve Ignite's shared RDD which provide distributed access to Ignite to save the Row of your RDD:
val igniteContext = new IgniteContext(sparkContext, () => new IgniteConfiguration())
...
// Retrieve Ignite's shared RDD
val igniteRdd = igniteContext.fromCache("partitioned")
igniteRDD.saveValues(hiveDF.toRDD)
More information are accessible from the Apache Ignite documentation.
You have to extend the Serializable interface.
object Test extends Serializable {
def WriteToIgnite(hiveDF:DataFrame,targetTable:String):Unit = {
???
}
}
I hope it would resolve your problem.