Cannot write/save data to Ignite directly from a Spark RDD - java

I try to write dataframe to ignite using jdbc ,
The Spark version is : 2.1
Ignite version:2.3
JDK:1.8
Scala:2.11.8
this is my code snippet:
def WriteToIgnite(hiveDF:DataFrame,targetTable:String):Unit = {
val conn = DataSource.conn
var psmt:PreparedStatement = null
try {
OperationIgniteUtil.deleteIgniteData(conn,targetTable)
hiveDF.foreachPartition({
partitionOfRecords => {
partitionOfRecords.foreach(
row => for ( i <- 0 until row.length ) {
psmt = OperationIgniteUtil.getInsertStatement(conn, targetTable, hiveDF.schema)
psmt.setObject(i+1, row.get(i))
psmt.execute()
}
)
}
})
}catch {
case e: Exception => e.printStackTrace()
} finally {
conn.close
}
}
and then I run on spark ,it print erro message:
org.apache.spark.SparkException: Task not serializable
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2094)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:924)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:923)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:923)
at org.apache.spark.sql.Dataset$$anonfun$foreachPartition$1.apply$mcV$sp(Dataset.scala:2305)
at org.apache.spark.sql.Dataset$$anonfun$foreachPartition$1.apply(Dataset.scala:2305)
at org.apache.spark.sql.Dataset$$anonfun$foreachPartition$1.apply(Dataset.scala:2305)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2765)
at org.apache.spark.sql.Dataset.foreachPartition(Dataset.scala:2304)
at com.pingan.pilot.ignite.common.OperationIgniteUtil$.WriteToIgnite(OperationIgniteUtil.scala:72)
at com.pingan.pilot.ignite.etl.HdfsToIgnite$.main(HdfsToIgnite.scala:36)
at com.pingan.pilot.ignite.etl.HdfsToIgnite.main(HdfsToIgnite.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.io.NotSerializableException:
org.apache.ignite.internal.jdbc2.JdbcConnection Serialization stack:
- object not serializable (class: org.apache.ignite.internal.jdbc2.JdbcConnection, value:
org.apache.ignite.internal.jdbc2.JdbcConnection#7ebc2975)
- field (class: com.pingan.pilot.ignite.common.OperationIgniteUtil$$anonfun$WriteToIgnite$1,
name: conn$1, type: interface java.sql.Connection)
- object (class com.pingan.pilot.ignite.common.OperationIgniteUtil$$anonfun$WriteToIgnite$1,
)
at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:295)
... 27 more
Anyone konws I to fix it?
Thanks!

The problem here is you cannot serialize the connection to Ignite DataSource.conn. The closure you provide to forEachPartition contains the connection as part of its scope which is why Spark cannot serialize it.
Fortunately, Ignite provides a custom implementation of RDD which allows you to save values to it. You will need to create an IgniteContext first, then retrieve Ignite's shared RDD which provide distributed access to Ignite to save the Row of your RDD:
val igniteContext = new IgniteContext(sparkContext, () => new IgniteConfiguration())
...
// Retrieve Ignite's shared RDD
val igniteRdd = igniteContext.fromCache("partitioned")
igniteRDD.saveValues(hiveDF.toRDD)
More information are accessible from the Apache Ignite documentation.

You have to extend the Serializable interface.
object Test extends Serializable {
def WriteToIgnite(hiveDF:DataFrame,targetTable:String):Unit = {
???
}
}
I hope it would resolve your problem.

Related

How to ensure exactly-once semantics when consuming from Kafka and storing records to database?

I have a consumer running in its own thread. This consumer is polling records from a topic, extracts relevant data and stores the result to a database. Is it possible to achieve exactly-once semantics with this setup? I.e is it possible to ensure that a record is stored only once to the database?
The consumer config is:
kafka:
bootstrapServers: ${KAFKA_BOOTSTRAP_SERVERS}
kafkaSecurityProtocol: ${KAFKA_DK_SECURITY_PROTOCOL}
schemaRegistryUrl: ${SCHEMA_REGISTRY_URL}
autoOffsetReset: earliest
enableAutoCommit: false
sessionTimeoutMs: 60000
heartbeatIntervalMs: 6000
defaultApiTimeoutMs: 120000
inputTopic: ${KAFKA_INPUT_TOPIC}
keyDeserializerClass: org.apache.kafka.common.serialization.StringDeserializer
valueDeserializerClass: org.apache.kafka.common.serialization.StringDeserializer
My consumer thread looks like the following:
import datasource
import dbContext
import extractRelevantData
class ConsumerThread(
name: String,
private val consumer: Consumer<String, String>,
private val sleepTime: Long,
) :
Thread(name) {
private val saveTimer = Metrics.gauge("time.used.on.saving", AtomicLong(0))!!
private val receiveCounter = Metrics.counter("received")
override fun run() {
while (true) {
try {
consumer.poll(Duration.ofSeconds(30)).let { rs ->
rs.forEach { r ->
val data = extractRelevantData(r.value())
dbContext.startConnection(dataSource).use {
val time = measureTimeMillis {
Dao(dbContext).saveData(data)
}
saveTimer.set(time)
}
}
log.info("Received ${rs.count()} {}", this.name)
receiveCounter.increment(rs.count().toDouble())
consumer.commitSync()
}
} catch (e: Exception) {
log.error("Unhandled exception when fetching {} from kafka", this.name, e)
sleep(sleepTime)
}
}
}
}
and my Dao looks like:
class TpAcknowledgementDao(private val dbContext: DbContext) {
private val table: DbContextTable = dbContext.table("table")
private val reasonTable: DbContextTable = dbContext.table("reason")
fun saveData(data: DataType): String {
dbContext.ensureTransaction().use {
// Do changes to database (i.e. save data to database and create a saveStatus object)
it.setComplete()
return saveStatus.id.toString()
}
}
}
I thought my current setup ensured exactly-once semantics: If an exception is thrown, the consumer is not comitting and the database transaction ensures the changes are rolled back. At restart, the records will be consumed one more time and the database transaction will be re-attempted.
However, when I get the following exception:
java.sql.SQLRecoverableException: I/O-error: The Network Adapter could not establish the connection
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:862)
at oracle.jdbc.driver.PhysicalConnection.connect(PhysicalConnection.java:793)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:57)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:747)
at oracle.jdbc.pool.OracleDataSource.getPhysicalConnection(OracleDataSource.java:413)
at oracle.jdbc.pool.OracleDataSource.getConnection(OracleDataSource.java:298)
at oracle.jdbc.pool.OracleDataSource.getConnection(OracleDataSource.java:213)
at oracle.jdbc.pool.OracleDataSource.getConnection(OracleDataSource.java:191)
at org.fluentjdbc.DbContext$TopLevelDbContextConnection.getConnection(DbContext.java:274)
at org.fluentjdbc.DbContext.getThreadConnection(DbContext.java:151)
at org.fluentjdbc.DbContext.ensureTransaction(DbContext.java:184)
Caused by: java.net.ConnectException: Connection refused
at java.base/sun.nio.ch.Net.connect0(Native Method)
at java.base/sun.nio.ch.Net.connect(Unknown Source)
at java.base/sun.nio.ch.Net.connect(Unknown Source)
at java.base/sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
at java.base/java.nio.channels.SocketChannel.open(Unknown Source)
at oracle.net.nt.TimeoutSocketChannel.connect(TimeoutSocketChannel.java:99)
at oracle.net.nt.TimeoutSocketChannel.<init>(TimeoutSocketChannel.java:77)
at oracle.net.nt.TcpNTAdapter.connect(TcpNTAdapter.java:192)
... 19 common frames omitted
One or more records will not be stored in the database. Any idea on how I can ensure that all the records are stored to the database only once?

How to fix org.apache.spark.SparkException: Job aborted due to stage failure Task & com.datastax.spark.connector.rdd.partitioner.CassandraPartition

In my project i am using spark-Cassandra-connector to read the from Cassandra table and process it further into JavaRDD but i am facing issue while processing Cassandra row to javaRDD.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 0.0 failed 4 times, most recent failure: Lost task 2.3 in stage 0.0 (TID 52, 172.20.0.4, executor 1):
java.lang.ClassNotFoundException: com.datastax.spark.connector.rdd.partitioner.CassandraPartition
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1868)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2042)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:370)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I have configured my spark to user a spark cluster. When i am using master as local the code works fine but as soon as i replace it with the master i am facing issue.
Here is my spark configuration:
SparkConf sparkConf = new SparkConf().setAppName("Data Transformation")
.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer").setMaster("spark://masterip:7077");
sparkConf.set("spark.cassandra.connection.host", cassandraContactPoints);
sparkConf.set("spark.cassandra.connection.port", cassandraPort);
sparkConf.set("spark.cassandra.connection.timeout_ms", "5000");
sparkConf.set("spark.cassandra.read.timeout_ms", "200000");
sparkConf.set("spark.driver.allowMultipleContexts", "true");
/*
* sparkConf.set("spark.cassandra.auth.username", "centralrw");
* sparkConf.set("spark.cassandra.auth.password", "t8b9HRWy");
*/
logger.info("creating spark context object");
sparkContext = new JavaSparkContext(sparkConf);
logger.info("returning sparkcontext object");
return sparkContext;
Spark version - 2.4.0
spark-Cassandra_connector - 2.4.0
ReceiverConfig:
public List<Map<String, GenericTriggerEntity>> readDataFromGenericTriggerEntityUsingSpark(
JavaSparkContext sparkContext) {
List<Map<String, GenericTriggerEntity>> genericTriggerEntityList = new ArrayList<Map<String, GenericTriggerEntity>>();
try {
logger.info("Keyspace & table name to read data from cassandra");
String tableName = "generictriggerentity";
String keySpace = "centraldatalake";
logger.info("establishing conection");
CassandraJavaRDD<CassandraRow> cassandraRDD = CassandraJavaUtil.javaFunctions(sparkContext)
.cassandraTable(keySpace, tableName);
int num = cassandraRDD.getNumPartitions();
System.out.println("num- " + num);
logger.info("Converting extracted rows to JavaRDD");
JavaRDD<Map<String, GenericTriggerEntity>> rdd = cassandraRDD
.map(new Function<CassandraRow, Map<String, GenericTriggerEntity>>() {
private static final long serialVersionUID = -165799649937652815L;
#Override
public Map<String, GenericTriggerEntity> call(CassandraRow row) throws Exception {
Map<String, GenericTriggerEntity> genericTriggerEntityMap = new HashMap<String, GenericTriggerEntity>();
GenericTriggerEntity genericTriggerEntity = new GenericTriggerEntity();
if (row.getString("end") != null)
genericTriggerEntity.setEnd(row.getString("end"));
if (row.getString("key") != null)
genericTriggerEntity.setKey(row.getString("key"));
if (row.getString("keyspacename") != null)
genericTriggerEntity.setKeyspacename(row.getString("keyspacename"));
if (row.getString("partitiondeleted") != null)
genericTriggerEntity.setPartitiondeleted(row.getString("partitiondeleted"));
if (row.getString("rowdeleted") != null)
genericTriggerEntity.setRowdeleted(row.getString("rowdeleted"));
if (row.getString("rows") != null)
genericTriggerEntity.setRows(row.getString("rows"));
if (row.getString("start") != null)
genericTriggerEntity.setStart(row.getString("start"));
if (row.getString("tablename") != null) {
genericTriggerEntity.setTablename(row.getString("tablename"));
dataTableName = row.getString("tablename");
}
if (row.getString("triggerdate") != null)
genericTriggerEntity.setTriggerdate(row.getString("triggerdate"));
if (row.getString("triggertime") != null)
genericTriggerEntity.setTriggertime(row.getString("triggertime"));
if (row.getString("uuid") != null)
genericTriggerEntity.setUuid(row.getUUID("uuid"));
genericTriggerEntityMap.put(dataTableName, genericTriggerEntity);
return genericTriggerEntityMap;
}
});
List<Partition> partition = rdd.partitions();
System.out.println("partion - " + partition.size());
logger.info("Collecting data into rdd");
genericTriggerEntityList = rdd.collect();
} catch (Exception e) {
e.printStackTrace();
}
logger.info("returning generic trigger entity list");
return genericTriggerEntityList;
}
when i am doing rdd.collect() it gives exception
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 21, 10.22.3.55, executor 0): java.lang.ClassNotFoundException: in.dmart.central.data.transform.base.config.ReceiverConfig$1
I found a solution of creating a fat jar and including it in my code but i do not want to do that as every time ill make any changes ill have to do the process again and that is not possible.
Please suggest some solution to configure in the code or spark cluster.
Thanks in advance.
If you don't create fat jar, then you need to submit job with correct package specified, like this:
spark-submit --packages datastax:spark-cassandra-connector:2.4.1-s_2.11 \
...rest of your options/arguments...
This will distribute corresponding SCC packages to all Spark nodes.

How do I load sequence files(from HDFS) parallely and process each of it in parallel through Spark..?

I need to load HDFS files parallelly and process(read it and filter it based on some criteria) each file parallely. Following code loading the files in serial way. Running Spark Application with three Workers(4 cores each). I even tried setting paration parameter in parallelize method, but no performance Improvement. I'm sure my cluster has enough resources to run the jobs in parallel. What changes should I do to make it parallel ?
sparkConf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer");
sparkConf.set("spark.closure.serializer", "org.apache.spark.serializer.JavaSerializer");
JavaSparkContext sparkContext = new JavaSparkContext(sparkConf);
JavaRDD<String> files = sparkContext.parallelize(fileList);
Iterator<String> localIterator = files.toLocalIterator();
while (localIterator.hasNext())
{
String hdfsPath = localIterator.next();
long startTime = DateUtil.getCurrentTimeMillis();
JavaPairRDD<IntWritable, BytesWritable> hdfsContent = sparkContext.sequenceFile(hdfsPath, IntWritable.class, BytesWritable.class);
try
{
JavaRDD<Message> logs = hdfsContent.map(new Function<Tuple2<IntWritable, BytesWritable>, Message>()
{
public Message call(Tuple2<IntWritable, BytesWritable> tuple2) throws Exception
{
BytesWritable value = tuple2._2();
BytesWritable tmp = new BytesWritable();
tmp.setCapacity(value.getLength());
tmp.set(value);
return (Message) getProtos(logtype, tmp.getBytes());
}
});
final JavaRDD<Message> filteredLogs = logs.filter(new Function<Message, Boolean>()
{
public Boolean call(Message msg) throws Exception
{
FieldDescriptor fd = msg.getDescriptorForType().findFieldByName("method");
String value = (String) msg.getField(fd);
if (value.equals("POST"))
{
return true;
}
return false;
}
});
long timetaken = DateUtil.getCurrentTimeMillis() - startTime;
LOGGER.log(Level.INFO, "HDFS: {0} Total Log Count : {1} Filtered Log Count : {2} TimeTaken : {3}", new Object[] { hdfsPath, logs.count(), filteredLogs.count(), timetaken });
}
catch (Exception e)
{
LOGGER.log(Level.INFO, "Exception : ", e);
}
}
Instead of iterating the files RDD, I also tried Spark functions like map & foreach. But it throws following Spark Exception. No external variables are referenced inside the closure and My class(OldLogAnalyzer) already implements Serializable interface. Also KryoSerializer and Javaserializer are configured in SparkConf. I'm puzzled what is not serializable in my code.
Exception in thread "main" org.apache.spark.SparkException: Task not serializable
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:166)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:158)
at org.apache.spark.SparkContext.clean(SparkContext.scala:1622)
at org.apache.spark.rdd.RDD.map(RDD.scala:286)
at org.apache.spark.api.java.JavaRDDLike$class.map(JavaRDDLike.scala:81)
at org.apache.spark.api.java.JavaRDD.map(JavaRDD.scala:32)
at com.test.logs.spark.OldLogAnalyzer.main(OldLogAnalyzer.java:423)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.NotSerializableException: org.apache.spark.api.java.JavaSparkContext
Serialization stack:
- object not serializable (class: org.apache.spark.api.java.JavaSparkContext, value: org.apache.spark.api.java.JavaSparkContext#68f277a2)
- field (class: com.test.logs.spark.OldLogAnalyzer$10, name: val$sparkContext, type: class org.apache.spark.api.java.JavaSparkContext)
- object (class com.test.logs.spark.OldLogAnalyzer$10, com.test.logs.spark.OldLogAnalyzer$10#2f80b005)
- field (class: org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1, name: fun$1, type: interface org.apache.spark.api.java.function.Function)
- object (class org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1, <function1>)
at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:38)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:47)
at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:80)
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:164)
... 15 more

Getting class information from ElementHandle<TypeElement>

I wanted to create a simple template wizard for NetBeans that would take an existing Java class from current user project and create a new Java class from it. To do so, I need to access field and annotation data from the selected class (Java file).
Now, I used the org.netbeans.api.java.source.ui.TypeElementFinder for finding and selecting the wanted class, but as a result I get an ElementHandle and I don't know what to do with it. How do I get class info from this?
I managed to get a TypeMirror (com.sun.tools.javac.code.Type$ClassType) using this code snippet:
Project project = ...; // from Templates.getProject(WizardDescriptor);
ClasspathInfo ci = ClasspathInfoFactory.infoFor(project, true, true, true);
final ElementHandle<TypeElement> element = TypeElementFinder.find(ci);
FileObject fo = SourceUtils.getFile(element, ci);
JavaSource.forFileObject(fo).runUserActionTask(new Task<CompilationController>() {
#Override
public void run(CompilationController p) throws Exception {
p.toPhase(JavaSource.Phase.RESOLVED);
TypeElement typeElement = element.resolve(p);
TypeMirror typeMirror = typeElement.asType();
}
}, true);
But what to do form here? Am I going about this the wrong way altogether?
EDIT:
In, response to francesco foresti's post:
I tried loads of different Reflection/Classloader approaches. I got to a point where a org.netbeans.api.java.classpath.ClassPath instance was created and would contain the wanted Class file, but when I tried loading the said class with a Classloader created from that ClassPath, I would get a ClassNotFoundException. Here's my code:
Project project = ...; // from Templates.getProject(WizardDescriptor);
ClasspathInfo ci = ClasspathInfoFactory.infoFor(project, true, true, true);
final ElementHandle<TypeElement> element = TypeElementFinder.find(ci);
FileObject fo = SourceUtils.getFile(element, ci);
ClassPath cp = ci.getClassPath(ClasspathInfo.PathKind.SOURCE);
System.out.println("NAME: " + element.getQualifiedName());
System.out.println("CONTAINS: " + cp.contains(fo));
try {
Class clazz = Class.forName(element.getQualifiedName(), true, cp.getClassLoader(true));
System.out.println(clazz.getName());
} catch (ClassNotFoundException ex) {
Exceptions.printStackTrace(ex);
}
This yields:
NAME: hr.test.Test
CONTAINS: true
SEVERE [org.openide.util.Exceptions]
java.lang.ClassNotFoundException: hr.test.Test
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at org.openide.execution.NbClassLoader.findClass(NbClassLoader.java:210)
at org.netbeans.api.java.classpath.ClassLoaderSupport.findClass(ClassLoaderSupport.java:113)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:344)
why don't you use plain old reflection? E.g.
Class<?> theClazz = Class.forName("com.example.myClass");
Annotation[] annotations = theClazz.getAnnotations();
Field[] fields = theClazz.getFields();
Method[] methods = theClazz.getMethods();
// from here on, you write a file that happens to be in the classpath,
// and whose extension is '.java'

Setting Bounded mailbox for an actor that uses stashing

I have an actor that depends on stashing. Stashing requires UnboundedDequeBasedMailbox. Is it possible to use stashing together with Bounded mailbox ?
As Arnaud pointed out, my setup is :
Actor with UnrestrictedStash with RequiresMessageQueue[BoundedDequeBasedMessageQueueSemantics]
and configuration :
akka {
loggers = ["akka.event.slf4j.Slf4jLogger"]
loglevel = INFO
daemonic = on
actor {
mailbox {
bounded-deque-based {
mailbox-type = "akka.dispatch.BoundedDequeBasedMailbox"
mailbox-capacity = 2000
mailbox-push-timeout-time = 5s
}
requirements {
"akka.dispatch.BoundedDequeBasedMessageQueueSemantics" = akka.actor.mailbox.bounded-deque-based
}
}
}
}
And the error is :
[ERROR] 2013/12/06 14:03:58.268 [r-4] a.a.OneForOneStrategy - DequeBasedMailbox required, got: akka.dispatch.UnboundedMailbox$MessageQueue
An (unbounded) deque-based mailbox can be configured as follows:
my-custom-mailbox {
mailbox-type = "akka.dispatch.UnboundedDequeBasedMailbox"
}
akka.actor.ActorInitializationException: exception during creation
at akka.actor.ActorInitializationException$.apply(Actor.scala:218) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.actor.ActorCell.create(ActorCell.scala:578) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:425) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:447) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:262) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.dispatch.Mailbox.run(Mailbox.scala:218) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) [akka-actor_2.10-2.2.0.jar:2.2.0]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [scala-library-2.10.2.jar:na]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [scala-library-2.10.2.jar:na]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [scala-library-2.10.2.jar:na]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [scala-library-2.10.2.jar:na]
Caused by: akka.actor.ActorInitializationException: DequeBasedMailbox required, got: akka.dispatch.UnboundedMailbox$MessageQueue
An (unbounded) deque-based mailbox can be configured as follows:
my-custom-mailbox {
mailbox-type = "akka.dispatch.UnboundedDequeBasedMailbox"
}
at akka.actor.ActorInitializationException$.apply(Actor.scala:218) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.actor.UnrestrictedStash$class.$init$(Stash.scala:82) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at com.fg.mail.smtp.index.Indexer.<init>(Indexer.scala:38) ~[classes/:na]
at com.fg.mail.smtp.Supervisor$$anonfun$preStart$1.apply(Supervisor.scala:20) ~[classes/:na]
at com.fg.mail.smtp.Supervisor$$anonfun$preStart$1.apply(Supervisor.scala:20) ~[classes/:na]
at akka.actor.CreatorFunctionConsumer.produce(Props.scala:369) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.actor.Props.newActor(Props.scala:323) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.actor.ActorCell.newActor(ActorCell.scala:534) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
at akka.actor.ActorCell.create(ActorCell.scala:560) ~[akka-actor_2.10-2.2.0.jar:2.2.0]
... 9 common frames omitted
If I do Props.withMailbox("bounded-deque-based") then I get
Caused by: akka.ConfigurationException: Mailbox Type [bounded-deque-based] not configured
ANSWER : thanks to Arnaud : The problem was the configuration, bounded-deque-based config block should be at the same level as akka config block. Documentation is totally misleading in this case...
akka {
loggers = ["akka.event.slf4j.Slf4jLogger"]
loglevel = INFO
daemonic = on
actor {
mailbox {
requirements {
"akka.dispatch.BoundedDequeBasedMessageQueueSemantics" = akka.actor.mailbox.bounded-deque-based
}
}
}
}
bounded-deque-based {
mailbox-type = "akka.dispatch.BoundedDequeBasedMailbox"
mailbox-capacity = 2000
mailbox-push-timeout-time = 5s
}
According the official Akka stash doc you can decide to use a Stash trait that does not enforce any mailbox type see UnrestrictedStash.
If you use UnrestrictedStash you can configure manually the proper mailbox as long as it extends the akka.dispatch.DequeBasedMessageQueueSemantics marker trait.
You can manually configure your BoundedMailbox mailbox following the mailboxes doc with something like :
bounded-mailbox {
mailbox-type = "akka.dispatch.BoundedDequeBasedMailbox"
mailbox-capacity = 1000
mailbox-push-timeout-time = 10s
}
akka.actor.mailbox.requirements {
"akka.dispatch.BoundedDequeBasedMessageQueueSemantics" = bounded-mailbox
}
I did not try it myself but it should work.
Edit : what version of Akka are you using? Looks like the stash trait definition changed with the version 2.2.0

Categories