Vert.x performance drop when starting with -cluster option - java

I'm wondering if any one experienced the same problem.
We have a Vert.x application and in the end it's purpose is to insert 600 million rows into a Cassandra cluster. We are testing the speed of Vert.x in combination with Cassandra by doing tests in smaller amounts.
If we run the fat jar (build with Shade plugin) without the -cluster option, we are able to insert 10 million records in about a minute. When we add the -cluster option (eventually we will run the Vert.x application in cluster) it takes about 5 minutes for 10 million records to insert.
Does anyone know why?
We know that the Hazelcast config will create some overhead, but never thought it would be 5 times slower. This implies we will need 5 EC2 instances in cluster to get the same result when using 1 EC2 without the cluster option.
As mentioned, everything runs on EC2 instances:
2 Cassandra servers on t2.small
1 Vert.x server on t2.2xlarge

You are actually running into corner cases of the Vert.x Hazelcast Cluster manager.
First of all you are using a worker Verticle to send your messages (30000001). Under the hood Hazelcast is blocking and thus when you send a message from a worker the version 3.3.3 does not take that in account. Recently we added this fix https://github.com/vert-x3/issues/issues/75 (not present in 3.4.0.Beta1 but present in 3.4.0-SNAPSHOTS) that will improve this case.
Second when you send all your messages at the same time, it runs into another corner case that prevents the Hazelcast cluster manager to use a cache of the cluster topology. This topology cache is usually updated after the first message has been sent and sending all the messages in one shot prevents the usage of the ache (short explanation HazelcastAsyncMultiMap#getInProgressCount will be > 0 and prevents the cache to be used), hence paying the penalty of an expensive lookup (hence the cache).
If I use Bertjan's reproducer with 3.4.0-SNAPSHOT + Hazelcast and the following change: send message to destination, wait for reply. Upon reply send all messages then I get a lot of improvements.
Without clustering : 5852 ms
With clustering with HZ 3.3.3 :16745 ms
With clustering with HZ 3.4.0-SNAPSHOT + initial message : 8609 ms
I believe also you should not use a worker verticle to send that many messages and instead send them using an event loop verticle via batches. Perhaps you should explain your use case and we can think about the best way to solve it.

When you're you enable clustering (of any kind) to an application you are making your application more resilient to failures but you're also adding a performance penalty.
For example your current flow (without clustering) is something like:
client ->
vert.x app ->
in memory same process eventbus (negletible) ->
handler -> cassandra
<- vert.x app
<- client
Once you enable clustering:
client ->
vert.x app ->
serialize request ->
network request cluster member ->
deserialize request ->
handler -> cassandra
<- serialize response
<- network reply
<- deserialize response
<- vert.x app
<- client
As you can see there are many encode decode operations required plus several network calls and this all gets added to your total request time.
In order to achive best performance you need to take advantage of locality the closer you are of your data store usually the fastest.

Just to add the code of the project. I guess that would help.
Sender verticle:
public class ProviderVerticle extends AbstractVerticle {
#Override
public void start() throws Exception {
IntStream.range(1, 30000001).parallel().forEach(i -> {
vertx.eventBus().send("clustertest1", Json.encode(new TestCluster1(i, "abc", LocalDateTime.now())));
});
}
#Override
public void stop() throws Exception {
super.stop();
}
}
And the inserter verticle
public class ReceiverVerticle extends AbstractVerticle {
private int messagesReceived = 1;
private Session cassandraSession;
#Override
public void start() throws Exception {
PoolingOptions poolingOptions = new PoolingOptions()
.setCoreConnectionsPerHost(HostDistance.LOCAL, 2)
.setMaxConnectionsPerHost(HostDistance.LOCAL, 3)
.setCoreConnectionsPerHost(HostDistance.REMOTE, 1)
.setMaxConnectionsPerHost(HostDistance.REMOTE, 3)
.setMaxRequestsPerConnection(HostDistance.LOCAL, 20)
.setMaxQueueSize(32768)
.setMaxRequestsPerConnection(HostDistance.REMOTE, 20);
Cluster cluster = Cluster.builder()
.withPoolingOptions(poolingOptions)
.addContactPoints(ClusterSetup.SEEDS)
.build();
System.out.println("Connecting session");
cassandraSession = cluster.connect("kiespees");
System.out.println("Session connected:\n\tcluster [" + cassandraSession.getCluster().getClusterName() + "]");
System.out.println("Connected hosts: ");
cassandraSession.getState().getConnectedHosts().forEach(host -> System.out.println(host.getAddress()));
PreparedStatement prepared = cassandraSession.prepare(
"insert into clustertest1 (id, value, created) " +
"values (:id, :value, :created)");
PreparedStatement preparedTimer = cassandraSession.prepare(
"insert into timer (name, created_on, amount) " +
"values (:name, :createdOn, :amount)");
BoundStatement timerStart = preparedTimer.bind()
.setString("name", "clusterteststart")
.setInt("amount", 0)
.setTimestamp("createdOn", new Timestamp(new Date().getTime()));
cassandraSession.executeAsync(timerStart);
EventBus bus = vertx.eventBus();
System.out.println("Bus info: " + bus.toString());
MessageConsumer<String> cons = bus.consumer("clustertest1");
System.out.println("Consumer info: " + cons.address());
System.out.println("Waiting for messages");
cons.handler(message -> {
TestCluster1 tc = Json.decodeValue(message.body(), TestCluster1.class);
if (messagesReceived % 100000 == 0)
System.out.println("Message received: " + messagesReceived);
BoundStatement boundRecord = prepared.bind()
.setInt("id", tc.getId())
.setString("value", tc.getValue())
.setTimestamp("created", new Timestamp(new Date().getTime()));
cassandraSession.executeAsync(boundRecord);
if (messagesReceived % 100000 == 0) {
BoundStatement timerStop = preparedTimer.bind()
.setString("name", "clusterteststop")
.setInt("amount", messagesReceived)
.setTimestamp("createdOn", new Timestamp(new Date().getTime()));
cassandraSession.executeAsync(timerStop);
}
messagesReceived++;
//message.reply("OK");
});
}
#Override
public void stop() throws Exception {
super.stop();
cassandraSession.close();
}
}

Related

Flink Kafka - how to make App run in Parallel?

I am creating a app in Flink to
Read Messages from a topic
Do some simple process on it
Write Result to a different topic
My code does work, however it does not run in parallel
How do I do that?
It seems my code runs only on one thread/block?
On the Flink Web Dashboard:
App goes to running status
But, there is only one block shown in the overview subtasks
And Bytes Received / Sent, Records Received / Sent is always zero ( no Update )
Here is my code, please assist me in learning how to split my app to be able to run in parallel, and am I writing the app correctly?
public class SimpleApp {
public static void main(String[] args) throws Exception {
// create execution environment INPUT
StreamExecutionEnvironment env_in =
StreamExecutionEnvironment.getExecutionEnvironment();
// event time characteristic
env_in.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
// production Ready (Does NOT Work if greater than 1)
env_in.setParallelism(Integer.parseInt(args[0].toString()));
// configure kafka consumer
Properties properties = new Properties();
properties.setProperty("zookeeper.connect", "localhost:2181");
properties.setProperty("bootstrap.servers", "localhost:9092");
properties.setProperty("auto.offset.reset", "earliest");
// create a kafka consumer
final DataStream<String> consumer = env_in
.addSource(new FlinkKafkaConsumer09<>("test", new
SimpleStringSchema(), properties));
// filter data
SingleOutputStreamOperator<String> result = consumer.filter(new
FilterFunction<String>(){
#Override
public boolean filter(String s) throws Exception {
return s.substring(0, 2).contentEquals("PS");
}
});
// Process Data
// Transform String Records to JSON Objects
SingleOutputStreamOperator<JSONObject> data = result.map(new
MapFunction<String, JSONObject>()
{
#Override
public JSONObject map(String value) throws Exception
{
JSONObject jsnobj = new JSONObject();
if(value.substring(0, 2).contentEquals("PS"))
{
// 1. Raw Data
jsnobj.put("Raw_Data", value.substring(0, value.length()-6));
// 2. Comment
int first_index_comment = value.indexOf("$");
int last_index_comment = value.lastIndexOf("$") + 1;
// - set comment
String comment =
value.substring(first_index_comment, last_index_comment);
comment = comment.substring(0, comment.length()-6);
jsnobj.put("Comment", comment);
}
else {
jsnobj.put("INVALID", value);
}
return jsnobj;
}
});
// Write JSON to Kafka Topic
data.addSink(new FlinkKafkaProducer09<JSONObject>("localhost:9092",
"FilteredData",
new SimpleJsonSchema()));
env_in.execute();
}
}
My code does work, but it seems to run only on a single thread
( One block shown ) in web interface ( No passing of data, hence the bytes sent / received are not updated ).
How do I make it run in parallel ?
To run your job in parallel you can do 2 things:
Increase the parallelism of your job at the env level - i.e. do something like
StreamExecutionEnvironment env_in =
StreamExecutionEnvironment.getExecutionEnvironment().setParallelism(4);
But this would only increase parallelism at flink end after it reads the data, so if the source is producing data faster it might not be fully utilized.
To fully parallelize your job, setup multiple partitions for your kafka topic, ideally the amount of parallelism you would want with your flink job. So, you might want to do something like below when you are creating your kafka topic:
bin/kafka-topics.sh --create --zookeeper localhost:2181
--replication-factor 3 --partitions 4 --topic test

Vert.x multi-thread web-socket

I have simple vert.x app:
public class Main {
public static void main(String[] args) {
Vertx vertx = Vertx.vertx(new VertxOptions().setWorkerPoolSize(40).setInternalBlockingPoolSize(40));
Router router = Router.router(vertx);
long main_pid = Thread.currentThread().getId();
Handler<ServerWebSocket> wsHandler = serverWebSocket -> {
if(!serverWebSocket.path().equalsIgnoreCase("/ws")){
serverWebSocket.reject();
} else {
long socket_pid = Thread.currentThread().getId();
serverWebSocket.handler(buffer -> {
String str = buffer.getString(0, buffer.length());
long handler_pid = Thread.currentThread().getId();
log.info("Got ws msg: " + str);
String res = String.format("(req:%s)main:%d sock:%d handlr:%d", str, main_pid, socket_pid, handler_pid);
try {
Thread.sleep(500);
} catch (InterruptedException e) {
e.printStackTrace();
}
serverWebSocket.writeFinalTextFrame(res);
});
}
};
vertx
.createHttpServer()
.websocketHandler(wsHandler)
.listen(8080);
}
}
When I connect this server with multiple clients I see that it works in one thread. But I want to handle each client connection parallelly. How I should change this code to do it?
This:
new VertxOptions().setWorkerPoolSize(40).setInternalBlockingPoolSize(40)
looks like you're trying to create your own HTTP connection pool, which is likely not what you really want.
The idea of Vert.x and other non-blocking event-loop based frameworks, is that we don't attempt the 1 thread -> 1 connection affinity, rather, when a request, currently being served by the event loop thread is waiting for IO - EG the response from a DB - that event-loop thread is freed to service another connection. This then allows a single event loop thread to service multiple connections in a concurrent-like fashion.
If you want to fully utilise all core on your machine, and you're only going to be running a single verticle, then set the number of instances to the number of cores when your deploy your verticle.
IE
Vertx.vertx().deployVerticle("MyVerticle", new DeploymentOptions().setInstances(Runtime.getRuntime().availableProcessors()));
Vert.x is a reactive framework, which means that it uses a single thread model to handle all your application load. This model is known to scale better than the threaded model.
The key point to know is that all code you put in a handler must never block (like your Thread.sleep) since it will block the main thread. If you have blocking code (say for example a JDBC call) you should wrap your blocking code in a executingBlocking handler, e.g.:
serverWebSocket.handler(buffer -> {
String str = buffer.getString(0, buffer.length());
long handler_pid = Thread.currentThread().getId();
log.info("Got ws msg: " + str);
String res = String.format("(req:%s)main:%d sock:%d handlr:%d", str, main_pid, socket_pid, handler_pid);
vertx.executeBlocking(future -> {
try {
Thread.sleep(500);
} catch (InterruptedException e) {
e.printStackTrace();
}
serverWebSocket.writeFinalTextFrame(res);
future.complete();
});
});
Now all the blocking code will be run on a thread from the thread pool that you can configure as already shown in other replies.
If you would like to avoid writing all these execute blocking handlers and you know that you need to do several blocking calls then you should consider using a worker verticle, since these will scale at the event bus level.
A final note for multi threading is that if you use multiple threads your server will not be as efficient as a single thread, for example it won't be able to handle 10 million websockets since 10 million threads event on a modern machine (we're in 2016) will bring your OS scheduler to its knees.

Thread VS RabbitMQ Worker resource consumption

I am using JAVA ExecutorService threads to send amazon emails, this helps me to make concurrent connection with AmazonSES via API and sends mails at lightning speed. So amazon accepts some number of connection in a sec, so for me its 50 requests in a second. So I execute 50 threads in a second and send around 1million emails daily.
This is working pretty good, but now the number of mails is going to be increased. And I don't want to invest more into RAM and processors.
One of my friend suggested me to use RabbitMQ Workers instead of threads, so instead of 50 threads, I ll be having 50 workers which will do that job.
So before changing some code to test the resource management, I just want to know will there be any huge difference in consumption? So currently when I execute my threads, JAVA consumes 20-30% of memory. So if I used workers will it be low or high?
Or is their any alternative option to this?
Here is my thread email sending function:
#Override
public void run() {
Destination destination = new Destination().withToAddresses(new String[] { this.TO });
Content subject = new Content().withData(SUBJECT);
Content textBody = new Content().withData(BODY);
Body body = new Body().withHtml(textBody);
Message message = new Message().withSubject(subject).withBody(body);
SendEmailRequest request = new SendEmailRequest().withSource(FROM).withDestination(destination).withMessage(message);
Connection connection = new Connection();
java.util.Date dt = new java.util.Date();
java.text.SimpleDateFormat sdf = new java.text.SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
String insert = "";
try {
System.out.println("Attempting to send an email to " + this.TO);
ctr++;
client.sendEmail(request);
insert = "INSERT INTO email_histories (campaign_id,contact_id,group_id,is_opened,mail_sent_at,mail_opened_at,ip_address,created_at,updated_at,is_sent) VALUES (" + this.campaign_id + ", " + this.contact_id + ", " + this.group_id + ", false, '" + sdf.format(dt) + "', null, null, '" + sdf.format(dt) + "', '" + sdf.format(dt) + "', true);";
connection.insert(insert);
System.out.println("Email sent! to " + this.TO);
} catch (Exception ex) {
System.out.println("The email was not sent.");
System.out.println("Error message: " + ex.getMessage());
}
}
I have no experience with RabbitMQ, so I'll have to leave that for others to answer.
Or is their any alternative option to this?
Instead of using one thread per mail, move that code inside a runable. Add a shared Semaphore with the number of permits = the number of mails you want to send per second. Take one permit per mail, refill permits every second from another thread (i.e. a separate SchedledExecutorService or a Timer). Then adjust the Executor thread pool size to whatever your server can handle.
From a RabbitMQ perspective there is a small amount of memory and network resource consumed, although pretty constant for each connection. If you use a pool of worker threads to read off of the RabbitMQ queue or queues it is possible that it will save you some resources because you are not garbage collecting the individual threads. As far as alternatives are concerned I would use a Thread Pool in any case. Although perhaps too heavyweight for your use, Spring Framework has a very good thread pool that I have used in the past.

Communication of separate processes through Esper events

I am trying to let multiple java processes exchange events using Esper. One process should send events, the other prepares a query and reacts according to the reported events.
When both operations are done within the same java process, everything works fine. But when I use two different processes, they just don't see each other.
I am wondering what is the key for this communication. I used the same name for the provider. This is all I could do so far.
The Producer:
String aType = espertest.dummy.A.class.getName();
Configuration cepConfig = new Configuration();
cepConfig.addEventType("A",aType);
EPServiceProvider epService = EPServiceProviderManager.getProvider("DummyProvider", cepConfig);
Object o = new A();
epService.getEPRuntime().sendEvent(o);
The Consumer:
String aType = A.class.getName();
String expression = "select count(*) from "+aType + "";
System.out.println("Our Query: " + expression);
Configuration cepConfig = new Configuration();
cepConfig.addEventType("A",aType);
EPServiceProvider epService = EPServiceProviderManager.getProvider("DummyProvider", cepConfig);
EPStatement statement = epService.getEPAdministrator().createEPL(expression);
DummyListener listener = new DummyListener();
statement.addListener(listener);
System.out.println("Anything");
try{
A a = new A();
epService.getEPRuntime().sendEvent(a);
Thread.sleep(60000);
}catch(Exception E)
{
System.out.println("Exception ");
}
The consumer tries to count the events of type A. It also sends an instance of A as a test, and this works fine. The listener is called as expected.
The code above is just an excerpt.
You need to configure middleware (Message Queue, Distributed Cache, Networked FileSystem, Socket Connection, etc....) to get the events from the producer JVM to the consumer JVM. If you can deploy the producer and consumer to a container that supports Apache Camel (e.g. ServiceMix) then it should be trivial to stand up a prototype that uses ActiveMQ to transport your objects into Esper as Camel has support for both products.
JVM 1
From Data Source
To CEP Engine 1
To Message Queue
JVM 2 (also could host MQ Broker)
From Message Queue
To CEP Engine 2
To Destination
Update:
If the producer and consumer can be threads in the same JVM, then the issue may be in the consumer. I cannot see where the consumer does anything with the event from the producer. Try something like this instead (esper reference is provided to the producer/consumer and consumer is reworked with an update method to handle results of the select statement).
Test Driver:
public Driver() {
String aType = espertest.dummy.A.class.getName();
Configuration cepConfig = new Configuration();
cepConfig.addEventType("A",aType);
EPServiceProvider epService = EPServiceProviderManager.getProvider("DummyProvider", cepConfig);
Consumer c = new Consumer(epService);
Producer p = new Producer(epService);
}
Producer:
public Producer(EPServiceProvider epsp) {
Object o = new A();
epsp.getEPRuntime().sendEvent(o);
}
Consumer:
public Consumer(EPServiceProvider epsp) {
EPStatement statement = epsp.getEPAdministrator().createEPL(input);
statement.setSubscriber(this);
}
public void update(A event) {
System.out.println("Consumer received event!");
}

Cassandra updates not working consistently

I run the following code on my local (mac) machine and on a remote unix server.:
public void deleteValue(final String id, final String value) {
log.info("Removing value " + value);
final Collection<String> valuesBeforeRemoval = getValues(id);
final MutationBatch m = keyspace.prepareMutationBatch();
m.withRow(VALUES_CF, id).deleteColumn(value);
try {
m.execute();
} catch (final ConnectionException e) {
log.error("Unable to delete location " + value, e);
}
final Collection<String> valuesAfterRemoval = getValues(id);
if (valuesAfterRemoval.size()!=(valuesBeforeRemoval.size()-1)) {
log.error("value " + value + " was supposed to be removed from list " + valuesBeforeRemoval + " but it wasn't: " + valuesAfterRemoval);
}
...
}
protected Collection<String> getValues(final String id) {
try {
final OperationResult<ColumnList<String>> operationResult = keyspace
.prepareQuery(VALUES_CF).getKey(id).execute();
final ColumnList<String> result = operationResult.getResult();
if (result.isEmpty()) {
log.info("No value found for id: " + id);
return new ArrayList<String>();
}
return result.getColumnNames();
} catch (final ConnectionException e) {
log.error("Unable to retrieve session " + id, e);
}
return new ArrayList<String>();
}
Locally, that line is never executed, which makes sense:
log.error("value " + value + " was supposed to be removed from list " + valuesBeforeRemoval + " but it wasn't: " + valuesAfterRemoval);
but that line is executed on my dev server:
[ERROR] [main] [n.o.w.s.d.SessionDaoCassandraImpl] [2013-03-08 13:12:24,801]
[] - value 3 was supposed to be removed from list [3, 2, 1, 0, 7, 6, 5, 4, 9, 8] but it wasn't: [3, 2, 1, 0, 7, 6, 5, 4, 9, 8]
I am using com.netflix.astyanax
Both my local machine and the remote dev server connect to the very
same cassandra instance.
Both my local machine and the remote dev server run the very same test
creating a new row family, and adding 10 records before one is deleted.
When the error occurs on dev, log.error("Unable to delete
location " + value, e); was not executed (i.e. running the deletion
command didn't produce any exception).
I am 100% positive that no other code is affecting the content of the
database while I am running the test on dev so this isn't some
strange concurrency issue.
What could possibly explain that the deleteColumn(value) request runs without producing any error but still does not remove the column from the database?
ADDITIONAL INFO
Here is how I created the keyspace:
create keyspace sessiondata
with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
and strategy_options = {replication_factor:1};
Here is how I created the column family values, referenced as VALUES_CF in the code above:
create column family values
with comparator = UTF8Type
;
Here is how the keyspace referenced in the java code above is defined:
final AstyanaxContext.Builder contextBuilder = getBuilder();
final AstyanaxContext<Keyspace> keyspaceContext = contextBuilder
.forKeyspace(keyspaceName).buildKeyspace(
ThriftFamilyFactory.getInstance());
keyspaceContext.start();
keyspace = keyspaceContext.getEntity();
where getBuilder is:
private Builder getBuilder() {
final AstyanaxConfigurationImpl conf = new AstyanaxConfigurationImpl()
.setDiscoveryType(NodeDiscoveryType.NONE)
.setRetryPolicy(new RunOnce());
final ConnectionPoolConfigurationImpl poolConf = new ConnectionPoolConfigurationImpl("MyPool")
.setPort(port)
.setMaxConnsPerHost(1)
.setSeeds(value);
return new AstyanaxContext.Builder()
.forCluster(cluster)
.withAstyanaxConfiguration(conf)
.withConnectionPoolConfiguration(poolConf)
.withConnectionPoolMonitor(new CountingConnectionPoolMonitor());
}
SECOND UPDATE
First, the issues are not solely related to deletes. I observe similar problems when updating records in the database, reading them, and not being able to read the updates I just wrote
Second, I created a test that does 100 times the following operations:
write a row into cassandra
update that row in cassandra
read back that row from cassandra and check whether the row was indeed updated, and checking again regularly after delays if it wasn't
What I observe from that test is that:
again, when I run that code locally, all 100 iterations pass right away (no retry ever needed)
when I run that code on the remote server, some of the iterations pass, some fail. When they fail, no matter how large the delay (I wait up to 10 seconds), the test always fail.
At this point, I am really not sure how any cassandra setup could explain this behavior since I connect to the very same server for my tests and since the delays I insert are much larger than any additional latency I may need to run the test when connecting from my local machine.
The only relevant difference seems to be which machine the code is running on.
THIRD UPDATE
If in the test mentioned in the previous update, I insert a delay between the 2 writes, the code starts passing if the delay is >= 1,000 ms. A delay of, say, 100 ms doesn't help. I also modified the builder to set the default read and write consistencies to the most demanding: ALL, and that had no impact on the results of the test (still failing about half of the time unless delay between writes >1s):
final AstyanaxConfigurationImpl conf = new AstyanaxConfigurationImpl()
.setDiscoveryType(NodeDiscoveryType.NONE)
.setRetryPolicy(new RunOnce()).setDefaultReadConsistencyLevel(ConsistencyLevel.CL_ALL).setDefaultWriteConsistencyLevel(ConsistencyLevel.CL_ALL);
To debug, try printing the full row instead of just the column names. When I say the full row I mean the column name, column value and the time stamp. A long shot is clocks are wrong on one of your test machines and this is throwing out your tests on the other.
Another thing to double check is that ip is indeed what you think it is, in both your application and cassandra. When you retrieve it print it between something, like println("-" + ip "-"). Before and after your try block for the execute in deleteSecureLocation do a get for only that column, not the entire row. I'm not too sure how to do that in astynax, on the cli it would be get[id][ip].
Something to keep in mind is that a delete won't fail even if there's nothing to delete. To cassandra it's a write, the only thing that will make it a delete is if on read it's the latest timestamped entry against that row/column name.

Categories