In my Java EE application I use JMS to store some messages. I want to display these messages in a JSF paginated table. How can I get the messages from the queue in batches? For the moment I'm using something like this, but it's not very nice because I need to loop through many messages.
Can this be achieved? I'm using JBoss with HornetQ.
browser = session.createBrowser(queue);
List<Message> messagesToReturn = new ArrayList<>();
final Enumeration<ObjectMessage> messages = browser.getEnumeration();
int messagesSoFar = 0;
int count = 0;
while(messages.hasMoreElements()) {
ObjectMessage message = messages.nextElement();
if (count >= offset) {
messagesToReturn.add(new CGSQueueMessage(message));
messagesSoFar += 1;
}
if (messagesSoFar == maxSelect) {
break;
}
count += 1;
}
return messagesToReturn;
There are no methods in the JMS API to get messages from the queue in batches for a paginated use-case like yours.
You could read all the messages from the queue browser into your own data structure and paginate using that.
If there were too many messages to fit them all into an in-memory data structure at once then you could read as many of them as you could reasonably fit into memory (which presumably will be more than what the user sees on any given page) and that would serve as your own kind of application-level page which you could use for serving up the user-level pages. That would reduced the number of times you need to loop through the queue browser.
Aside from that you could dump all the messages from the queue browser into a temporary, random-access file and pull the results from that.
All that said, I think your use-case doesn't ultimately fit with a messaging API like JMS. In my view I think you'd be better suited using something like a database which could easily support this use-case.
Related
I have a requirement where I read a bunch of rows (thousands) from a SQL DB using Spring Batch and call a REST Service to enrich content before writing them on a Kafka topic.
When using the Spring Reactive webClient, how do I limit the number of active non-blocking service calls? Should I somehow introduce a Flux in the loop after I read data using Spring Batch?
(I understand the usage of delayElements and that it serves a different purpose, as when a single Get Service Call brings in lot of data and you want the server to slow down -- here though, my use case is a bit different in that I have many WebClient calls to make and would like to limit the number of calls to avoid out of memory issues but still gain the advantages of non-blocking invocations).
Very interesting question. I pondered about it and I thought of a couple of ideas on how this could be done. I will share my thoughts on it and hopefully there are some ideas here that perhaps help you with your investigation.
Unfortunately, I'm not familiar with Spring Batch. However, this sounds like a problem of rate limiting, or the classical producer-consumer problem.
So, we have a producer that produces so many messages that our consumer cannot keep up, and the buffering in the middle becomes unbearable.
The problem I see is that your Spring Batch process, as you describe it, is not working as a stream or pipeline, but your reactive Web client is.
So, if we were able to read the data as a stream, then as records start getting into the pipeline those would get processed by the reactive web client and, using back-pressure, we could control the flow of the stream from producer/database side.
The Producer Side
So, the first thing I would change is how records get extracted from the database. We need to control how many records get read from the database at the time, either by paging our data retrieval or by controlling the fetch size and then, with back pressure, control how many of those are sent downstream through the reactive pipeline.
So, consider the following (rudimentary) database data retrieval, wrapped in a Flux.
Flux<String> getData(DataSource ds) {
return Flux.create(sink -> {
try {
Connection con = ds.getConnection();
con.setAutoCommit(false);
PreparedStatement stm = con.prepareStatement("SELECT order_number FROM orders WHERE order_date >= '2018-08-12'", ResultSet.TYPE_FORWARD_ONLY);
stm.setFetchSize(1000);
ResultSet rs = stm.executeQuery();
sink.onRequest(batchSize -> {
try {
for (int i = 0; i < batchSize; i++) {
if (!rs.next()) {
//no more data, close resources!
rs.close();
stm.close();
con.close();
sink.complete();
break;
}
sink.next(rs.getString(1));
}
} catch (SQLException e) {
//TODO: close resources here
sink.error(e);
}
});
}
catch (SQLException e) {
//TODO: close resources here
sink.error(e);
}
});
}
In the example above:
I control the amount of records we read per batch to be 1000 by setting a fetch size.
The sink will send the amount of records requested by the subscriber (i.e. batchSize) and then wait for it to request more using back pressure.
When there are no more records in the result set, then we complete the sink and close resources.
If an error occurs at any point, we send back the error and close resources.
Alternatively I could have used paging to read the data, probably simplifying the handling of resources by having to reissue a query at every request cycle.
You may consider also doing something if subscription is cancelled or disposed (sink.onCancel, sink.onDispose) since closing the connection and other resources is fundamental here.
The Consumer Side
At the consumer side you register a subscriber that only requests messages at a speed of 1000 at the time and it will only request more once it has processed that batch.
getData(source).subscribe(new BaseSubscriber<String>() {
private int messages = 0;
#Override
protected void hookOnSubscribe(Subscription subscription) {
subscription.request(1000);
}
#Override
protected void hookOnNext(String value) {
//make http request
System.out.println(value);
messages++;
if(messages % 1000 == 0) {
//when we're done with a batch
//then we're ready to request for more
upstream().request(1000);
}
}
});
In the example above, when subscription starts it requests the first batch of 1000 messages. In the onNext we process that first batch, making http requests using the Web client.
Once the batch is complete, then we request another batch of 1000 from the publisher, and so on and so on.
And there your have it! Using back pressure you control how many open HTTP requests you have at the time.
My example is very rudimentary and it will require some extra work to make it production ready, but I believe this hopefully offers some ideas that can be adapted to your Spring Batch scenario.
I want to develop solutions that can dynamically route messages to different queues (more than 10000 queues). That's what I have so far:
Exchange with type set to topic. So that I can route messages to different queues based on routing keys.
10000 queues that have routing key as #.%numberOfQueue.#. The %numberOfQueue% is simple numeric value for that queue (but it might be changed for more meaningfull ones).
Producer producing message with routing key like that: 5.10.15.105.10000 which means that message should be routed to queue with keys 5, 10, 15, 105 and 10000 as they comform the patterns of that queues.
That's how it looks like from java client API:
String exchangeName = "rabbit.test.exchange";
String exchangeType = "topic";
boolean exchangeDurable = true;
boolean queueDurable = true;
boolean queueExclusive = false;
boolean queueAutoDelete = false;
Map<String, Object> queueArguments = null;
for (int i = 0; i < numberOfQueues; i++) {
String queueNameIterated = "rabbit.test" + i + ".queue";
channel.exchangeDeclare(exchangeName, exchangeType, exchangeDurable);
channel.queueDeclare(queueNameIterated, queueDurable, queueExclusive, queueAutoDelete, queueArguments);
String routingKey = "#." + i + ".#";
channel.queueBind(queueNameIterated, exchangeName, routingKey);
}
That's how routingKey generated for all messages for queues from 0 to 9998:
private String generateRoutingKey() {
StringBuilder keyBuilder = new StringBuilder();
for (int i = 0; i < numberOfQueues - 2; i++) {
keyBuilder.append(i);
keyBuilder.append('.');
}
String result = keyBuilder.append(numberOfQueues - 2).toString();
LOGGER.info("generated key: {}", result);
return result;
}
Seems good. The problem is that I can't use such long routingKey with channel.basicPublish() method:
Exception in thread "main" java.lang.IllegalArgumentException: Short string too long; utf-8 encoded length = 48884, max = 255.
at com.rabbitmq.client.impl.ValueWriter.writeShortstr(ValueWriter.java:50)
at com.rabbitmq.client.impl.MethodArgumentWriter.writeShortstr(MethodArgumentWriter.java:74)
at com.rabbitmq.client.impl.AMQImpl$Basic$Publish.writeArgumentsTo(AMQImpl.java:2319)
at com.rabbitmq.client.impl.Method.toFrame(Method.java:85)
at com.rabbitmq.client.impl.AMQCommand.transmit(AMQCommand.java:104)
at com.rabbitmq.client.impl.AMQChannel.quiescingTransmit(AMQChannel.java:396)
at com.rabbitmq.client.impl.AMQChannel.transmit(AMQChannel.java:372)
at com.rabbitmq.client.impl.ChannelN.basicPublish(ChannelN.java:690)
at com.rabbitmq.client.impl.ChannelN.basicPublish(ChannelN.java:672)
at com.rabbitmq.client.impl.ChannelN.basicPublish(ChannelN.java:662)
at com.rabbitmq.client.impl.recovery.AutorecoveringChannel.basicPublish(AutorecoveringChannel.java:192)
I have requirements:
Dynamically choose from producer in which queues produce the messages. It might be just one queue, all queues or 1000 queues.
I have more than 10000 different queues and it might be needed to produce same message to them.
So the questions are:
Can I use such long key? If can - how to do it?
Maybe I can achieve the same goal by different configuration of exchange or queues?
Maybe there are some hash function that can effectivily distinguesh destinations and collapse that in 255 symbols? If so, It should provide way to deal with different publishings (for example how to send to only queues numbered 555 and 8989?)?
Maybe there are some different key strategy that could be used in that way?
How else I can achieve my requirements?
I started using RabbitQM just a short time ago, hope I can help you nonetheless. There can be as many words in the routing key as you like, up to the limit of 255 bytes (as also described in e.g. RabbitMQ Tutorial 5 - Topics). Thus, the topics exchange does not seem to be appropriate for your use case.
Perhaps you can use a headers exchange in this case? According to the concept description:
A headers exchange is designed for routing on multiple attributes that are more easily expressed as message headers than a routing key. Headers exchanges ignore the routing key attribute. Instead, the attributes used for routing are taken from the headers attribute. A message is considered matching if the value of the header equals the value specified upon binding.
See here and here for an example. As I said, I just started with RabbitMQ, therefore, I don't know for sure whether this could be an option for you. If I have time later I try to construct a simple example for you.
I am trying to build a simple application that reads data from AWS Kinesis. I have managed to read data using a single shard but I want to get data from 4 different shards.
Problem is, I have a while loop which iterates as long as the shard is active which prevents me from reading data from different shards. So far I couldn't find an alternative algorithm nor was able to implement a KCL-based solution.
Many thanks in advance
public static void DoSomething() {
AmazonKinesisClient client = new AmazonKinesisClient();
//noinspection deprecation
client.setEndpoint(endpoint, serviceName, regionId);
/** get shards from the stream using describe stream method*/
DescribeStreamRequest describeStreamRequest = new DescribeStreamRequest();
describeStreamRequest.setStreamName(streamName);
List<Shard> shards = new ArrayList<>();
String exclusiveStartShardId = null;
do {
describeStreamRequest.setExclusiveStartShardId(exclusiveStartShardId);
DescribeStreamResult describeStreamResult = client.describeStream(describeStreamRequest);
shards.addAll(describeStreamResult.getStreamDescription().getShards());
if (describeStreamResult.getStreamDescription().getHasMoreShards() && shards.size() > 0) {
exclusiveStartShardId = shards.get(shards.size() - 1).getShardId();
} else {
exclusiveStartShardId = null;
}
}while (exclusiveStartShardId != null);
/** shards obtained */
String shardIterator;
GetShardIteratorRequest getShardIteratorRequest = new GetShardIteratorRequest();
getShardIteratorRequest.setStreamName(streamName);
getShardIteratorRequest.setShardId(shards.get(0).getShardId());
getShardIteratorRequest.setShardIteratorType("LATEST");
GetShardIteratorResult getShardIteratorResult = client.getShardIterator(getShardIteratorRequest);
shardIterator = getShardIteratorResult.getShardIterator();
GetRecordsRequest getRecordsRequest = new GetRecordsRequest();
while (!shardIterator.equals(null)) {
getRecordsRequest.setShardIterator(shardIterator);
getRecordsRequest.setLimit(250);
GetRecordsResult getRecordsResult = client.getRecords(getRecordsRequest);
List<Record> records = getRecordsResult.getRecords();
shardIterator = getRecordsResult.getNextShardIterator();
if(records.size()!=0) {
for(Record r : records) {
System.out.println(r.getPartitionKey());
}
}
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
}
}
}
It is recommended that you will not read from a single process/worker from multiple shards. First, as you can see it is adding to the complexity of your code, but more importantly, you will have problems scaling up.
The "secret" of scalability is to have small and independent workers or other such units. Such design you can see in Hadoop, DynamoDB or Kinesis in AWS. It allows you to build small systems (micro-services), that can easily scale up and down as needed. You can easily add more units of work/data as your service becomes more successful, or other fluctuations in its usage.
As you can see in these AWS services, you sometimes can get this scalability automatically such in DynamoDB, and sometimes you need add shards to your kinesis streams. But for your application you need to control somehow your scalability.
In the case of Kinesis, you can scale up and down using AWS Lambda or Kinesis Client Library (KCL). Both of them are listening to the status of your streams (number of shards and events) and using it to add or remove workers and deliver the events for them to process. In both of these solutions you should build a worker that is working against a single shard.
If you need to align events from multiple shards, you can do that using some state service such as Redis or DynamoDB.
For a simpler and neater solution where you only have to worry about providing your own message processing code, I would recommend using the KCL Library.
Quoting from the documentation
The KCL acts as an intermediary between your record processing logic
and Kinesis Data Streams. The KCL performs the following tasks:
Connects to the data stream
Enumerates the shards within the data stream
Uses leases to coordinates shard associations with its workers
Instantiates a record processor for every shard it manages
Pulls data records from the data stream
Pushes the records to the corresponding record processor
Checkpoints processed records
Balances shard-worker associations (leases) when the worker instance count changes or when the data stream is resharded (shards are split or merged)
Is there a best practice or guidance for sending persistent messages with asyncSend set to true.
We don't have transaction manager configured
We have ~40k-50k messages which are sent using jmsTemplate configured with
org.apache.activemq.pool.PooledConnectionFactory
We have a for loop which iterates over messages list and send them using
jmsTemplate.convertAndSend(destination, msg)
We see lot of message loss on frequent basis, when we turn off asyncSend we get the reliability but the producer performance drops by 95%
A bit of speculation as the question is not very detailed but anyway.
Depending on configuration, ActiveMQ might have memory limits on queues (might as well differ between persistent and non persistent messages). So when memory is up, your asyncSend calls will ignore warnings and continue to deliver messages to the "black hole" until memory is freed by the consumer.
There is no silver bullet to allow max performance and max reliability. Unfortunately.
However, I would try setting a producerWindowSize on the connection factory to allow some specified amount of data before a broker ack is received. Exact value is something you need to try out and depends on scenario as well as broker config/resources.
I solved this using a ProducerCallback
List<String> messageTexts = prepareListOfMessaeTexts();
ProducerCallback producerCallback = (session, producer) -> {
Topic destination = session.createTopic(myTopicName);
for (String messageText : myMessagmessageTextseBodies) {
producer.send(destination, session.createTextMessage(messageText));
}
return null;
};
jmsTemplate.execute(producerCallback);
is there any way to return the number of messages that are unacknowledged?
I am using this code to get the number of messages in the queue:
DeclareOk declareOk = amqpAdmin.getRabbitTemplate().execute(
new ChannelCallback<DeclareOk>() {
public DeclareOk doInRabbit(Channel channel)
throws Exception {
return channel.queueDeclarePassive(name);
}
});
return declareOk.getMessageCount();
but I would like to know as well the number of unacknowledged messages.
I have seen that the RabbitMQ Admin tool includes that information (for each queue it gives out the number of Ready/ Unacked and Total messages) and I guess there must be a way to retrieve that from Java/ Spring.
Thanks
UPDATE
Oks, it seems there is no way to accomplish that programmatically since listing of configuration/ queues is not part of AMPQ.
There is the possibility to enable the management plugin and query the REST web services about the queues (among other things). More info here:
http://www.rabbitmq.com/management.html
As you say in your update, if you enable the management plugin, you can query the rest api:
Eg:
`http://username:password#queue-server:15672/api/queues/%2f/queue_name.queue`
This returns json with (among other things)
messages_unacknowledged
messages_ready
It's good stuff if you have a safe route to the server.
actual url for 3.8.9 version:
http://username:password#queue-server:15672/api/queues/%2F/queue-name