DDS Reader not dropping messages - java

I am learning about DDS using RTI (still very new to this topic) . I am creating a Publisher that writes to a Subscriber, and the Subscriber outputs the message. One thing I would like to simulate is dropped packages. As an example, let's say the Publisher writes to the Subscriber 4 times a second but the Subscriber can only read one a second (the most recent message).
As of now, I am able to create a Publisher & Subscriber w/o any packages being dropped.
I read through some documentation and found HistoryQosPolicyKind.KEEP_LAST_HISTORY_QOS.
Correct me if I am wrong, but I was under the impression that this would essentially keep the most recent message received from the Publisher. Instead, the Subscriber is receiving all the messages but delayed by 1 second.
I don't want to cache the messages but drop the messages. How can I simulate the "dropped" package?
BTW: I don't want to change anything in the .xml file. I want to do it programmatically.
Here are some snippets of my code.
//Publisher.java
//writer = (MsgDataWriter)publisher.create_datawriter(topic, Publisher.DATAWRITER_QOS_DEFAULT,null /* listener */, StatusKind.STATUS_MASK_NONE);
writer = (MsgDataWriter)publisher.create_datawriter(topic, write, null,
StatusKind.STATUS_MASK_ALL);
if (writer == null) {
System.err.println("create_datawriter error\n");
return;
}
// --- Write --- //
String[] messages= {"1", "2", "test", "3"};
/* Create data sample for writing */
Msg instance = new Msg();
InstanceHandle_t instance_handle = InstanceHandle_t.HANDLE_NIL;
/* For a data type that has a key, if the same instance is going to be
written multiple times, initialize the key here
and register the keyed instance prior to writing */
//instance_handle = writer.register_instance(instance);
final long sendPeriodMillis = (long) (.25 * 1000); // 4 per second
for (int count = 0;
(sampleCount == 0) || (count < sampleCount);
++count) {
if (count == 11)
{
return;
}
System.out.println("Writing Msg, count " + count);
/* Modify the instance to be written here */
instance.message =words[count];
instance.sender = "some user";
/* Write data */
writer.write(instance, instance_handle);
try {
Thread.sleep(sendPeriodMillis);
} catch (InterruptedException ix) {
System.err.println("INTERRUPTED");
break;
}
}
//writer.unregister_instance(instance, instance_handle);
} finally {
// --- Shutdown --- //
if(participant != null) {
participant.delete_contained_entities();
DomainParticipantFactory.TheParticipantFactory.
delete_participant(participant);
}
//Subscriber
// Customize time & Qos for receiving info
DataReaderQos readerQ = new DataReaderQos();
subscriber.get_default_datareader_qos(readerQ);
Duration_t minTime = new Duration_t(1,0);
readerQ.time_based_filter.minimum_separation.sec = minTime.sec;
readerQ.time_based_filter.minimum_separation.nanosec = minTime.nanosec;
readerQ.history.kind = HistoryQosPolicyKind.KEEP_LAST_HISTORY_QOS;
readerQ.reliability.kind = ReliabilityQosPolicyKind.BEST_EFFORT_RELIABILITY_QOS;
reader = (MsgDataReader)subscriber.create_datareader(topic, readerQ, listener, StatusKind.STATUS_MASK_ALL);
if (reader == null) {
System.err.println("create_datareader error\n");
return;
}
// --- Wait for data --- //
final long receivePeriodSec = 1;
for (int count = 0;
(sampleCount == 0) || (count < sampleCount);
++count) {
//System.out.println("Msg subscriber sleeping for "+ receivePeriodSec + " sec...");
try {
Thread.sleep(receivePeriodSec * 1000); // in millisec
} catch (InterruptedException ix) {
System.err.println("INTERRUPTED");
break;
}
}
} finally {
// --- Shutdown --- //

On the subscriber side, it is useful to distinguish three different types of interaction between your application and the DDS Domain: polling, Listeners and WaitSets
Polling means that the application decides when it reads available data. This is often a time-driven mechanism.
Listeners are basically callback functions that get invoked as soon as data becomes available, by an infrastructure thread, to read that data.
WaitSets implement a mechanism similar to the socket select mechanism: an application thread waits (blocks) for data to become available and after unblocking reads the new data.
Your application uses a Listener mechanism. You did not post the implementation of the callback function, but from the overall picture, it is likely that the listener implementation immediately tries to read the data at the moment that the callback is invoked. There is no time for the data to be "pushed out" or "dropped" as you called it. This reading happens in a different thread than your main thread, which is sleeping most of the time. You can find a Knowledge Base article about it here.
The only thing that is not clear is the impact of the time_based_filter QoS setting. You did not mention that in your question, but it does show up in the code. I would expect this to filter out some of your samples. That is a different mechanism than the pushing out of the history though. The behavior for the time based filter may be implemented differently for different DDS implementations. Which product do you use?

Related

Grpc Server keeps processing the data even when client get disconnected

I have a server which streams the data for a given request below is the method which does that function
#Override
public void getChangeFeed(ChangeFeedRequest request, StreamObserver<ChangeFeedResponse> responseObserver) {
long queryDate = request.getFromDate();
long offset = request.getPageNo();
ChangeFeedResponse changeFeedResponse = processData(responseObserver, queryDate, offset);
while(true){
if(changeFeedResponse!=null && !changeFeedResponse.getFinalize()){
responseObserver.onNext(changeFeedResponse);
changeFeedResponse = processData(responseObserver, changeFeedResponse.getToDate(), changeFeedResponse.getPageNo());
}else{
break;
}
}
responseObserver.onNext(changeFeedResponse);
responseObserver.onCompleted();
}
When the client get disconnected the server still keeps on processing, this might be issue when multiple clients are fetching the data. Need to know how to tell server to stop processing
There's two fairly-equivalent ways. One is to use the Context, which is cancelled when the RPC is completed/cancelled:
while(!Context.current().isCancelled()){ // THIS LINE CHANGED
if(changeFeedResponse!=null && !changeFeedResponse.getFinalize()){
responseObserver.onNext(changeFeedResponse);
changeFeedResponse = processData(responseObserver, changeFeedResponse.getToDate(), changeFeedResponse.getPageNo());
}else{
break;
}
}
The other would be to use the ServerCallStreamObserver:
// THE NEXT TWO LINES CHANGED
ServerCallStreamObserver scso = (ServerCallStreamObserver) responseObserver;
while(!scso.isCancelled()){
if(changeFeedResponse!=null && !changeFeedResponse.getFinalize()){
responseObserver.onNext(changeFeedResponse);
changeFeedResponse = processData(responseObserver, changeFeedResponse.getToDate(), changeFeedResponse.getPageNo());
}else{
break;
}
}
Both approaches can also provide notification when a cancellation occurs, but polling is easiest in your case.

DropwizardMetricServices doesn't submit the gauge metric to JMX for second time (after removing the first time)

DropwizardMetricServices#submit() I'm using doesn't submit the gauge metric for second time.
i.e. My use-case is to remove the gauge metric from JMX after reading it. And my application can send the same metric (with different value).
For the first time the gauge metric is submitted successfully (then my application removes it once it reads the metric). But, the same metric is not submitted the second time.
So, I'm a bit confused what would be the reason for DropwizardMetricServices#submit() not to work for the second time?
Below is the code:
Submit metric:
private void submitNonSparseMetric(final String metricName, final long value) {
validateMetricName(metricName);
metricService.submit(metricName, value); // metricService is the DropwizardMetricServices
log(metricName, value);
LOGGER.debug("Submitted the metric {} to JMX", metricName);
}
Code that reads and removes the metric:
protected void collectMetrics() {
// Create the connection
Long currTime = System.currentTimeMillis()/1000; // Graphite needs
Socket connection = createConnection();
if (connection == null){
return;
}
// Get the output stream
DataOutputStream outputStream = getDataOutputStream(connection);
if (outputStream == null){
closeConnection();
return;
}
// Get metrics from JMX
Map<String, Gauge> g = metricRegistry.getGauges(); // metricRegistry is com.codahale.metrics.MetricRegistry
for(Entry<String, Gauge> e : g.entrySet()){
String key = e.getKey();
if(p2cMetric(key)){
String metricName = convertToMetricStandard(key);
String metricValue = String.valueOf(e.getValue().getValue());
String metricToSend = String.format("%s %s %s\n", metricName, metricValue, currTime);
try {
writeToStream(outputStream, metricToSend);
// Remove the metric from JMX after successfully sending metric to graphite
removeMetricFromJMX(key);
} catch (IOException e1) {
LOGGER.error("Unable to send metric to Graphite - {}", e1.getMessage());
}
}
}
closeOutputStream();
closeConnection();
}
I think I found the issue.
As per the DropwizardMetricServices doc - https://docs.spring.io/spring-boot/docs/current/api/org/springframework/boot/actuate/metrics/dropwizard/DropwizardMetricServices.html#submit-java.lang.String-double- ,
submit() method Set the specified gauge value.
So, I think it's recommended to use DropwizardMetricServices#submit() method to only set the values of any existing gauge metric in JMX and not for adding any new metric to JMX.
So, once I replaced DropwizardMetricServices#submit() with MetricRegistry#register() (com.codahale.metrics.MetricRegistry) method to submit all my metrics it worked as expected and my metrics are readded to JMX (once they were removed by my application).
But, I'm just wondering what makes DropwizardMetricServices#submit() to only add new metrics to JMX and not any metric that's already been removed (from JMX). Does DropwizardMetricServices cache (in memory) all the metrics submitted to JMX? that makes DropwizardMetricServices#submit() method not to resubmit the metric?

SSH Server Identification never received - Handshake Deadlock [SSHJ]

We're having some trouble trying to implement a Pool of SftpConnections for our application.
We're currently using SSHJ (Schmizz) as the transport library, and facing an issue we simply cannot simulate in our development environment (but the error keeps showing randomly in production, sometimes after three days, sometimes after just 10 minutes).
The problem is, when trying to send a file via SFTP, the thread gets locked in the init method from schmizz' TransportImpl class:
#Override
public void init(String remoteHost, int remotePort, InputStream in, OutputStream out)
throws TransportException {
connInfo = new ConnInfo(remoteHost, remotePort, in, out);
try {
if (config.isWaitForServerIdentBeforeSendingClientIdent()) {
receiveServerIdent();
sendClientIdent();
} else {
sendClientIdent();
receiveServerIdent();
}
log.info("Server identity string: {}", serverID);
} catch (IOException e) {
throw new TransportException(e);
}
reader.start();
}
isWaitForServerIdentBeforeSendingClientIdent is FALSE for us, so first of all the client (we) send our identification, as appears in logs:
"Client identity String: blabla"
Then it's turn for the receiveServerIdent:
private void receiveServerIdent() throws IOException
{
final Buffer.PlainBuffer buf = new Buffer.PlainBuffer();
while ((serverID = readIdentification(buf)).isEmpty()) {
int b = connInfo.in.read();
if (b == -1)
throw new TransportException("Server closed connection during identification exchange");
buf.putByte((byte) b);
}
}
The thread never gets the control back, as the server never replies with its identity. Seems like the code is stuck in this While loop. No timeouts, or SSH exceptions are thrown, my client just keeps waiting forever, and the thread gets deadlocked.
This is the readIdentification method's impl:
private String readIdentification(Buffer.PlainBuffer buffer)
throws IOException {
String ident = new IdentificationStringParser(buffer, loggerFactory).parseIdentificationString();
if (ident.isEmpty()) {
return ident;
}
if (!ident.startsWith("SSH-2.0-") && !ident.startsWith("SSH-1.99-"))
throw new TransportException(DisconnectReason.PROTOCOL_VERSION_NOT_SUPPORTED,
"Server does not support SSHv2, identified as: " + ident);
return ident;
}
Seems like ConnectionInfo's inputstream never gets data to read, as if the server closed the connection (even if, as said earlier, no exception is thrown).
I've tried to simulate this error by saturating the negotiation, closing sockets while connecting, using conntrack to kill established connections while the handshake is being made, but with no luck at all, so any help would be HIGHLY appreciated.
: )
I bet following code creates a problem:
String ident = new IdentificationStringParser(buffer, loggerFactory).parseIdentificationString();
if (ident.isEmpty()) {
return ident;
}
If the IdentificationStringParser.parseIdentificationString() returns empty string, it will be returned to the caller method. The caller method will keep calling the while ((serverID = readIdentification(buf)).isEmpty()) since the string is always empty. The only way to break the loop would be if call to int b = connInfo.in.read(); returns -1... but if server keeps sending the data (or resending the data) this condition is never met.
If this is the case I would add some kind of artificial way to detect this like:
private String readIdentification(Buffer.PlainBuffer buffer, AtomicInteger numberOfAttempts)
throws IOException {
String ident = new IdentificationStringParser(buffer, loggerFactory).parseIdentificationString();
numberOfAttempts.incrementAndGet();
if (ident.isEmpty() && numberOfAttempts.intValue() < 1000) { // 1000
return ident;
} else if (numberOfAttempts.intValue() >= 1000) {
throw new TransportException("To many attempts to read the server ident").
}
if (!ident.startsWith("SSH-2.0-") && !ident.startsWith("SSH-1.99-"))
throw new TransportException(DisconnectReason.PROTOCOL_VERSION_NOT_SUPPORTED,
"Server does not support SSHv2, identified as: " + ident);
return ident;
}
This way you would at least confirm that this is the case and can dig further why .parseIdentificationString() returns empty string.
Faced a similar issue where we would see:
INFO [net.schmizz.sshj.transport.TransportImpl : pool-6-thread-2] - Client identity string: blablabla
INFO [net.schmizz.sshj.transport.TransportImpl : pool-6-thread-2] - Server identity string: blablabla
But on some occasions, there were no server response.
Our service would typically wake up and transfer several files simultaneously, one file per connection / thread.
The issue was in the sshd server config, we increased maxStartups from default value 10
(we noticed the problems started shortly after batch sizes increased to above 10)
Default in /etc/ssh/sshd_config:
MaxStartups 10:30:100
Changed to:
MaxStartups 30:30:100
MaxStartups
Specifies the maximum number of concurrent unauthenticated connections to the SSH daemon. Additional connections will be dropped until authentication succeeds or the LoginGraceTime expires for a connection. The default is 10:30:100. Alternatively, random early drop can be enabled by specifying the three colon separated values start:rate:full (e.g. "10:30:60"). sshd will refuse connection attempts with a probability of rate/100 (30%) if there are currently start (10) unauthenticated connections. The probability increases linearly and all connection attempts are refused if the number of unauthenticated connections reaches full (60).
If you cannot control the server, you might have to find a way to limit your concurrent connection attempts in your client code instead.

ServiceActivator 'randomly' unsubscribing from PublishSubscribeChannel

I have a method annotated with #ServiceActivator("CH1"), where "CH1" definition is:
#Bean(name = "CH1")
MessageChannel ch1() {
return new PublishSubscribeChannel
}
and other PollableChannels publishing to this channel via
#BidgeTo(value = "CH1", poller = #Poller("myPoller"))
Things seem to work fine most of the time; however, seemingly randomly the message handler unsubscribes from "CH1" and I see in the logs:
[DEBUG] (pool-2-thread-1) org.springframework.integration.dispatcher.BroadcastingDispatcher: No subscribers, default behavior is ignore
Now I know I can change the minSubscribers but I don't get why things seem to randomly unsubscribe? After this error it will go back to handling some messages fine. Does a message handler unsubscribe while handling messages or if the executor being used is full? I see no errors associated with this in the log nor and unsubscribe or update to subscriber counts to "CH1" in the logs.
That does not make sense. Please, share some test-case to reproduce from the Framework perspective.
The source code on the matter looks like:
if (dispatched == 0 && this.minSubscribers == 0 && logger.isDebugEnabled()) {
if (sequenceSize > 0) {
logger.debug("No subscribers received message, default behavior is ignore");
}
else {
logger.debug("No subscribers, default behavior is ignore");
}
}
where we can go to the sequence == 0 only in case of:
Collection<MessageHandler> handlers = this.getHandlers();
if (this.requireSubscribers && handlers.size() == 0) {
throw new MessageDispatchingException(message, "Dispatcher has no subscribers");
}
int sequenceSize = handlers.size();
Only the clue that your subscribers unsubscribes somehow...
I see that you have a DEBUG for your CH1, so would you mind to share DEBUG logs for entire org.springframework.integration when you see that error.
EDIT
Also note that whenever a subscriber is added/removed (e.g. when a consuming endpoint is started/stopped), you will see this log message...
if (logger.isInfoEnabled()) {
logger.info("Channel '" + this.getFullChannelName() + "' has " + counter + " subscriber(s).");
}
(when logging with at least INFO logging).

How to catch "NotesException: Notes error: Remote system no longer responding" and retry?

I have this java agent that processes a huge amount of documents that it could run overnight. The problem is that I need the agent to retry if the network got suddenly disconnected briefly. The retry could have a maximum number.
int numberOfRetries = 0;
try {
while(nextdoc != null) {
// process documents
numberOfRetries = 0;
}
} catch (NotesException e) {
numberOfRetries++;
if (numberOfRetries > 4) {
// go back and reprocess current document
} else {
// message reached max number of retries. did not successfully finished
}
}
Also, of course I do not want to actually retry the whole process. Basically I need to continue on the document it was processing and move on to the next loop
You should do a retry loop around each piece of code that gets a document. Since the Notes classes generally require a getFirst and getNext paradigm, that means you need two separate retry loops. E.g.,
numberOfRetries = 0;
maxRetries = 4;
// get first document, with retries
needToRetry = false;
while (needToRetry)
{
try
{
while (needToRetry)
{
nextDoc = myView.getFirstDocument();
needToRetry=false;
}
}
catch (NotesException e)
{
numberOfRetries++;
if (numberOfRetries < maxRetries) {
// you might want to sleep here to wait for the network to recover
// you could use numberOfRetries as a factor to sleep longer on
// each failure
needToRetry = true;
} else {
// write "Max retries have been exceeded getting first document" to log
nextDoc = null; // we won't go into the processing loop
}
}
}
// process all documents
while(nextdoc != null)
{
// process nextDoc
// insert your code here
// now get next document, with retries
while (needToRetry)
{
try
{
nextDoc = myView.getNextDocument();
needToRetry=false;
}
catch (NotesException e)
{
numberOfRetries++;
if (numberOfRetries < maxRetries) {
// you might want to sleep here to wait for the network to recover
// you could use numberOfRetries as a factor to sleep longer on
// each failure
needToRetry = true;
} else {
// write "Max retries have been exceeded getting first document" to log
nextDoc = false; // we'lll be exiting the processing loop without finishing all docs
}
}
}
}
Note that I'm treating maxRetries as the max total retries across all documents in the data set, not the max for each document.
Also note that it's probably cleaner to break this up a little. E.g.
numberOfRetries = 0;
maxRetries = 4;
nextDoc = getFirstDocWithRetries(view); // this contains while loop and try-catch
while (nextDoc != null)
{
processOneDoc(nextDoc);
nextDoc = getNextDocWithRetries(view,nextDoc); // and so does this
}
I would not recommend what you are doing at all.
The NotesException can fire for a number of reasons, and there is no guarantee you will be returning to a safe state.
Also the fact the agent needs to run for such a long time means you need to change the server "Maximum execution timeout" to allow it to run correctly. Setting that to a very high value makes the server more prone to performance/deadlock issues.
A better solution would be to batch workload and have the agent run for a set time on a batch. Update as you go so that when the agent comes back it knows to work on the next batch.

Categories