I am trying to connect to mongo query routers in a test environment (I setup just one query router for test -> pointing to a config server (instead of three) which in turn points to a two node shard with no replicas). I can insert/fetch documents using the mongo shell (and have verified that the documents are going to the sharded nodes). However, when I try to test the connection to the mongo database, I get the output copied below (code being used is also copied underneath). I am using mongo database v3.2.0 and java driver v3.2.2 (I am trying to use the async api).
[info] 14:34:44.562 227 [main] MongoAuthentication INFO - testing 1
[info] 14:34:44.595 260 [main] cluster INFO - Cluster created with settings {hosts=[192.168.0.1:27018], mode=MULTIPLE, requiredClusterType=SHARDED, serverSelectionTimeout='30000 ms', maxWaitQueueSize=30}
[info] 14:34:44.595 260 [main] cluster INFO - Adding discovered server 192.168.0.1:27018 to client view of cluster
[info] 14:34:44.652 317 [main] cluster DEBUG - Updating cluster description to {type=SHARDED, servers=[{address=192.168.0.1:27018, type=UNKNOWN, state=CONNECTING}]
[info] Outputting database names:
[info] 14:34:44.660 325 [main] cluster INFO - No server chosen by ReadPreferenceServerSelector{readPreference=primary} from cluster description ClusterDescription{type=SHARDED, connectionMode=MULTIPLE, all=[ServerDescription{address=192.168.0.1:27018, type=UNKNOWN, state=CONNECTING}]}. Waiting for 30000 ms before timing out
[info] Counting the number of documents
[info] 14:34:44.667 332 [main] cluster INFO - No server chosen by ReadPreferenceServerSelector{readPreference=primaryPreferred} from cluster description ClusterDescription{type=SHARDED, connectionMode=MULTIPLE, all=[ServerDescription{address=192.168.0.1:27018, type=UNKNOWN, state=CONNECTING}]}. Waiting for 30000 ms before timing out
[info] - Count result: 0
[info] 14:34:45.669 1334 [cluster-ClusterId{value='577414c420055e5bc086c255', description='null'}-192.168.0.1:27018] connection DEBUG - Closing connection connectionId{localValue:1}
part of the code being used
final MongoClient mongoClient = MongoClientAccessor.INSTANCE.getMongoClientInstance();
final CountDownLatch listDbsLatch = new CountDownLatch(1);
System.out.println("Outputting database names:");
mongoClient.listDatabaseNames().forEach(new Block<String>() {
#Override
public void apply(final String name) {
System.out.println(" - " + name);
}
}, new SingleResultCallback<Void>() {
#Override
public void onResult(final Void result, final Throwable t) {
listDbsLatch.countDown();
}
});
The enum being used is responsible for reading config options and passing a MongoClient reference to its caller. The enum itself calls other classes which I can copy as well if needed. I have the following option configured for ReadPreference:
mongo.client.readPreference=PRIMARYPREFERRED
Any thoughts on what I might be doing wrong or might have misinterpreted? The goal is to connect to the shard via the mongos (query router) so that I can insert/fetch documents in the Mongo shard.
The mongo shard setup (query router, config and shard with replica sets) was not correctly configured. Ensure that the config server(s) replica set is launched first, mongos (query router) is brought up and points to these config servers, the mongo shards are brought up and then the shards are registered via the query router (mongos) as well as the collection is enabled for sharding. Obviously, make sure that the driver is connecting to the mongos (query router) process.
Related
In case that Kafka server is (temporarily) down, my Spring Boot application ReactiveKafkaConsumerTemplate keeps trying to connect unsuccessfully, thus causing unnecessary traffic and messing the log files:
2021-11-10 14:45:30.265 WARN 24984 --- [onsumer-group-1] org.apache.kafka.clients.NetworkClient : [Consumer clientId=consumer-group-1, groupId=consumer-group] Connection to node -1 (localhost/127.0.0.1:29092) could not be established. Broker may not be available.
2021-11-10 14:45:32.792 WARN 24984 --- [onsumer-group-1] org.apache.kafka.clients.NetworkClient : [Consumer clientId=consumer-group-1, groupId=consumer-group] Bootstrap broker localhost:29092 (id: -1 rack: null) disconnected
2021-11-10 14:45:34.845 WARN 24984 --- [onsumer-group-1] org.apache.kafka.clients.NetworkClient : [Consumer clientId=consumer-group-1, groupId=consumer-group] Connection to node -1 (localhost/127.0.0.1:29092) could not be established. Broker may not be available.
2021-11-10 14:45:34.845 WARN 24984 --- [onsumer-group-1] org.apache.kafka.clients.NetworkClient : [Consumer clientId=consumer-group-1, groupId=consumer-group] Bootstrap broker localhost:29092 (id: -1 rack: null) disconnected
Is it possible to use something like a circuit breaker (an inspiration here or here), so the Spring Boot Kafka client in case of a failure (or even better a few consecutive failures) slows down the pace of its connection attempts, and returns to the normal pace only after the server is up again?
Is there already a ready-made config parameter, or any other solution?
I am aware of the parameter reconnect.backoff.ms, this is how I create the ReactiveKafkaConsumerTemplate bean:
#Bean
public ReactiveKafkaConsumerTemplate<String, MyEvent> kafkaConsumer(KafkaProperties properties) {
final Map<String, Object> map = new HashMap<>(properties.buildConsumerProperties());
map.put(ConsumerConfig.GROUP_ID_CONFIG, "MyGroup");
map.put(ConsumerConfig.RECONNECT_BACKOFF_MS_CONFIG, 10_000L);
final JsonDeserializer<DisplayCurrencyEvent> jsonDeserializer = new JsonDeserializer<>();
jsonDeserializer.addTrustedPackages("com.example.myapplication");
return new ReactiveKafkaConsumerTemplate<>(
ReceiverOptions
.<String, MyEvent>create(map)
.withKeyDeserializer(new ErrorHandlingDeserializer<>(new StringDeserializer()))
.withValueDeserializer(new ErrorHandlingDeserializer<>(jsonDeserializer))
.subscription(List.of("MyTopic")));
}
And still the consumer is trying to connect every 3 seconds.
See https://kafka.apache.org/documentation/#consumerconfigs_retry.backoff.ms
The base amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all connection attempts by the client to a broker.
and https://kafka.apache.org/documentation/#consumerconfigs_reconnect.backoff.max.ms
The maximum amount of time in milliseconds to wait when reconnecting to a broker that has repeatedly failed to connect. If provided, the backoff per host will increase exponentially for each consecutive connection failure, up to this maximum. After calculating the backoff increase, 20% random jitter is added to avoid connection storms.
and
I'm running my application and Neo4j 4 in Docker compose environment. After starting my application I'm getting some weird logs that connection pool is closing connection to DB (Closing connection pool towards graphdb(172.21.0.4):7687) and after this neo4jClient is unable to query DB (logs below). What is the reason of this behaviour?
BTW. I created Spring Health Check (driver.verifyConnectivity()), but it always return OK (no error is thrown).
Any ideas?
#Configuration
class Neo4jConfiguration {
private val logger = LoggerFactory.getLogger(Neo4jConfiguration::class.java)
#Bean
fun neo4jDriver(
#Value("\${spring.data.neo4j.host}") host: String?,
#Value("\${spring.data.neo4j.port}") port: Int?): Driver {
val connectionUrl = "neo4j://$host:$port"
logger.info("Connecting to Neo4j on `$connectionUrl`")
return GraphDatabase.driver(connectionUrl/*, AuthTokens.basic("neo4j", "secret")*/)
}
#Bean
fun neo4jClient(): ReactiveNeo4jClient = ReactiveNeo4jClient.create(neo4jDriver(null, null))
#Bean
fun neo4jTransactionManager() = ReactiveNeo4jTransactionManager(neo4jDriver(null, null))
}
Docker compose:
version: '3.7'
services:
graphdb:
image: neo4j:4.0.0
ports:
- 7474:7474
- 7687:7687
environment:
NEO4J_AUTH: none
NEO4J_dbms_connectors_default__listen__address: 0.0.0.0
volumes:
- ./docker/neo4j/data:/data
networks:
- things
networks:
things:
name: things
Full logs:
2020-02-24 20:57:32.922 INFO 1 --- [ restartedMain] c.t.r.repo.neo4j.Neo4jConfiguration : Connecting to Neo4j on `neo4j://graphdb:7687`
2020-02-24 20:57:33.321 INFO 1 --- [ restartedMain] Driver : Routing driver instance 656417291 created for server address graphdb:7687
2020-02-24 20:57:43.329 INFO 1 --- [o4jDriverIO-2-3] LoadBalancer : Routing table for database 'system' is stale. Ttl 1582577863326, currentTime 1582577863328, routers AddressSet=[], writers AddressSet=[], readers AddressSet=[], database 'system'
2020-02-24 20:57:43.437 INFO 1 --- [o4jDriverIO-2-2] ConnectionPool : Closing connection pool towards graphdb(172.21.0.4):7687, it has no active connections and is not in the routing table registry.
2020-02-24 20:57:43.440 INFO 1 --- [o4jDriverIO-2-2] LoadBalancer : Updated routing table for database 'system'. Ttl 1582578163422, currentTime 1582577863439, routers AddressSet=[0.0.0.0:7687], writers AddressSet=[0.0.0.0:7687], readers AddressSet=[0.0.0.0:7687], database 'system'
2020-02-24 20:58:02.694 INFO 1 --- [ault-executor-1] LoadBalancer : Routing table for database '<default database>' is stale. Ttl 1582577882693, currentTime 1582577882694, routers AddressSet=[], writers AddressSet=[], readers AddressSet=[], database '<default database>'
2020-02-24 20:58:02.777 INFO 1 --- [o4jDriverIO-2-2] ConnectionPool : Closing connection pool towards graphdb(172.21.0.4):7687, it has no active connections and is not in the routing table registry.
2020-02-24 20:58:02.777 INFO 1 --- [o4jDriverIO-2-2] LoadBalancer : Updated routing table for database '<default database>'. Ttl 1582578182776, currentTime 1582577882777, routers AddressSet=[0.0.0.0:7687], writers AddressSet=[0.0.0.0:7687], readers AddressSet=[0.0.0.0:7687], database '<default database>'
2020-02-24 20:58:02.803 WARN 1 --- [o4jDriverIO-2-2] LoadBalancer : Failed to obtain a connection towards address 0.0.0.0:7687
org.neo4j.driver.exceptions.SessionExpiredException: Server at 0.0.0.0:7687 is no longer available
at org.neo4j.driver.internal.cluster.loadbalancing.LoadBalancer.lambda$acquire$9(LoadBalancer.java:204) ~[neo4j-java-driver-4.0.0.jar:4.0.0-d03d93ede8ad65657eeb90ed890757203ecfaa7a]
at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source) ~[na:na]
at java.base/java.util.concurrent.CompletableFuture.uniWhenCompleteStage(Unknown Source) ~[na:na]
[... removed ...]
Caused by: org.neo4j.driver.exceptions.ServiceUnavailableException: Unable to connect to 0.0.0.0:7687, ensure the database is running and that there is a working network connection to it.
at org.neo4j.driver.internal.async.connection.ChannelConnectedListener.databaseUnavailableError(ChannelConnectedListener.java:76) ~[neo4j-java-driver-4.0.0.jar:4.0.0-d03d93ede8ad65657eeb90ed890757203ecfaa7a]
at org.neo4j.driver.internal.async.connection.ChannelConnectedListener.operationComplete(ChannelConnectedListener.java:70) ~[neo4j-java-driver-4.0.0.jar:4.0.0-d03d93ede8ad65657eeb90ed890757203ecfaa7a]
at org.neo4j.driver.internal.async.connection.ChannelConnectedListener.operationComplete(ChannelConnectedListener.java:37) ~[neo4j-java-driver-4.0.0.jar:4.0.0-d03d93ede8ad65657eeb90ed890757203ecfaa7a]
[... removed ...]
... 7 common frames omitted
Caused by: org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /0.0.0.0:7687
Caused by: java.net.ConnectException: Connection refused
at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:na]
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source) ~[na:na]
[... removed ...]
I have just come across the same problem, after a little experimenting turns out that the ip should be "bolt://$host:$port" and not "neo4j://$host:$port"
Seems like some of the spring.io tutorials are out of date.
No matter how much resource I put to the system it cannot go below 11 min.
I am trying to parallelize a simple Spark program processes HBase data in parallel.
// Get Hbase RDD
JavaPairRDD<ImmutableBytesWritable, Result> hBaseRDD =
jsc.newAPIHadoopRDD(
conf,
TableInputFormat.class,
ImmutableBytesWritable.class,
Result.class
);
long count = hBaseRDD.count();
The problem is my program is as slow as the largest bar.
Found that ZK is taking long time before shutting.
18/05/19 17:26:55 INFO zookeeper.ClientCnxn: Session establishment complete on server <IP>:2181, sessionid = 0x163662b64eb046d, negotiated timeout = 40000
18/05/19 17:38:00 INFO zookeeper.ZooKeeper: Session: 0x163662b64eb046d closed
I am experiencing Embedded InfiniSpan cache issue where nodes timeout on re-joining the cluster.
Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 7 from vvshost
at org.infinispan.remoting.transport.impl.SingleTargetRequest.onTimeout(SingleTargetRequest.java:64)
at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:86)
at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:21)
The only way I can get the node to re-join is to switch off the cache and delete all local cache persistence files.
Here is the configuration which I am using:
Transport:
TransportConfigurationBuilder - defaultClusteredBuild
JMX Statistics - Enabled
Duplicate domains - Allowed
Cache Manager:
Manager Class - EmbeddedCacheManager
Memory - Memory Size: 0
Persistence: Single File Store
async: disabled
Clustering Cache Mode - CacheMode.DIST_SYNC
It seems right to me, but the value of remote-timeout is "15000" milliseconds by default. Increase the timeout until you stop getting the error.
Hope it helps
Components: apache ignite + apache cassandra. Use defaut datastax driver.
After doing several operation(about 3-5 billions entities put to cache) we get a situation when datastax driver always reconnects to cassandra from ignite.
2017-02-16 13:29:21.287 INFO 160487 --- [ sys-#441%null%] m.r.t.d.c.m.p.c.c.p.RetryPolicyImpl : init cluster
2017-02-16 13:29:21.288 INFO 160487 --- [ sys-#441%null%] com.datastax.driver.core.Cluster : New Cassandra host <our host> added
2017-02-16 13:29:21.307 INFO 160487 --- [ sys-#441%null%] m.r.t.d.c.m.p.c.c.p.RetryPolicyImpl : close cluster
2017-02-16 13:29:23.516 INFO 160487 --- [ sys-#441%null%] com.datastax.driver.core.ClockFactory : Using native clock to generate timestamps.
2017-02-16 13:29:23.537 INFO 160487 --- [ sys-#441%null%] c.d.d.c.p.DCAwareRoundRobinPolicy : Using data-center name 'datacenter1' for DCAw
And this process is endless and it can be interrupted by server restarted.
Infrastructure :
1 server ignite - ~Xmx30g and 8 cores.
25 clients ignites ~Xmx1g and 8 cores.
1 node cassandra.
Batch size(entities which will be put to cache and then cassandra) is about 1-2M.
Config datasource =>
public DataSource dataSource() {
DataSource dataSource = new DataSource();
dataSource.setUser(login);
dataSource.setPassword(pass);
dataSource.setPort(port);
dataSource.setContactPoints(host);
dataSource.setRetryPolicy(retryPolicy);
dataSource.setFetchSize(10_000);
dataSource.setReconnectionPolicy(new ConstantReconnectionPolicy(1000));
dataSource.setLoadBalancingPolicy(DCAwareRoundRobinPolicy.builder().withUsedHostsPerRemoteDc(0).build());
dataSource.setSocketOptions(new SocketOptions().setReadTimeoutMillis(100_000).setConnectTimeoutMillis(100_00));
return dataSource;
}
config cache =>
CacheConfiguration<KeyIgnite, Long> cfg = new CacheConfiguration<>();
cfg
.setName(area)
.setRebalanceMode(CacheRebalanceMode.SYNC)
.setStartSize(1_000_000)
.setAtomicityMode(CacheAtomicityMode.ATOMIC)
.setIndexedTypes(KeyIgnite.class, Long.class)
.setCacheMode(CacheMode.PARTITIONED)
.setBackups(0);
if (!clientMode) {
CassandraCacheStoreFactory<KeyIgnite, Long> csFactory = new CassandraCacheStoreFactory<>();
csFactory.setDataSource(ds);
csFactory.setPersistenceSettings(kv);
// CassandraCacheStoreFactoryDwh<KeyIgnite, Long> csFactory = new CassandraCacheStoreFactoryDwh<>(ds, kv,params);
cfg
.setCacheStoreFactory(csFactory)
.setReadThrough(true)
.setWriteThrough(true);
}
cfg.setExpiryPolicyFactory(TouchedExpiryPolicy.factoryOf(new Duration(TimeUnit.DAYS, 5)));
return cfg;
ignite config =>:
TcpDiscoveryMulticastIpFinder finder = new TcpDiscoveryMulticastIpFinder();
finder.setAddresses(adresses);
return Ignition.start(
new IgniteConfiguration()
.setClientMode(clientMode)
.setDiscoverySpi(new TcpDiscoverySpi().setIpFinder(finder).setNetworkTimeout(10000))
.setFailureDetectionTimeout(50000)
.setPeerClassLoadingEnabled(false)
.setLoadBalancingSpi(new RoundRobinLoadBalancingSpi())
);
When we have done several iteration of putting items to cache we have gotten this case.
after including debug level i have gotten this record:
2017-02-17 17:48:41.570 DEBUG 24816 --- [ sys-#184%null%] com.datastax.driver.core.RequestHandler : [1071994376-1] Error querying ds-inmemorydb-02t/10.216.28.34:9042 : com.datastax.driver.core.exceptions.DriverException: Timeout while trying to acquire available connection (you may want to increase the driver number of per-host connections)
The timeout increase on the driver helped