I need some help, Our service uses the lettuce 5.1.6 version, and a total of 22 docker nodes are deployed.
Whenever the service is deployed, several docker nodes will appear ERROR: READONLY You can't write against a read only slave.
Restart the problematic docker node ERROR no longer appears
redis server configuration:
8 master 8 slave
stop-writes-on-bgsave-error no
slave-serve-stale-data yes
slave-read-only yes
cluster-enabled yes
cluster-config-file "/data/server/redis-cluster/{port}/conf/node.conf"
lettuce configuration:
ClientResources res = DefaultClientResources.builder()
.commandLatencyPublisherOptions(
DefaultEventPublisherOptions.builder()
.eventEmitInterval(Duration.ofSeconds(5))
.build()
)
.build();
redisClusterClient = RedisClusterClient.create(res, REDIS_CLUSTER_URI);
redisClusterClient.setOptions(
ClusterClientOptions.builder()
.maxRedirects(99)
.socketOptions(SocketOptions.builder().keepAlive(true).build())
.topologyRefreshOptions(
ClusterTopologyRefreshOptions.builder()
.enableAllAdaptiveRefreshTriggers()
.build())
.build());
RedisAdvancedClusterCommands<String, String> command = redisClusterClient.connect().sync();
command.setex("some key", 18000, "some value");
The Exception that appears:
io.lettuce.core.RedisCommandExecutionException: READONLY You can't write against a read only slave.
at io.lettuce.core.ExceptionFactory.createExecutionException(ExceptionFactory.java:135)
at io.lettuce.core.LettuceFutures.awaitOrCancel(LettuceFutures.java:122)
at io.lettuce.core.cluster.ClusterFutureSyncInvocationHandler.handleInvocation(ClusterFutureSyncInvocationHandler.java:123)
at io.lettuce.core.internal.AbstractInvocationHandler.invoke(AbstractInvocationHandler.java:80)
at com.sun.proxy.$Proxy135.setex(Unknown Source)
at com.xueqiu.infra.redis4.RedisClusterImpl.lambda$setex$164(RedisClusterImpl.java:1489)
at com.xueqiu.infra.redis4.RedisClusterImpl$$Lambda$1422/1017847781.apply(Unknown Source)
at com.xueqiu.infra.redis4.RedisClusterImpl.execute(RedisClusterImpl.java:526)
at com.xueqiu.infra.redis4.RedisClusterImpl.executeTotal(RedisClusterImpl.java:491)
at com.xueqiu.infra.redis4.RedisClusterImpl.setex(RedisClusterImpl.java:1489)
In the face of distributed middleware, the client side will put some partitions, sharding and other relationships on the client side for management.
And lettuce is the slot mapping management of redis cluster:
The method adopted is to use an array of slotCache, and cache the node corresponding to each slot locally in the form of an array.
When there is a key that needs to read and write to the server, the slot will be calculated through the CRC16 in the client, and then the node will be obtained in the cache.
When the redis cluster server performs cluster management, it records the mapping relationship between slot and node in the local node.conf of each node.
When ping pong data is exchanged through the gossip protocol, these metadata information are broadcast to form the final consistent metadata information.
However, if there is an error in the slot mapping relationship on the server side, the client side will use these wrong data.
This time the problem appears here. The server part node maps the slot to the slave, so that the slot cached by the client is mapped to the slave node, and the read and write requests are sent to the slave node, resulting in an error.
lettuce source code investigation
1 lettuce initialization Partitions.java
/**
* Update the partition cache. Updates are necessary after the partition details have changed.
*/
public void updateCache() {
synchronized (partitions) {
if (partitions.isEmpty()) {
this.slotCache = EMPTY;
this.nodeReadView = Collections.emptyList();
return;
}
RedisClusterNode[] slotCache = new RedisClusterNode[SlotHash.SLOT_COUNT];
List<RedisClusterNode> readView = new ArrayList<>(partitions.size());
for (RedisClusterNode partition: partitions) {
readView.add(partition);
for (Integer integer: partition.getSlots()) {
slotCache[integer.intValue()] = partition;
}
}
this.slotCache = slotCache;
this.nodeReadView = Collections.unmodifiableCollection(readView);
}
}
2 lettuce send command PooledClusterConnectionProvider.java
private CompletableFuture<StatefulRedisConnection<K, V>> getWriteConnection(int slot) {
CompletableFuture<StatefulRedisConnection<K, V>> writer;// avoid races when reconfiguring partitions.
synchronized (stateLock) {
writer = writers[slot];
}
if (writer == null) {
RedisClusterNode partition = partitions.getPartitionBySlot(slot);
if (partition == null) {
clusterEventListener.onUncoveredSlot(slot);
return Futures.failed(new PartitionSelectorException("Cannot determine a partition for slot "+ slot + ".",
partitions.clone()));
}
// Use always host and port for slot-oriented operations. We don't want to get reconnected on a different
// host because the nodeId can be handled by a different host.
RedisURI uri = partition.getUri();
ConnectionKey key = new ConnectionKey(Intent.WRITE, uri.getHost(), uri.getPort());
ConnectionFuture<StatefulRedisConnection<K, V>> future = getConnectionAsync(key);
return future.thenApply(connection -> {
synchronized (stateLock) {
if (writers[slot] == null) {
writers[slot] = CompletableFuture.completedFuture(connection);
}
}
return connection;
}).toCompletableFuture();
}
return writer;
}
The sending principle of lettuce:
Load the topology when the client starts, and store the mapping relationship between slot and node locally in an array structure slotCache
When sending, after calculating the CRC16 of the key, go to the array slotCache through slot to get the corresponding node, and continue to get the connection of this node
Note that basically in all middleware of this cluster mode, the logic of the client is to obtain the network topology of the server, and then calculate the mapping logic on the client,
Compare the performance analysis of Kafka across computer rooms:
redis cluster information troubleshooting
./bin/redis-cli -h 10.10.28.2 -p 25661 cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size: 3
cluster_current_epoch:8
cluster_my_epoch:6
cluster_stats_messages_ping_sent:615483
cluster_stats_messages_pong_sent:610194
cluster_stats_messages_meet_sent:3
cluster_stats_messages_fail_sent:8
cluster_stats_messages_auth-req_sent:5
cluster_stats_messages_auth-ack_sent:2
cluster_stats_messages_update_sent:4
cluster_stats_messages_sent:1225699
cluster_stats_messages_ping_received:610188
cluster_stats_messages_pong_received:603593
cluster_stats_messages_meet_received:2
cluster_stats_messages_fail_received:4
cluster_stats_messages_auth-req_received:2
cluster_stats_messages_auth-ack_received:2
cluster_stats_messages_received:1213791
./bin/redis-cli -h 10.10.28.2 -p 25661 cluster nodes
5e9d0c185a2ba2fc9564495730c874bea76c15fa 10.10.28.3:25662#35662 slave 2281f330d771ee682221bc6c239afd68e6f20571 0 1595921769000 15 connected
79cb673db12199c32737b959cd82ec9963106558 10.10.25.2:25651#35651 master - 0 1595921770000 18 connected 4096-6143
2281f330d771ee682221bc6c239afd68e6f20571 10.10.28.2:25661#35661 myself,master - 0 1595921759000 15 connected 10240-12287
6a9ea568d6b49360afbb650c712bd7920403ba19 10.10.28.3:25686#35686 master - 0 1595921769000 14 connected 12288-14335
5a12dd423370e6f4085e593f9cd0b3a4ddfa9757 10.10.27.2:25656#35656 master - 0 1595921771000 13 connected 14336-16383
f5148dba1127bd9bada8ecc39341a0b72ef25d8e 10.10.25.3:25652#35652 slave 79cb673db12199c32737b959cd82ec9963106558 0 1595921769000 18 connected
f6788b4829e601642ed4139548153830c430b932 10.10.26.3:25666#35666 master - 0 1595921769870 16 connected 8192-10239
f54cfebc12c69725f471d16133e7ca3a8567dc18 10.10.28.15:25687#35687 slave 6a9ea568d6b49360afbb650c712bd7920403ba19 0 1595921763000 14 connected
f09ad21effff245cae23c024a8a886f883634f5c 10.10.28.15:25667#35667 slave f6788b4829e601642ed4139548153830c430b932 0 1595921770870 16 connected
ff5f5a56a7866f32e84ec89482aabd9ca1f05e20 10.10.25.3:25681#35681 master - 0 1595921773876 0 connected 0-2047
19c57214e4293b2e37d881534dcd55318fa96a70 10.10.50.16:25677#35677 slave 5f677e012808b09c67316f6ac5bdf0ec005cd598 0 1595921768000 17 connected
d8b4f99e0f9961f2e866b92e7351760faa3e0f2b 10.10.30.9:25671#35671 master - 0 1595921773000 6 connected 2048-4095
068e3bc73c27782c49782d30b66aa8b1140666ce 10.10.27.3:25682#35682 slave ff5f5a56a7866f32e84ec89482aabd9ca1f05e20 0 1595921771872 12 connected
e8b0311aeec4e3d285028abc377f0c277f9a5c74 10.10.49.9:25672#35672 slave d8b4f99e0f9961f2e866b92e7351760faa3e0f2b 0 1595921770000 6 connected
f03bc2ca91b3012f4612ecbc8c611c9f4a0e1305 10.10.27.3:25657#35657 slave 5a12dd423370e6f4085e593f9cd0b3a4ddfa9757 0 1595921762000 13 connected
5f677e012808b09c67316f6ac5bdf0ec005cd598 10.10.50.7:25676#35676 master - 0 1595921772873 17 connected 6144-8191
./bin/redis-cli -h 10.10.28.3 -p 25662 cluster nodes
f5148dba1127bd9bada8ecc39341a0b72ef25d8e 10.10.25.3:25652#35652 slave 79cb673db12199c32737b959cd82ec9963106558 0 1595921741000 18 connected
f6788b4829e601642ed4139548153830c430b932 10.10.26.3:25666#35666 master - 0 1595921744000 16 connected 8192-10239
f03bc2ca91b3012f4612ecbc8c611c9f4a0e1305 10.10.27.3:25657#35657 slave 5a12dd423370e6f4085e593f9cd0b3a4ddfa9757 0 1595921740000 13 connected
5f677e012808b09c67316f6ac5bdf0ec005cd598 10.10.50.7:25676#35676 master - 0 1595921743127 17 connected 6144-8191
79cb673db12199c32737b959cd82ec9963106558 10.10.25.2:25651#35651 master - 0 1595921743000 18 connected 4096-6143
2281f330d771ee682221bc6c239afd68e6f20571 10.10.28.2:25661#35661 master - 0 1595921744129 15 connected 10240-12287
f09ad21effff245cae23c024a8a886f883634f5c 10.10.28.15:25667#35667 slave f6788b4829e601642ed4139548153830c430b932 0 1595921740000 16 connected
f54cfebc12c69725f471d16133e7ca3a8567dc18 10.10.28.15:25687#35687 slave 6a9ea568d6b49360afbb650c712bd7920403ba19 0 1595921745130 14 connected
5e9d0c185a2ba2fc9564495730c874bea76c15fa 10.10.28.3:25662#35662 myself,slave 2281f330d771ee682221bc6c239afd68e6f20571 0 1595921733000 5 connected 0-1820
068e3bc73c27782c49782d30b66aa8b1140666ce 10.10.27.3:25682#35682 slave ff5f5a56a7866f32e84ec89482aabd9ca1f05e20 0 1595921744000 12 connected
d8b4f99e0f9961f2e866b92e7351760faa3e0f2b 10.10.30.9:25671#35671 master - 0 1595921739000 6 connected 2048-4095
5a12dd423370e6f4085e593f9cd0b3a4ddfa9757 10.10.27.2:25656#35656 master - 0 1595921742000 13 connected 14336-16383
ff5f5a56a7866f32e84ec89482aabd9ca1f05e20 10.10.25.3:25681#35681 master - 0 1595921746131 0 connected 1821-2047
6a9ea568d6b49360afbb650c712bd7920403ba19 10.10.28.3:25686#35686 master - 0 1595921747133 14 connected 12288-14335
19c57214e4293b2e37d881534dcd55318fa96a70 10.10.50.16:25677#35677 slave 5f677e012808b09c67316f6ac5bdf0ec005cd598 0 1595921742126 17 connected
e8b0311aeec4e3d285028abc377f0c277f9a5c74 10.10.49.9:25672#35672 slave d8b4f99e0f9961f2e866b92e7351760faa3e0f2b 0 1595921745000 6 connected
./bin/redis-cli -h 10.10.49.9 -p 25672 cluster nodes
d8b4f99e0f9961f2e866b92e7351760faa3e0f2b 10.10.30.9:25671#35671 master - 0 1595921829000 6 connected 2048-4095
79cb673db12199c32737b959cd82ec9963106558 10.10.25.2:25651#35651 master - 0 1595921830000 18 connected 4096-6143
ff5f5a56a7866f32e84ec89482aabd9ca1f05e20 10.10.25.3:25681#35681 master - 0 1595921830719 0 connected 0-1820
f54cfebc12c69725f471d16133e7ca3a8567dc18 10.10.28.15:25687#35687 slave 6a9ea568d6b49360afbb650c712bd7920403ba19 0 1595921827000 14 connected
5f677e012808b09c67316f6ac5bdf0ec005cd598 10.10.50.7:25676#35676 master - 0 1595921827000 17 connected 6144-8191
2281f330d771ee682221bc6c239afd68e6f20571 10.10.28.2:25661#35661 master - 0 1595921822000 15 connected 10240-12287
5e9d0c185a2ba2fc9564495730c874bea76c15fa 10.10.28.3:25662#35662 slave 2281f330d771ee682221bc6c239afd68e6f20571 0 1595921828714 15 connected
068e3bc73c27782c49782d30b66aa8b1140666ce 10.10.27.3:25682#35682 slave ff5f5a56a7866f32e84ec89482aabd9ca1f05e20 0 1595921832721 12 connected
6a9ea568d6b49360afbb650c712bd7920403ba19 10.10.28.3:25686#35686 master - 0 1595921825000 14 connected 12288-14335
f5148dba1127bd9bada8ecc39341a0b72ef25d8e 10.10.25.3:25652#35652 slave 79cb673db12199c32737b959cd82ec9963106558 0 1595921830000 18 connected
19c57214e4293b2e37d881534dcd55318fa96a70 10.10.50.16:25677#35677 slave 5f677e012808b09c67316f6ac5bdf0ec005cd598 0 1595921829716 17 connected
e8b0311aeec4e3d285028abc377f0c277f9a5c74 10.10.49.9:25672#35672 myself,slave d8b4f99e0f9961f2e866b92e7351760faa3e0f2b 0 1595921832000 4 connected 1821-2047
f09ad21effff245cae23c024a8a886f883634f5c 10.10.28.15:25667#35667 slave f6788b4829e601642ed4139548153830c430b932 0 1595921826711 16 connected
f03bc2ca91b3012f4612ecbc8c611c9f4a0e1305 10.10.27.3:25657#35657 slave 5a12dd423370e6f4085e593f9cd0b3a4ddfa9757 0 1595921829000 13 connected
f6788b4829e601642ed4139548153830c430b932 10.10.26.3:25666#35666 master - 0 1595921831720 16 connected 8192-10239
5a12dd423370e6f4085e593f9cd0b3a4ddfa9757 10.10.27.2:25656#35656 master - 0 1595921827714 13 connected 14336-16383
./bin/redis-trib.rb check 10.10.30.9:25671
>>> Performing Cluster Check (using node 10.10.30.9:25671)
M: d8b4f99e0f9961f2e866b92e7351760faa3e0f2b 10.10.30.9:25671
slots:2048-4095 (2048 slots) master
1 additional replica(s)
S: e8b0311aeec4e3d285028abc377f0c277f9a5c74 10.10.49.9:25672
slots: (0 slots) slave
········
········
S: f03bc2ca91b3012f4612ecbc8c611c9f4a0e1305 10.10.27.3:25657
slots: (0 slots) slave
replicates 5a12dd423370e6f4085e593f9cd0b3a4ddfa9757
M: 5a12dd423370e6f4085e593f9cd0b3a4ddfa9757 10.10.27.2:25656
slots:14336-16383 (2048 slots) master
1 additional replica(s)
[ERR] Nodes don't agree about configuration!
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
Be suspicious of everything, be vigilant, diligence can make up for one’s weaknesses
At the beginning, I suspected that the cluster is healthy, but due to online phenomena, most of the nodes are normal, and a few do not have problems, restart the problem fix
In the beginning, the information was checked through normal nodes, and no problems were found. Even if the topological information of several nodes was inconsistent through the logs at the beginning, it would be difficult to see the problem without the mapping relationship of the client.
Comparison found that some slots are mapped to slave nodes
After executing the check, it is found that there is a problem with the cluster, which is explained in the following open source related issues
I saw the same issue as you and tried to investigate that.
I figured out that caused by lettuce.
When we run a Redis command, lettuce will analyze and recognize which is the Redis-Endpoint to send the order.
If READ-COMMAND then it will send to slave-node (By setting ReadFrom.Any_Rep). Please note that the other ReadFrom options may change the behavior.
If WRITE-COMMAND then it will send to master-node
To determine what are READ-COMMAND. Lettuce used ReadOnlyCommands class to list all Read commands.
In my case, I used the EVAL command to write a key value to Redis. But Lettuce determines it is READ-COMMAND then send to slave-node => The exception happens.
So please check ReadOnlyCommands class and make sure your write-command does not include that. This is a mistake from the Lettuce team and they already fix this issue from newer versions.
In your version, ReadOnlyCommands for cluster settings is
class ReadOnlyCommands {
private static final Set<CommandType> READ_ONLY_COMMANDS = EnumSet.noneOf(CommandType.class);
static {
for (CommandName commandNames : CommandName.values()) {
READ_ONLY_COMMANDS.add(CommandType.valueOf(commandNames.name()));
}
}
/**
* #param protocolKeyword must not be {#literal null}.
* #return {#literal true} if {#link ProtocolKeyword} is a read-only command.
*/
public static boolean isReadOnlyCommand(ProtocolKeyword protocolKeyword) {
return READ_ONLY_COMMANDS.contains(protocolKeyword);
}
/**
* #return an unmodifiable {#link Set} of {#link CommandType read-only} commands.
*/
public static Set<CommandType> getReadOnlyCommands() {
return Collections.unmodifiableSet(READ_ONLY_COMMANDS);
}
enum CommandName {
ASKING, BITCOUNT, BITPOS, CLIENT, COMMAND, DUMP, ECHO, EVAL, EVALSHA, EXISTS, //
GEODIST, GEOPOS, GEORADIUS, GEORADIUSBYMEMBER, GEOHASH, GET, GETBIT, //
GETRANGE, HEXISTS, HGET, HGETALL, HKEYS, HLEN, HMGET, HSCAN, HSTRLEN, //
HVALS, INFO, KEYS, LINDEX, LLEN, LRANGE, MGET, PFCOUNT, PTTL, //
RANDOMKEY, READWRITE, SCAN, SCARD, SCRIPT, //
SDIFF, SINTER, SISMEMBER, SMEMBERS, SRANDMEMBER, SSCAN, STRLEN, //
SUNION, TIME, TTL, TYPE, ZCARD, ZCOUNT, ZLEXCOUNT, ZRANGE, //
ZRANGEBYLEX, ZRANGEBYSCORE, ZRANK, ZREVRANGE, ZREVRANGEBYLEX, ZREVRANGEBYSCORE, ZREVRANK, ZSCAN, ZSCORE, //
// Pub/Sub commands are no key-space commands so they are safe to execute on slave nodes
PUBLISH, PUBSUB, PSUBSCRIBE, PUNSUBSCRIBE, SUBSCRIBE, UNSUBSCRIBE
So you can check easily.
Solution -> Upgrade version Letture is the best way to do. Or you can try to override this setting
Related
I have enabled checkpoint in flink 1.12.1 programmatically as below:
int duration = 10 ;
if (!environment.getCheckpointConfig().isCheckpointingEnabled()) {
environment.enableCheckpointing(duration * 6 * 1000, CheckpointingMode.EXACTLY_ONCE);
environment.getCheckpointConfig().setMinPauseBetweenCheckpoints(duration * 3 * 1000);
}
Flink Version: 1.12.1
configuration:
state.backend: rocksdb
state.checkpoints.dir: file:///flink/
blob.server.port: 6124
jobmanager.rpc.port: 6123
parallelism.default: 2
queryable-state.proxy.ports: 6125
taskmanager.numberOfTaskSlots: 2
taskmanager.rpc.port: 6122
jobmanager.memory.process.size: 1600m
taskmanager.memory.process.size: 1728m
jobmanager.web.address: 0.0.0.0
rest.address: 0.0.0.0
rest.bind-address: 0.0.0.0
rest.port: 8081
taskmanager.data.port: 6121
classloader.resolve-order: parent-first
execution.checkpointing.unaligned: false
execution.checkpointing.max-concurrent-checkpoints: 2
execution.checkpointing.interval: 60000
But it is failing with following error:
Caused by: org.apache.flink.runtime.resourcemanager.exceptions.UnfulfillableSlotRequestException: Could not fulfill slot request 44ec308e34aa86629d2034a017b8ef91. Requested resource profile (ResourceProfile{UNKNOWN}) is unfulfillable.
If I remove/disable checkpoint, everything works normally. I have checkpoint requirement because, if my pod, gets restart, data which is being handled by earlier run gets reset.
Can somebody direct, how this can be addressed?
I was trying to delete messages from my kafka topic using Java Admin Client API's delete Records method. Following are the steps that i have tried
1. I pushed 20000 records to my TEST-DELETE topic
2. Started a console consumer and consumed all the messages
3. Invoked my java program to delete all those 20k messages
4. Started another console consumer with a different group id. This consumer is not receiving any of the deleted messages
When I checked the file system, I could still see all those 20k records occupying the disk space. My intention is to delete those records forever from file system too.
My Topic configuration is given below along with server.properties settings
Topic:TEST-DELETE PartitionCount:4 ReplicationFactor:1 Configs:cleanup.policy=delete
Topic: TEST-DELETE Partition: 0 Leader: 0 Replicas: 0 Isr: 0
Topic: TEST-DELETE Partition: 1 Leader: 0 Replicas: 0 Isr: 0
Topic: TEST-DELETE Partition: 2 Leader: 0 Replicas: 0 Isr: 0
Topic: TEST-DELETE Partition: 3 Leader: 0 Replicas: 0 Isr: 0
log.retention.hours=24
log.retention.check.interval.ms=60000
log.cleaner.delete.retention.ms=60000
file.delete.delay.ms=60000
delete.retention.ms=60000
offsets.retention.minutes=5
offsets.retention.check.interval.ms=60000
log.cleaner.enable=true
log.cleanup.policy=compact,delete
My delete code is given below
public void deleteRecords(Map<String, Map<Integer, Long>> allTopicPartions) {
Map<TopicPartition, RecordsToDelete> recordsToDelete = new HashMap<>();
allTopicPartions.entrySet().forEach(topicDetails -> {
String topicName = topicDetails.getKey();
Map<Integer, Long> value = topicDetails.getValue();
value.entrySet().forEach(partitionDetails -> {
if (partitionDetails.getValue() != 0) {
recordsToDelete.put(new TopicPartition(topicName, partitionDetails.getKey()),
RecordsToDelete.beforeOffset(partitionDetails.getValue()));
}
});
});
DeleteRecordsResult deleteRecords = this.client.deleteRecords(recordsToDelete);
Map<TopicPartition, KafkaFuture<DeletedRecords>> lowWatermarks = deleteRecords.lowWatermarks();
lowWatermarks.entrySet().forEach(entry -> {
try {
logger.info(entry.getKey().topic() + " " + entry.getKey().partition() + " "
+ entry.getValue().get().lowWatermark());
} catch (Exception ex) {
}
});
}
The output of my java program is given below
2019-06-25 16:21:15 INFO MyKafkaAdminClient:247 - TEST-DELETE 1 5000
2019-06-25 16:21:15 INFO MyKafkaAdminClient:247 - TEST-DELETE 0 5000
2019-06-25 16:21:15 INFO MyKafkaAdminClient:247 - TEST-DELETE 3 5000
2019-06-25 16:21:15 INFO MyKafkaAdminClient:247 - TEST-DELETE 2 5000
My intention is to delete the consumed records from the file system as I am working with limited storage for my kafka broker.
I would like to get some help with my below doubts
I was in the impression that the delete Records will remove the messages from the file system too, but look like I got it wrong!!
How long those deleted records be present in the log directory?
Is there any specific configuration that i need to use in order to remove the records from the files system once the delete Records API is invoked?
Appreciate your help
Thanks
The recommended approach to handle this is to set retention.ms and related configuration values for the topics you're interested in. That way, you can define how long Kafka will store your data until it deletes it, making sure all your downstream consumers have had the chance to pull down the data before it's deleted from the Kafk cluster.
If, however, you still want to force Kafka to delete based on bytes, there's the log.retention.bytes and retention.bytes configuration values. The first one is a cluster-wide setting, the second one is the topic-specific setting, which by default takes whatever the first one is set to, but you can still override it per topic. The retention.bytes number is enforced per partition, so you should multiply it by the total number of topic partitions.
Be aware, however, that if you have a run-away producer that starts generating a lot of data suddenly, and you have it set to a hard byte limit, you might wipe out entire days worth of data in the cluster, and only be left with the last few minutes of data, maybe before even valid consumers can pull down the data from the cluster. This is why it's much better to set your kafka topics to have time-based retention, and not byte-based.
You can find the configuration properties and their explanation in the official Kafka docs: https://kafka.apache.org/documentation/
I have 1 x master node and 1 x slave node setup.
My issue is when running the map reduce processing. The slave node doesn't seem working. Anyone can provide help on how to check, to change and ensure the slave is working?
The config files info can be found on the URL below too
https://drive.google.com/file/d/1ULEe6k2zYnfQDQUQIbz_xR29WgT1DJhB/view
Here are my observation
1) When i check the CPU resources utilization, The slaves doesn't seem working and CPU resources at 0% when running the map reduce job while the master at 44% CPU resources. refer to the attachment.
2) When i run the dfs report it show it has 2 live nodes but on the cluster web it show only 1. Refer to the attachment and below.
3) The total processing time of map reduce is same with or without the slave
-------------------------------------------------
Live datanodes (2):
Name: 192.168.249.128:9866 (node-master)
Hostname: localhost
Decommission Status : Normal
Configured Capacity: 20587741184 (19.17 GB)
DFS Used: 174785723 (166.69 MB)
Non DFS Used: 60308293 (57.51 MB)
DFS Remaining: 20352647168 (18.95 GB)
DFS Used%: 0.85%
DFS Remaining%: 98.86%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Oct 23 11:17:39 PDT 2018
Last Block Report: Tue Oct 23 11:07:32 PDT 2018
Num of Blocks: 93
Name: 192.168.249.129:9866 (node1)
Hostname: localhost
Decommission Status : Normal
Configured Capacity: 20587741184 (19.17 GB)
DFS Used: 85743 (83.73 KB)
Non DFS Used: 33775889 (32.21 MB)
DFS Remaining: 20553879552 (19.14 GB)
DFS Used%: 0.00%
DFS Remaining%: 99.84%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Oct 23 11:17:38 PDT 2018
Last Block Report: Tue Oct 23 11:03:59 PDT 2018
Num of Blocks: 4
You're showing datanodes with dfsreport, not nodemanagers that actually are processing the data. In the YARN UI, you will want to take note of the "Active Nodes" counter, which in your case is 1. That would make sense if the master is a namenode and resource manager while the slave would be a datanode and nodemanager.
Other than that, if you have a non splittable file, for example a ZIP, or your file is less than the block size (by default 128 MB), then only one mapper will process that. Plus, it's not guaranteed that mappers (or reducers) will be distributed evenly over all available resources
Outside of a learning environment, though, 40 GB of storage and 8 GB of RAM would be better spent on multi threading rather than distributed computing (or a proper database; i.e parse files and load them into a queryable store). Or use Spark or Pig, which don't require Hadoop, but are much easier to work with than MapReduce
I have a cluster with 3 nodes in my developer environment, with a keyspace and a replication factor = 2, originally I had only one node in this cluster but then I added 2 more nodes, one by one. Cassandra version is 3.7.
All these nodes are "clones" so I just modified the cassandra.yaml with the corresponding IP for every node.
I've done a repair and cleanup on every node, and in my application, I have a consistency level ONE.
This is the nodetool status output:
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.132.0.4 50.54 GiB 256 70.2% 50dc5baf-b8b3-4e19-8173-cf828afd36af rack1
UN 10.132.0.3 50.31 GiB 256 65.3% 2a45b7a5-41ce-4533-ba63-60fd3c5cc530 rack1
UN 10.132.0.9 33.88 GiB 256 64.5% e601fb16-6608-4e72-a820-dd4661977946 rack1
In the Cassandra.yaml I have only 10.132.0.3 as the seed node.
So at this point, everything works fine and as expected, if I turn down one node everything keeps running "fine" unless if this node is 10.132.0.9, if I turn down this "bad" node everything crashes with the following error:
org.apache.cassandra.exceptions.UnavailableException: Cannot achieve consistency level QUORUM
When I stop the bad node, the good ones show this error in his system.log files (I only copy the error not the entire StackTrace):
ERROR [SharedPool-Worker-1] 2018-02-27 10:59:16,449 QueryMessage.java:128 - Unexpected error during query
com.google.common.util.concurrent.UncheckedExecutionException: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.RuntimeException: org.apache.cassandra.exceptions.UnavailableException: Cannot achieve consistency level QUORUM
I don't understand what's wrong with this node and I don't find a solution...
Edit
My connection code:
cluster_builder = Cluster.builder()
.addContactPoints(serverIP.getCassandraList(sysvar))
.withAuthProvider(new PlainTextAuthProvider(serverIP.getCassandraUser(sysvar), serverIP.getCassandraPwd(sysvar))).withPoolingOptions(poolingOptions)
.withQueryOptions(new QueryOptions().setConsistencyLevel(ConsistencyLevel.ONE));
cluster = cluster_builder.build();
session = cluster.connect(keyspace);
My query:
statement = QueryBuilder.insertInto(keyspace, "measurement_minute").values(this.normal_names, (List<Object>) values);
And the execution:
ResultSetFuture future = session.executeAsync(statement.setConsistencyLevel(ConsistencyLevel.ONE));
I want to mention that I restarted, repaired and cleaned up all the nodes.
You are requesting QUORUM with a replication factor of 2. This won't really work well as you are really requesting ALL. For a quorum, a majority of your nodes need to respond to your query.
You can calculate the node count for a quorum with (RF/2)+1 (using integer arithmetic), so RF=2 gives (2/2)+1=2 - you need both of your replicas and can't have one down. The reason for some queries to work is that those don't use 10.132.0.9.
You can go with a replication factor of RF=3 or use CL.ONE for example.
I create a Source of consumer records using Reactive Kafka as follows:
val settings = ConsumerSettings(system, keyDeserializer, valueDeserializer)
.withBootstrapServers(bootstrapServers)
.withGroupId(groupName)
// what offset to begin with if there's no offset for this group
.withProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest")
// do we want to automatically commit offsets?
.withProperty(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "true")
// auto-commit offsets every 1 minute, in the background
.withProperty(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "1000")
// reconnect every 1 second, when disconnected
.withProperty(ConsumerConfig.RECONNECT_BACKOFF_MS_CONFIG, "1000")
// every session lasts 30 seconds
.withProperty(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "30000")
// send heartbeat every 10 seconds i.e. 1/3 * session.timeout.ms
.withProperty(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG, "10000")
// how many records to fetch in each poll( )
.withProperty(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "100")
Consumer.atMostOnceSource(settings, Subscriptions.topics(topic)).map(_.value)
I have 1 instance of Kafka running on my local machine. I push values into the topic via the console producer and see them printed out. Then I kill Kafka, and restart it to see if the source reconnects.
These are how my logs proceed:
* Connection with /192.168.0.1 disconnected
java.net.ConnectException: Connection refused
* Give up sending metadata request since no node is available
* Consumer interrupted with WakeupException after timeout. Message: null. Current value of akka.kafka.consumer.wakeup-timeout is 3000 milliseconds
* Resuming partition test-events-0
* Error while fetching metadata with correlation id 139 : {test-events=INVALID_REPLICATION_FACTOR}
* Sending metadata request (type=MetadataRequest, topics=test-events) to node 0
* Sending GroupCoordinator request for group mytestgroup to broker 192.168.0.1:9092 (id: 0 rack: null)
* Consumer interrupted with WakeupException after timeout. Message: null. Current value of akka.kafka.consumer.wakeup-timeout is 3000 milliseconds
* Received GroupCoordinator response ClientResponse(receivedTimeMs=1491797713078, latencyMs=70, disconnected=false, requestHeader={api_key=10,api_version=0,correlation_id=166,client_id=consumer-1}, responseBody={error_code=15,coordinator={node_id=-1,host=,port=-1}}) for group mytestgroup
* Error while fetching metadata with correlation id 169 : {test-events=INVALID_REPLICATION_FACTOR}
* Received GroupCoordinator response ClientResponse(receivedTimeMs=1491797716169, latencyMs=72, disconnected=false, requestHeader={api_key=10,api_version=0,correlation_id=196,client_id=consumer-1}, responseBody={error_code=15,coordinator={node_id=-1,host=,port=-1}}) for group mytestgroup
09:45:16.169 [testsystem-akka.kafka.default-dispatcher-16] DEBUG o.a.k.c.c.i.AbstractCoordinator - Group coordinator lookup for group mytestgroup failed: The group coordinator is not available.
09:45:16.169 [testsystem-akka.kafka.default-dispatcher-16] DEBUG o.a.k.c.c.i.AbstractCoordinator - Coordinator discovery failed for group mytestgroup, refreshing metadata
* Initiating API versions fetch from node 2147483647
* Offset commit for group mytestgroup failed: This is not the correct coordinator for this group.
* Marking the coordinator 192.168.43.25:9092 (id: 2147483647 rack: null) dead for group mytestgroup
* The Kafka consumer has closed.
How do I make sure that this Source reconnects and continues processing the logs?
I think you need to have at least 2 brokers. If one fails the other one can do the job and you could restart the other one.