SCDF: Error handling when pod failed to start

SCDF: Error handling when pod failed to start - java

I'm working on a service where it will call Spring Cloud Dataflow (SCDF) to spin off a new k8s Pod for Spring Batch job.
Map<String, String> properties = Map.of("testApp.cpu", cpu, "testApp.memory", memory);
LOGGER.info("Create task '{}' with definition '{}'", taskName, taskDefinition);
taskOperations.create(taskName, taskDefinition);
LOGGER.info("Launching task '{}' with properties {} and arguments '{}'", taskName, properties, args);
return taskOperations.launch(taskName, properties, args);
Everything works fine. The problem is, whenever we pull a non-existing image (eg: due to some connection issue), the pod failed to start AND we end up with pending tasks (with NO batch jobs created whatever)
For example, we will have tasks in the table task_execution (SCDF table) with empty end time
But no related jobs in batch_job_execution table.
It seems fine at first since no pod is created, we don't consume any resource. But as the number of "pending jobs" reached 20, we have the famous error:
Cannot launch task testApp. The maximum concurrent task executions is at its limit [20]
I'm trying to find a way to detect that the pod spin-off has failed (and hence we should mark the task as error), but to no avail.
Is there a way to detect if the task launch has failed when that task launch a new k8s pod?
UPDATE
Not sure if it is relevant, we are using SCDF 1.7.3.RELEASE
Describe the failed pod:
Name: podname-lp2nyowgmm
Namespace: my-namespace
Priority: 1000
Priority Class Name: test-cluster-default
Node: some-ip.compute.internal/XX.XXX.XXX.XX
Start Time: Thu, 14 Jan 2021 18:47:52 +0700
Labels: role=spring-app
spring-app-id=podname-lp2nyowgmm
spring-deployment-id=podname-lp2nyowgmm
task-name=podname
Annotations: iam.amazonaws.com/role: arn:aws:iam::XXXXXXXXXXXX:role/svc-XXXX-XXX-XX-XXXX-X-XXX-XXX-XXXXXXXXXXXXXXXXXXXX
kubernetes.io/psp: eks.privileged
Status: Pending
IP: XX.XXX.XXX.XXX
IPs:
IP: XX.XXX.XXX.XXX
Containers:
podname-lp2nyowgmm:
Container ID:
Image: image_host:XXX/mysystem/myapp:notExist
Image ID:
Port: <none>
Host Port: <none>
Args:
--spring.datasource.username=postgres
--spring.cloud.task.name=podname
--spring.datasource.url=jdbc:postgresql://...
--spring.datasource.driverClassName=org.postgresql.Driver
--spring.datasource.password=XXXX
--fileId=XXXXXXXXXXX
--spring.application.name=app-name
--fileName=file_name.csv
...
--spring.cloud.task.executionid=3
State: Waiting
Reason: ErrImagePull
Ready: False
Restart Count: 0
Limits:
cpu: 2
memory: 8Gi
Requests:
cpu: 2
memory: 8Gi
Environment:
ELASTIC_SEARCH_PORT: 80
ELASTIC_SEARCH_PROTOCOL: http
SPRING_RABBITMQ_PORT: ${RABBITMQ_SERVICE_PORT}
ELASTIC_SEARCH_URL: elasticsearch
SPRING_PROFILES_ACTIVE: kubernetes
CLIENT_SECRET: ${CLIENT_SECRET}
SPRING_RABBITMQ_HOST: ${RABBITMQ_SERVICE_HOST}
RELEASE_ENV_NAME: QA_TEST
SPRING_CLOUD_APPLICATION_GUID: ${HOSTNAME}
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-xxxxx(ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-xxxxx:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-xxxxx
Optional: false
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m22s default-scheduler Successfully assigned my-namespace/podname-lp2nyowgmm to some-ip.compute.internal
Normal Pulling 103s (x4 over 3m21s) kubelet Pulling image "image_host:XXX/mysystem/myapp:notExist"
Warning Failed 102s (x4 over 3m19s) kubelet Failed to pull image "image_host:XXX/mysystem/myapp:notExist": rpc error: code = Unknown desc = Error response from daemon: manifest for image_host:XXX/mysystem/myapp:notExist not found: manifest unknown: manifest unknown
Warning Failed 102s (x4 over 3m19s) kubelet Error: ErrImagePull
Normal BackOff 88s (x6 over 3m19s) kubelet Back-off pulling image "image_host:XXX/mysystem/myapp:notExist"
Warning Failed 73s (x7 over 3m19s) kubelet Error: ImagePullBackOff

1.7.3 is a very old release. We just released 2.7. The original logic used the task execution tables instead of the pod status. If the version you are using is subject to that, then it would explain what you are seeing. I strongly recommend an upgrade.

Thanks for the question. Looking at the source code, we don't include Pendingpods when calculating the current number of executing tasks. It may be something else is going on. 1) Could you run kubectl describe pod on a pod when it's in this state and post the result? (status details). 2) Is the deployer configured to create a job for each task? (false by default).

Related

How can I get notified when money has been sent to a particular Bitcoin address on a local regtest network?

I want to programmatically detect whenever someone sends Bitcoin to some address. This happens on a local testnet which I start using this docker-compose.yml file.
Once the local testnet runs, I create a new address using
docker exec -it minimal-crypto-exchange_node_1 bitcoin-cli getnewaddress
Let's say it returns 2N23tWAFEtBtTgxNjBNmnwzsiPdLcNek181.
I put this address into the following Java code:
import org.bitcoinj.core.Address;
import org.bitcoinj.core.Coin;
import org.bitcoinj.core.NetworkParameters;
import org.bitcoinj.core.Transaction;
import org.bitcoinj.wallet.Wallet;
import org.bitcoinj.wallet.listeners.WalletCoinsReceivedEventListener;
public class WalletObserver {
public void init() {
final NetworkParameters netParams = NetworkParameters.fromID(NetworkParameters.ID_REGTEST);
try {
final Wallet wallet = Wallet.createBasic(netParams);
wallet.addWatchedAddress(Address.fromString(netParams, "2N23tWAFEtBtTgxNjBNmnwzsiPdLcNek181"));
wallet.addCoinsReceivedEventListener(new WalletCoinsReceivedEventListener() {
#Override
public void onCoinsReceived(final Wallet wallet, final Transaction transaction, final Coin prevBalance, final Coin newBalance) {
System.out.println("Heyo!");
}
});
}
catch (Exception exception) {
exception.printStackTrace();
}
}
}
Then I start the Java application with this class.
Then I send some test Bitcoin to the address in question:
% docker exec -it minimal-crypto-exchange_node_1 bitcoin-cli sendtoaddress 2N23tWAFEtBtTgxNjBNmnwzsiPdLcNek181 0.5
068c377bab961356ad9a3919229a764aa929711c68aefd5dbd4c7c348eef3406
If I go to http://localhost:3002/tx/068c377bab961356ad9a3919229a764aa929711c68aefd5dbd4c7c348eef3406, I see that the transaction details.
However, the breakpoint in the listener (onCoinsReceived method) never activates.
How do I need to modify my code and/or the commands I use to send test BTC so that whenever money is received by that account, onCoinsReceived method is called? Is there a place where I can tell Wallet or NetworkParameters that I want to connect to localhost?
I am using version 0.15.10 of bitcoinj-core.
Update 1:
I modified docker-compose.yml and added following port mappings:
ports:
- "51001:50001"
- "51002:50002"
- "19001:19001"
- "19000:19000"
- "28332:28332"
Then I rewrote the init method so that I can connect to localhost and specify the port:
public class WalletObserver {
public void init() {
final LocalTestNetParams netParams = new LocalTestNetParams();
netParams.setPort(50001);
try {
final WalletAppKit kit = new WalletAppKit(netParams, new File("."), "_minimalCryptoExchangeBtcWallet");
kit.setAutoSave(true);
kit.connectToLocalHost();
kit.startAsync();
kit.awaitRunning(); // I never get past this point
kit.peerGroup().addPeerDiscovery(new DnsDiscovery(netParams));
kit.wallet().addWatchedAddress(Address.fromString(netParams, "2N23tWAFEtBtTgxNjBNmnwzsiPdLcNek181"));
kit.wallet().addCoinsReceivedEventListener(new WalletCoinsReceivedEventListener() {
#Override
public void onCoinsReceived(final Wallet wallet, final Transaction transaction, final Coin prevBalance, final Coin newBalance) {
System.out.println("Heyo!");
}
});
}
catch (Exception exception) {
exception.printStackTrace();
}
}
LocalTestNetParams allows to specify the port:
package com.dpisarenko.minimalcryptoexchange.logic.btc;
import org.bitcoinj.params.RegTestParams;
public class LocalTestNetParams extends RegTestParams {
public void setPort(final int newPort) {
this.port = newPort;
}
}
I tried all of the aforementioned ports in netParams.setPort(50001);.
In all cases I get the following messages after kit.awaitRunning();:
22:16:34.245 [PeerGroup Thread] INFO org.bitcoinj.core.PeerGroup - Attempting connection to [10.10.1.218]:50001 (0 connected, 1 pending, 1 max)
22:16:34.265 [NioClientManager] WARN org.bitcoinj.net.NioClientManager - Failed to connect with exception: java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused
at java.base/sun.nio.ch.Net.pollConnect(Native Method)
at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:579)
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:820)
at org.bitcoinj.net.NioClientManager.handleKey(NioClientManager.java:64)
at org.bitcoinj.net.NioClientManager.run(NioClientManager.java:122)
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$2.run(AbstractExecutionThreadService.java:66)
at com.google.common.util.concurrent.Callables$4.run(Callables.java:119)
at org.bitcoinj.utils.ContextPropagatingThreadFactory$1.run(ContextPropagatingThreadFactory.java:51)
at java.base/java.lang.Thread.run(Thread.java:830)
22:16:34.267 [NioClientManager] INFO org.bitcoinj.core.PeerGroup - [10.10.1.218]:50001: Peer died (0 connected, 0 pending, 1 max)
22:16:34.267 [PeerGroup Thread] INFO org.bitcoinj.core.PeerGroup - Peer discovery took 21.84 μs and returned 0 items from 0 discoverers
22:16:34.269 [PeerGroup Thread] INFO org.bitcoinj.core.PeerGroup - Waiting 1502 ms before next connect attempt to [10.10.1.218]:50001
22:16:35.776 [PeerGroup Thread] INFO org.bitcoinj.core.PeerGroup - Attempting connection to [10.10.1.218]:50001 (0 connected, 1 pending, 1 max)
22:16:35.778 [NioClientManager] WARN org.bitcoinj.net.NioClientManager - Failed to connect with exception: java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused
at java.base/sun.nio.ch.Net.pollConnect(Native Method)
at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:579)
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:820)
at org.bitcoinj.net.NioClientManager.handleKey(NioClientManager.java:64)
at org.bitcoinj.net.NioClientManager.run(NioClientManager.java:122)
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$2.run(AbstractExecutionThreadService.java:66)
at com.google.common.util.concurrent.Callables$4.run(Callables.java:119)
at org.bitcoinj.utils.ContextPropagatingThreadFactory$1.run(ContextPropagatingThreadFactory.java:51)
at java.base/java.lang.Thread.run(Thread.java:830)
22:16:35.778 [NioClientManager] INFO org.bitcoinj.core.PeerGroup - [10.10.1.218]:50001: Peer died (0 connected, 0 pending, 1 max)
22:16:35.779 [PeerGroup Thread] INFO org.bitcoinj.core.PeerGroup - Peer discovery took 8.752 μs and returned 0 items from 0 discoverers
10.10.1.218 seems to be generated by InetAddress.getLocalHost() in org.bitcoinj.kits.WalletAppKit#connectToLocalHost:
public WalletAppKit connectToLocalHost() {
try {
InetAddress localHost = InetAddress.getLocalHost();
return this.setPeerNodes(new PeerAddress(this.params, localHost, this.params.getPort()));
} catch (UnknownHostException var2) {
throw new RuntimeException(var2);
}
}
Update 1:
I tried to use network_mode: "host".
If I add it to node as in
node:
image: ulamlabs/bitcoind-custom-regtest:latest
network_mode: "host"
I get the following error when I run docker-compose up -d:
minimal-crypto-exchange % docker-compose up -d
Creating network "minimal-crypto-exchange_default" with the default driver
Creating minimal-crypto-exchange_postgres_1 ... done
Creating minimal-crypto-exchange_geth_1 ...
Creating minimal-crypto-exchange_node_1 ... done
Creating minimal-crypto-exchange_electrumx_1 ...
Creating minimal-crypto-exchange_electrumx_1 ... error
ERROR: for minimal-crypto-exchange_electrumx_1 Cannot start service electrumx: driver fail
Creating minimal-crypto-exchange_geth_1 ... done
f68d0f25a0512399877bc55434513def810649e4fcf31a5a88ca3292d34): Error starting userland proxy: listen tcp4 0.0.0.0:28332: bind: address already in use
Creating minimal-crypto-exchange_blockscout_1 ... done
ERROR: for electrumx Cannot start service electrumx: driver failed programming external connectivity on endpoint minimal-crypto-exchange_electrumx_1 (8eaa4f68d0f25a0512399877bc55434513def810649e4fcf31a5a88ca3292d34): Error starting userland proxy: listen tcp4 0.0.0.0:28332: bind: address already in use
ERROR: Encountered errors while bringing up the project.
If I add it to electrumx part as in
electrumx:
image: lukechilds/electrumx:latest
network_mode: "host"
I get another error:
minimal-crypto-exchange % docker-compose up -d
minimal-crypto-exchange_postgres_1 is up-to-date
minimal-crypto-exchange_geth_1 is up-to-date
Recreating minimal-crypto-exchange_node_1 ...
Recreating minimal-crypto-exchange_node_1 ... done
Recreating minimal-crypto-exchange_electrumx_1 ...
ERROR: for minimal-crypto-exchange_electrumx_1 "host" network_mode is incompatible with port_bindings
ERROR: for electrumx "host" network_mode is incompatible with port_bindings
Traceback (most recent call last):
File "docker-compose", line 3, in <module>
File "compose/cli/main.py", line 81, in main
File "compose/cli/main.py", line 203, in perform_command
File "compose/metrics/decorator.py", line 18, in wrapper
File "compose/cli/main.py", line 1186, in up
File "compose/cli/main.py", line 1166, in up
File "compose/project.py", line 697, in up
File "compose/parallel.py", line 108, in parallel_execute
File "compose/parallel.py", line 206, in producer
File "compose/project.py", line 679, in do
File "compose/service.py", line 579, in execute_convergence_plan
File "compose/service.py", line 499, in _execute_convergence_recreate
File "compose/parallel.py", line 108, in parallel_execute
File "compose/parallel.py", line 206, in producer
File "compose/service.py", line 494, in recreate
File "compose/service.py", line 612, in recreate_container
File "compose/service.py", line 330, in create_container
File "compose/service.py", line 939, in _get_container_create_options
File "compose/service.py", line 1014, in _get_container_host_config
File "docker/api/container.py", line 598, in create_host_config
File "docker/types/containers.py", line 338, in __init__
docker.errors.InvalidArgument: "host" network_mode is incompatible with port_bindings
[44262] Failed to execute script docker-compose
Update 2:
If I comment out port bindings as in
electrumx:
image: lukechilds/electrumx:latest
network_mode: host
links:
- node
# Port settings see https://github.com/ulamlabs/bitcoind-custom-regtest
# ports:
# - "51001:50001"
# - "51002:50002"
# - "19001:19001"
# - "19000:19000"
# - "28332:28332"
and run docker-compose up -d I get
% docker-compose up -d
Creating network "minimal-crypto-exchange_default" with the default driver
Creating minimal-crypto-exchange_geth_1 ...
Creating minimal-crypto-exchange_postgres_1 ... done
Creating minimal-crypto-exchange_node_1 ... done
Creating minimal-crypto-exchange_electrumx_1 ... error
Creating minimal-crypto-exchange_geth_1 ... done
ERROR: for minimal-crypto-exchange_electrumx_1 Cannot create container for service electrumx: conflicting options: host type networking can't be used with links. This would result in undefined behavior
Creating minimal-crypto-exchange_blockscout_1 ... done
ERROR: for electrumx Cannot create container for service electrumx: conflicting options: host type networking can't be used with links. This would result in undefined behavior
ERROR: Encountered errors while bringing up the project.
Update 3: I assume that the root of the error is that in my Java code I try to connect to the ElectrumX server instead of the actual Bitcoin node (node in docker-compose.yml).
Update 4:
I changed docker-compose.yml as follows:
node:
image: ulamlabs/bitcoind-custom-regtest:latest
# For ports used by node see
# https://github.com/ulamlabs/bitcoind-custom-regtest/blob/master/bitcoin.conf
ports:
- "19001:19001"
- "19000:19000"
- "28332:28332"
electrumx:
image: lukechilds/electrumx:latest
links:
- node
# Port settings see https://github.com/ulamlabs/bitcoind-custom-regtest
ports:
- "51001:50001"
- "51002:50002"
# - "19001:19001"
# - "19000:19000"
# - "28332:28332"
Now I am getting different errors (full log available here):
11:33:51.865 [NioClientManager] INFO org.bitcoinj.core.PeerGroup - [192.168.10.208]:19000: Peer died (0 connected, 0 pending, 1 max)
11:33:51.865 [NioClientManager] INFO org.bitcoinj.core.PeerGroup - Not yet setting download peer because there is no clear candidate.
11:33:51.865 [NioClientManager] DEBUG org.bitcoinj.core.BitcoinSerializer - Received 168 byte 'alert' message: 60010000000000000000000000ffffff7f00000000ffffff7ffeffff7f01ffffff7f00000000ffffff7f00ffffff7f002f555247454e543a20416c657274206b657920636f6d70726f6d697365642c2075706772616465207265717569726564004630440220653febd6410f470f6bae11cad19c48413becb1ac2c17f908fd0fd53bdc3abd5202206d0e9c96fe88d4a0f01ed9dedae2b6f9e00da94cad0fecaae66ecf689bf71b50
11:33:51.866 [PeerGroup Thread] INFO org.bitcoinj.core.PeerGroup - Waiting 999 ms before next connect attempt to [127.0.0.1]:19000
11:33:51.866 [NioClientManager] DEBUG org.bitcoinj.core.Peer - Received alert from peer Peer{[192.168.10.208]:19000, version=70015, subVer=/Satoshi:0.19.1(bitcore)/, services=1033 (NETWORK, WITNESS, NETWORK_LIMITED), time=2021-11-06 11:33:52, height=5}: URGENT: Alert key compromised, upgrade required
11:33:51.867 [NioClientManager] WARN org.bitcoinj.net.ConnectionHandler - Error handling SelectionKey: java.nio.channels.CancelledKeyException
java.nio.channels.CancelledKeyException: null
at java.base/sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:71)
at java.base/sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:130)
at java.base/java.nio.channels.SelectionKey.isWritable(SelectionKey.java:377)
at org.bitcoinj.net.ConnectionHandler.handleKey(ConnectionHandler.java:244)
at org.bitcoinj.net.NioClientManager.handleKey(NioClientManager.java:86)
at org.bitcoinj.net.NioClientManager.run(NioClientManager.java:122)
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$2.run(AbstractExecutionThreadService.java:66)
at com.google.common.util.concurrent.Callables$4.run(Callables.java:119)
at org.bitcoinj.utils.ContextPropagatingThreadFactory$1.run(ContextPropagatingThreadFactory.java:51)
at java.base/java.lang.Thread.run(Thread.java:830)
Update 5:
Someone suggested (in a now removed comment) that in the output of the application there is this Peer does not support bloom filtering message:
11:32:43.482 [NioClientManager] INFO org.bitcoinj.core.Peer - Peer{[127.0.0.1]:19000, version=70015, subVer=/Satoshi:0.19.1(bitcore)/, services=1033 (NETWORK, WITNESS, NETWORK_LIMITED), time=2021-11-06 11:32:43, height=4}: Peer does not support bloom filtering.
So I tried to fork the original image and change the bitcoin.conf file to enable Bloom filtering:
peerbloomfilters=1
When I run docker build -t mentiflectax/bitcoind-custom-regtest:latest . I get the following error message (part of remaining output can be found here):
#13 922.4 g++: fatal error: Killed signal terminated program cc1plus
#13 922.4 compilation terminated.
#13 922.4 make[2]: *** [Makefile:8044: libbitcoin_server_a-init.o] Error 1
#13 922.4 make[2]: *** Waiting for unfinished jobs....
#13 965.8 make[2]: Leaving directory '/bitcoin-0.19.1/src'
#13 965.8 make[1]: *** [Makefile:13765: all-recursive] Error 1
#13 965.9 make[1]: Leaving directory '/bitcoin-0.19.1/src'
#13 965.9 make: *** [Makefile:776: all-recursive] Error 1
------
executor failed running [/bin/sh -c tar -xzf *.tar.gz && cd bitcoin-${BITCOIN_VERSION} && sed -i 's/consensus.nSubsidyHalvingInterval = 150/consensus.nSubsidyHalvingInterval = 210000/g' src/chainparams.cpp && ./autogen.sh && ./configure LDFLAGS=-L`ls -d /opt/db`/lib/ CPPFLAGS=-I`ls -d /opt/db`/include/ --prefix=/opt/bitcoin --disable-man --disable-tests --disable-bench --disable-ccache --with-gui=no --enable-util-cli --with-daemon && make -j4 && make install && strip /opt/bitcoin/bin/bitcoin-cli && strip /opt/bitcoin/bin/bitcoind]: exit code: 2
Update 6: The correct port seems to be 19000.
If I use port 19001, I get following errors after kit.awaitRunning():
INFO org.bitcoinj.core.PeerSocketHandler - [127.0.0.1]:19001: Timed out
Full log output is available here.

I haven't tested your full setup with electrumx and the ethereum stuff present in your docker-compose file, but regarding your problem, the following steps worked properly, and I think it will do as well in your complete setup.
I ran with docker a bitcoin node based in the ulamlabs/bitcoind-custom-regtest:latest image you provided:
docker run -p 18444:19000 -d ulamlabs/bitcoind-custom-regtest:latest
As you can see, I exposed the image internal port 19000 as the default port for RegTestParams, 18444. From the point of view of our client, with this setup, basically it will look like as if we were running the bitcoin daemon in the host. Using your LocalTestNetParams class and providing the port 19000 as you indicated should do the trick as well.
Then, according to the feedback you provided in the question, I manually edited the daemon configuration of the bitcoin node in /root/.bitcoin/bitcoin.conf using bash and vi:
docker exec -it 0aa2e863cd9927 bash
And included the following configuration:
peerbloomfilters=1
After restart the container, I got a new address:
docker exec -it 0aa2e863cd9927 bitcoin-cli -regtest getnewaddress
Let's assume that the new address is the one you provided in the question:
2N23tWAFEtBtTgxNjBNmnwzsiPdLcNek181
Then, as suggested in the Bitcoin documentation, in order to avoid an insufficient funds error, I generated 101 blocks to this address:
docker exec -it 0aa2e863cd9927 bitcoin-cli -regtest generatetoaddress 101 2N23tWAFEtBtTgxNjBNmnwzsiPdLcNek181
I used generatetoaddress and not generate because since Bitcoin 0.19.0 the option is no longer valid.
Next, I prepared a simple Java program, based in the information you provided and this example from the Bitcoinj library documentation:
import java.io.File;
import org.bitcoinj.core.Address;
import org.bitcoinj.core.NetworkParameters;
import org.bitcoinj.kits.WalletAppKit;
import org.bitcoinj.params.RegTestParams;
public class Kit {
public static void main(String[] args) {
Kit kit = new Kit();
kit.run();
}
private synchronized void run(){
NetworkParameters params = RegTestParams.get();
WalletAppKit kit = new WalletAppKit(params, new File("."), "walletappkit-example");
kit.connectToLocalHost();
kit.startAsync();
kit.awaitRunning();
kit.wallet().addWatchedAddress(Address.fromString(params, "2N23tWAFEtBtTgxNjBNmnwzsiPdLcNek181"));
kit.wallet().addCoinsReceivedEventListener((wallet, tx, prevBalance, newBalance) -> {
System.out.println("-----> coins resceived: " + tx.getTxId());
});
while (true) {
try {
this.wait(2000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
}
I used a simple while loop to keep the problem running; of course, if will be probably unnecessary in an actual setup as it seems you are using Spring Boot.
Then, if you send some bitcoins to this address:
docker exec -it 0aa2e863cd9927 bitcoin-cli -regtest sendtoaddress 2N23tWAFEtBtTgxNjBNmnwzsiPdLcNek181 0.00001
0f972642713c72ae0fe03fe51818b9ea4d483720b69b90e795f35eb80a587c26
The listener should be invoked:
2021-11-09 23:51:20.537 INFO [NioClientManager][Wallet] Received a pending transaction 0f972642713c72ae0fe03fe51818b9ea4d483720b69b90e795f35eb80a587c26 that spends 0.00 BTC from our own wallet, and sends us 0.00001 BTC
2021-11-09 23:51:20.537 INFO [NioClientManager][Wallet] commitTx of 0f972642713c72ae0fe03fe51818b9ea4d483720b69b90e795f35eb80a587c26
...
2021-11-09 23:51:20.537 INFO [NioClientManager][Wallet] ->pending: 0f972642713c72ae0fe03fe51818b9ea4d483720b69b90e795f35eb80a587c26
2021-11-09 23:51:20.537 INFO [NioClientManager][Wallet] Estimated balance is now: 0.00001 BTC
-----> coins resceived: 0f972642713c72ae0fe03fe51818b9ea4d483720b69b90e795f35eb80a587c26
2021-11-09 23:51:20.538 INFO [NioClientManager][WalletFiles] Saving wallet; last seen block is height 165, date 2021-11-09T22:50:48Z, hash 23451521947bc5ff098c088ae0fc445becca8837d39ee8f6dd88f2c47ad5ac23
2021-11-09 23:51:20.543 INFO [NioClientManager][WalletFiles] Save completed in 4.736 ms
There is still a problem you mentioned that I haven't had the opportunity to test, and it is creating a new Docker image in which the peerbloomfilters configuration would be configured properly without modifying the actual container state. I think the compilation problem you indicated could be related to this issue, basically, that the container didn't have enough resources to perform the process. If you are using macOS and Docker for Mac, try tweaking the amount of memory available to your containers, it may be of help. A change in the base alpine image used can motivate the problem also. I will try digging into the issue as well.

After enabling checkpoint - org.apache.flink.runtime.resourcemanager.exceptions.UnfulfillableSlotRequestException

I have enabled checkpoint in flink 1.12.1 programmatically as below:
int duration = 10 ;
if (!environment.getCheckpointConfig().isCheckpointingEnabled()) {
environment.enableCheckpointing(duration * 6 * 1000, CheckpointingMode.EXACTLY_ONCE);
environment.getCheckpointConfig().setMinPauseBetweenCheckpoints(duration * 3 * 1000);
}
Flink Version: 1.12.1
configuration:
state.backend: rocksdb
state.checkpoints.dir: file:///flink/
blob.server.port: 6124
jobmanager.rpc.port: 6123
parallelism.default: 2
queryable-state.proxy.ports: 6125
taskmanager.numberOfTaskSlots: 2
taskmanager.rpc.port: 6122
jobmanager.memory.process.size: 1600m
taskmanager.memory.process.size: 1728m
jobmanager.web.address: 0.0.0.0
rest.address: 0.0.0.0
rest.bind-address: 0.0.0.0
rest.port: 8081
taskmanager.data.port: 6121
classloader.resolve-order: parent-first
execution.checkpointing.unaligned: false
execution.checkpointing.max-concurrent-checkpoints: 2
execution.checkpointing.interval: 60000
But it is failing with following error:
Caused by: org.apache.flink.runtime.resourcemanager.exceptions.UnfulfillableSlotRequestException: Could not fulfill slot request 44ec308e34aa86629d2034a017b8ef91. Requested resource profile (ResourceProfile{UNKNOWN}) is unfulfillable.
If I remove/disable checkpoint, everything works normally. I have checkpoint requirement because, if my pod, gets restart, data which is being handled by earlier run gets reset.
Can somebody direct, how this can be addressed?

Elastic BeanStalk loading configuration from another region failed

I have uploaded a saved configuration file in a BeanStalk application in a region to another BeanStalk application in another region.
While loading that config I got an error
Stack named 'awseb-e-sme7w3eym3-stack' aborted operation. Current
state: 'CREATE_FAILED' Reason: The following resource(s) failed to
create: [AWSEBLoadBalancer]
Creating load balancer failed Reason: Property Listeners cannot be
empty Any idea about this issue ?
See the config file
AWSConfigurationTemplateVersion: 1.1.0.0
EnvironmentConfigurationMetadata:
DateCreated: '1580272974000'
DateModified: '1580273310143'
Description: xxxxxxxxxxxxxxxxxxxxx
EnvironmentTier:
Name: WebServer
Type: Standard
OptionSettings:
AWSEBAutoScalingGroup.aws:autoscaling:updatepolicy:rollingupdate:
MaxBatchSize: '1'
MinInstancesInService: '1'
RollingUpdateEnabled: true
RollingUpdateType: Health
AWSEBAutoScalingLaunchConfiguration.aws:autoscaling:launchconfiguration:
EC2KeyName: xxxxxxxxxxxxxxxxxxx
AWSEBCloudwatchAlarmHigh.aws:autoscaling:trigger:
UpperThreshold: '60'
AWSEBCloudwatchAlarmLow.aws:autoscaling:trigger:
BreachDuration: '2'
LowerThreshold: '25'
MeasureName: CPUUtilization
Period: '1'
Statistic: Maximum
Unit: Percent
AWSEBLoadBalancerSecurityGroup.aws:ec2:vpc:
VPCId: vpc-xxxxxxxxxxxxxxxx
AWSEBV2LoadBalancerListener.aws:elbv2:listener:default:
ListenerEnabled: false
AWSEBV2LoadBalancerListener443.aws:elbv2:listener:443:
SSLCertificateArns: arn:aws:acm:us-east-2:xxxxxxxxxxx:certificate/xxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx
AWSEBV2LoadBalancerTargetGroup.aws:elasticbeanstalk:environment:process:default:
HealthCheckPath: /rest/account/ping
MatcherHTTPCode: '200'
Port: '80'
Protocol: HTTP
aws:autoscaling:launchconfiguration:
IamInstanceProfile: aws-elasticbeanstalk-ec2-role
SecurityGroups:
- sg-xxxxxxxxxxxxx
aws:ec2:instances:
InstanceTypes: t2.small
aws:ec2:vpc:
ELBSubnets: subnet-xxxxxxxxxxxxxxxxxx,subnet-xxxxxxxxxxxxxxxxxx,subnet-xxxxxxxxxxxxxxx
Subnets: subnet-xxxxxxxxxxxxxxxxx,subnet-xxxxxxxxxxxxxxxxx,subnet-xxxxxxxxxxxxxxxxx
aws:elasticbeanstalk:application:environment:
JDBC_CONNECTION_STRING: jdbc:mysql://xxxxxxxxxxxxxxxxxxxxxxxxxxxx?user=xxxxxxxx&password=xxxxxxxxxxx&rewriteBatchedStatements=true&characterEncoding=UTF-8
aws.accessKeyId: xxxxxxxxxxxxxxxxxx
aws.secretKey: xxxxxxxxxxxxxxxxxxxx
com.aws.secretManger.secret.name: xxxxxxxxxxxxxxx
com.aws.secretManger.secret.region: us-east-2
com.decsond.loggly.token: xxxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx#xxxxx
com.decsond.metakey: xxxxxxxxxxxxxxxxx/XXX==
com.decsond.mode: debug
com.decsond.server.db.environment: aws
com.decsond.server.dpBinaryColumn: xxxxxxxxxxxx
com.decsond.server.environment: xxxxxxxxxx
com.decsond.server.type: pms
aws:elasticbeanstalk:container:tomcat:jvmoptions:
JVM Options: -XX:+CMSClassUnloadingEnabled -Dmvel.disable.jit=true -Ddrools.permgenThreshold=0
Xms: 512m
Xmx: 1024m
aws:elasticbeanstalk:environment:
LoadBalancerType: application
ServiceRole: arn:aws:iam::xxxxxxxxxxxxxx:role/aws-elasticbeanstalk-service-role
aws:elasticbeanstalk:healthreporting:system:
SystemType: enhanced
aws:elasticbeanstalk:managedactions:
ManagedActionsEnabled: true
PreferredStartTime: SAT:03:01
aws:elasticbeanstalk:managedactions:platformupdate:
InstanceRefreshEnabled: true
UpdateLevel: minor
aws:elasticbeanstalk:xray:
XRayEnabled: true
aws:elbv2:listener:443:
DefaultProcess: default
ListenerEnabled: true
Protocol: HTTPS
Rules: ''
SSLPolicy: ELBSecurityPolicy-2016-08
Platform:
PlatformArn: arn:aws:elasticbeanstalk:us-east-2::platform/Tomcat 8.5 with Java 8
running on 64bit Amazon Linux/3.3.1
Any idea about the issue ?

The most likely reason is that you are referencing objects in the region from where the config was saved from.
Is this the first EB application / environment in the new region?
If it is, it's worth first creating a test application and environment, using the features you want ... that will give EB a chance to create all the region specific behind-the-scenes magic it relies on.

IBM MQ - Java api - PCFMessageAgent - connection fails

I changed the MQIVP sample in MQ with a server connection channel of my own local.server.con and it is working fine. But I tried connecting to the same channel with PCFMessageAgent and the connection is failing with errors in MQ log. What is the relation between my channel and SYSTEM.DEFAULT.MODEL.QUEUE which gives the error.
C:\Program Files\IBM\WebSphere MQ\Tools\wmqjava\samples>java -Djava.library.path="C:\Program Files\IBM\WebSphere MQ\java\lib" MQIVPMod
Websphere MQ for Java Installation Verification Program
5724-B4 (C) Copyright IBM Corp. 2002, 2014. All Rights Reserved.
================================================================
Please enter the IP address of the MQ server :10.40.1.16
Please enter the port to connect to : (1414)1415
Please enter the server connection channel name :local.server.con
Please enter the user name (or RETURN for none) :test
Please enter the password for the user :test123
Please enter the queue manager name :local
Success: Connected to queue manager.
Success: Opened SYSTEM.DEFAULT.LOCAL.QUEUE
Success: Put a message to SYSTEM.DEFAULT.LOCAL.QUEUE
Success: Got a message from SYSTEM.DEFAULT.LOCAL.QUEUE
Success: Closed SYSTEM.DEFAULT.LOCAL.QUEUE
Success: Disconnected from queue manager
Tests complete -
SUCCESS: This MQ Transport is functioning correctly.
Press Enter to continue ...
My PCFMessageAgent code and error:
new PCFMessageAgent(host, Integer.parseInt(port), channelName); // connect
com.ibm.mq.MQException: MQJE001: Completion Code '2', Reason '2035'.
at com.ibm.mq.MQDestination.open(MQDestination.java:323)
at com.ibm.mq.MQQueue.<init>(MQQueue.java:236)
at com.ibm.mq.MQQueueManager.accessQueue(MQQueueManager.java:2674)
at com.ibm.mq.pcf.PCFAgent.open(PCFAgent.java:448)
at com.ibm.mq.pcf.PCFAgent.open(PCFAgent.java:394)
at com.ibm.mq.pcf.PCFAgent.connect(PCFAgent.java:287)
at com.ibm.mq.pcf.PCFAgent.<init>(PCFAgent.java:190)
at com.ibm.mq.pcf.PCFMessageAgent.<init>(PCFMessageAgent.java:157)
at test.wmq.PCFTest.main(PCFTest.java:49)
And the MQ log :
5/2/2017 14:01:31 - Process(6048.60) User(MUSR_MQADMIN) Program(amqzlaa0.exe)
Host(BLR_SWG_N09505) Installation(Installation1)
VRMF(8.0.0.4) QMgr(local)
AMQ8077: Entity 'test#blr_swg_n09505' has insufficient authority to access
object 'SYSTEM.DEFAULT.MODEL.QUEUE'.
EXPLANATION:
The specified entity is not authorized to access the required object. The
following requested permissions are unauthorized: get
ACTION:
Ensure that the correct level of authority has been set for this entity against
the required object, or ensure that the entity is a member of a privileged
group.
----- amqzfubn.c : 518 --------------------------------------------------------
5/2/2017 14:01:32 - Process(8004.41) User(MUSR_MQADMIN) Program(amqrmppa.exe)
Host(BLR_SWG_N09505) Installation(Installation1)
VRMF(8.0.0.4) QMgr(local)
AMQ9208: Error on receive from host BLR_SWG_N09505 (10.40.1.16).
EXPLANATION:
An error occurred receiving data from BLR_SWG_N09505 (10.40.1.16) over TCP/IP.
This may be due to a communications failure.
ACTION:
The return code from the TCP/IP recv() call was 10054 (X'2746'). Record these
values and tell the systems administrator.
----- amqccita.c : 4076 -------------------------------------------------------
5/2/2017 14:01:32 - Process(8004.41) User(MUSR_MQADMIN) Program(amqrmppa.exe)
Host(BLR_SWG_N09505) Installation(Installation1)
VRMF(8.0.0.4) QMgr(local)
AMQ9999: Channel 'local.server.con' to host '10.40.1.16' ended abnormally.
EXPLANATION:
The channel program running under process ID 8004(7988) for channel
'local.server.con' ended abnormally. The host name is '10.40.1.16'; in some
cases the host name cannot be determined and so is shown as '????'.
ACTION:
Look at previous error messages for the channel program in the error logs to
determine the cause of the failure. Note that this message can be excluded
completely or suppressed by tuning the "ExcludeMessage" or "SuppressMessage"
attributes under the "QMErrorLog" stanza in qm.ini. Further information can be
found in the System Administration Guide.
----- amqrmrsa.c : 930 --------------------------------------------------------

You need to go and read up on MQ permissions (i.e. authorizations). It is best to do permissions on the group rather than principle (UserId).
setmqaut -m {QM_NAME} -n SYSTEM.ADMIN.COMMAND.QUEUE -t queue -g {GROUP} +put +inq +dsp
setmqaut -m {QM_NAME} -n SYSTEM.DEFAULT.MODEL.QUEUE -t queue -g {GROUP} +get +inq +dsp

There is no relation between channels and model queues.
But I think, that the PCFMessageAgent is trying to create a dynamic queue to use as a ReplyToQ to receive responses, and it seems it tries to create the dynamic queue by opening the SYSTEM.DEFAULT.MODEL.QUEUE.

JMeter test on Netty-based impl produces error for every second request

I've implemented an HTTP service based on the HTTP server example as provided by the netty.io project.
When I execute a GET request to the service URL from command-line (wget) or from a browser, I receive a result as expected.
When I perform a load test using ApacheBench ab -n 100000 -c 8 http://localhost:9000/path/to/service, experience no errors (neither on service nor on ab side) and see fair numbers for request processing duration.
Afterwards, I set up a test plan in JMeter having a thread group with 1 thread and a loop count of 2. I inserted an HTTP request sampler where I simply added the server name localhost, the port number 9000 and the path /path/to/service. Then I also added a View Results Tree and a Summary Report listener.
Finally, I executed the test plan and received one valid response and one error showing the following content:
Thread Name: Thread Group 1-1
Sample Start: 2015-06-04 09:23:12 CEST
Load time: 0
Connect Time: 0
Latency: 0
Size in bytes: 2068
Headers size in bytes: 0
Body size in bytes: 2068
Sample Count: 1
Error Count: 1
Response code: Non HTTP response code: org.apache.http.NoHttpResponseException
Response message: Non HTTP response message: The target server failed to respond
Response headers:
HTTPSampleResult fields:
ContentType:
DataEncoding: null
The associated exception found in response data tab showed the following content
org.apache.http.NoHttpResponseException: The target server failed to respond
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:95)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:61)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
at org.apache.jmeter.protocol.http.sampler.MeasuringConnectionManager$MeasuredConnection.receiveResponseHeader(MeasuringConnectionManager.java:201)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl.executeRequest(HTTPHC4Impl.java:517)
at org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl.sample(HTTPHC4Impl.java:331)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerProxy.sample(HTTPSamplerProxy.java:74)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1146)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1135)
at org.apache.jmeter.threads.JMeterThread.process_sampler(JMeterThread.java:434)
at org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:261)
at java.lang.Thread.run(Thread.java:745)
As I have a similar service already running which receives and processes web tracking data which shows no errors, it might be a problem within my test plan or JMeter .. but I am not sure :-(
Did anyone experience similar behavior? Thanks in advance ;-)

Issue can be related to Keep-Alive management.
Read those:
https://bz.apache.org/bugzilla/show_bug.cgi?id=57921
https://wiki.apache.org/jmeter/JMeterSocketClosed
So your solution is one of those:
If you're sure it's a keep alive issue:
Try jmeter nightly build http://jmeter.apache.org/nightly.html:
Download the _bin and _lib files
Unpack the archives into the same directory structure
The other archives are not needed to run JMeter.
And adapt the value of httpclient4.idletimeout
A workaround is to increase retry or add connection stale check as per :
https://wiki.apache.org/jmeter/JMeterSocketClosed

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

SCDF: Error handling when pod failed to start - java

1.7.3 is a very old release. We just released 2.7. The original logic used the task execution tables instead of the pod status. If the version you are using is subject to that, then it would explain what you are seeing. I strongly recommend an upgrade.

Related

How can I get notified when money has been sent to a particular Bitcoin address on a local regtest network?

After enabling checkpoint - org.apache.flink.runtime.resourcemanager.exceptions.UnfulfillableSlotRequestException

Elastic BeanStalk loading configuration from another region failed

IBM MQ - Java api - PCFMessageAgent - connection fails

JMeter test on Netty-based impl produces error for every second request

Categories

Resources