ES Query Exception in Storm Crawler - java

I am using following packages
Apache zookeeper 3.4.14
Apache storm 1.2.3
Apache Maven 3.6.2
ElasticSearch 7.2.0 (hosted locally)
Java 1.8.0_252
aws ec2 medium instance with 4GB ram
I have used this command to increase the virtual memory for jvm(Earlier it was showing error for jvm not having enough memory)
sysctl -w vm.max_map_count=262144
I have created maven package with -
mvn archetype:generate -DarchetypeGroupId=com.digitalpebble.stormcrawler -
DarchetypeArtifactId=storm-crawler-elasticsearch-archetype -DarchetypeVersion=LATEST
Command used for submitting topology
storm jar target/newscrawler-1.0-SNAPSHOT.jar org.apache.storm.flux.Flux --local es-crawler.flux --sleep 30000
when i run this command, it shows my topology is submitted sucessfully, and in elasticsearch status index it shows FETCH_ERROR and also the url from seeds.txt
content index shows no hits in elasticsearch
In worker.log file there were many exceptions of following type-
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_252]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:714) ~[?:1.8.0_252]
at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:174) [stormjar.jar:?]
at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:148) [stormjar.jar:?]
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:351) [stormjar.jar:?]
at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:221) [stormjar.jar:?]
at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) [stormjar.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
2020-06-12 10:31:14.635 c.d.s.e.p.AggregationSpout Thread-46-spout-executor[17 17] [INFO] [spout #7] Populating buffer with nextFetchDate <= 2020-06-12T10:30:50Z
2020-06-12 10:31:14.636 c.d.s.e.p.AggregationSpout Thread-32-spout-executor[19 19] [INFO] [spout #9] Populating buffer with nextFetchDate <= 2020-06-12T10:30:50Z
2020-06-12 10:31:14.636 c.d.s.e.p.AggregationSpout pool-13-thread-1 [ERROR] [spout #7] Exception with ES query
There are following logs in worker.log related to elasticsearch
'Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [http://localhost:9200], URI [/status/_search?typed_keys=true&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&preference=_shards%3A1&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true], status line [HTTP/1.1 503 Service Unavailable]
{"error":{"root_cause":[{"type":"cluster_block_exception","reason":"blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];"}],"type":"cluster_block_exception","reason":"blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];"},"status":503}
'
'
Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [http://localhost:9200], URI [/status/_search?typed_keys=true&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&preference=_shards%3A8&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true], status line [HTTP/1.1 503 Service Unavailable]
{"error":{"root_cause":[],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[]},"status":503}
'
I have checked health of shards, they are in green status.
Earlier i was using Java 11 , with which i was not able to submit topology so i shifted to java 8.
Now topology is submitted sucessfully, but no data is injected in Elasticsearch.
I want to know if there is a problem with version imcompatibility between java and elasticsearch or with any oher package.

Use an absolute path for the seed file and run it in remote mode. The local mode should be used mostly for debugging.
The sleep parameter is (I think) in milliseconds. The command above means that the topology will run for 30 seconds only, which doesn't give it much time to do anything.

Related

Testcontainers + db2 issue

I am having an issue with db2 using testcontainers. I keep receiving a connection refused error.
When running db2 with:
docker run I am able to connect with dbvis.
using fabric8 maven plugin to start the db2 container and again I am able to connect with dbvis
I put a breakpoint in the junit5 test and attempt access db2 and I receive the connection refused.
My db2 testcontainers configuration:
#Testcontainers
public class ArchiveTest {
#Container
private static final Db2Container DB2 = new Db2Container("ibmcom/db2:11.5.7.0").withPrivilegedMode(true)
.acceptLicense().withUsername("db2inst1").withPassword("password").withDatabaseName("BPMF")
.withEnv("ARCHIVE_LOGS", "false").withEnv("PERSISTENT_HOME", "false");
The db2 logs from docker:
(*) Previous setup has not been detected. Creating the users...
(*) Creating users ...
(*) Creating instance ...
DB2 installation is being initialized.
Total estimated time for all tasks to be performed: 309 second(s)
Total number of tasks to be performed: 4
Estimated time 1 second(s)
Description: Setting default global profile registry variables
Task #1 start
Task #1 end
Estimated time 5 second(s)
Description: Initializing instance list
Task #2 start
Task #2 end
Estimated time 300 second(s)
Description: Configuring DB2 instances
Task #3 start
Task #3 end
The execution completed successfully.
Task #4 end
Estimated time 3 second(s)
Description: Updating global profile registry
Task #4 start
For more information see the DB2 installation log at "/tmp/db2icrt.log.72".
(*) Fixing /etc/services file for DB2 ...
DBI1070I Program db2icrt completed successfully.
DBI1446I The db2icrt command is running.
chown: cannot access '/database/config/db2inst1/sqllib/adm/fencedid': No such file or directory
03/16/2022 10:26:18 0 0 SQL1032N No start database manager command was issued.
SQL1032N No start database manager command was issued. SQLSTATE=57019
(*) Cataloging existing databases
(*) Applying Db2 license ...
ls: cannot access /database/data/db2inst1/NODE0000: No such file or directory
LIC1426I This product is now licensed for use as outlined in your License Agreement. USE OF THE PRODUCT CONSTITUTES ACCEPTANCE OF THE TERMS OF THE IBM LICENSE AGREEMENT, LOCATED IN THE FOLLOWING DIRECTORY: "/opt/ibm/db2/V11.5/license/en_US.iso88591"
LIC1402I License added successfully.
(*) Updating DBM CFG parameters ...
(*) Saving the checksum of the current nodelock file ...
successfully.
DB20000I The UPDATE DATABASE MANAGER CONFIGURATION command completed
successfully.
DB20000I The UPDATE DATABASE MANAGER CONFIGURATION command completed
successfully.
DB20000I The UPDATE DATABASE MANAGER CONFIGURATION command completed
(*) Remounting /database with suid...
No Cgroup memory limit detected, instance memory will follow automatic tuning
(*) Nothing appears in the Db2 directory. will skip update/upgrade.
(*) Code level is the same. No update/upgrade needed.
DB2 State : Operable
Starting DB2...
DB2 has not been started
SQL1063N DB2START processing was successful.
03/16/2022 10:26:29 0 0 SQL1063N DB2START processing was successful.
(*) Creating database BPMF ...
(*) User chose to create BPMF database
DB20000I The CREATE DATABASE command completed successfully.
DB20000I The ACTIVATE DATABASE command completed successfully.
(*) Instance and database will not be auto configured. AUTOCONFIG has been set to false.
(*) Log archiving will not be configured as ARCHIVE_LOGS has been set to false.
(*) Skipping TEXT_SEARCH setup for database BPMF because TEXT_SEARCH is not configured for the instance ...
ssh-keygen: generating new host keys: RSA1 RSA DSA ECDSA ED25519
(*) All databases are now active.
(*) Setup has completed.
2022-03-16 12:27:28 | INFO | [main] d.5.7.0]:503 - Container ibmcom/db2:11.5.7.0 started in PT1M27.212S
The error from java is:
Caused by: com.ibm.db2.jcc.am.DisconnectNonTransientConnectionException: [jcc][t4][2043][11550][4.25.13] Exception java.net.ConnectException: Error opening socket to server /127.0.0.1 on port 50,000 with message: Connection refused: connect. ERRORCODE=-4499, SQLSTATE=08001
at com.ibm.db2.jcc.am.b6.a(b6.java:338)
at com.ibm.db2.jcc.am.b6.a(b6.java:435)
at com.ibm.db2.jcc.t4.a0.a(a0.java:445)
at com.ibm.db2.jcc.t4.a0.<init>(a0.java:96)
at com.ibm.db2.jcc.t4.a.b(a.java:366)
at com.ibm.db2.jcc.t4.b.newAgent_(b.java:2148)
at com.ibm.db2.jcc.am.Connection.initConnection(Connection.java:839)
at com.ibm.db2.jcc.am.Connection.<init>(Connection.java:784)
at com.ibm.db2.jcc.t4.b.<init>(b.java:350)
at com.ibm.db2.jcc.DB2SimpleDataSource.getConnection(DB2SimpleDataSource.java:233)
at com.ibm.db2.jcc.DB2SimpleDataSource.getConnection(DB2SimpleDataSource.java:200)
at com.ibm.db2.jcc.DB2Driver.connect(DB2Driver.java:471)
at com.ibm.db2.jcc.DB2Driver.connect(DB2Driver.java:113)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at org.osjava.datasource.SJDataSource.getConnection(SJDataSource.java:115)
at org.osjava.datasource.SJDataSource.getConnection(SJDataSource.java:106)
at org.osjava.datasource.SJDataSource.getConnection(SJDataSource.java:88)
at org.flywaydb.core.internal.jdbc.JdbcUtils.openConnection(JdbcUtils.java:48)
... 105 common frames omitted
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:79)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:607)
at com.ibm.db2.jcc.t4.x.run(x.java:49)
at java.security.AccessController.doPrivileged(Native Method)
at com.ibm.db2.jcc.t4.a0.a(a0.java:431)
... 121 common frames omitted
Caused by: org.springframework.beans.BeanInstantiationException:
Failed to instantiate [org.flywaydb.core.Flyway]: Factory method 'migration' threw exception; nested exception is org.flywaydb.core.internal.exception.FlywaySqlException: Unable to obtain connection from database: [jcc][t4][2043][11550][4.25.13] Exception java.net.ConnectException: Error opening socket to server /127.0.0.1 on port 50,000 with message: Connection refused: connect. ERRORCODE=-4499, SQLSTATE=08001
I have confirmed my JDBC parameters are correct...so I am at a bit of a loss where it is going wrong.
EDIT 1: db2 is running:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
e7046334e6c8 ibmcom/db2:11.5.0.0a "/var/db2_setup/lib/…" About a minute ago Up About a minute 22/tcp, 55000/tcp, 60006-60007/tcp, 0.0.0.0:53444->50000/tcp wizardly_cartwright
ccfe6845bfb1 testcontainers/ryuk:0.3.3 "/app" About a minute ago Up About a minute 0.0.0.0:53439->8080/tcp testcontainers-ryuk-99222438-9340-47ca-b6d2-0a13bfe50f9d
EDIT2: docker-for-java command parameters:
AbstrDockerCmd:34 - Cmd: org.testcontainers.shaded.com.github.dockerjava.core.command.CreateContainerCmdImpl#7df60067[name=<null>,hostName=<null>,domainName=<null>,user=<null>,attachStdin=<null>,attachStdout=<null>,attachStderr=<null>,portSpecs=<null>,tty=<null>,stdinOpen=<null>,stdInOnce=<null>,env={DB2INSTANCE=db2inst1,AUTOCONFIG=false,ARCHIVE_LOGS=false,DB2INST1_PASSWORD=password,PERSISTENT_HOME=false,DBNAME=BPMF,LICENSE=accept},cmd={},healthcheck=<null>,argsEscaped=<null>,entrypoint=<null>,image=ibmcom/db2:11.5.0.0a,volumes=Volumes(volumes=[]),workingDir=<null>,macAddress=<null>,onBuild=<null>,networkDisabled=<null>,exposedPorts=ExposedPorts(exposedPorts=[50000/tcp]),stopSignal=<null>,stopTimeout=<null>,hostConfig=HostConfig(binds=[], blkioWeight=null, blkioWeightDevice=null, blkioDeviceReadBps=null, blkioDeviceWriteBps=null, blkioDeviceReadIOps=null, blkioDeviceWriteIOps=null, memorySwappiness=null, nanoCPUs=null, capAdd=null, capDrop=null, containerIDFile=null, cpuPeriod=null, cpuRealtimePeriod=null, cpuRealtimeRuntime=null, cpuShares=null, cpuQuota=null, cpusetCpus=null, cpusetMems=null, devices=null, deviceCgroupRules=null, deviceRequests=null, diskQuota=null, dns=null, dnsOptions=null, dnsSearch=null, extraHosts=[], groupAdd=null, ipcMode=null, cgroup=null, links=[], logConfig=LogConfig(type=null, config=null), lxcConf=null, memory=null, memorySwap=null, memoryReservation=null, kernelMemory=null, networkMode=null, oomKillDisable=null, init=null, autoRemove=null, oomScoreAdj=null, portBindings={50000/tcp=[Lcom.github.dockerjava.api.model.Ports$Binding;#393881f0}, privileged=true, publishAllPorts=null, readonlyRootfs=null, restartPolicy=null, ulimits=null, cpuCount=null, cpuPercent=null, ioMaximumIOps=null, ioMaximumBandwidth=null, volumesFrom=[], mounts=null, pidMode=null, isolation=null, securityOpts=null, storageOpt=null, cgroupParent=null, volumeDriver=null, shmSize=null, pidsLimit=null, runtime=null, tmpFs=null, utSMode=null, usernsMode=null, sysctls=null, consoleSize=null),labels={org.testcontainers=true, org.testcontainers.sessionId=090442b0-8cc4-4f6e-b07e-1afdfed5ec15},shell=<null>,networkingConfig=<null>,ipv4Address=<null>,ipv6Address=<null>,aliases=<null>,authConfig=<null>,platform=<null>]
As I am using smplie-jndi and have the JDBC parameters in property files, the port for the JDBC URL is not 50000. Busy looking how to set it to a specific port as the default is specified in the DB2Container class
EDIT 1: It's mentioned in the docs
Note that this exposed port number is from the perspective of the container.
From the host's perspective Testcontainers actually exposes this on a random free port. This is by design, to avoid port collisions that may arise with locally running software or in between parallel test runs.
Please make sure to use DB2.getJdbcUrl() or similar access the container after starting it. Testcontainers publishes the exposed ports of the container to a random free host port and this dynamic port needs to be injected into your system under test at runtime. Depending on framework and libraries, there are different ways to achieve these, either configuration properties in Spring, or worst case, by templating config files.

Create Apache Pulsar Sink on Kubernetes cluster with Java API

I am trying to create a Clickhouse sink connector with the Java API on a distant Pulsar cluster running on kubernetes but I experience some difficulties with it.
My cluster run on pulsar 2.8.1
pulsarAdmin.sinks().createSinkWithUrl(
mySinkConfig,
"https://archive.apache.org/dist/pulsar/pulsar-2.8.1/connectors/pulsar-io-jdbc-clickhouse-2.8.1.nar")
The api returns well and seems to create the sink.
I can get its status or its configuration however the status is in failure and when checking the pod on kubernetes I see the pod corresponding to the new sink crashing
NAME READY STATUS RESTARTS AGE
pf-my-tenant-test-sink6-ab222ce8-0 0/1 CrashLoopBackOff 15 57m
with this in the k8s logs
null
Reason: java.io.IOException: No such file or directory
Here is the command used by pulsar when creating the pod
sh -c
/pulsar/bin/pulsar-admin
--auth-plugin org.apache.pulsar.client.impl.auth.AuthenticationToken
--auth-params file:///etc/auth/token
--admin-url https://XXXX:8443/ functions download
--tenant my-tenant
--namespace test
--name sink6
--destination-file /pulsar/download/pulsar_functions/functions17103366778764930042.tmp && SHARD_ID=${POD_NAME##*-} && echo shardId=${SHARD_ID} && exec java -cp /pulsar/instances/java-instance.jar:/pulsar/instances/deps/*
-Dpulsar.functions.extra.dependencies.dir=/pulsar/instances/deps -Dpulsar.functions.instance.classpath=/pulsar/conf:::/pulsar/lib/*:
-Dlog4j.configurationFile=kubernetes_instance_log4j2.xml -Dpulsar.function.log.dir=logs/functions/my-tenant/test/sink6
-Dpulsar.function.log.file=sink6-$SHARD_ID -Xmx1073741824 org.apache.pulsar.functions.instance.JavaInstanceMain
--jar /pulsar/download/pulsar_functions/functions17103366778764930042.tmp --instance_id $SHARD_ID --function_id 16d9bcab-abcd-2f4b-b536-d3fb5d1232ab
--function_version 8807b42e-b1fc-4495-862e-21fe27085eb7
--function_details '{"tenant":"my-tenant","namespace":"test","name":"sink6","className":"org.apache.pulsar.functions.api.utils.IdentityFunction","autoAck":true,"parallelism":1,"source":{"typeClassName":"org.apache.pulsar.client.api.schema.GenericRecord","inputSpecs":{"my-topic":{}},"cleanupSubscription":true},"sink":{"className":"org.apache.pulsar.io.jdbc.ClickHouseJdbcAutoSchemaSink","configs":"{\"userName\":\"XXXXX\",\"password\":\"XXXX\",\"jdbcUrl\":\"jdbc:clickhouse://XXXXXX\",\"tableName\":\"XXXXXXX\"}","typeClassName":"org.apache.pulsar.client.api.schema.GenericRecord"},"resources":{"cpu":1.0,"ram":"1234","disk":"1234"},"componentType":"SINK"}'
--pulsar_serviceurl pulsar+ssl://XXXX:6651/
--client_auth_plugin org.apache.pulsar.client.impl.auth.AuthenticationToken
--client_auth_params file:///etc/auth/token
--use_tls false
--tls_allow_insecure false
--hostname_verification_enabled false
--max_buffered_tuples 1024
--port 9093
--metrics_port 39809
--pending_async_requests 1000
--expected_healthcheck_interval -1
--secrets_provider org.apache.pulsar.functions.secretsprovider.ClearTextSecretsProvider
--cluster_name pulsar-ofgebi
--nar_extraction_directory /tmp
Does anyone has any idea why creating a sink with the Java API could result in such error ?

Error while running HelloActivity Sample Temporal Java program

I'm getting the following error when I run the temporal HelloActivity Java sample:
06:43:55.969 [main] INFO io.temporal.internal.worker.Poller - start(): Poller{options=PollerOptions{maximumPollRateIntervalMilliseconds=1000, maximumPollRatePerSecond=0.0, pollBackoffCoefficient=2.0, pollBackoffInitialInterval=PT0.1S, pollBackoffMaximumInterval=PT1M, pollThreadCount=5, pollThreadNamePrefix='Host Local Workflow Poller'}, identity=23af0cb3-09aa-4cbc-bca2-118cfa79dc96}
06:43:57.291 [Activity Poller taskQueue="HelloActivity", namespace="default": 4] ERROR io.temporal.internal.worker.Poller - Failure in thread Activity Poller taskQueue="HelloActivity", namespace="default": 4
io.grpc.StatusRuntimeException: UNIMPLEMENTED: unknown service temporal.api.workflowservice.v1.WorkflowService
at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:244)
at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:225)
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:142)
at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.pollActivityTaskQueue(WorkflowServiceGrpc.java:2682)
at io.temporal.internal.worker.ActivityPollTask.poll(ActivityPollTask.java:95)
at io.temporal.internal.worker.ActivityPollTask.poll(ActivityPollTask.java:38)
at io.temporal.internal.worker.Poller$PollExecutionTask.run(Poller.java:273)
at io.temporal.internal.worker.Poller$PollLoopTask.run(Poller.java:242)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
As pre-requisite, docker-compose up was executed and I have all 3 services temporalio/web, temporalio/auto-setup, and cassandra running.
Make sure that you are running the same version of the service as the Java SDK used by the samples requires.
It looks like the Samples README wasn't updated with the latest version. As of now (7/21/20) the v0.27.0 is the latest version. So tear down the currently running version of the service:
docker-compose down
then install the latest one:
curl -L https://github.com/temporalio/temporal/releases/download/v0.27.0/docker.tar.gz | tar -xz --strip-components 1 docker/docker-compose.yml
docker-compose up

Ignite gridgain generated project openfile limit issue

I m trying to cache a large dataset of some tables, My server is centos based with 8Go ram and 500Go disk space
I configured my local storage policy to persist and after getting a file open limit issue I tried to make to to 2 000 000 following theses steps
vi /etc/sysctl.conf
fs.file-max = 2000000 (2 million)
:wq
sysctl -p
but even using this fix
and setting the work directory on chmod -x I m still having the following error prompt
SEVERE: Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize partition file: /home/grid-gain-server/gridgain-community-8.7.7/work/db/node00-3273af50-1e97-47fa-a237-29e7dfc2d987/cache-COrderCache/part-56.bin]]
class org.apache.ignite.internal.processors.cache.persistence.StorageException: Failed to initialize partition file: /home/grid-gain-server/gridgain-community-8.7.7/work/db/node00-3273af50-1e97-47fa-a237-29e7dfc2d987/cache-COrderCache/part-56.bin
at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:448)
at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:337)
at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:478)
at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:462)
at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:853)
at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:694)
at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.getOrAllocatePartitionMetas(GridCacheOffheapManager.java:1679)
at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.init0(GridCacheOffheapManager.java:1507)
at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:2137)
at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:429)
at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:4261)
at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.initialValue(GridCacheMapEntry.java:3407)
at org.apache.ignite.internal.processors.cache.GridCacheEntryEx.initialValue(GridCacheEntryEx.java:771)
at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.loadEntry(GridDhtCacheAdapter.java:683)
at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.access$600(GridDhtCacheAdapter.java:103)
at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter$5.apply(GridDhtCacheAdapter.java:633)
at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter$5.apply(GridDhtCacheAdapter.java:629)
at org.apache.ignite.internal.processors.cache.store.GridCacheStoreManagerAdapter$3.apply(GridCacheStoreManagerAdapter.java:535)
at org.apache.ignite.cache.store.jdbc.CacheAbstractJdbcStore$1.call(CacheAbstractJdbcStore.java:469)
at org.apache.ignite.cache.store.jdbc.CacheAbstractJdbcStore$1.call(CacheAbstractJdbcStore.java:433)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.file.FileSystemException: /home/grid-gain-server/gridgain-community-8.7.7/work/db/node00-3273af50-1e97-47fa-a237-29e7dfc2d987/cache-COrderCache/part-56.bin: Too many open files
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newAsynchronousFileChannel(UnixFileSystemProvider.java:196)
at java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:248)
at java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:301)
at org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.<init>(AsyncFileIO.java:56)
at org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory.create(AsyncFileIOFactory.java:43)
at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:420)
... 23 more
Nov 24, 2019 4:54:51 PM java.util.logging.LogManager$RootLogger log
SEVERE: JVM will be halted immediately due to the failure: [failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize partition file: /home/grid-gain-server/gridgain-community-8.7.7/work/db/node00-3273af50-1e97-47fa-a237-29e7dfc2d987/cache-COrderCache/part-56.bin]]
what could I do to fix IT
Adding the following configuration was enough for me to avoid this exception
vi /etc/security/limits.conf
root soft nofile 10240
root hard nofile 20480
Then in /etc/sysctl.conf I appended the max watcher config
fs.inotify.max_user_watches=524288
Knowing that root is my user account name
The values are experimental I m not sure if this is safe but I hadn't any remarquable issue in my VM
I didn't drop the previous configuration
A reboot was needed
Credit to #Stephen Darlington
Just to explain what's going on here: fs.file-max sets an overall limit for the operating system. The stuff in limits.conf set limits for each user. The only other thing I would add is that if you're running Ignite as a user other than root (recommended) you'd change that users limits.

No able to connect to spark cluster via sparklyr package when my custom package method is invoked via OpenCpu

I have created an R package that makes use of the sparklyr capabilities within a dummy hello function. My package does a very simple thing as connection to a spark cluster, print the spark version and disconnect. The package is successfully clean and build and is successfully executed from R and Rstudio.
# Connect to Spark cluster
spark_conn <- sparklyr::spark_connect(master = "spark://elenipc.home:7077", spark_home = '/home/eleni/spark-2.2.0-bin-hadoop2.7/')
# Print the version of Spark
sv<- sparklyr::spark_version(spark_conn)
print(sv)
# Disconnect from Spark
sparklyr::spark_disconnect(spark_conn)
It is very important for me to be able to execute the hello function from OpenCpu rest api. (I have used opencpu api for executing many other custom created packages.)
When invoking opencpu api like:
curl http://localhost/ocpu/user/rstudio/library/myFirstBigDataPackage/R/hello/print -X POST
i get the following response:
Failed while connecting to sparklyr to port (8880) for sessionid (89615): Gateway in port (8880) did not respond.
Path: /home/eleni/spark-2.2.0-bin-hadoop2.7/bin/spark-submit
Parameters: --class, sparklyr.Shell, '/home/rstudio/R/x86_64-pc-linux-gnu-library/3.4/sparklyr/java/sparklyr-2.2-2.11.jar', 8880, 89615
Log: /tmp/ocpu-temp/file26b165c92166_spark.log
---- Output Log ----
Error occurred during initialization of VM
Could not allocate metaspace: 1073741824 bytes
---- Error Log ----
In call:
force(code)
Of course allocate more memory to both java & spark executor does not resolve the issue. permission issues are also discarded as i already configured the etc/apparmor.d/opencpu.d/custom file so as to permit opencpu to have rwx privileges on spark. It seems to be a connectivity issue that i do not know how to face. During method invocation via opencpu api spark logs do not even print something.
For you info my environment configuration is as follows:
java version "1.8.0_65"
R version 3.4.1
RStudio version 1.0.153
spark-2.2.0-bin-hadoop2.7
opencpu 1.5 (compatible with my Ubuntu 14.04.3 LTS)
Thank you very much for you support and time!!!

Categories