I have 2 docker images 1) for couchdb and 2) a web application. The web application couldnt able to talk to the couchdb which is running on the same machine.
When I access couchdb directly it is working http://127.0.0.1:5984/_utils/#database/
http://0.0.0.0:5984/_utils/#database/
What am I missing any pointers?
| Caused by: org.apache.http.conn.HttpHostConnectException: Connect to localhost:5984 [localhost/127.0.0.1] failed: Connection refused (Connection refused)
hashgraph_1 | at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:159)
hashgraph_1 | at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:359)
hashgraph_1 | at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381)
hashgraph_1 | at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237)
hashgraph_1 | at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
hashgraph_1 | at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
hashgraph_1 | at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
hashgraph_1 | at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
hashgraph_1 | at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
hashgraph_1 | at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
my docker compose file
version: "3"
services:
hashgraph:
build: "./"
depends_on:
- couchdb
deploy:
replicas: 1
restart_policy:
condition: always
ports:
- "51200-51299:51200-51299"
couchdb:
image: couchdb:2.1
ports:
- "5984:5984"
deploy:
replicas: 1
restart_policy:
condition: always
Output of docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
cc7e37cd6260 hashgraphexperiments_hashgraph "java -jar swirlds.j…" About a minute ago Up About a minute 50200-50299/tcp, 0.0.0.0:51200-51299->51200-51299/tcp hashgraphexperiments_hashgraph_1
9f4767b36aea couchdb:2.1 "tini -- /docker-ent…" 2 hours ago Up About a minute 4369/tcp, 9100/tcp, 0.0.0.0:5984->5984/tcp hashgraphexperiments_couchdb_1
depends_on: it just wait to the other container to be started.
Whenever you want to call the couchdb from the hashgraph container code, you need to use couchdb:5984 instead of localhost:5984
Networking in Compose
You can also explicitly use the links entry instead of depends_on.
The description of links is
links: Link to containers in another service. Either specify both the service name and a link alias (SERVICE:ALIAS), or just the service name.
Links also express dependency between services in the same way as depends_on, so they determine the order of service startup.
version: "3"
services:
hashgraph:
build: .
links:
- couchdb:couchdb
deploy:
replicas: 1
restart_policy:
condition: always
ports:
- "51200-51299:51200-51299"
couchdb:
image: couchdb:2.1
ports:
- 5984:5984
deploy:
replicas: 1
restart_policy:
condition: always
Related
I am trying to deploy my application in docker (on Windows 10), in compose with a Postgres container. When I execute docker-compose up, I see the following log:
Starting postgres ... done
Recreating application ... done
Attaching to postgres, application
postgres |
postgres | PostgreSQL Database directory appears to contain a database; Skipping initialization
postgres |
postgres | 2021-08-20 14:51:49.721 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
postgres | 2021-08-20 14:51:49.721 UTC [1] LOG: listening on IPv6 address "::", port 5432
postgres | 2021-08-20 14:51:49.741 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
postgres | 2021-08-20 14:51:49.858 UTC [21] LOG: database system was interrupted; last known up at 2021-08-20 14:50:34 UTC
postgres | 2021-08-20 14:51:51.363 UTC [21] LOG: database system was not properly shut down; automatic recovery in progress
postgres | 2021-08-20 14:51:51.377 UTC [21] LOG: redo starts at 0/1661A88
postgres | 2021-08-20 14:51:51.377 UTC [21] LOG: invalid record length at 0/1661AC0: wanted 24, got 0
postgres | 2021-08-20 14:51:51.377 UTC [21] LOG: redo done at 0/1661A88
postgres | 2021-08-20 14:51:51.471 UTC [1] LOG: database system is ready to accept connections
Then the container of my application tries to start and after the banner "Spring Boot" etc. I get an error:
application | 2021-08-20 14:52:23.440 ERROR 1 --- [ main] com.zaxxer.hikari.pool.HikariPool : HikariPool-1 - Exception during pool initialization.
application |
application | org.postgresql.util.PSQLException: Connection to localhost:5432 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.
application | at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:303) ~[postgresql-42.2.20.jar!/:42.2.20]
Here is my docker-compose.yml
version: "3"
services:
db:
image: postgres:11.13-alpine
container_name: postgres
ports:
- 5432:5432
volumes:
- ./pgdata:/var/lib/postgresql/data
environment:
- POSTGRES_DB=my_db
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=root
- PGDATA=/var/lib/postgresql/data/mnt
restart: always
app:
build: .
container_name: application
ports:
- 8085:8085
environment:
- POSTGRES_HOST=db
restart: always
links:
- db
my Dockerfile:
FROM openjdk:11
ADD target/my-app.jar my-app.jar
EXPOSE 8085
ENTRYPOINT ["java" , "-jar", "my-app.jar"]
application.properties:
spring.datasource.url=jdbc:postgresql://${POSTGRES_HOST}:5432/my_db
spring.datasource.username=postgres
spring.datasource.password=root
spring.jpa.database-platform=org.hibernate.dialect.PostgreSQLDialect
spring.jpa.hibernate.ddl-auto=none
spring.liquibase.change-log=classpath:liquibase/changelog.xml
logging.level.org.springframework.jdbc.core = TRACE
What is the problem? Why my application looking for Postgres on localhost and doesn't apply enviroment variable? Inside docker container the host for postgres should be different, isn't it? I have even tried to hardcode postgres host in application.properties to jdbc:postgresql://db:5432/my_db , but it continue to use localhost. How can I fix it?
try to use in this way
environment:
POSTGRES_HOST: db
I have configured following Kafka properties for my spring boot based library bundled inside a lib directory of an EAR deployed to Wildfly. I am able to start the spring components successfully by loading the porperty file from classpath (WEB-INF/classes)
spring.autoconfigure.exclude=org.springframework.boot.autoconfigure.gson.GsonAutoConfiguration,org.springframework.boot.autoconfigure.jms.artemis.ArtemisAutoConfiguration,\
org.springframework.boot.autoconfigure.data.web.SpringDataWebAutoConfiguration
spring.kafka.admin.client-id=iris-admin-local
spring.kafka.producer.client-id=iris-producer-local
spring.kafka.producer.retries=3
spring.kafka.producer.properties.max.block.ms=2000
spring.kafka.producer.bootstrap-servers=127.0.0.1:19092
spring.kafka.producer.key-serializer=org.apache.kafka.common.serialization.StringSerializer
spring.kafka.producer.value-serializer=org.springframework.kafka.support.serializer.JsonSerializer
foo.app.kafka.executor.core-pool-size=10
foo.app.kafka.executor.max-pool-size=500
foo.app.kafka.executor.queue-capacity=1000
I run Kafka and zookeeper via docker compose, and the containers are mapped to host ports 12181 and 19092 respectively. The publish fails with the error
19:37:42,914 ERROR [org.springframework.kafka.support.LoggingProducerListener] (swiftalker-3) Exception thrown when sending a message with key='543507' and payload='com.foo.app.kanban.defect.entity.KanbanDefect#84b13' to topic alm_swift-alm:: org.apache.kafka.common.errors.TimeoutException: Topic alm_swift-alm not present in metadata after 2000 ms.
19:37:43,124 WARN [org.apache.kafka.clients.NetworkClient] (kafka-producer-network-thread | iris-producer-local-1) [Producer clientId=iris-producer-local-1] Error connecting to node 6be446692a1f:9092 (id: 1001 rack: null): java.net.UnknownHostException: 6be446692a1f
at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
at java.net.InetAddress.getAllByName(InetAddress.java:1193)
at java.net.InetAddress.getAllByName(InetAddress.java:1127)
at org.apache.kafka.clients.ClientUtils.resolve(ClientUtils.java:110)
at org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.currentAddress(ClusterConnectionStates.java:403)
at org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.access$200(ClusterConnectionStates.java:363)
at org.apache.kafka.clients.ClusterConnectionStates.currentAddress(ClusterConnectionStates.java:151)
at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:955)
at org.apache.kafka.clients.NetworkClient.access$600(NetworkClient.java:73)
at org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.maybeUpdate(NetworkClient.java:1128)
at org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.maybeUpdate(NetworkClient.java:1016)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:547)
at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:324)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:239)
at java.lang.Thread.run(Thread.java:748)
Now this is after having provided spring.kafka.producer.bootstrap-servers=127.0.0.1:19092 property. What's interesting though is
CONTAINER ID NAMES PORTS CREATED STATUS
2133c81ed51d mongo 0.0.0.0:23556->27017/tcp, 0.0.0.0:23557->27018/tcp, 0.0.0.0:23558->27019/tcp 29 minutes ago Up 29 minutes
f18b86d8739e kafka-ui 0.0.0.0:18080->8080/tcp 29 minutes ago Up 29 minutes
6be446692a1f kafka 0.0.0.0:19092->9092/tcp 29 minutes ago Up 29 minutes
873304e1e6a0 zookeeper 2181/tcp, 2888/tcp, 3888/tcp, 8080/tcp 29 minutes ago Up 29 minutes
the Wildfly server error logs show the app is actually connecting to the docker container via it's container ID i.e.
6be446692a1f kafka 0.0.0.0:19092->9092/tcp
from the docker ps -a output and
Error connecting to node 6be446692a1f:9092 (id: 1001 rack: null): java.net.UnknownHostException: 6be446692a1f
I'm confused as to how is the spring boot code, despite the config property referring server over localhost and mapped port 19092, is managing to find a docker container on it's ID and default port and then trying to connect to it? How do I fix this?
Update: The docker compose
version: '3'
networks:
app-tier:
driver: bridge
services:
zookeeper:
image: 'docker.io/bitnami/zookeeper:3-debian-10'
container_name: 'zookeeper'
networks:
- app-tier
volumes:
- 'zookeeper_data:/bitnami'
environment:
- ALLOW_ANONYMOUS_LOGIN=yes
kafka:
image: 'docker.io/bitnami/kafka:2-debian-10'
container_name: 'kafka'
ports:
- 19092:9092
networks:
- app-tier
volumes:
- 'kafka_data:/bitnami'
- /var/run/docker.sock:/var/run/docker.sock
environment:
- KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
- ALLOW_PLAINTEXT_LISTENER=yes
depends_on:
- zookeeper
database:
image: 'mongo'
container_name: 'mongo'
environment:
- MONGO_INITDB_DATABASE='swiftalk_db'
networks:
- app-tier
ports:
- 23556-23558:27017-27019
depends_on:
- kafka
kafka-ui:
container_name: kafka-ui
image: provectuslabs/kafka-ui:latest
ports:
- 18080:8080
networks:
- app-tier
volumes:
- 'mongo_data:/data/db'
depends_on:
- kafka
environment:
- KAFKA_CLUSTERS_0_NAME=local
- KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS=kafka:9092
- KAFKA_CLUSTERS_0_ZOOKEEPER=zookeeper:2181
volumes:
zookeeper_data:
driver: local
kafka_data:
driver: local
mongo_data:
driver: local
You've not shared your Docker Compose so I can't give you the specific fix to make, but in essence you need to configure your advertised listeners correctly. This is the value that the broker provides to the client telling it where to find it when it makes subsequent connections.
Details: https://www.confluent.io/blog/kafka-client-cannot-connect-to-broker-on-aws-on-docker-etc/
i am getting the following error
2020-12-26 23:17:30.499 INFO 1 --- [ main] org.hibernate.dialect.Dialect : HHH000400: Using dialect: org.hibernate.dialect.MySQL57Dialect
licensingservice_1 | Hibernate: drop table if exists licenses
licensingservice_1 | 2020-12-26 23:17:31.006 INFO 1 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Starting...
licensingservice_1 | 2020-12-26 23:17:32.010 ERROR 1 --- [ main] com.zaxxer.hikari.pool.HikariPool : HikariPool-1 - Exception during pool initialization.
licensingservice_1 |
licensingservice_1 | com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure
licensingservice_1 |
licensingservice_1 | The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
licensingservice_1 | at com.mysql.cj.jdbc.exceptions.SQLError.createCommunicationsException(SQLError.java:174) ~[mysql-connector-java-8.0.22.jar:8.0.22]
licensingservice_1 | at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:64) ~[mysql-connector-java-8.0.22.jar:8.0.22]
my docker-compose yml
version : '3'
services:
licensingservice:
image: licensing/licensing-service-ms:0.0.1-SNAPSHOT
ports:
- "8080:8080"
networks:
- my-network
volumes:
- .:/vol/development
depends_on:
- mysqldbserver
mysqldbserver:
image: mysql:5.7
ports:
- "3307:3306"
networks:
- my-network
environment:
MYSQL_DATABASE: license
MYSQL_ROOT_PASSWORD: Spartans#123
container_name: mysqldb
networks:
my-network:
driver: bridge
and my application.properties
spring.jpa.hibernate.ddl-auto=create-drop
spring.datasource.url=jdbc:mysql://mysqldb:3307/license
spring.datasource.username=root
spring.datasource.password=Spartans#123
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQL57Dialect
spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver
spring.jpa.show-sql=true
Try connecting to port 3306 instead. You're exposing port 3306 on the database container to the host machine on port 3307, but that doesn't change anything for communication between services inside the same network.
This is explained in the Docker-Compose docs.
By default Compose sets up a single network for your app. Each container for a service joins the default network and is both reachable by other containers on that network, and discoverable by them at a hostname identical to the container name.
Additionally, you can choose to expose these ports to the outside world by defining a mapping between the host port and container port. However, this has no effect on communication between services inside the same network:
It is important to note the distinction between HOST_PORT and CONTAINER_PORT. [...] Networked service-to-service communication uses the CONTAINER_PORT. When HOST_PORT is defined, the service is accessible outside the swarm as well.
Alright so i have spring-boot java application, mysql db and nginx.
I start them and it looks like this (command i use is docker-compose up that is all i need to execute):
The Error that i have in my application workaround_app_1 is following:
/\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
\\/ ___)| |_)| | | | | || (_| | ) ) ) )
' |____| .__|_| |_|_| |_\__, | / / / /
=========|_|==============|___/=/_/_/_/
:: Spring Boot :: (v2.1.4.RELEASE)
14:46:59.083 INFO [c.b.w.WorkaroundApplication] Starting WorkaroundApplication on 9b42d0d4614b with PID 48 (/app/target/classes started by root in /app)
14:46:59.089 INFO [c.b.w.WorkaroundApplication] The following profiles are active: devdock
14:47:11.985 INFO [o.a.c.h.Http11NioProtocol] Initializing ProtocolHandler ["https-jsse-nio-8080"]
14:47:12.015 INFO [o.a.c.c.StandardService] Starting service [Tomcat]
14:47:12.016 INFO [o.a.c.c.StandardEngine] Starting Servlet engine: [Apache Tomcat/9.0.17]
14:47:13.993 INFO [o.a.c.c.C.[.[.[/workaround]] Initializing Spring embedded WebApplicationContext
14:47:13.996 INFO [o.s.w.c.ContextLoader] Root WebApplicationContext: initialization completed in 14635 ms
14:47:17.906 INFO [c.z.h.HikariDataSource] HikariPool-1 - Starting...
14:47:19.294 ERROR [c.z.h.p.HikariPool] HikariPool-1 - Exception during pool initialization.
com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at com.mysql.cj.jdbc.exceptions.SQLError.createCommunicationsException(SQLError.java:174)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:64)
at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:835)
at com.mysql.cj.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:455)
at com.mysql.cj.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:240)
at com.mysql.cj.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:199)
at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:136)
at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:369)
at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:198)
at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:467)
at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:541)
at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:115)
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:112)
at org.springframework.jdbc.datasource.DataSourceUtils.fetchConnection(DataSourceUtils.java:157)
at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:115)
at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:78)
at org.springframework.jdbc.support.JdbcUtils.extractDatabaseMetaData(JdbcUtils.java:319)
at org.springframework.jdbc.support.JdbcUtils.extractDatabaseMetaData(JdbcUtils.java:356)
at org.springframework.boot.autoconfigure.orm.jpa.DatabaseLookup.getDatabase(DatabaseLookup.java:73)
at org.springframework.boot.autoconfigure.orm.jpa.JpaProperties.determineDatabase(JpaProperties.java:142)
at org.springframework.boot.autoconfigure.orm.jpa.JpaBaseConfiguration.jpaVendorAdapter(JpaBaseConfiguration.java:113)
at org.springframework.boot.autoconfigure.orm.jpa.HibernateJpaConfiguration$$EnhancerBySpringCGLIB$$aadd42f9.CGLIB$jpaVendorAdapter$4(<generated>)
at org.springframework.boot.autoconfigure.orm.jpa.HibernateJpaConfiguration$$EnhancerBySpringCGLIB$$aadd42f9$$FastClassBySpringCGLIB$$962bc1e0.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:244)
at org.springframework.context.annotation.ConfigurationClassEnhancer$BeanMethodInterceptor.intercept(Configuratio
That goes on and one till at the end you see
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:61)
at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:105)
at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:151)
at com.mysql.cj.exceptions.ExceptionFactory.createCommunicationsException(ExceptionFactory.java:167)
at com.mysql.cj.protocol.a.NativeSocketConnection.connect(NativeSocketConnection.java:91)
at com.mysql.cj.NativeSession.connect(NativeSession.java:152)
at com.mysql.cj.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:955)
at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:825)
... 188 common frames omitted
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399)
at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242)
at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224)
at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403)
at java.base/java.net.Socket.connect(Socket.java:591)
at com.mysql.cj.protocol.StandardSocketFactory.connect(StandardSocketFactory.java:155)
at com.mysql.cj.protocol.a.NativeSocketConnection.connect(NativeSocketConnection.java:65)
... 191 common frames omitted
This is my docker-compose:
version: '3'
services:
nginx:
container_name: some-nginx
image: nginx:1.13
restart: always
ports:
- 8080:8080
volumes:
- ./nginx/conf.d:/etc/nginx/conf.d
depends_on:
- app
mysql:
container_name: workaround-mysql
image: mysql/mysql-server:5.7
environment:
MYSQL_DATABASE: workaround
MYSQL_USER: springuser
MYSQL_PASSWORD: admin
MYSQL_ROOT_PASSWORD: admin
MYSQL_ROOT_HOST: '%'
ports:
- "3308:3306"
restart: always
app:
restart: always
build: ./
working_dir: /app
volumes:
- ./:/app
- ~/.m2:/root/.m2
expose:
- "8080"
command: mvn clean spring-boot:run
depends_on:
- mysql
And here is my application.properties:
###################################
#---------DATABASE
###################################
#
# URL for the mysql db
spring.datasource.url=jdbc:mysql://workaround-mysql:3308/workaround?serverTimezone=UTC&max_allowed_packet=15728640
# User name in mysql
spring.datasource.username=springuser
# Password for mysql
spring.datasource.password=admin
My Dockerfile only contains one liner: FROM openjdk:12-jdk
All of that in mind, What is happening why i cannot connect to my database? When i do it out of docker, all works fine on localhost. Bud cant get it working with this setup. Could that be that somehow my workaround_app_1 started sooner then mysql and now cant function? Bud in my docker-compose I specified that it depends on mysql to be started right ?Im new to docker + nginx.
Notes:
I have tried different ports for mysql bud that doesnt seem to be issue. I also dont think its issue with resources or anything related so hardware constraints. Are my configurations proper ? Btw i use Docker for windows, 64bit machine jdk12. I have tried some demo applications and they were working fine.
The ports mapping in your docker-compose.yml is only relevant for the host, so you'll be able to connect to your DB through localhost:3308. But inside your other docker-compose containers (that is, the compose default network), you'd have to use workaround-mysql:3306.
Ok so the issue was , mysql got stuck on :
[Entrypoint] Starting MySQL 5.7.26-1.1.11
Nothing could connect to it.
Application that needed it was starting up and it could not connect so it was throwing errors.
Here is another question of mayne regarding this issue and you can see its solved :
Docker MySQL - can't connect from Spring Boot app to MySQL database
You can connect from host to db in docker container, but you cannot connect to db in docker container from other docker container.
Check mysql settings and allow connections from app container's IP (or docker network).
I've done a setup of predictionIO v0.13 on my linux machine in docker (running in swarm mode). This setup includes:
one container for pio v0.13
one container for elasticsearch v5.6.4
one container for mysql v8.0.16
one container for spark-master v2.3.2
one container for spark-worker v2.3.2
The template I am using is the ecomm-recommender-java, modified for my data. I don't know if I made an error with the template or with the docker setup, but there is something really wrong:
pio build succeeds
pio train fails - with
Exception in thread "main" java.io.IOException: Connection reset by peer
Because of this, I put a lot of logging into my template for various points, and this is what I found:
The train fails after the model is computed. I am using a custom Model class, for holding the logistic-regression model and the various user and product indices.
The model is a PersistentModel. In the save method I put logging after every step. Those are logged, and I can find the saved results in the mounted docker volume, so it seems like the save also succeeds, but after that I get the following exception:
[INFO] [Model] saving user index
[INFO] [Model] saving product index
[INFO] [Model] save done
[INFO] [AbstractConnector] Stopped Spark#20229b7d{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
Exception in thread "main" java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.SessionInputBufferImpl.fill(SessionInputBufferImpl.java:204)
at org.apache.predictionio.shaded.org.apache.http.impl.nio.codecs.AbstractMessageParser.fillBuffer(AbstractMessageParser.java:136)
at org.apache.predictionio.shaded.org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:241)
at org.apache.predictionio.shaded.org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
at org.apache.predictionio.shaded.org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
at org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
at org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
at org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
at org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
at org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
at org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
at org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)
at java.lang.Thread.run(Thread.java:748)
I couldn't find any more relevant in any of the logs, but that's a possibility that I overlooked something.
I tried to play with the train parameters like so:
pio-docker train -- --master local[3] --driver-memory 4g --executor-memory 10g --verbose --num-executors 3
playing with the spark modes (i.e.: --master local[1-3], and not providing that to use the instances in the docker containers)
played with the --driver-memory (from 4g to 10g)
played with the --executor-memory (also from 4g to 10g)
played with the --num-executors number (from 1 to 3)
As most of the google search results are suggested these.
My main problem here is that I don't know from where this exception is coming and how to discover it.
Here is the save and method, which could be relevant:
public boolean save(String id, AlgorithmParams algorithmParams, SparkContext sparkContext) {
try {
logger.info("saving logistic regression model");
logisticRegressionModel.save("/templates/" + id + "/lrm");
logger.info("creating java spark context");
JavaSparkContext jsc = JavaSparkContext.fromSparkContext(sparkContext);
logger.info("saving user index");
userIdIndex.saveAsObjectFile("/templates/" + id + "/indices/user");
logger.info("saving product index");
productIdIndex.saveAsObjectFile("/templates/" + id + "/indices/product");
logger.info("save done");
} catch (IOException e) {
e.printStackTrace();
}
return true;
}
The hardcoded /templates/ is the docker-mounted volume for pio and for spark also.
Expected result is: train completes without error.
I am happy to share more details if necessary, please ask for them, as I am not sure what could be helpful here.
EDIT1: Including docker-compose.yml
version: '3'
networks:
mynet:
driver: overlay
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:5.6.4
environment:
- xpack.graph.enabled=false
- xpack.ml.enabled=false
- xpack.monitoring.enabled=false
- xpack.security.enabled=false
- xpack.watcher.enabled=false
- cluster.name=predictionio
- bootstrap.memory_lock=false
- "ES_JAVA_OPTS=-Xms1g -Xmx1g"
volumes:
- pio-elasticsearch-data:/usr/share/elasticsearch/data
deploy:
replicas: 1
networks:
- mynet
mysql:
image: mysql:8
command: mysqld --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci
environment:
MYSQL_ROOT_PASSWORD: somepass
MYSQL_USER: someuser
MYSQL_PASSWORD: someotherpass
MYSQL_DATABASE: pio
volumes:
- pio-mysql-data:/var/lib/mysql
deploy:
replicas: 1
networks:
- mynet
spark-master:
image: bde2020/spark-master:2.3.2-hadoop2.7
ports:
- "8080:8080"
- "7077:7077"
volumes:
- ./templates:/templates
environment:
- INIT_DAEMON_STEP=setup_spark
deploy:
replicas: 1
networks:
- mynet
spark-worker:
image: bde2020/spark-worker:2.3.2-hadoop2.7
depends_on:
- spark-master
ports:
- "8081:8081"
volumes:
- ./templates:/templates
environment:
- "SPARK_MASTER=spark://spark-master:7077"
deploy:
replicas: 1
networks:
- mynet
pio:
image: tamassoltesz/pio0.13-spark.230:1
ports:
- 7070:7070
- 8000:8000
volumes:
- ./templates:/templates
dns: 8.8.8.8
depends_on:
- mysql
- elasticsearch
- spark-master
environment:
PIO_STORAGE_SOURCES_MYSQL_TYPE: jdbc
PIO_STORAGE_SOURCES_MYSQL_URL: "jdbc:mysql://mysql/pio"
PIO_STORAGE_SOURCES_MYSQL_USERNAME: someuser
PIO_STORAGE_SOURCES_MYSQL_PASSWORD: someuser
PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME: pio_event
PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE: MYSQL
PIO_STORAGE_REPOSITORIES_MODELDATA_NAME: pio_model
PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE: MYSQL
PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE: elasticsearch
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS: predictionio_elasticsearch
PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS: 9200
PIO_STORAGE_SOURCES_ELASTICSEARCH_SCHEMES: http
PIO_STORAGE_REPOSITORIES_METADATA_NAME: pio_meta
PIO_STORAGE_REPOSITORIES_METADATA_SOURCE: ELASTICSEARCH
MASTER: spark://spark-master:7077 #spark master
deploy:
replicas: 1
networks:
- mynet
volumes:
pio-elasticsearch-data:
pio-mysql-data:
I found out what the issue is: somehow the connection to elasticsearch is lost during the long-running train. This is a docker issue, not a predictionIO issue. For now, I "solved" this by not using elasticsearch at all.
Another thing I was not aware of: it does matter where you put your --verbose in the command. Providing it in the way I did originally (like pio train -- --driver-memory 4g --verbose) has no/little effect on the verbosity of the logging. The right way to do so is pio train --verbose -- --driver-memory 4g, so before the --. This way I got much more log, from which the origin of the issue became clear.