I have a spring image that is failing to bind the port correctly, I can see from the logs that the states for the readiness and liveness endpoints are ok.
{"timestamp":"2023-02-08T21:28:40.448Z","level":"DEBUG","thread":"main","logger":"org.springframework.boot.availability.ApplicationAvailabilityBean","message":"Application availability state LivenessState changed to CORRECT","context":"default","nanotime":142674280192537}
{"timestamp":"2023-02-08T21:28:40.451Z","level":"DEBUG","thread":"main","logger":"org.springframework.boot.devtools.restart.Restarter","message":"Creating new Restarter for thread Thread[main,5,main]","context":"default","nanotime":142674282973006}
{"timestamp":"2023-02-08T21:28:40.454Z","level":"DEBUG","thread":"main","logger":"org.springframework.boot.availability.ApplicationAvailabilityBean","message":"Application availability state ReadinessState changed to ACCEPTING_TRAFFIC","context":"default","nanotime":142674286139388}
{"timestamp":"2023-02-08T21:28:40.496Z","level":"DEBUG","thread":"main","logger":"org.springframework.boot.web.servlet.filter.OrderedRequestContextFilter","message":"Filter 'requestContextFilter' configured for use","context":"default","nanotime":142674328227052}
but kubernetes can't reach those endpoints
Port: 8080/TCP
Host Port: 0/TCP
State: Running
Started: Tue, 07 Feb 2023 01:23:27 -0600
Ready: False
Restart Count: 0
Liveness: http-get http://:8080/actuator/health/liveness delay=60s timeout=1s period=30s #success=1 #failure=3
Readiness: http-get http://:8080/actuator/health/readiness delay=60s timeout=1s period=10s #success=1 #failure=3
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m30s default-scheduler Successfully assigned podname to ip-10-8-x-x.ap-northeast-1.compute.internal
Normal Pulled 2m29s kubelet Container image "registry.gitlab.com/integration-image" already present on machine
Normal Created 2m29s kubelet Created container ui-backend
Normal Started 2m29s kubelet Started container ui-backend
Warning Unhealthy 30s (x2 over 60s) kubelet Liveness probe failed: HTTP probe failed with statuscode: 404
Warning Unhealthy 10s (x8 over 80s) kubelet Readiness probe failed: HTTP probe failed with statuscode: 404
If i curl from inside the pod I get the same result. curl localhost:8080/actuator/health/liveness
<!doctype html>
<html lang="en">
<head>
<title>HTTP Status 404 – Not Found</title>
<style type="text/css">body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style>
</head>
<body>
<h1>HTTP Status 404 – Not Found</h1>
</body>
</html>
Weird thing is that if I run the same image with kubectl run the application starts and I get {"status":"UP"} from the liveness and readiness endpoints.
This is my deployment configuration
resource "kubernetes_deployment" "ui_backend" {
metadata {
name = local.name
namespace = local.namespace
labels = {
"app.kubernetes.io/name" = local.name
"app.kubernetes.io/managed-by" = local.managed_by
}
}
spec {
replicas = 1
selector {
match_labels = {
"app.kubernetes.io/name" = local.name
"app.kubernetes.io/managed-by" = local.managed_by
}
}
template {
metadata {
labels = {
"app.kubernetes.io/name" = local.name
"app.kubernetes.io/managed-by" = local.managed_by
}
}
spec {
service_account_name = local.name
image_pull_secrets {
name = "gitlab-auth"
}
container {
image = var.docker_image
name = local.name
env {
name = "JAVA_OPTS"
value = "-Dspring.profiles.active=${var.environment} "
}
port {
container_port = local.application_port
}
liveness_probe {
http_get {
path = "/actuator/health/liveness"
port = local.application_port
}
initial_delay_seconds = 60
period_seconds = 30
}
readiness_probe {
http_get {
path = "/actuator/health/readiness"
port = local.application_port
}
initial_delay_seconds = 60
period_seconds = 10
}
}
}
}
}
}
What could be happening? This behaviour started a week ago, before that it was working correctly and we didn't change anything from the deployment configuration. The image is built with gradle bootBuildImage and therefore the buildpacks.
Any information will be appreciated.
Related
I have an iot edge module in java developed with VS code. Up to now everything was working (meaning: I can send messages to the iot hub without problems). Now I want to add the option to receive messages from the iot hub. As this is not directly implemented, I am subscribing for direct method calls. This is seemingly not working correctly in my case.
But, as soon as I remove the subscription to the method, everything works fine. So, the act of subscription seems to break something.
When I add the code to subscribe for direct method calls, the module still can send messages to the iot hub, but the connection gets lost and re-established all the time in 5 second intervals. After some time (about 5 minutes) I see a message from the iotHubEventCallback (ERROR):
230162 [MQTT Rec: myEdgeDevice/JavaModule] WARN HbIIoTGateway - Connection Status: Retrying
230995 [MQTT Rec: myEdgeDevice/JavaModule] WARN HbIIoTGateway - Connection Status: Connected
240196 [MQTT Rec: myEdgeDevice/JavaModule] WARN HbIIoTGateway - Connection Status: Retrying
240994 [MQTT Rec: myEdgeDevice/JavaModule] WARN HbIIoTGateway - Connection Status: Connected
250228 [MQTT Rec: myEdgeDevice/JavaModule] WARN HbIIoTGateway - Connection Status: Retrying
250229 [azure-iot-sdk-IotHubSendTask] INFO HbIIoTGateway - Direct method # IoT Hub responded to device method acknowledgement with status: ERROR
251005 [MQTT Rec: myEdgeDevice/JavaModule] WARN HbIIoTGateway - Connection Status: Connected
After that, the connection remains stable.
But, I cannot invoke method calls (e.g. from the azure portal), they get a timeout:
Failed to invoke device method: {"message":"GatewayTimeout:{\r\n \"Message\": \"{\\"errorCode\\":504101,\\"trackingId\\":\\"3558e6feadd54b5c9f248bdbf20bd5e0-G:19-TimeStamp:04/26/2019 05:20:00\\",\\"message\\":\\"Timed out waiting for the response from device.\\",\\"info\\":{\\"timeout\\":\\"00:00:10\\"},\\"timestampUtc\\":\\"2019-04-26T05:20:00.6977425Z\\"}\",\r\n \"ExceptionMessage\": \"\"\r\n}"}
The status of the iotedge service seems unproblematic:
Apr 26 05:21:09 peers_docker_dev iotedged[52836]: 2019-04-26T05:21:09Z [INFO] - Checking edge runtime status
Apr 26 05:21:09 peers_docker_dev iotedged[52836]: 2019-04-26T05:21:09Z [INFO] - Edge runtime is running.
Apr 26 05:21:10 peers_docker_dev iotedged[52836]: 2019-04-26T05:21:10Z [INFO] - [mgmt] - - - [2019-04-26 05:21:10.206275755 UTC] "GET /modules?api-version=2018-06-28 HTTP/1.1" 200 OK 1483 "-" "-" pid(52
This is the code in question:
m_client = new ModuleClient(sConnString, IotHubClientProtocol.MQTT);
// Set the callback for messages
m_client.setMessageCallback(INPUT_NAME, new MessageCallback()
{
public IotHubMessageResult execute(Message msg, Object context)
{
App.m_logger.info("Received message from hub: " + new String(msg.getBytes(), Message.DEFAULT_IOTHUB_MESSAGE_CHARSET));
return IotHubMessageResult.COMPLETE;
}
}, m_client);
// Register the callback for connection state change
m_client.registerConnectionStatusChangeCallback(new IotHubConnectionStatusChangeCallback ()
{
public void execute(IotHubConnectionStatus status, IotHubConnectionStatusChangeReason statusChangeReason, Throwable throwable, Object callbackContext)
{
String statusStr = "Connection Status: %s";
switch (status)
{
case CONNECTED:
App.m_logger.warn(String.format(statusStr, "Connected"));
break;
case DISCONNECTED:
App.m_logger.error(String.format(statusStr, "Disconnected"));
if (throwable != null)
{
throwable.printStackTrace();
}
break;
case DISCONNECTED_RETRYING:
App.m_logger.warn(String.format(statusStr, "Retrying"));
break;
default:
break;
}
}
}, null);
// Open client
m_client.open();
// Register to receive direct method calls.
m_client.subscribeToMethod(new DeviceMethodCallback() {
#Override
public DeviceMethodData call(String methodName, Object methodData, Object context) {
App.m_logger.info("Method called:" + methodName);
return new DeviceMethodData(METHOD_SUCCESS, "Executed direct method " + methodName);
}
}, null, new IotHubEventCallback()
{
public void execute(IotHubStatusCode status, Object context)
{
App.m_logger.info("Direct method # IoT Hub responded to device method acknowledgement with status: " + status.name());
}
}, null);
The actual output is shown above. I would like to have a stable connection and receive method calls.
Any ideas what I might be doing wrong?
Thanks for your help!
Just to close this problem: I could make it work by just adding a new device and inside a new module in the azure portal. With those connection strings it works.
I don't know what I may have done wrong with the first device/module. If you have any ideas, please let me know.
Thanks and best regards,
I'm writing a Java class to access to Solr (with SolrJ) in a Kerberized Cloudera Virtual Machine with a static IP address (I'm using VMWare) from a windows machine. The problem is that Kerberos returns me the following error: Server not found in Kerberos database (7) - UNKNOWN_SERVER.
This is the complete error:
KRBError:
cTime is Sun Mar 06 03:49:00 CET 1994 762922140000
sTime is Thu Dec 29 16:11:14 CET 2016 1483024274000
suSec is 413432
error code is 7
error Message is Server not found in Kerberos database
cname is cloudera#CLOUDERA
sname is HTTP/192.168.59.200#CLOUDERA
msgType is 30
The problem is that Kerberos uses the IP address of the Virtual Machines (in which Kerberos is installed) instead of the FQDN (= quickstart.cloudera). In fact in Kerberos exists only HTTP/quickstart.cloudera#CLOUDERA principal.
I also tried to rename the service principal from HTTP/quickstart.cloudera#CLOUDERA to HTTP/192.168.59.200#CLOUDERA and it worked, but I broke all cloudera's internal services that use the HTTP original principal.
In the windows hosts file I put: 192.168.59.200 quickstart.cloudera
This is my krb5.conf:
[libdefaults]
default_realm = CLOUDERA
rdns = true
dns_lookup_kdc = true
dns_lookup_realm = true
dns_canonicalize_hostname = true
ignore_acceptor_hostname = true
ticket_lifetime = 86400
renew_lifetime = 604800
forwardable = true
default_tgs_enctypes = rc4-hmac
default_tkt_enctypes = rc4-hmac
permitted_enctypes = rc4-hmac
udp_preference_limit = 1
kdc_timeout = 3000
[realms]
CLOUDERA = {
kdc = quickstart.cloudera
admin_server = quickstart.cloudera
default_domain = quickstart.cloudera
}
[domain_realm]
.cloudera = CLOUDERA
quickstart.cloudera = CLOUDERA
This is my jaas.conf:
com.sun.security.jgss.initiate {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
keyTab="C:/Binaries/Kerberos/cloudera.keytab"
doNotPrompt=true
useTicketCache=false
storeKey=true
debug=true
principal="cloudera#CLOUDERA";
};
And this is my java test code:
#Test
public void testSecureSolr() {
try {
System.setProperty("sun.security.krb5.debug", "true");
System.setProperty("java.security.krb5.conf","C:\\Binaries\\Kerberos\\krb5.conf");
System.setProperty("java.security.auth.login.config","C:\\Binaries\\Kerberos\\jaas.conf");
LOG.info("-------------------------------------------------");
LOG.info("------------------- TESTS SOLR ------------------");
LOG.info("-------------------------------------------------");
HttpClientUtil.setConfigurer(new Krb5HttpClientConfigurer());
SolrServer solrServer = new HttpSolrServer(CLUSTER_URI_SOLR);
SolrPingResponse pingResponse = solrServer.ping();
LOG.info("Solr Ping Status: "+ pingResponse.getStatus());
LOG.info("Solr Ping Time: "+ pingResponse.getQTime());
} catch (SolrServerException | IOException e) {
e.printStackTrace();
}
}
Any suggestion? Thanks.
While updating(Upsert) the document we have been getting the time out exception.
couch base server version: 3.0.3 enterprise edition
couch base client - 2.1.2
Configuration:
connectTimeout: 10000
viewTimeout: 75000
queryTimeout: 75000
Setting:
// Create couchbase cluster client
CouchbaseEnvironment couchEnv = DefaultCouchbaseEnvironment.builder()
.connectTimeout(configuration.getCouchbase().getConnectTimeout()) //10000ms = 10s, default is 5s
.viewTimeout(configuration.getCouchbase().getViewTimeout())
.queryTimeout(configuration.getCouchbase().getQueryTimeout())
.autoreleaseAfter(5000)
.build();
cluster = CouchbaseCluster.create(couchEnv,configuration.getCouchbase().getHosts());
bucket = cluster.openBucket(configuration.getCouchbase().getBucket(), configuration.getCouchbase().getPassword());
Code:
public UserDocument updateUserDocument(UserDocument userDocument)
throws Exception {
userDocument.setLastUpdatedTime(Calendar.getInstance().getTime());
JsonObject userDocObject = JsonObject.fromJson(gson
.toJson(userDocument));
JsonDocument userDocumentJson = JsonDocument.create(
String.valueOf(userDocument.getUserId()), userDocObject);
//Getting Timeout Exception here
JsonDocument responseDoc = bucket.upsert(userDocumentJson);
// update device mappings in redis
if (userDocument.getUserDevices() != null && userDocument.getUserDevices().size() > 0) {
for (UserDevice userDevice : userDocument.getUserDevices())
{
redisClientService.putDeviceMappingInCache( userDevice.getDeviceId(), userDocument.getPartnerId(), userDocument);
}
}
return gson.fromJson(responseDoc.content() != null ? responseDoc.content().toString() : null, UserDocument.class);
}
Error:
ERROR [2015-07-28 12:16:59,120] com.personagraph.dropwizard.resource.UserManagementResource: Internal Error in gettting user details
! java.util.concurrent.TimeoutException: null
! Causing: java.lang.RuntimeException: java.util.concurrent.TimeoutException
! at com.couchbase.client.java.util.Blocking.blockForSingle(Blocking.java:93) ~[pgweb-usermanagement-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]
! at com.couchbase.client.java.view.DefaultViewRow.document(DefaultViewRow.java:44) ~[pgweb-usermanagement-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]
! at com.couchbase.client.java.view.DefaultViewRow.document(DefaultViewRow.java:39) ~[pgweb-usermanagement-0.0.1-SNAPSHOT.jar:0.0.1-SNAPSHOT]
Can anyone please let me know if I need any setting or code changes
I am running a local Yarn Cluster with 8 vCores and 8Gb total memory.
The workflow is as such:
YarnClient submits an app request that starts the AppMaster in a container.
AppMaster start, creates amRMClient and nmClient, register itself to the RM and next it creates 4 container requests for worker threads via amRMClient.addContainerRequest
Even though there are enough resources available containers are not allocated (The callback's function onContainersAllocated is never called). I tried inspecting nodemanager's and resourcemanager's logs and I don't see any line related to the container requests. I followed closely apache docs and can't understand what I`m doing wrong.
For reference here is the AppMaster code:
#Override
public void run() {
Map<String, String> envs = System.getenv();
String containerIdString = envs.get(ApplicationConstants.Environment.CONTAINER_ID.toString());
if (containerIdString == null) {
// container id should always be set in the env by the framework
throw new IllegalArgumentException("ContainerId not set in the environment");
}
ContainerId containerId = ConverterUtils.toContainerId(containerIdString);
ApplicationAttemptId appAttemptID = containerId.getApplicationAttemptId();
LOG.info("Starting AppMaster Client...");
YarnAMRMCallbackHandler amHandler = new YarnAMRMCallbackHandler(allocatedYarnContainers);
// TODO: get heart-beet interval from config instead of 100 default value
amClient = AMRMClientAsync.createAMRMClientAsync(1000, this);
amClient.init(config);
amClient.start();
LOG.info("Starting AppMaster Client OK");
//YarnNMCallbackHandler nmHandler = new YarnNMCallbackHandler();
containerManager = NMClient.createNMClient();
containerManager.init(config);
containerManager.start();
// Get port, ulr information. TODO: get tracking url
String appMasterHostname = NetUtils.getHostname();
String appMasterTrackingUrl = "/progress";
// Register self with ResourceManager. This will start heart-beating to the RM
RegisterApplicationMasterResponse response = null;
LOG.info("Register AppMaster on: " + appMasterHostname + "...");
try {
response = amClient.registerApplicationMaster(appMasterHostname, 0, appMasterTrackingUrl);
} catch (YarnException | IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
return;
}
LOG.info("Register AppMaster OK");
// Dump out information about cluster capability as seen by the resource manager
int maxMem = response.getMaximumResourceCapability().getMemory();
LOG.info("Max mem capabililty of resources in this cluster " + maxMem);
int maxVCores = response.getMaximumResourceCapability().getVirtualCores();
LOG.info("Max vcores capabililty of resources in this cluster " + maxVCores);
containerMemory = Integer.parseInt(config.get(YarnConfig.YARN_CONTAINER_MEMORY_MB));
containerCores = Integer.parseInt(config.get(YarnConfig.YARN_CONTAINER_CPU_CORES));
// A resource ask cannot exceed the max.
if (containerMemory > maxMem) {
LOG.info("Container memory specified above max threshold of cluster."
+ " Using max value." + ", specified=" + containerMemory + ", max="
+ maxMem);
containerMemory = maxMem;
}
if (containerCores > maxVCores) {
LOG.info("Container virtual cores specified above max threshold of cluster."
+ " Using max value." + ", specified=" + containerCores + ", max=" + maxVCores);
containerCores = maxVCores;
}
List<Container> previousAMRunningContainers = response.getContainersFromPreviousAttempts();
LOG.info("Received " + previousAMRunningContainers.size()
+ " previous AM's running containers on AM registration.");
for (int i = 0; i < 4; ++i) {
ContainerRequest containerAsk = setupContainerAskForRM();
amClient.addContainerRequest(containerAsk); // NOTHING HAPPENS HERE...
LOG.info("Available resources: " + amClient.getAvailableResources().toString());
}
while(completedYarnContainers != 4) {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
LOG.info("Done with allocation!");
}
#Override
public void onContainersAllocated(List<Container> containers) {
LOG.info("Got response from RM for container ask, allocatedCnt=" + containers.size());
for (Container container : containers) {
LOG.info("Allocated yarn container with id: {}" + container.getId());
allocatedYarnContainers.push(container);
// TODO: Launch the container in a thread
}
}
#Override
public void onError(Throwable error) {
LOG.error(error.getMessage());
}
#Override
public float getProgress() {
return (float) completedYarnContainers / allocatedYarnContainers.size();
}
Here is output from jps:
14594 NameNode
15269 DataNode
17975 Jps
14666 ResourceManager
14702 NodeManager
And here is AppMaster log for initialization and 4 container requests:
23:47:09 YarnAppMaster - Starting AppMaster Client OK
23:47:09 YarnAppMaster - Register AppMaster on: andrei-mbp.local/192.168.1.4...
23:47:09 YarnAppMaster - Register AppMaster OK
23:47:09 YarnAppMaster - Max mem capabililty of resources in this cluster 2048
23:47:09 YarnAppMaster - Max vcores capabililty of resources in this cluster 2
23:47:09 YarnAppMaster - Received 0 previous AM's running containers on AM registration.
23:47:11 YarnAppMaster - Requested container ask: Capability[<memory:512, vCores:1>]Priority[0]
23:47:11 YarnAppMaster - Available resources: <memory:7680, vCores:0>
23:47:11 YarnAppMaster - Requested container ask: Capability[<memory:512, vCores:1>]Priority[0]
23:47:11 YarnAppMaster - Available resources: <memory:7680, vCores:0>
23:47:11 YarnAppMaster - Requested container ask: Capability[<memory:512, vCores:1>]Priority[0]
23:47:11 YarnAppMaster - Available resources: <memory:7680, vCores:0>
23:47:11 YarnAppMaster - Requested container ask: Capability[<memory:512, vCores:1>]Priority[0]
23:47:11 YarnAppMaster - Available resources: <memory:7680, vCores:0>
23:47:11 YarnAppMaster - Progress indicator should not be negative
Thanks in advance.
I suspect the problem comes exactly from the negative progress:
23:47:11 YarnAppMaster - Progress indicator should not be negative
Note that, since you are using the AMRMAsyncClient, requests are not made immediately when you call addContainerRequest. There is actually an heartbeat function which is run periodically and it is in this function that allocate is called and the pending requests will be made. The progress value used by this function initially starts at 0 but is updated with the value returned by your handler once a response from the acquire is obtained.
The first acquire is supposedly done right after the register so the getProgress function should be called then and update the existing progress. As it is, your progress will be updated to NaN because, at this time, allocatedYarnContainers will be empty and completedYarnContainers will also be 0 and so your returned progress will be the result of 0/0 which is not defined. It just so happens that when the next allocate checks your progress value, it will fail because NaNs return false in all comparisons and so no other allocate function will actually communicate with the ResourceManager because it quits right at that first step with an exception.
Try changing your progress function to the following:
#Override
public float getProgress() {
return (float) allocatedYarnContainers.size() / 4.0f;
}
(note: copied to StackOverflow for posteriority from here)
Thanks to Alexandre Fonseca for pointing out that getProgress() returns a NaN for division by zero when it's called before the first allocation which makes the ResourceManager to quit immediately with an exception.
Read more about it here.
I am working on a Java project in Eclipse. I have a staging server and a live server. Those two also have their own mongodbs, which run on a different server on two different ports (29017 and 27017).
Via a Junit Test I want to copy data from the live mongo to the devel mongo.
Weirdest thing: sometimes it works and sometimes I get a socket error. I wonder why mongo sometimes completely refuses to write inserts and on other days it works flawlessly. Here is an excerpt of the mongo log file (the one where code gets inserted) and the Junit test script:
mongo log:
Thu Mar 14 21:01:04 [initandlisten] connection accepted from xx.xxx.xxx.183:60848 #1 (1 connection now open)
Thu Mar 14 21:01:04 [conn1] run command admin.$cmd { isMaster: 1 }
Thu Mar 14 21:01:04 [conn1] command admin.$cmd command: { isMaster: 1 } ntoreturn:1 keyUpdates:0 reslen:90 0ms
Thu Mar 14 21:01:04 [conn1] opening db: repgain
Thu Mar 14 21:01:04 [conn1] query repgain.editorconfigs query: { $and: [ { customer: "nokia" }, { category: "restaurant" } ] } ntoreturn:0 keyUpdates:0 locks(micros) W:5302 r:176 nreturned:0 reslen:20 0ms
Thu Mar 14 21:01:04 [conn1] Socket recv() errno:104 Connection reset by peer xx.xxx.xxx.183:60848
Thu Mar 14 21:01:04 [conn1] SocketException: remote: xx.xxx.xxx.183:60848 error: 9001 socket exception [1] server [xx.xxx.xxx.183:60848]
Thu Mar 14 21:01:04 [conn1] end connection xx.xxx.xxx.183:60848 (0 connections now open)
junit test script:
public class CopyEditorConfig {
protected final Log logger = LogFactory.getLog(getClass());
private static final String CUSTOMER = "customerx";
private static final String CATEGORY = "categoryx";
#Test
public void test() {
try {
ObjectMapper om = new ObjectMapper();
// script copies the config from m2 to m1.
Mongo m1 = new Mongo("xxx.xxx.com", 29017); // devel
Mongo m2 = new Mongo("yyy.yyy.com", 27017); // live
Assert.assertNotNull(m1);
Assert.assertNotNull(m2);
logger.info("try to connect to db \"dbname\"");
DB db2 = m2.getDB("dbname");
logger.info("get collection \"config\"");
DBCollection c2 = db2.getCollection("config");
JacksonDBCollection<EditorTabConfig, ObjectId> ec2 = JacksonDBCollection.wrap(c2, EditorTabConfig.class, ObjectId.class);
logger.info("find entry with customer {" + CUSTOMER + "} and category {" + CATEGORY + "}");
EditorTabConfig config2 = ec2.findOne(DBQuery.and(DBQuery.is("customer", CUSTOMER), DBQuery.is("category", CATEGORY)));
// config
if (config2 == null) {
logger.info("no customer found to copy.");
} else {
logger.info("Found config with id: {" + config2.objectId + "}");
config2.objectId = null;
logger.info("copy config");
boolean found = false;
DB db1 = m1.getDB("dbname");
DBCollection c1 = db1.getCollection("config");
JacksonDBCollection<EditorTabConfig, ObjectId> ec1 = JacksonDBCollection.wrap(c1, EditorTabConfig.class, ObjectId.class);
EditorTabConfig config1 = ec1.findOne(DBQuery.and(DBQuery.is("customer", CUSTOMER), DBQuery.is("category", CATEGORY)));
if (config1 != null) {
found = true;
}
if (found == false) {
WriteResult<EditorTabConfig, ObjectId> result = ec1.insert(config2);
ObjectId id = result.getSavedId();
logger.info("INSERT config with id: " + id);
} else {
logger.info("UPDATE config with id: " + config1.objectId);
ec1.updateById(config1.objectId, config2);
}
StringWriter sw = new StringWriter();
om.writeValue(sw, config2);
logger.info(sw);
}
} catch (Exception e) {
logger.error("exception occured: ", e);
}
}
}
Running this script seems like a success when I read the log in eclipse. I get an id for both c1 and c2 and the data is also here. The log even states, that it didn't find the config on devel and inserts it. That also is true, if I put it there manually. It gets "updated" then. But the mongo log stays the same.
The socket exception occurs, and the data is never written to the db.
I am out of good ideas to debug this. If you could, I'd be glad to get some tips how to go from here. Also, if any information is missing, please tell me, I'd be glad to share.
Regards,
Alex
It seems you have a connection issue with mongo server. Below ways may help you better diagnose the mongo servers:
Try to get more information from log files:
$less /var/log/mongo/mongod.log
or customized log files defined in mongod.conf
Try to use mongostat to monitor the server state:
$ mongostat -u ADMIN_USER -p ADMIN_PASS
Try to use mongo cli to check server runing status:
$ mongo admin -u ADMIN_USER -p ADMIN_PASS
$ db.serverStatus()
More useful commands is at: http://docs.mongodb.org/manual/reference/method/
Sometimes it may come across with Linux system configs. Try to tune Linux for more connections and limits, and it may help.
To check current Linux limits, run:
$ ulimit -a
Below suggestions may be helpful:
Each connection is seen by Linux as an open file. The default maximum number of open file is 1024. To increase this limit:
modify /etc/security/limits.conf:
root soft nofile 500000
root hard nofile 512000
root soft nproc 500000
root hard nproc 512000
modify /etc/sysctl.conf
fs.file-max=360000
net.ipv4.ip_local_port_range=1024 65000
Comment out the line in your mongod.conf that binds the IP to 127.0.0.1.
Usually, it is set to 127.0.0.1 by default.
For Linux, this config file location should be be /etc/mongod.conf. Once you comment that out , it will receive connections from all interfaces. This fixed it for me as i was getting these socket exceptions as well.