Cassandra Talend Job running via java posing errors

Cassandra Talend Job running via java posing errors - java

I have 3 node Apache Cassandra cluster, where we are doing data loading operations via java prepared statement, while running the job we are facing the following error:
INSERT INTO "abc" () VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)
is not prepared on /xx.xx.xx.xx:9042, preparing before retrying executing.
Seeing this message a few times is fine, but seeing it a lot may be source of performance problems. 
This query is used in a java code where the talend jars are called, which is taking a lot of time to complete the job of data loading  
Above error message is showing for all 3 Cassandra nodes in cluster. Below is the environment setup:
Apache Cassanndra version - 3.8.0
Talend Version - 6.4
Apache Cassandra driver - cassandra-driver-core-3.0.0-shaded.jar

Related

Solr sometimes stop returning results for the query in production.

we are using solr 6.6 in production and have massive query load. We have one master and two slaves which replicate from master in polling interval of 5 minutes. We have data import handler which runs every hour to index data from mysql db to solr master. we are using solrj for querying solr slaves from java application. It has happened twice in the production that solr did not return results for the query and starts working fine after we execute data import.
I am not able to find the root cause for this.

Spring Mongo's find operation freezes on windows server 2008 Machine

I am currently using Spring's Mongo persistence layer for querying MongoDB. The collection I query contains about 4G of data. When I run the find code on my IDE it retrieves the data. However, when I run the same code on my server, it freezes for about 15 to 20 minutes and eventually throws the error below. My concern is that it runs without a hitch on my IDE running on my 4G Ram windows PC and fails on my 14G ram server. I have looked through the Mongo Log, and there's nothing there that points to the problem. I also assumed that the problem might be an environmental issue since it works on my local spring IDE, however the libraries on both my local pc are the same as the ones on my server. Has anyone had this kind of issue or can any one point me to what I'm doing wrong. Also weirdly, the find operation works when I revert to Mongo's java driver find methods.
I'm using mongo-java-driver - 2.12.1
spring-data-mongodb - 1.7.0.RELEASE
See below sample find operation code and error message.
List<HTObject> empObjects =mongoOperations.find(new Query(Criteria.where("date").gte(dateS).lte(dateE)),HTObject.class);
The exception I get is:
09:42:01.436 [main] DEBUG o.s.data.mongodb.core.MongoDbUtils - Getting Mongo Database name=[Hansard]
Exception in thread "main" org.springframework.dao.DataAccessResourceFailureException: Cursor 185020098546 not found on server 172.30.128.155:27017; nested exception is com.mongodb.MongoException$CursorNotFound: Cursor 185020098546 not found on server 172.30.128.155:27017
at org.springframework.data.mongodb.core.MongoExceptionTranslator.translateExceptionIfPossible(MongoExceptionTranslator.java:73)
at org.springframework.data.mongodb.core.MongoTemplate.potentiallyConvertRuntimeException(MongoTemplate.java:2002)
at org.springframework.data.mongodb.core.MongoTemplate.executeFindMultiInternal(MongoTemplate.java:1885)
at org.springframework.data.mongodb.core.MongoTemplate.doFind(MongoTemplate.java:1696)
at org.springframework.data.mongodb.core.MongoTemplate.doFind(MongoTempate.java:1679)
at org.springframework.data.mongodb.core.MongoTemplate.find(MongoTemplate.java:598)
at org.springframework.data.mongodb.core.MongoTemplate.find(MongoTemplate.java:589)
at com.sa.dbObject.TestDb.main(TestDb.java:74)
Caused by: com.mongodb.MongoException$CursorNotFound: Cursor 185020098546 not found on server 172.30.128.155:27017
at com.mongodb.QueryResultIterator.throwOnQueryFailure(QueryResultIterator.java:218)
at com.mongodb.QueryResultIterator.init(QueryResultIterator.java:198)
at com.mongodb.QueryResultIterator.initFromQueryResponse(QueryResultIterator.java:176)
at com.mongodb.QueryResultIterator.getMore(QueryResultIterator.java:141)
at com.mongodb.QueryResultIterator.hasNext(QueryResultIterator.java:127)
at com.mongodb.DBCursor._hasNext(DBCursor.java:551)
at com.mongodb.DBCursor.hasNext(DBCursor.java:571)
at org.springframework.data.mongodb.core.MongoTemplate.executeFindMultiInternal(MongoTemplate.java:1871)
... 5 more

In short
The MongoDB result cursor is not available anymore on the server.
Explanation
This can happen when using Sharding and a connection to a mongos fails over or if you run into timeouts (see http://docs.mongodb.org/manual/core/cursors/#closure-of-inactive-cursors).
You're performing a query that loads all objects into one list (mongoOperations.find). Depending on the result size, this may take a long time. Using an Iterator can help to leverage but even loading huge amounts using Iterators is limited at a certain point.
You should partition the results if you have to query very large data amounts using either paging (paging gets slower the more records you skip) or by querying with splits of your range (you have already a date range, so this could work).

Apache nutch is not crawling any more

I have a two machine cluster. On one machine nutch is configured and on second hbase and hadoop are configured. hadoop is in fully distributed mode and hbase in pseudo distributed mode. I have crawled about 280GB data. But now when I start crawling . It gives following message and do not crawl any more in previous table
INFO mapreduce.GoraRecordReader - gora.buffer.read.limit = 10000
INFO crawl.FetchScheduleFactory - Using FetchSchedule impl: org.apache.nutch.crawl.DefaultFetchSchedule
and following bug
ERROR store.HBaseStore
- [Ljava.lang.StackTraceElement;#7ae0c96b
Documents are fetched but they are not saved in hbase.
But if I crawl data in a new table, it works well and crawl properly witout any error. I think this is not a connection problem as for new table it works. I think it is bacause of some property etc.
Can anyone guide me as I am not an expert in apache nutch?

Not quite my field, but looks like thread exhaustion on the underlying machines.

As I was also facing similiar problem. Actual problem was with regionserver (Hbase deamon ). So try to restart it as it is shutdown when used with default seeting and data is too mutch in hbase. For more information, see log files of regionserver.

Strange performance Issue with SQL Server + Spring JAVA application

I am experience a strange performance issue when accessing data from SQL Server from a Spring based application. In my current setup, the Spring java application runs on a separate machine accessing data from a remote SQL Server DB. I am using NamedParameterTemplate in Spring, which I believe uses Prepared Statement to execute the query. For some reason, some of the query takes a long time to complete (approx. 2 mins). The JAVA app runs on a 64bit machine running 64bit version of Java v1.6, and the SQL Server is MS SQL Server 2008 R2.
The strangeness, is if I run the same java app from my laptop running Windows XP 32bit, running the same version of Java v1.6, the query takes less than a second, accessing the exact same remote DB server (infact, I am connected through VPN)
This shows the issue is not with the Spring framework but may be with the SQL JDBC Driver. I am using Microsoft JDBC Driver 4.0 for SQL Server (sqljdbc.jar)
I am completely clueless, as what could possibly be wrong and not sure where to start my debugging process.
I understand, there isn't much information in my question, so please let me know if you need any specific detail.
Thanks for any help/suggestions.

I think this may be due to the combination of your java version and jdbc driver failing to handshake the connection with the server. See Driver.getConnection hangs using SQLServer driver and Java 1.6.0_29 and http://blogs.msdn.com/b/jdbcteam/archive/2012/01/19/patch-available-for-sql-server-and-java-6-update-30.aspx
If so, switching to 1.6.0 upgrade 30 or higher and applying kb 2653857 ought to fix it.

Dynamodb requestHandler acception

I have found a cryptic exception when running dynamo inserts in the cloud, any help or clues as to how to debug such an error ?
Background
The code I am running :
Succesfully inserts data into dynamodb when run from my local machines, but
Fails abruptly due to authentication when running in the cloud in a mapreduce job over EMR.
Uses a URL endpoint for authentication.
I simply create credentials like so:
client=new DynamoDBClient(new BasicAWSCredentials(
"XXXX",
"XXXXXXXXXXX));
client.setEndpoint("https://dynamodb.eu-west-1.amazonaws.com");
The exception Im getting is below:
Exception in thread "main" java.lang.NoSuchFieldError: requestHandlers
at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.init(AWSSecurityTokenServiceClient.java:214)
at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.<init>(AWSSecurityTokenServiceClient.java:160)
at com.amazonaws.auth.STSSessionCredentialsProvider.<init>(STSSessionCredentialsProvider.java:73)
at com.amazonaws.auth.SessionCredentialsProviderFactory.getSessionCredentialsProvider(SessionCredentialsProviderFactory.java:96)
at com.amazonaws.services.dynamodb.AmazonDynamoDBClient.setEndpoint(AmazonDynamoDBClient.java:857)
at com.amazonaws.services.dynamodb.AmazonDynamoDBClient.init(AmazonDynamoDBClient.java:262)
at com.amazonaws.services.dynamodb.AmazonDynamoDBClient.<init>(AmazonDynamoDBClient.java:181)
at com.amazonaws.services.dynamodb.AmazonDynamoDBClient.<init>(AmazonDynamoDBClient.java:142)

The "real" answer here, is that, dynamodb clients which don't match up with the latest or current versions can exhibit odd reflection / class loading error when we attempt to use them in a modern environment.
AWS jars exist on the class path of older EMR AMI instances can conflict with proper (latest) AWS jars used by hadoop job which invokes a non-EMR service (i.e. such as dynamodb, in our case).
On my older AMI instance, I simply issued:
mv $HOME/lib/aws-java-sdk-1.1.1.jar $HOME/lib/aws-java-sdk-1.1.1.jar.old
To resolve the issue on a single node cluster.
The ROOT cause of this error? was that I was using an older Ruby elastic-mapreduce client, which led to creation of older AMI versions in my EMR cloud, which had obsolete aws-sdk jars on the class path.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Cassandra Talend Job running via java posing errors - java

Related

Solr sometimes stop returning results for the query in production.

Spring Mongo's find operation freezes on windows server 2008 Machine

Apache nutch is not crawling any more

Strange performance Issue with SQL Server + Spring JAVA application

Dynamodb requestHandler acception

Categories

Resources