NoHostAvailableException in Cassandra though host is online - java

I am using a DataStax Cassandra client version 2.1.1 and I connect to 10 different clusters. I use one session per cluster and we are doing inserts to different clusters in our server.
We have prepared statements to insert queries per host and when we need to do an insert to the particular cluster, we get the session object's connection and do the insert.
When we ran load test, two things we noted:
1) I do an insert to one host(X) for a long time(burst of calls,etc) - no issues are found
2) Do a burst call to two clusters(X,Y) -then most of the records inserted into the first cluster(Y) fails
Any reason for this?
Thanks,
Gopi

I found the issue for the driver misbehaving. The actual problem was with the data model used. My data model had a map(Collection) datatype and during high load, there were timeouts. When I changed my datatype from Map to Text and when I added COMPACT STORAGE when I created tables, then things worked fine.
Yes, it is weird, but it worked. An explanation to why this works would really help.
Thanks,
Gopi

Related

ORA-02020 – Too many database links in use

i have a web based application which uses weblogic connection pooling while accessing oracle db. there is a procedure in which i must use db link to fetch and update remote table. but after a few call my service which triggers procedure call i got ORA-02020 – Too many database links in use error.
I think it causes from weblogic connection pool, it does not close session so db link is not closed and it reached max dblink count after a few try.
i've found http://dbtricks.com/?p=198 work around solution but it have not worked for me. i must use db link so what should i do? is there any solution for my case?
thanks.

Microsoft SQL Server - Query with huge results, causes ASYNC_NETWORK_IO wait issue

I have a Java application requesting about 2.4 million records from a Microsoft SQL Server (Microsoft SQL Server 2008 R2 (SP3))
The application runs fine on all hosts, except one. On this host, the application is able to retrieve data on some occasions. But on some others, it hangs.
Monitoring the MS Sql server indicates that the SPID associated with the query is in an ASYNC_NETWORK_IO wait state.
There are a few links online that talk about it.
https://blogs.msdn.microsoft.com/joesack/2009/01/08/troubleshooting-async_network_io-networkio/
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/6db233d5-8892-4f8a-88c7-b72d0fc59ca9/very-high-asyncnetworkio?forum=sqldatabaseengine
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/1df2cab8-33ca-4870-9daf-ed333a64630c/network-packet-size-and-delay-by-sql-server-sending-data-to-client?forum=sqldatabaseengine
Based on the above, the ASYNC_NETWORK_IO means 2 things:
1. Application is slow to process the results
2. Network between application and DB has some issues.
For #1 above, We analyzed using tcpdumps and found that in the cases where the query goes into ASYNC_NETWORK_IO state, the application server's tcp connection has a window size that oscillates between 0 and a small number, and eventually remains stuck at 0. Based on some more analysis, aspects related to firewalls between DB and application have also been mostly ruled out.
So I am staring at #2, unable to understand what could possibly go wrong. All the more baffling because the same code has been running under similar data loads for more than a year now. And it also runs fine on other hosts.
The JDBC driver being used is sqljdbc4-4.0.jar.
This by default has an adaptive buffering feature, which does things under the hood to reduce application resources.
We use the default fetch size of 128 (which i believe is not a good one).
So i am going to experiment overriding the default adaptive buffering behavior, though the MS docs suggest that it is good to have adaptive buffering for large result sets.
I will change the connection setting to use selectMethod=cursor.
And also change the fetchSize to 1024.
Now if it does not work:
What are some aspects of the problem that are worth investigating.
Assuming its still an issue with the client, what other connection settings, network settings should be inspected/changed to make progress?
If it does work consistently, what is the impact of making the connection setting change to selectMethod=cursor
On the application side?
Database side?
Update: I tested the application adding the selectMethod=cursor to the connection. However, it results in the same issue as above.
Based on discussions with other administrators in the team - at this point the issue may be in the jdbc driver, or on the OS (when it tries to handle the data on the network).
After a good amount of discussions with the System admin, Network Admin and Database admin - it was agreed that somewhere in the OS -> Application stack, the data from network wasn't handled. In the meantime, we tested out a solution where we broke down the query to return smaller sized results. So we broke it down to 5 queries, each returning about 500k records.
Now when we ran these queries sequentially, we still ran into the same issue.
However, when we ran the queries in parallel, it always was successful.
Given that the solution worked always we haven't bothered getting to the root cause of the problem anymore.
On another note, the hardware and software running the application was also outdated. It was running Red Hat 5. So, it could well have to do something with that.

Connecting to multiple databases using same JDBC drivers performance

I have a requirement to write a java application (web based) that will connect to an Oracle 11G databases (currently is connecting 10-12 different oracle databases), read some data from it (all are select queries).
After iterating this arraylist, I am connecting each database, fire select query (same query I am firing for all databases), get record , put it in one global collection list and close connection and continuing same process over this loop.
Currently I am "Executors" to connect multiple databases.
again using ExecutorService executor = Executors.newFixedThreadPool(20);
give one surprise to me. for creting first database connection, it shows immediate log, but for subsequent database connection, it prints logs after 60 seconds, so i am not getting why this is taking 60 secs or more for all connections?
logically, it should take time for all connections as like one.
Please suggest performance improvement for this application.
Opening a database connection is an expensive operation; if possible you should connect to each database once and reuse the connection for all queries made to that database (also known as connection pooling). It's not clear if this is what you're already doing.
Another thing that seems strange is that you have made the openConnection method synchronized. I see no need for doing that, and depending on how your code is structured it may mean that only one thread can be opening a connection at a given time. Since connecting takes a long time you'd want each thread to make its connection in parallel.

Neo4j can't get all nodes after reconnecting to the database

I used EmbeddedGraphDatabase() to create a neo4j database and create some nodes and relationships. Then I shut down the database. But after I reconnected the database,using following method:
GraphDatabaseService graphDb=new EmbeddedGraphDatabase(DB_PATH);//DB_PATH is the path of original db
and then I tried to get all nodes using GlobalGraphOperations.at(graphDb).getAllNodes();
but I can't get all nodes, that means,I can't get nodes which were created when I first connect the database.
Dev Environment:
Version of neo4j is 1.9M01 and the IDE is Eclipse while jdk is 1.6-win32
Anyone knows the reason?
Many thanks!!
I had the same issue, where i was not able to retrieve my nodes based on their index. I was missing the tx.success() when i created my database. When i recreated the DB and included tx.success() in the finally{} clause, everything started working as a magic!! Thanks a lot cporte!!
Salini

Understanding the real reason behind a Hive failure

I'm using a JDBC driver to run "describe TABLE_NAME" on hive. It gives me the following error:
NativeException: java.sql.SQLException: Query returned non-zero code: 9, cause: FAILED:
Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
return code 1 doesn't tell me very much. How do I figure out what the underlying reason is?
It's most likely because your Hive metastore is not setup properly. Hive uses a RDBMS metastore to store meta data about its tables. This includes things like table names, schemas, partitioning/bucketing/sorting columms, table level statistics, etc.
By default, Hive uses an embedded derby metastore which can only be accessed by one process at a time. If you are using that, it's possible that you have multiple sessions to Hive open that's causing this problem.
In any case, I would recommend you to set up a stand alone metastore for Hive. Embedded derby was chosen for its usability in running tests and a good out of the box metastore. However, in my opinion, it's not fit for production workflows. You can find instructions on how to configure MySQL as Hive metastore here.
Possibly you have another sesssion open. Since derby allows only one session per person.
You can check -
ps -wwwfu <your id>
kill the id which is running the hive connection.
It is because the table with the name you've specified is didn't exist in the database.
Try creating the table and again run the command. it will work. :)

Categories