I have a simple question about distributed locks in Reddison. How much are they secure during redis failure? I have an example (here is just one redis cluster):
Thread A try to aquire a lock (simple lock() operation with timeout 60 seconds for example) but there is an Redis cluster failure during that process - will redisson throw an exception or will wait given time for availability? And if so, will it allow thread to do some operations?
Unfortunately I have no environment to simulate that case, so I will appreciate any answer
Related
I am using Apache Ignite with a Java application and observing that with increasing concurrency, response times also increases. I noticed that there is only one connection established between the java application and the Ignite server. How can I confirm if that is the bottleneck? Thread dumps reveal that some threads are waiting for the Socket.Read method. Is it relatable to number of connections?
As of Ignite 2.7.6, Thin Client establishes only one connection to the server node. Yes, it can become a bottleneck when used from multiple threads.
I can recommend either having one IgniteClient instance per thread, or using some kind of a connection pool.
Also, Ignite 2.8 introduces Partition Awareness (release is planned for today), where thin client connection is established to every specified server node, and key-based requests are dispatched to primary nodes. This may help in your case as well.
Did you tried the applications that comes with the java JDK (JVisualVM) or best yourkit to identify where you're loosing time ?
I'm using Apache Ignite with the following setup:
2 servers form a cluster with several Ignite caches configured in REPLICATED mode. There are also 10 Java processes that connect to Apache Ignite cluster as clients and get data from those caches.
While profiling client JVMs with VisualVM I see that clients spend almost half of their time in blocked on the following line
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
at org.apache.ignite.internal.processors.cache.GridCacheAdapter.get0(GridCacheAdapter.java:4723)
at org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:4697)
at org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:1415)
at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.get(IgniteCacheProxyImpl.java:928)
at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.get(GatewayProtectedCacheProxy.java:640)
I understand that lock might be needed to correctly process get()/put() for the given key in a given cache. But in my application I first load all needed reference data into Ignite cache and after that client JVMs only get data from cache. Is such behavior (spending a lot of time in WAITING during cache.get()) expected? Is there any way to call cache.get() without lock since in my case there will be no updates in cache after initial load?
In general it's expected because you need to wait for a network to deliver cache values to your client node at least. REPLICATED cache mode means that every key is present on every server node but it still takes some time to pull it to a client node.
I am developing a HTTP application server using Netty 4 and JDBC(+BoneCP for connection pooling).
So far, I am doing all the work(works involving database connections, HttpAsyncClient and so on) on one handler. I close all I/O after each job is finished.
As far as I know, Netty performs well as long as nothing is blocking the worker thread.
However, I read that JDBC connections create blocking I/O.
Is there a good practice to use JDBC with Netty to improve scalability and performance?
As you may know, Netty provides EventExecutorGroup to start a separate thread. Blocking calls (e.g. JDBC connections, etc.) should be done in this thread instead of the one with event loop running, so that the main event loop won't be blocked and keeps responsive.
Make sure you have enough database connections, obviously your workers will block waiting for a connection if your pool is out of connections. The worker will be waiting for a new connection (if the pool is allowed to grow), or waiting for a connection to return otherwise. Otherwise, use general best practices. Tune your reads with setFetchSize() and your writes by using batching. Minimize your round trips, and fetch only the data you need. Do you have specific code (or a query) that is slow?
Fundamentally, this question is about: Can the same DB connection be used across multiple processes (as different map-reduce jobs are in real different independent processes).
I know that this is a little trivial question but it would be great if somebody can answer this as well: What happens in case if the maximum number of connections to the DB(which is preconfigured on the server hosting the DB) have exhausted and a new process tries to get a new connection? Does it wait for sometime, and if yes, is there a way to set a timeout for this wait period. I am talking in terms of a PostGres DB in this particular case and the language used for talking to the DB is java.
To give you a context of the problem, I have multiple map-reduce jobs (about 40 reducers) running in parallel, each wanting to update a PostGres DB. How do I efficiently manage these DB read/writes from these processes. Note: The DB is hosted on a separate machine independent of where the map reduce job is running.
Connection pooling is one option but it can be very inefficient at times especially for several reads/writes per second.
Can the same DB connection be used across multiple processes
No, not in any sane or reliable way. You could use a broker process, but then you'd be one step away from inventing a connection pool anyway.
What happens in case if the maximum number of connections to the
DB(which is preconfigured on the server hosting the DB) have exhausted
and a new process tries to get a new connection?
The connection attempt fails with SQLSTATE 53300 too_many_connections. If it waited, the server could exhaust other limits and begin to have issues servicing existing clients.
For a problem like this you'd usually use tools like C3P0 or DBCP that do in-JVM pooling, but this won't work when you have multiple JVMs.
What you need to do is to use an external connection pool like PgBouncer or PgPool-II to maintain a set of lightweight connections from your workers. The pooler then has a smaller number of real server connections and shares those between the lightweight connections from clients.
Connection pooling is typically more efficient than not pooling, because it allows you to optimise the number of active PostgreSQL worker processes to the hardware and workload, providing admission control for work.
An alternative is to have a writer process with one or more threads (one connection per thread) that takes finished work from the reduce workers and writes to the DB, so the reduce workers can get on to their next unit of work. You'd need to have a way to tell the reduce workers to wait if the writer got too far behind. There are several Java queueing system implementations that would be suitable for this, or you could use JMS.
See IPC Suggestion for lots of small data
It's also worth optimizing how you write to PostgreSQL as much as possible, using:
Prepared statements
A commit_delay
synchronous_commit = 'off' if you can afford to lose a few transactions if the server crashes
Batching work into bigger transactions
COPY or multi-valued INSERTs to insert blocks of data
Decent hardware with a useful disk subsystem, not some Amazon EC2 instance with awful I/O or a RAID 5 box with 5400rpm disks
A proper RAID controller with battery backed write-back cache to reduce the cost of fsync(). Most important if you can't do big batches of work or use a commit delay; has less impact if your fsync rate is lower because of batching and group commit.
See:
http://www.postgresql.org/docs/current/interactive/populate.html
http://www.depesz.com/index.php/2007/07/05/how-to-insert-data-to-database-as-fast-as-possible/
We have a web application that is generating some 3-5 parallel threads every five seconds to connect to a JMS/JNDI connection pool. We wait for the first batch of parallel threads to complete before creating next batch of parallel threads. During this process we are using a lot of network traffic and connection threads are just hanging. Eventually we manually call operations team to kill the connection threads to free up connections.
Question I wanted to ask you is:
Obviously we are doing something wrong as we are holding up connection resources
When we wait for parallel threads to respond before sending second batch of requests,Does this design not resonate well with industry best practices?
Finally what are the options and recommendations you have for this scenario i.e. multiple threads connecting to JMS/JMDI connection
Thanks for your inputs
You need to adjust your connection pool parameters. It sounds like you're using up only 3-5 connections for your service, which seems very reasonable to me. A JMS service should be able to handle thousands of connections. Either your pool's default limit is too low, or your JMS server is configured with too few allowed connections.
Are you sure that's what the other users are blocking on? It seems strange to me.
I'm almost sure that you would be alright with single connection factory. Just make sure clean up/close session properly. We uses spring's SingleConnectionFactory.