I built a simple application that is monitoring Oracle's DB on a single table. I tried to test performance impact of enabled subscription and was unpleased surprised that degradation is about x2 for inserting about 10000 records each in standalone transaction.
without subscription 10k insert ~ 30 sec
with subscription ROWID granularity 10k insert ~ 60 sec
If I set:
OracleConnection.DCN_NOTIFY_ROWIDS, "false"
OracleConnection.DCN_QUERY_CHANGE_NOTIFICATION, "false"
then all degradation is vanishing but I need to get details of updates.
I removed all extra processing from client side so this is all about subscription overhead.
I am wondering is it so expensive by nature or I can tune this somehow?
Database change notification has an overhead during commit. This can't be tuned. Note that this feature is designed for read-mostly tables that are worth being cached on the client/mid-tier. One trick might be to unregister your app during batch inserts.
Related
I am currently working on a context where the application uses Hazelcast. The paradigm used is not embedded, therefore server-client is used.
I am having a flow where on a distributed map is executed a query.
After all the optimizations I could think of, different combinations with memory format, query cache, indexes etc. The most I could achieve was around ~10 milliseconds latency, which I know it sounds fast for a single operation.
The issue is that the current application is basing some flows on microseconds latency.
So my question is, is that kind of optimization possible for the query engine of Hazelcast. ? Or should I try to focus on maybe updating the business code ?
I am using Hazelcast: 4.2 with a map of around 14 000 items, with a memory count (total) of around 10 MB, so not that big.
The testing is done using local workstation.
So after all the debugging, seems that the query is capped in latency in milliseconds range. Doesn't seem that there is a way to go towards microseconds in the 4.2 version. When using continous query cache, there seem to be some unnecessary serialization carried out, which in certain cases can take 30-40 percent of the total latency, but even without that the total latency will still stay in the milliseconds range.
I am running some comparison tests (ignite vs cassandra) to check how to improve the performance of 'get' operation. The data is fairly straightforward. A simple Employee Object(10 odd fields), being stored as BinaryObject in the cache as
IgniteCache<String, BinaryObject> empCache;
The cache is configured with,
Write Sync Mode - FULL_SYNC, Atomicity - TRANSACTIONAL, Backup - 1 & Persistence - Enabled
Cluster config,
3 server + 1 client node.
Client has multiple threads(configurable) making concurrent get calls.
For about 500k request, i am getting a throughput of about 1500/sec. Given all of the data is in off-heap with cache hits percentage = 100%. Interestingly with Cassandra i am getting a similar performance, with key Cache and limited row cache.
Letting the defaults for most of the Data configuration. For this test i turned the persistence off. Ideally for get's it shouldn't really matter. The performance is the same.
Data Regions Configured:
[19:35:58] ^-- default [initSize=256.0 MiB, maxSize=14.1 GiB, persistence=false]
Topology snapshot [ver=4, locNode=038f99b3, servers=3, clients=1, state=ACTIVE, CPUs=40, offheap=42.0GB, heap=63.0GB]
Frankly, i was expecting Ignite gets to be pretty fast, given all data is in cache. Atleast looking at this test https://www.gridgain.com/resources/blog/apacher-ignitetm-and-apacher-cassandratm-benchmarks-power-in-memory-computing
Planning to run one more test tomorrow with no-persistence and setting near cache (on heap) to see if it helps.
Let me know if you guys see any obvious configurations that should be set.
Two table A(25k rows) and B(2.2m rows), use hibernate session load all those data(each row represent one object) then do update only one row in A and one row in B in a transaction, I found hibernate behavior is strange: the commit consume about 1.5 seconds to return. However the sql database's log shows the sql update command only consume several milliseconds. hibernate consume most of the time before flush the sql command to database.
So I use jprofiler to find out what it doing:
There are no clues about how the time was consumed. Due to database execute update command very fast, so it must not be blocked by database. If it was doing computation, it should be record by jprofiler(cpu time consuming).
What hibernate doing here? Why the commit so slow?
If you have loaded over 2 million objects that are sitting in Hibernate's first level cache you should not be surprised that things are a bit slow. The time is most likely spent going through all those objects looking for changes. If you know that you don't need an object you can evict it from the cache. That will reduce memory consumption and speed up the eventual commit. Just take care not to evict objects that are actually needed or you will create nasty bugs!
I have a long task to run under my App Engine application with a lot of datastore to compute. It worked well with a small amount of data, but since yesterday, I'm suddenly getting more than a million datastore entries to compute per day. After a while running the task (around 2 minutes), it fails with a 202 exit code (HTTP error 500). I really cannot deal with this issue. It is pretty much undocumented. The only information I was able to find is that it probably means that my app is running out of memory.
The task is simple. Each entry in the datastore contains a non-unique string identifier and a long number. The task sums the numbers and stores the identifiers into a set.
My budget is really low since my app is entirely free and without ads. I would like to prevent the app cost to soar. I would like to find a cheap and simple solution to this issue.
Edit:
I read Objectify documentation thoroughly tonight, and I found that the session cache (which ensures entities references consistency) can consume a lot of memory and should be cleared regularly when performing a lot of requests (which is my case). Unfortunately, this didn't help.
It's possible to stay within the free quota but it will require a little extra work.
In your case you should split this operation into smaller batches ( ej process 1000 entities per batch) and queue those smaller tasks to run sequentially during off hours. That should save you form the memory issue and allow you to scale beyond your current entity amount.
I have been running a test for a large data migration to dynamo that we intend to do in our prod account this summer. I ran a test to batch write about 3.2 billion documents to our dynamo table, which has a hash and range keys and two partial indexes. Each document is small, less than 1k. While we succeeded in getting the items written in about 3 days, we were disappointed with the Dynamo performance we experienced and are looking for suggestions on how we might improve things.
In order to do this migration, we are using 2 ec2 instances (c4.8xlarges). Each runs up to 10 processes of our migration program; we've split the work among the processes by some internal parameters and know that some processes will run longer than others. Each process queries our RDS database for 100,000 records. We then split these into partitions of 25 each and use a threadpool of 10 threads to call the DynamoDB java SDK's batchSave() method. Each call to batchSave() is sending only 25 documents that are less than 1k each, so we expect each to only make a single HTTP call out to AWS. This means that at any given time, we can have as many as 100 threads on a server each making calls to batchSave with 25 records. Our RDS instance handled the load of queries to it just fine during this time, and our 2 EC2 instances did as well. On the ec2 side, we did not max out our cpu, memory, or network in or network out. Our writes are not grouped by hash key, as we know that can be known to slow down dynamo writes. In general, in a group of 100,000 records, they are split across 88,000 different hash keys. I created the dynamo table initially with 30,000 write throughput, but configured up to 40,000 write throughput at one point during the test, so our understanding is that there are at least 40 partitions on the dynamo side to handle this.
We saw very variable responses times in our calls to batchSave() to dynamo throughout this period. For one span of 20 minutes while I was running 100 threads per ec2 instance, the average time was 0.636 seconds, but the median was only 0.374, so we've got a lot of calls taking more than a second. I'd expect to see much more consistency in the time it takes to make these calls from an EC2 instance to dynamo. Our dynamo table seems to have plenty of throughput configured, and the EC2 instance is below 10% CPU, and the network in and out look healthy, but are not close to be maxed out. The CloudWatch graphs in the console (which are fairly terrible...) didn't show any throttling of write requests.
After I took these sample times, some of our processes finished their work, so we were running less threads on our ec2 instances. When that happened, we saw dramatically improved response times in our calls to dynamo. e.g. when we were running 40 threads instead of 100 on the ec2 instance, each making calls to batchSave, the response times improved more than 5x. However, we did NOT see improved write throughput even with the increased better response times. It seems that no matter what we configured our write throughput to be, we never really saw the actual throughput exceed 15,000.
We'd like some advice on how best to achieve better performance on a Dynamo migration like this. Our production migration this summer will be time-sensitive, of course, and by then, we'll be looking to migrate about 4 billion records. Does anyone have any advice on how we can achieve an overall higher throughput rate? If we're willing to pay for 30,000 units of write throughput for our main index during the migration, how can we actually achieve performance close to that?
One component of BatchWrite latency is the Put request that takes the longest in the Batch. Considering that you have to loop over the List of DynamoDBMapper.FailedBatch until it is empty, you might not be making progress fast enough. Consider running multiple parallel DynamoDBMapper.save() calls instead of batchSave so that you can make progress independently for each item you write.
Again, Cloudwatch metrics are 1 minute metrics so you may have peaks of consumption and throttling that are masked by the 1 minute window. This is compounded by the fact that the SDK, by default, will retry throttled calls 10 times before exposing the ProvisionedThroughputExceededException to the client, making it difficult to pinpoint when and where the actual throttling is happening. To improve your understanding, try reducing the number of SDK retries, request ConsumedCapacity=TOTAL, self-throttle your writes using Guava RateLimiter as is described in the rate-limited scan blog post, and log throttled primary keys to see if any patterns emerge.
Finally, the number of partitions of a table is not only driven by the amount of read and write capacity units you provision on your table. It is also driven by the amount of data you store in your table. Generally, a partition stores up to 10GB of data and then will split. So, if you just write to your table without deleting old entries, the number of partitions in your table will grow without bound. This causes IOPS starvation - even if you provision 40000 WCU/s, if you already have 80 partitions due to the amount of data, the 40k WCU will be distributed among 80 partitions for an average of 500 WCU per partition. To control the amount of stale data in your table, you can have a rate-limited cleanup process that scans and removes old entries, or use rolling time-series tables (slides 84-95) and delete/migrate entire tables of data as they become less relevant. Rolling time-series tables is less expensive than rate-limited cleanup as you do not consume WCU with a DeleteTable operation, while you consume at least 1 WCU for each DeleteItem call.