Stale Lucene index when using multiple machines - java

I've got a Java/Hibernate/MySQL application up and running, and it works very nicely.
Recently I've been using Lucene (Hibernate Search) to speed up the searching and avoid round trips to the database by using projection. That works great too, except that the index gets stale when the application gets used on multiple machines. Lucene does a good job of updating the local index when changes are made locally, but it can't see changes from other machines.
Currently, I am:
reindexing in full once a week
updating a "last modified" time on all records, and updating the local index at start time based on anything modified since last indexing
But this doesn't work for deletions. If something gets deleted on one machine, it still turns up in searches on other machines.
Is there a 'standard' way to deal with this? I can think of a few options, none of which excite me:
reindex in full every night (still stale during the day, though)
maintain a table of deleted records so that I can use it to update locally
perform a round trip to the db at startup time to find all entries in the index but not in the db
add some sort of trigger to the db to record something somewhere when something gets deleted (this would work for updates as well as deletions)
Hard to believe this is a new problem, but I couldn't find any convincing answers.
Any help much appreciated.

Related

Legacy purge job hangs due to multi cascade

We initially missed migration of legacy scheduled "purge" job (Java based) to cloud. Now, when we have done so, the job always hangs, due to its original design of cascade deletes (or even regular ones) of 15 or so tables for each user identity.
This job runs well for few users, but because of initial miss, we ended up with 1000s of users that need purge (with associated records in multiple tables). Hence the first run is causing the job to run for hours, and it finally hangs.
Few approaches were tried (creating indexes, using chunk of size 50 etc), but none of them have so far worked.
Because this job works well for few users (which is likely scenario going forward), trying to see if approach of creating some kind of script / mechanism to delete users in small batches (of say 5), iteratively and have it executed by DBA. Once this is complete (all applicable users are purged), enabled the legacy purge job with its original design, which should work for deleting few users going forward.
Appreciate any suggestions/thoughts.

Java application fails to process a special file that comes in every Thursday evening, while Cassandra throws an error

All different files are processing fine, but this file seems to be special.
The solution is to restart both the Cassandra Database, Java application and re-upload the file into the S3 bucket for processing. Then the same file is processed correctly.
Right now, we're restarting the Java application and Cassandra database every Friday morning. We're suspecting accumulation of something to be a possible root cause of the problem, as the file is processed perfectly fine after a complete restart.
This is a screenshot of the error in Cassandra:
We're using Cassandra as a backend for Akka Persistence.
So a failure to ingest the file only happens when the cluster has been up for some time. I don't have a failure to ingest, if it's done soon after cluster start.
First, that's not an ERROR, it's an INFO. Secondly, it's telling you that you're writing into cache faster than the cache can be recycled. If you're not seeing any negative effects (data loss, stale replicas, etc), I wouldn't sweat this. Hence, the INFO and not ERROR.
If you are and you have some spare non-heap RAM on the nodes, you could try increasing file_cache_size_in_mb. It defaults to 512MB, so you could try doubling that and see if it helps.
we're restarting the Java application and Cassandra database every Friday morning
Also, there's nothing to really gain by restarting Cassandra on a regular basis. Unless you're running it on a Windows machine (seriously hope you are not), you're really not helping anything by doing this. My team supports high write throughput nodes that run for months, and are only restarted for security patching.

Derby DB - File size leak

Every now and then one of our remotely deployed java apps with a local derby db locks up due to waiting for a transaction to complete on a table. Every 24 hours the app syncs this derby db with our juggernaught oracle server, which means that after delivering its contents off, it clears all of its tables and then completely repopulates.
I opened the hood up and found that the .dat file for one of the tables has blown itself out to 100Mb. Im not sure exactly how long it takes to get to this point because the users generally arent around when the thing syncs. The gradual slowdown wasnt reported until it appeared to have completely stopped. Im also currently not sure logically how, since although autocommit is off, the code seems pretty tight.
After some googling I think what may have happened is an old transaction was somehow forgotten and neither commited nor rolled back - Future syncs which clear the table and repopulate essentially grow the file by it's original size every sync, since its tracking its entire history back to that old point.
Is there some way to confirm this and/or am I thinking completely down the wrong track? Is it possible to list old or long running transactions? If any are found, how can I clear them and will clearing them reclaim the disk space?

Searching is not responsive during indexing with Lucene

When I re-index the DB data of my application, and there is a search executed on the same time, the thread that runs the search is going to sleep until the re-indexing is done. I assume that the indexing methods are thread-safe in order to prevent change of the data while indexing. Is there any built in way in Lucene to make it responsive only for search (where the data is not being changed)? Or should I start thinking about something on my own? I'm running my application on a Tomcat server.
Thanks, Tomer
I assume that you are actually rebuilding the index (or reindexing everything from scratch, as opposed to reindexing individual documents). While the index is being rebuilt, you cannot perform the queries against it, because it's not in consistent state.
The simplest solution that is often used is to rebuild the index in the background (while still performing the queries against the old one) and then replace it with the fresh one.
If the problem you are facing is connected with frequent server crashes, it might be worthwhile to look at some more systematical approach like the one that is implemented for example in Zoie -- it records subsequent indexing requests, so it can recover from the last correct snapshot of the index.

MySQL performance

I have this LAMP application with about 900k rows in MySQL and I am having some performance issues.
Background - Apart from the LAMP stack , there's also a Java process (multi-threaded) that runs in its own JVM. So together with LAMP & java, they form the complete solution. The java process is responsible for inserts/updates and few selects as well. These inserts/updates are usually in bulk/batch, anywhere between 5-150 rows. The PHP front-end code only does SELECT's.
Issue - the PHP/SELECT queries become very slow when the java process is running. When the java process is stopped, SELECT's perform alright. I mean the performance difference is huge. When the java process is running, any action performed on the php front-end results in 80% and more CPU usage for mysqld process.
Any help would be appreciated.
MySQL is running with default parameters & settings.
Software stack -
Apache - 2.2.x
MySQL -5.1.37-1ubuntu5
PHP - 5.2.10
Java - 1.6.0_15
OS - Ubuntu 9.10 (karmic)
What engine are you using for MySQL? The thing to note here is if you're using MyISAM, then you're going to have locking issues due to the table locking that engine uses.
From: MySQL Table Locking
Table locking is also disadvantageous
under the following scenario:
* A session issues a SELECT that takes a long time to run.
* Another session then issues an UPDATE on the same table. This session
waits until the SELECT is finished.
* Another session issues another SELECT statement on the same table.
Because UPDATE has higher priority than SELECT, this SELECT waits for the UPDATE to finish,
after waiting for the first SELECT to finish.
I won't repeat them here, but the page has some tips on increasing concurrency on a table within MySQL. Obviously, one option would be to change to an engine like InnoDB which has a more complex row locking mechanism that for high concurrency tables can make a huge difference in performance. For more info on InnoDB go here.
Prior to changing the engine though it would probably be worth looking at the other tips like making sure your table is indexed properly, etc. as this will increase select and update performance regardless of the storage engine.
Edit based on user comment:
I would say it's one possible solution based on the symptoms you've described, but it may not be
the one that will get you where you want to be. It's impossible to say without more information.
You could be doing full table scans due to the lack of indexes. This could be causing I/O contention
on your disk, which just further exasterbates the table locks used by MyISAM. If this is the case then
the root of the cause is the improper indexing and rectifying that would be your best course of action
before changing storage engines.
Also, make sure your tables are normalized. This can have profound implications on performance
especially on updates. Normalized tables can allow you to update a single row instead of hundreds or
thousands in an un-normalized table. This is due to unduplicated values. It can also save huge amounts
of I/O on selects as the db can more efficiently cache data blocks. Without knowing the structure of
the tables you're working with or the indexes you have present it's difficult to provide you with a
more detailed response.
Edit after user attempted using InnoDB:
You mentioned that your Java process is multi-threaded. Have you tried running the process with a single thread? I'm wondering if maybe it's possibly you're sending the same rows to update out to multiple threads and/or the way you're updating across threads is causing locking issues.
Outside of that, I would check the following:
Have you checked your explain plans to verify you have reasonable costs and that the query is actually using the indexes you have?
Are your tables normalized? More specifically, are you updating 100 rows when you could update a single record if the tables were normalized?
Is it possible that you're running out of physical memory when the Java process is running and the machine is busy swapping stuff in and out?
Are you flooding your disk (a single disk?) with more IOPs than it can reasonably handle?
We'd need to know a lot more about the system to say if thats normal or how to solve the problem.
with about 900k rows in MySQL
I would say that makes it very small - so if its performing badly then you're going seriously wrong somewhere.
Enable the query log to see exactly what queries are running, prioritize based on the product of frequency and duration. Have a look at the explain plans, create some indexes. Think about splitting the database across multiple disks.
HTH
C.

Categories