I need to implement some kind of inter-process mutex in Java. I'm considering using the FileLock API as recommended in this thread. I'll basically be using a dummy file and locking it in each process.
Is this the best approach? Or is something like this built in the standard API (I can't find it).
For more details see below:
I have written an application which reads some input files and updates some database tables according to what it finds in them (it's more complex, but business logic is irrelevant here).
I need to ensure mutual exclusion between multiple database updates. I tried to implement this with LOCK TABLE, but this is unsupported by the engine I'm using. So, I want to implement the locking support in the application code.
I went for the FileLock API approach and implemented a simple mutex based on:
FileChannel.lock
FileLock.release
All the processes use the same dummy file for acquiring locks.
Why bother using files, if you have database on your hands? Try using database locking like this (https://dev.mysql.com/doc/refman/5.0/en/innodb-locking-reads.html).
By the way, what database engine do you use? It might help if you list it.
Related
I want to read/write to a raw device(which is just a file in linux) asyncly, and I have been using java.nio.channels.AsynchronousFileChannel.
But it's a 'fake asynchronous', because the AsynchronousFileChannel uses a thread pool to execute the read/write tasks. It's actually calling the synchronized read/write interface offered by OS.
What I really want is a real asynchronous implementation which is io_submit in linux.
But I can't find it in jdk or any other repositories like guava or apache.
So my question is this :
In java, is there an existing implementation of asynchronous file accessor based on the native io_submit interface ?
If not, why can't I see anyone else who need it ?
In java, is there an existing implementation of asynchronous file accessor based on the native io_submit interface
Not in the default Java libraries at the time of writing (2019). I doubt there's much enthusiasm to implement an io_submit() Java wrapper in the default libraries because:
libaio/KAIO is quirky. Linux's KAIO is fraught with constraints like only really being asynchronous when doing direct I/O (and even then there are very elaborate rules which will turn submission synchronous if broken that go beyond the caller's control)
There's no guarantee that libaio library itself would be around so you would have to bundle it with Java or otherwise reimplement it.
If not, why can't I see anyone else who need it ?
People who need it that badly have recreated wrappers (e.g. see https://github.com/zrlio/jaio ). However supporting KAIO would be a Linux only thing and thus not that portable (which goes a bit against a key Java ethos).
I want to develop a program that reads data from the database and written into file.
For a better performance, I want to use multithreading.
The solution I plan to implement is based on these assumptions:
it is not necessary to put multiple threads to read from the database because there is a concurrency problem to be managed by the DBMS (similarly to the writing into the file). Given that each read element from the database will be deleted in the same transaction.
Using the model producer-consumer: a thread to read the data (main program). and another thread to write the data in the file.
For implementation I will use the executor framework: a thread pool (size=1) to represent the consumer thread.
Can these assumptions make a good solution ?
Is this problem requires a solution based on multithreading?
it is not necessary to put multiple threads to read from the database because there is a concurrency problem to be managed by the DBMS
Ok. So you want one thread that is reading from the database.
Can these assumptions make a good solution ? Is this problem requires a solution based on multithreading?
Your solution will work but as mentioned by others, there are questions about the performance improvements (if any). Threading programs work because you can make use of the multiple processor (or core) hardware on your computer. In your case, if the threads are blocked by the database or blocked by the file-system, the performance improvement may be minimal if at all. If you were doing a lot of processing of the data, then having multiple threads handle the task would work well.
This is more of a comment:
For your first assumption: You should post the db part on https://dba.stackexchange.com/ .
A simple search returned :
https://dba.stackexchange.com/questions/2918/about-single-threaded-versus-multithreaded-databases-performance - so you need to check if your read action is complex enough and if multithread even serves your need for db connection.
Also, your program seems to be sequential read and write. I dont think you even need multithreading unless you want multiple writes on the same file at the same time.
You should have a look at Spring Batch, http://projects.spring.io/spring-batch/, which relates to JSR 352 specs.
This framework comes with pretty good patterns to manage ETL related operations, including multi-threaded processing, data partitioning, etc.
Here is my requirement:
a date is inserted in to a db table with each record. Two weeks
before that particulate date, a separate record should be entered to a
different table.
My initial solution was to put up a SQL schedule job, but my client insisted on it being handled through java.
What is the best approach for this?
What are the pros and cons of using SQL schedule job and Java scheduling for this task?
Ask yourself the question: to what domain does this piece of work belong? If it's required for data integrity, then it's obviously the DBMS' problem and would probably best be handled there. If it's part of the business domain rather than the data, or might require information or processing that's not available or natural to the DBMS, it's probably best made external.
I'd say, use the best tool for the job. Having stuff handled by the database using whatever features it offers is often nice. For example, a log table that keeps "snapshots" of status updates of records in another table is something I typically like to have a trigger for, taking that responsibility out of my app's hands.
But that's something that's available in practically any DBMS. There's the possibility that other databases won't offer the job scheduling capacities you require. If it's conceivable that some day you'll be switching to a different DBMS, you'll then be forced to do it in Java anyway. That's the advantage of the Java approach: you've got the functionality independently of the database. If you're using pure JDBC with standard SQL queries, you've got a fully portable solution.
Both approaches seem valid. Consider what induces the least work and worries. If it's done in Java you'll need to make sure that process is running or scheduled. That's some external dependency. If it's in the database, you'll be sure the job is done as long as the DB is up.
Well, first off, if you want to do it in Java, you can use the Timer for a simple basic repetitive job, or Quartz for more advanced stuff.
Personally I also think that it would be better to have the same entity (application) deal with all related database actions. In other words, if your Java app is reading/writing to/from the db, it should be consistent and also deal with scheduled reading/writings. And as a plus, this way you can synchronize your actions easier, like, if you want to make sure that a scheduled job is running, has started, has finished, you can do that a lot easier if all is done in Java as compared with having a different process (like the SQL Scheduler) doing it.
I'm using JCaptcha in a project and needed a behavior that was not directly available. so I looked into the source code to see if I can extend it to obtain what I want and found that the store implementation I use (MapCaptchaStore) uses a HashMap as the store... with no synchronization.
I know JCaptcha does not work in a clustered environment, it is not my case, but how about multiple clients at the same time? Is the store implementation synchronized externally or should I roll my own and make sure it is properly synchronized?
TIA!
Judging by the reading source for MapCaptchaStore, this class is NOT thread-safe. I'm not 100% willing to stand behind this answer though, because synchronisation may be happening at a higher level (eg all accesses to a single instance of MapCaptchaStore may be synchronised on another object).
You could use another implementation of CaptchaStore. For example, EhcacheCaptchaStore
Basic hashmap implementation of the captcha store is not synchronized, that could lead to some weird behaviour.
Other stores are thread safe, for a simple implementation use FastHashMapCaptchaStore.
I'm assuming it is because it has been designed to be integrated with web applications which will always have multiple clients. It's also a CAPTCHA framework so they must have tested with both human and computer clients.
However, I would still recommend testing whether it behaves correctly in a multithreaded environment.
In my doPost method of the servlet I need to access a file (shared resource ) and update the file .
How do I cater to some 100 users using this at the same time ?
Regards,
Mithun
Create separate singleton class for file access.
Use Read and WriteLock from java.util.concurent package to protect file access.
Cache, so that you won't have to do file read if all you have to do is to return file content.
Are you sure a file is how you want to handle this? Protecting access to data by multiple concurrent users is most of what a modern database does.
With high level of concurrency (for writes), synchronizing will cost you a lot of throughput.
Databases are more adequate to handle this, if possible in your project.
I would take advantage of the java.util.concurrent packages added in Java 1.5. Specifically a BlockingQueue to queue up requests and handle them in a second pool of threads.