Is it ok to have a REST Webservice (Spring and Jersey) that uses a DAO with a ConcurrentHashMap to store the data, or should I avoid it and use some kind of in-memory DB?
It's an sample application, so I don't mind losing the data every time the application stops.
ConcurrentHashMap is fine if you pretty much just need to create, read, update and delete entities. I'm actually using ConcurrentHashMap in an application that runs in Jetty and emulates some system that our application integrates.
But, as Soitorios Delimanolis and omickron mentioned, things would get hairy if you need to to rely on the atomicity of transactions involving multiple database operations.
To safeguard myself from that situation, I defined interfaces for my DAOs and wrote ConcurrentHashMap-backed implementation. If time comes when that would not be sufficient, I'd be able to swap out that implementation with an implementation based on HSQLDB or SQLite.
You can use ConcurrentHashMap, but you will have some difficulties when:
trying to do 2 and more actions in same "transaction", you should synchronize such actions with other threads, as ConcurrentHashMap is successfully works only with one operation;
trying to search not by Map key but some other field of the Map.Entry.value Object.
ConcurrentHashMap is for other purposes.
So, I'll advise to use any in-memory DB.
Related
I want to write a Java EE web application where different users work with a database. A user can start editing a record, and then either save changes or cancel editing. While the user is editing, the record should be locked for other users. It should be locked on the database level, because there are also other non-Java users editing the same database, locking the records they work on.
I understand some basic Java + databases, but I am not good at multiple-user things like locking. Looking for some examples on the internet, it seems to me like every "hello world" example for a Java EE technology introduces at least one another technology. To access objects in the database, I use JPA. To lock records, I probably need transactions, which brings JTA. To work with JTA, I need JNDI. To work with all those objects, I probably also need EJB and injections... and at this moment I wonder whether this is really the most simple way to solve the problem, or whether I missed something important. I do not know whether all those technologies are necessary (if yes, I will use them; I just would like to be sure before I learn them all). I just see that the examples I found on the web introduce them very generously.
I would like a simple example of a Java EE code which:
uses JPA;
connects to a database described in the "persistence.xml" file;
has a MyObject class with properties id and name, stored in the MYOBJECT table;
has a method (e.g. called from a JSP page) that database-level locks the object with id = 42 (so that non-Java users with access to the same database also cannot modify it), or displays an error if the record is already locked by another user (either another Java user, or a non-Java user);
has another method (e.g. called from another JSP) that either updates the name to a specified value and releases the lock, or just releases the lock if empty string is provided.
For each new technology you introduce in the solution, I would like to hear a very short explanation why did you use it. Also whether that technology requires me to install new libraries, create or modify configuration files, write additional code, etc. (The JSP files which call the methods are not necessary; I am interested in the database-related parts.)
(Another detail: Here is described a difference between EntityTransaction and UserTransaction. If I understand it correctly, JTA is needed only if I use multiple databases. Is it also necessary if I use only one Oracle database with different schemas? If yes, the please write the example code using JTA.)
1) If you want to lock a record in a database, you need something called pessimistic lock. Remember this keyword and use it for further googling. Simply said, pessimistic lock means really locking the record in the database. Which means that if your Java application makes a pessimistic lock, the record is really locked; so even if some other non-Java program accesses the same database, the record will be locked, and they cannot modify it.
On the other hand, the so-called optimistic lock is mostly a pretend-lock. It is, approximately, a "we most likely don't need to lock this record anyway, so we will not really lock it, and if something bad happens, then we will try to fix the problem afterwards" approach. Which actually makes sense and increases performance, but only in situations where the assumptions behind this approach are true; where the conflicts are really rare, and where you really can fix the problem afterwards. Unless you understand it well (which you don't seem to), just don't use it.
2) JPA is a unified approach for using a database with transactions and stuff, and it also maps objects to tables for you. This is probably what you want.
JTA is the same stuff, plus a unified approach to use transactions over many databases, so it is more powerful than JPA, but that means it has additional functionality that you don't really need. On the other hand, for using these superpowers you pay some cost, like losing the ability to start and transactions on whim. The server will manage the transactions for you, as the server needs. If you completely understand how exactly that works, then you know whether this fits your needs; but if you don't, then you rather avoid it. Your development environment may offer you JTA as a default option, but that is only because it thinks that you are going to write Skynet. By not using JTA you also don't have to use JNDI, EJB, and many other Skynet-related technologies.
3) After hearing this, now it is time for you to do your homework. Because now you have an idea of what to do. Read the "javax.persistence" API documentation.
You can use annotated Java classes to represent your database tables; or you can use the old-fashioned SQL queries; or both, as you wish. You can use either of them to lock and release records. A lock must be inside of a transaction, so if you want to keep the lock, you have to keep the transaction.
We will not solve this for you. You are asking for everything. You need to code it your self, but here is a link for JPA locking.
Hint: Use #Version
Read here for information on locking for JPA
I am developing web application with Java EE 6. In order to minimize calls to database will it be a good idea to have classes:
Data access class (DAO) will call only basic methods getAllClients, getAllProducts, getAllOrders, delete, update methods - CRUD methods.
Service class which will call CRUD methods but in addition filter methods e.g. findClientByName, findProuctByType, findProductByYear, findOrderFullyPaid/NotPaid etc... which will be based on basic DAO methods.
Thank you
In my experience (albeit, limited) DAO classes tend to have all the possible database operations which the application is allowed to perform. So in your case, it will have methods such as getAllClients() and getClientByName(String name), etc.
Getting all the users in your DAO and iterating all over them until you find the one you need will result in unneeded waste of computational time and memory consumption.
If you want to reduce the amount of times that your database is hit you could, maybe, implement some caching mechanism. An ORM framework such as Hibernate should be able to provide what you need as shown here.
EDIT:
As per your comment question, no, your service will not be made redundant. What one does is to usually use a Service layer to expose the DAO functionalities. This will, basically, not make the DAO visible from the from front end of your application. It usually also allows for extra methods, such as, for instance, public String getUserFormatted(String userName). This will make use of the getUserByName function offered by the DAO but provide some extra functionality.
The Service layer will also make itself useful should there be a change in specification and you now also need a web service to interface with your application. Having a service layer in between will allow the web service to query the DAO through the Service layer.
So basically, the DAO layer will still worry about the database stuff (CRUD Operations) while the service will adapt the data returned by the DAO without exposing the DAO.
It's hard to say without more information, but I think it's probably a good idea to leverage your database more than with just CRUD operations. Databases are good at searching, provided you configure them correctly, so IMHO it's a good idea to let your database handle the searching in your find methods for you. This means that your find methods would probably go in your DAOs...
It's good to think about/be aware of the implications of DB access on performance, but don't go overboard. Also, your approach implies that since your services are going to be doing the filtering, you are going to load a large amount of DB data into your application, which is a bad idea. The bottom line is you should use your RDBMS as it is intended to be used, and worry about performance due to over-access when you can show its a problem. I doubt you will run into that scenario.
I would say that you're better off having your DAO be more fine grained than you've specified.
I'd suggest putting findClientByName, findProuctByType, findProductByYear, findOrderFullyPaid/NotPaid on your DAO as well in some way because your database will most likely be better at filtering and sorting data than your in memory code.
Imagine you have 10 years of data and you call findProductsByYear on your service class and it then calls getAllProducts and then throws away 9 years of data in memory. You're far better off getting your database to only return you the year you are interested in.
Yes, this is the right way to do it.
The service will own the transactions. You should write these as POJOs; that way you can expose them as SOAO or REST web services, EJBs, or anything else that you want later on.
I understand from a few other posts and my understanding on JAX-WS web services they are not thread-safe. My web service is going to get called by 100's of clients and we need to be able to process around 200 transaction/second.
My web service is going to interact with database to perform its work, if i introduce the synchronized keyword around the code that access the database I essentially will ensure only one thread access the database at a time, I wonder if I will still be able to achieve the required throughput in this case. thanks in advance for your help.
I have been told to actually move the database access work into another class and instaniate that class at the method level that way I won't need to use the synchronized keyword and still achieve thread safety. Is that correct?
If you need transactions and thread safety why aren't you just using EJBs as your JAX-WS endpoints?
We need more info on the application.
In general - for performance in the case you describe + database access I recommend.
Carefully plan your database - index where possible/makes sense, use views, etc...
Try to use a database with a good locking mechanism (lock per row). This way when two requests access different rows, you will not suffer from whole table locking.
Have your transactions as short as possible. If using EJBs - make sure the transaction scope for "read data" methods is not Required or RequiredNew (this might cause opening a transaction).
If you do use synchronization - carefully use the proper lock. Don't be tempted to automatically used "synchronized" as its the easiest to code. Consider using ReaderWriterLock where possible.
Consider using caching where possible, but carefully plan this, so your flows work on "relevant" data.
Start with these directions - I think you will see you can achieve your performance target like this.
Perhaps this question is not very clear but I didn't find better words for the heading, which describes the problem I like to deal with shortly.
I want to restrict access from a java desktop application to postgres.
The background:
Suppose you have 2 apps running and the first Application has to do some complex calculations on the basis of data in the db. To nail the immutability of the data in the db down i'd like to lock the db for insert, update and delete operations. On client side i think it's impossible to handle this behaviour satisfactory. So i thought about to use a little java-app on server-side which works like a proxy. So the task is to hand over CRUD (Create Read Update Delete) operations until it gets a command to lock. After a lock it rejects all CUD operations until it gets a unlock command from the locking client or a timeout is reached.
Questions:
What do you think about this approach?
Is it possible to lock a Database while using such an approach?
Would you prefer Java SE or Java EE as server-side java app?
Thanks in advance.
Why not use transactions in your operations? The database has features to maintain data integrity itself, rather than resorting to a brute operation such as a total-database lock.
This locking mechanism you describe sounds like it would be a pain for the users. Are the users initating the lock or is the software itself? If it's the users, you can expect some problems when Bob hits lock and then goes to lunch for 2 hours, forgetting to unlock the database first...
Indeed... there are a few proper ways to deal with this problem.
Just lock the tables in your code. Postgresql has commands for locking entire tables that you could run from your client application
Pick a transaction isolation level that doesn't have the problem of reading data that was committed after your txn started (BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ).
Of these, by far the most efficient is to use repeatable read as your isolation level. Postgres supports this quite efficiently, and it will give you a consistent view of the data without such heavy locking of the db.
Year i thought about transactions but in this case i can't use them. I'm sorry i didn't mention it exactly. So assume the follow easy case:
A calculation closes one area of responsibility. After calc a new one is opened and new inserts are dedicated to it. But while calculation-process a insert or update or delete is not allowed to the data of the (currently calculated) area of responsibility. More over a delete is strictly prohibited because data has to be archived.
So imo the use of transactions doesn't fit this requirement. Or did i miss sth.?
ps: (off topic) #jsight: i currently read that intenally postgres mapps "repeatable read" to "serializable", so using "repeatable read" gets you more restriction then you would perhaps expect.
It seems to me that introducing an ORM tool is supposed to make your architecture cleaner, but for efficiency I've found myself bypassing it and iterating over a JDBC Result Set on occasion. This leads to an uncoordinated tangle of artifacts instead of a cleaner architecture.
Is this because I'm applying the tool in an invalid Context, or is it deeper than that?
When can/should you go whole hog with the ORM approach?
Any insight would be greatly appreciated.
A little of background:
In my environment I have about 50 client computers and 1 reasonably powerful SQL Server.
I have a desktop application in which all 50 clients are accessing the data at all times.
The project's Data Model has gone through a number of reorganizations for various reasons including clarity, efficiency, etc.
My Data Model's history
JDBC calls directly
DAO + POJO without relations between Pojos (basically wrapping the JDBC).
Added Relations between POJOs implementing Lazy Loading, but just hiding the inter-DAO calls
Jumped onto the Hibernate bandwagon after seeing how "simple" it made data access (it made inter POJO relations trivial) and because it could decrease the number of round trips to the database when working with many related entities.
Since it was a desktop application keeping Sessions open long term was a nightmare so it ended up causing a whole lot of issues
Stepped back to a partial DAO/Hibernate approach that allows me to make direct JDBC calls behind the DAO curtain while at the same time using Hibernate.
Hibernate makes more sense when your application works on object graphs, which are persisted in the RDBMS. Instead, if your application logic works on a 2-D matrix of data, fetching those via direct JDBC works better. Although Hibernate is written on top of JDBC, it has capabilities which might be non-trivial to implement in JDBC. For eg:
Say, the user views a row in the UI and changes some of the values and you want to fire an update query for only those columns that did indeed change.
To avoid getting into deadlocks you need to maintain a global order for SQLs in a transaction. Getting this right JDBC might not be easy
Easily setting up optimistic locking. When you use JDBC, you need to remember to have this in every update query.
Batch updates, lazy materialization of collections etc might also be non-trivial to implement in JDBC.
(I say "might be non-trivial", because it of course can be done - and you might be a super hacker:)
Hibernate lets you fire your own SQL queries also, in case you need to.
Hope this helps you to decide.
PS: Keeping the Session open on a remote desktop client and running into trouble is really not Hibernate's problem - you would run into the same issue if you keep the Connection to the DB open for long.