So, I'm working on using Jooq to create a caching layer over Postgres. I've been using the MockConnection/MockDataProvider objects to intercept every query, and this is working, but I'm having a few issues.
First, how do I determine between reads and writes? That is, how do I tell whether a query is an insert/update/etc or a select, given only the MockExecuteContext that's passed into the execute method in MockDataProvider?
And I'm a bit confused on how I can do invalidations. The basic scheme I'm implementing right now is that whenever a "write" query is made to a table, I invalidate all cached queries that involve that table. This goes back to my first question, on telling different types of queries from each other, but also brings up another issue: how would I identify the tables used in a query given only the sql string and the bindings (both are attributes of MockExecuteContext)?
Also, is this a correct approach at caching? My first thought was to override the fetch() method, but that method is final, and I'd rather not change something already embedded in Jooq itself. This is the only other way I could think of to intercept all requests made so I could create a separate, persistent caching layer.
I have seen this (https://groups.google.com/forum/#!topic/jooq-user/xSjrvnmcDHw) question, but I'm still not clear on how Lukas recommended to identify tables from the object. I can try to implement a Postgres NOTIFY, but I wanted something native in Jooq first. I've seen this issue (https://github.com/jOOQ/jOOQ/issues/2665) pop up a lot too, but I'm not sure how it applies.
Keep in mind that I'm new to Jooq, so it's quite possible that I'm missing something obvious.
Thanks!
Related
My question is this: Is there ever a role for JPA merge in a stateless web application?
There is a lot of discussion on SO about the merge operation in JPA. There is also a great article on the subject which contrasts JPA merge via a more manual Do-It-Yourself process (where you find the entity via the entity manager and make your changes).
My application has a rich domain model (ala domain-driven design) that uses the #Version annotation in order to make use of optimistic locking. We have also created DTOs to send over the wire as part of our RESTful web services. The creation of this DTO layer also allows us to send to the client everything it needs and nothing it doesn't.
So far, I understand this is a fairly typical architecture. My question is about the service methods that need to UPDATE (i.e. HTTP PUT) existing objects. In this case we have these two approaches 1) JPA Merge, and 2) DIY.
What I don't understand is how JPA merge can even be considered an option for handling updates. Here's my thinking and I am wondering if there is something I don't understand:
1) In order to properly create a detached JPA entity from a wire DTO, the version number must be set correctly...else an OptimisticLockException is thrown. But the JPA spec says:
An entity may access the state of its version field or property or
export a method for use by the application to access the version, but
must not modify the version value[30]. Only the persistence provider
is permitted to set or update the value of the version attribute in
the object.
2) Merge doesn't handle bi-directional relationships ... the back-pointing fields always end up as null.
3) If any fields or data is missing from the DTO (due to a partial update), then the JPA merge will delete those relationships or null-out those fields. Hibernate can handle partial updates, but not JPA merge. DIY can handle partial updates.
4) The first thing the merge method will do is query the database for the entity ID, so there is no performance benefit over DIY to be had.
5) In a DYI update, we load the entity and make the changes according to the DTO -- there is no call to merge or to persist for that matter because the JPA context implements the unit-of-work pattern out of the box.
Do I have this straight?
Edit:
6) Merge behavior with regards to lazy loaded relationships can differ amongst providers.
Using Merge does require you to either send and receive a complete representation of the entity, or maintain server side state. For trivial CRUD-y type operations, it is easy and convenient. I have used it plenty in stateless web apps where there is no meaningful security hazard to letting the client see the entire entity.
However, if you've already reduced operations to only passing the immediately relevant information, then you need to also manually write the corresponding services.
Just remember that when doing your 'DIY' update you still need to pass a Version number around on the DTO and manually compare it to the one that comes out of the database. Otherwise you don't get the Optimistic Locking that spans 'user think-time' that you would have if you were using the simpler approach with merge.
You can't change the version on an entity created by the provider, but when you have made your own instance of the entity class with the new keyword it is fine and expected to set the version on it.
It will make the persistent representation match the in-memory representation you provide, this can include making things null. Remember when an object is merged that object is supposed to be discarded and replaced with the one returned by merge. You are not supposed to merge an object and then continue using it. Its state is not defined by the spec.
True.
Most likely, as long as your DIY solution is also using the entity ID and not an arbitrary query. (There are other benefits to using the 'find' method over a query.)
True.
I would add:
7) Merge translates to insert or to update depending on the existence of the record on DB, hence it does not deal correctly with update-vs-delete optimistic concurrency. That is, if another user concurrently deletes the record and you update it, it must (1) throw a concurrency exception... but it does not, it just inserts the record as new one.
(1) At least, in most cases, in my opinion, it should. I can imagine some cases where I would want this use case to trigger a new insert, but they are far from usual. At least, I would like the developer to think twice about it, not just accept that "merge() == updateWithConcurrencyControl()", because it is not.
The project I'm working on has a REST API written in JRuby/Java, with an endpoint that hits a MySQL database to retrieve a number of records.
We need to allow the client to filter those records using one or more columns, including boolean checks and range values.
The easiest way we can do this is to add a string parameter to the API, then add it into the SQL statement.
Collectively, the development team agree that this is a bad idea but the alternative is to provide an almost identical syntax for filtering, which is translated into SQL. The allure of the SQL injection parameter is strong.
So my question is, are there any circumstances under which this is a safe thing to do?
In particular, might we consider using the WHERE clause safely if it's been fully parsed beforehand, and identified as such. Or at the very least, checking for certain trigger words such as DROP, SELECT etc.
Also if anyone knows of a good library that could act as a go-between (translating or parsing an external expression into a WHERE clause) that would be great.
The OData and GData protocols already implement this functionality in a safe and standard way. You can find server and client implementations for both, for Ruby, PHP, MySQL etc. Check here for the OData libraries
Leaving aside the SQL injection issue, you'll expose your inner implementation (both the database chosen - MySQL and your table structure) directly in the form of your API.
e.g. if you change to some NoSQL-type implementation at the backend, your public-facing API will break immediately. Similarly if you restructure your database. I wouldn't do this even in an environment in which I wasn't worried about the probability/severity of injection attacks.
Besides the security implications, allowing an arbitrary WHERE clause is a bad idea because it takes the "I" out of "API" -- it's not an interface. The API is supposed free the user of the need to know details of the implementation. Like table and column names.
If clients are interacting with your data by constructing their own WHERE clauses, then you can never change the database. There might be code out there with those statements programmed in. If a bug or new feature required you to alter the DB in a way that would break existing client interactions you'd be stuck. The API should provide the filtering capability and translate requests into calls to backend in a way that lets you change the backend without breaking the API.
There are numerous ORM's for this purpose, especially in ruby (activerecord, sequel)
The most basic thing you need to do is escape the string input, which will pretty much prevent sequel injection if you are doing it properly.
It helps to not directly insert parameters directly into the sequel statement if you dont have to either, instead, check their validity and then map them to logical ones (this isn't always possible). For example, if there is an html dropdown list, and when you submit the form it passes some parameter 'firstitem', map 'firstitem' to an id or otherwise that you will then use to search on, versus using the user supplied version (assuming this mapping doesn't involve the db).
We're going to write a new web interface for a big system based on Oracle database. All business rules are already coded in PL/SQL stored procedures and we'd like to reuse as much code as possible. We'll write some new stored procedures that will combine the existing business rules and return the final result dataset.
We want to do this on the database level to avoid java-db round trips. The interface layer will be written in Java (we'd like to use GWT), so we need a way of passing data from Oracle stored procedures to Java service side. The data can be e.g. a set of properties of a specific item or a list of items fulfilling certain criteria.
Would anyone recommend a preferable way of doing this?
We're considering one of the 2 following scenarios:
passing objects and lists of objects (DB object types defined on the
schema level)
passing a sys_refcursor
We verified that both approaches are "doable", the question is more about design decision, best practice, possible maintenance problems, flexibility, etc.
I'd appreciate any hints.
I would recommend sticking with a refcursor with well defined keys (agreed on both sides by java devs and pl/sql developers). This is much easier to extend in the future, you can easily convert the refcursor to hashmap and then a hashmap to a POJO using a apache bean utils if needed. I'm working on a big telecom project with many approaches to this issue and refcursor seems to be the best at the end of the day.
In the past I have achieved exactly the same with classic JDBC CallableStatement without any perfomance or maintenance issues. With ORM solutions like Hibernate making persistence much more flexible, you can wrap your solution around Hibernate as achieve in this post. Also see this example if you are not already familiar with the way store procedure and CallableStatement works.
It's been a while since I've done something like that, but the way I remember is that you need to define a view that calls your stored procedure, and you can then easily read the result sets from within java, with the OR-mapper of your choice.
So, this seems close to your scenario 1, which never caused any problems in my experience.
The one thing one needs to be careful is transaction handling: If your stored procedures write data, and you call several of them within a Java EE transaction, you might get into a situation of data inconsistency.
I just started working on upgrading a small component in a distributed java application. The main application is a rather complicated applet/servlet combo running on JBoss and it extensively uses Hibernate for its DataAccess. The component i am working on however is very a very straightforward data importing service.
Basically the workflow is
Listen for a network event
Parse the data packet, extract a set of identifiers
Map the identifier set to a primary key in our database
Parse the rest of the packet and insert items in a related table using the foreign key found in step 3
Repeat
in the previous version of this component it used a hibernate based DAL, that is no longer usable for a variety of reasons (in particular it is EOL), so I am in charge of replacing the Data Access layer for this component.
So on the one hand I think i should use Hibernate because that's what the rest of the application does, but on the other i think i should just use regular java.sql.* classes because my requirements are really straightforward and aren't expected to change any time soon.
So my question is (and i understand it is subjective) at what point do you think that the added complexity of using an ORM tool (in terms of configuration, dependencies...) is worth it?
UPDATE
due to the way the DataAccesLayer for the main application was written (weird dependencies) i cannot easily use it, i would have to implement it myself.
If we look into why Spring-Hibernate combination is used?
Because for simple Jdbc operation we have to do lot of operation like getting a connection.
Making a statement and handling resultset.For all these steps there are lot of exception handling.
But with spring hibernate you have to use just this:
public PostProfiles findPostProfilesById(long id) {
List list=getHibernateTemplate().find("from PostProfiles where id=?",id);
return (PostProfiles) list.get(0);
}
And everything is taken care by framework.I hope it will solve you dilemma
I think the answer really depends on your skill set. It would probably take similar amount of time to craft a simple solution involving a handful of tables in either way (Hibernate or raw JDBC) if you are comfortable with both techniques.
As I am pretty comfortable with Hibernate, I'd just choose it as I prefer to working in a higher level and not worrying about things that Hibernate handles for me. Yes, it has its own glitches, but especially for simple data models it does the job, and does it well.
The only few reasons why would I choose plain JDBC would be:
uber-complicated maximum-optimized SQL that is performance critical;
Hibernate being stupid and not being capable to express what I want;
And especially if you say you are already managing other entities with Hibernate, why not keep your code in the same style everywhere?
I think you are better off using JDBC api. From what you describe, the two operations (select foreign key from table, insert into table_2) can easily be executed with a simple Stored Procedure call.
The advantage of using this technique is that you can manage transactions/exceptions within your stored procedure call.
I am getting this exception in a controller of a web application based on spring framework using hibernate. I have tried many ways to counter this but could not resolve it.
In the controller's method, handleRequestInternal, there are calls made to the database mainly for 'read', unless its a submit action.
I have been using, Spring's Session but moved to getHibernateTemplate() and the problem still remains.
basically, this the second call to the database throws this exception. That is:
1) getEquipmentsByNumber(number) { firstly an equipment is fetched from the DB based on the 'number', which has a list of properties and each property has a list of values. I loop through those values (primitive objects Strings) to read in to variables)
2) getMaterialById(id) {fetches materials based on id}
I do understand that the second call, most probably, is making the session to "flush", but I am only 'reading' objects, then why does the second call throws the stale object state exception on the Equipment property if there is nothing changed?
I cannot clear the cache after the call since it causes LazyExceptions on objects that I pass to the view.
I have read this:
https://forums.hibernate.org/viewtopic.php?f=1&t=996355&start=0
but could not solve the problem based on the suggestions provided.
How can I solve this issue? Any ideas and thoughts are appreciated.
UPDATE:
What I just tested is that in the function getEquipmentsByNumber() after reading the variables from list of properties, I do this: getHibernateTemplate().flush(); and now the exception is on this line rather then the call to fetch material (that is getMaterialById(id)).
UPDATE:
Before explicitly calling flush, I am removing the object from session cache so that no stale object remains in the cache.
getHibernateTemplate().evict(equipment);
getHibernateTemplate().flush();
OK, so now the problem has moved to the next fetch from DB after I did this. I suppose I have to label the methods as synchronized and evict the Objects as soon as I am finished reading their contents! it doesn't sound very good.
UPDATE:
Made the handleRequestInternal method "synchronized". The error disappeared. Ofcourse, not the best solution, but what to do!
Tried in handleRequestInternal to close the current session and open a new one. But it would cause other parts of the app not to work properly. Tried to use ThreadLocal that did not work either.
You're mis-using Hibernate in some way that causes it to think you're updating or deleting objects from the database.
That's why calling flush() is throwing an exception.
One possibility: you're incorrectly "sharing" Session or Entities, via member field(s) of your servlet or controller. This is the main reason 'synchronized' would change your error symptoms.. Short solution: don't ever do this. Sessions and Entities shouldn't & don't work this way -- each Request should get processed independently.
Another possibility: unsaved-value defaults to 0 for "int" PK fields. You may be able to type these as "Integer" instead, if you really want to use 0 as a valid PK value.
Third suggestion: use Hibernate Session explicitly, learn to write simple correct code that works, then load the Java source for Hibernate/ Spring libraries so you can read & understand what these libraries are actually doing for you.
I also have been struggling with this exception, but when it continued to recur even when I put a lock on the object (and in a test environment, where I knew I was the only process touching the object), I decided to give the parenthetical in the stack trace its due consideration.
org.hibernate.StaleObjectStateException: Row was updated or deleted by
another transaction (or unsaved-value mapping was incorrect):
[com.rc.model.mexp.MerchantAccount#59132]
In our case it turned out that the mapping was wrong; we had type="text" in the mapping for one field that was a mediumtext type in the database, and it seems that Hibernate really hates that, at least under certain circumstances. We removed the type specification altogether from the mapping for this field, and the problem was resolved.
Now the weird thing is that in our production environment, with the supposedly problematic mapping in place, we do NOT get this exception. Does anybody have any idea why this might be? We are using the same version of MySQL - "5.0.22-log" (I don't know what the "-log" means) - in dev and production envs.
Here are 3 possibilities (as I do not know exactly, which kind of hibernate session handling you are using). Add one after another and test:
Use bi-directional mapping with inverse=true between parent object and child object, so the change in parent or child will get propagate to the other end of relation properly.
Add support for Optimistic Locking using TimeStamp or Version column
Use join query to fetch the whole object graph [ parent+children] together to avoid the second call altogether.
Lastly, if and only if nothing works:
Load the parent again by Id (you have that already) and populate modified data then update.
Life will be good! :)
This problem was something that I had experienced and was quite frustrating, although there has to be something a little odd going on in your DAO/Hibernate calls, because if you're doing a lookup by ID there is no reason to get a stale state, since that is just a simple lookup for an object.
First, make sure all your methods are annotated with #Transaction(required=true) // you'll have to look up the exact syntax
However, this exception is usually thrown when you try to make changes to an object that has been detached from the session it was retrieved from. The solution to this is often not simple and would require more code posted so we can see exactly what is going on; my general suggestion would be to create a #Service that performs these kinds of things within a single transaction