I want to implement double submission prevention in an existing java web application (struts actually). Architecture wise we are talking about 2 to N possible application servers (tomcat) and one single database server (mysql). The individual servers do not know each other and are not able to exchange messages. In front of the application servers there is a single load balancer which has the ability to do sticky sessions.
So basically there are two kinds of double submission prevention client side and server side. If possible I want to go server-side because all client side techniques seem to fail if people disable cookies and/or javascript in their browsers.
This leaves me with the idea of doing some kind of mutex-like synchronisation via database locks. I think it may be possible to calculate a checksum of the user entered data and persisting it to a dedicated database table. On each submit the application would have to check for presence of an equal checksum which would indicate that the given submission is a duplicate. Of course the checksums in this table have to be cleared periodically. The problem is the whole process of checking whether there is a duplicate checksum already in the database and inserting the checksum if there is none is pretty much a critical section. Therefore the checksum table has to be locked beforehand and unlocked again after the section.
My deadlock and bottle neck alarm bells start to ring when I think about table locks. So my question is: Are there saner ways to prevent double submissions in stateless web applications?
Please note that the struts TokenInterceptor can not be applied here because it fails miserably when cookies are disabled (it relies on the HTTP session which simply isn't present without session cookies).
A simpler DB based solution would be something like this. This can be made generic across multiple forms as well.
Have a database table that can be used to store tokens.
When an new form is displayed - insert a new row into the token table
and add the token as a hidden field in the form.
When you get a form submit do a select for update on the row
corresponding to the token you received as a part of the form.
If the row still exists then this is the first submit. Process the
submit and delete the row.
If the row doesn't exist then the form has already been processed -
you can return an error.
The classic technique to prevent double submissions is to assign two IDs (both as "hidden" field in HTML Form tag) - one "session-ID" which stays the same from login to logout...
The second ID changes with every submission... server-side you only need to keep track of the "current valid ID" (session-specific)... if you get a "re-submission" (by click-happy-user or a "refresh-button" or a "back-button" or...) then that wouldn't match the current ID... this way you know: this submission should be discarded and a new ID is generated and sent back with the answer.
Some implementations use an ID that is inremented on every submission which eases a bit the check/kepp track part but that could be vulnerable to "guessing" (security concern)...
I like to generate cryptographically strong IDs for this kind of protection...
IF you have a load-balanced environment with sticky session then you only need to keep track of the ID on the server itself (in-memory)... but you can certainly store the ID in the DB... since you store it together with the session ID the lock would be on "row level" (not table level) which should be ok.
The way you described goes one step further by examining the content... BUT I see the content part more on the "application logic" level than on the "re-submission prevention level" since it depends on the app logic whether it wants to accepts the same data again...
What if you work with sticky sessions then you would be fine with some TokenManagement. There exist a DoubleClickFilter which you can add to your web.xml.
Since you have sticky sessions there is no need for a Cross-Tomcat-Solution.
Related
I have a scenario where any update/change in the data by a cms user through application/CMS needs the approval of the admin/authorizer user. There may be multiple changes in one update in a single document/record. This approval will not be done in real-time and may take few hours or may be days. Authorizer may also reject the change. So in this case what would be the best way to keep this data alive without comitting it to the database till approval or rejection. Should I create temporary or duplicate tables to keep this data temporarily in the db? But this will result in large number of temporary tables(one for each table). Or is there any other option at developer/application/java end? I am using here Oracle with Java.
You need to better understand the problem.
You do not require one datastore,
you require two datastores.
Datastore one (possible table one) will contain unapproved changes.
This is the "proposed" state.
You will write and commit all data into this datastore as soon as the user requests the change.
Datastore two (possible table two) will contain the approved changes;
this is the "real" state.
Once a change that is in datastore one has been reviewed and approved,
you must apply the change here.
A possible other solution is to use a Kafka topic:
Use a Kafka topic to store the unapproved changes.
Feed the topic to reviewers.
When approved, note the decision (in the same topic) and write the change to the database.
Note:
datastore 1 and datastore 2 can be the same table,
just have a column to indicate "approved change",
"declined change",
and "pending change".
You can always have draft and final copy of the data. Draft copy will save your work in draft mode, committed and operation like save / confirm from app can copy this into final version.
This requires one more record to identify draft / final version and you should be using draft data to show on UI.
I have 2 applications:
desktop (java)
web (symfony)
I have some data in the desktop app that must be consistent with the data in the web app.
So basically I send a POST request from the desktop app to the web app to update online data.
But the problem is that the internet cannot always be available when I send my request and in the same time I can't prevent the user from updating the desktop data
So far, this is what I have in mind to make sure to synchronize data when the internet is available.
Am I on the right direction or not ?
If not, I hope you guys put me in the right path to achieve my goal in a professional way.
Any link about this kind of topics will be appreciated.
In this case the usefull pattern is to assume that sending data is asynchronous by default. The data, after collecting, are stored in some intermediate structure and wait for a sutable moment to be send. I think the queue could be useful because it can be backend with a database and prevent data lost in case of the sending server failure. Separate thread (e.g. a job) check for data in the queue and if exists, read them and try to send. If sending was performed correctly the data are removed from queue. If failure occurs, the data stays in queue and an attempt will be made to send them next time.
This is a typical scenario when you want to send a message to an un-transactional external system in a transaction and you need to garantee that data will be transfered to the external system as soon as possible without losing it.
2 solutions come up in my mind, maybe the second fits better to your architecture.
Use case 1)
You can use message queue + redelivery limit setting with dead letter pattern. In t that case you need to have an application server.
Here you can read details about the Dead letter pattern.
This document explain how redelivery limit works on Weblogic server.
Use case 2)
You can create an interface table in the database of the destop application. Then insert your original data into database and insert a new record into the interface table as well (all in same transaction). The data what you want to POST needs to be inserted into the interface table as well. The status flag of the new record in the interface table can be "ARRIVED". Then create an independent timer in your desktop app which search periodically for records in the interface table with status "ARRIVED". This timer controlled process will try to POST data to webservice. If the HTTP response is 200 then update the status of the record to "SENT".
Boot can work like a charm.
You can solve it many way. Here give 2 way:
1.You can use circuit breaker pattern. You can get link about it from here
You can use JMS concept to manage this.
I just want to double check to make sure that a user is allowed to be at page.
Previously I have been pulling the userName out of the session and seeing if that value is null. This is fine, but I was just wondering if isNew() would have the same results or if there might be some reason that a user could have a forged or previous session.
From reading about the method I feel like it would work just fine to use this way, but want to make sure I am not missing something.
I am doing this validation in the back with Java. The front end will be doing it's own validation too, using JS I imagine, it's more just an extra check I guess.
Thanks!
In reading a few books such as PHP Hacks and Web Site cookbook, and MySQL, their common strategy is that every single page that can be accessed by a client should check for a valid session ID, a server event sequence ID, the users site status and the users role.
The site status (in SQL server) is usually: 'is_owner', 'is_creator', 'is_admin', 'is_member', 'is_new', 'is_suspended', 'is_banned'.
The entries for 'Role' is simular but pertains to a members role.
If a member comes out of suspension they may get back their status and role.
Roles can be: 'is_owner', 'is_creator', 'is_supervisor', 'is_manager', 'is_boss', 'is worker', etc.
Each page checks these 4 categories against the database to make sure the packet was not hijacked, or a sudden rise in rank, or they are a member again when they had been banned.This does not include other fields such as permission to view or join chat or a group, but if that page offers that option it needs to confirm it with the database first.
The steps to accessing a page may involve several pages of query before the client is granted access. Checking one item to grant access to all is asking for trouble.
I have implement a grid which displays document metadata and the user is able to edit the document on right click. I wanted to implement a locking mechanism for this. What would be the best way to put a lock on the document when one user has opened the editor ? These documents do reside in the database.
Just add a column that specifies who currently has the file checked out. When a person tries to check out a file, if that column is set, they will not be able to check it out, and will be notified of who has it checked out. Unless you have thousands of requests per second for a single document, this method will work fine.
In addition to adding a column to say who has the file checked out and preventing access using that. You can add a timestamp for when the lock was requested.
This way, if someone requests it and the lock is, for example, 30 mins old with no changes made, they can take the lock. (If the original user didn't quit gracefully or something).
If the documents are in a database, the database itself should have support for preventing inconsistent access.
http://docs.oracle.com/javase/6/docs/api/java/sql/Connection.html#setTransactionIsolation%28int%29
If the editor does not keep database transactions/connections open for the duration of file editing, however, and the java application runs client-side rather than server-side (as you could simply create a lock in the editor for concurrency on the server side), then things get a bit trickier and I haven't yet had enough database experience to say how you would resolve that, as using a field in the database to indicate editing status would have concurrency problems with that type of setup (unless the database itself supports locking on records, but that would depend on the DB engine in use).
Oh, one possibility would be to use file modification times (have a timestamp field in the database and update it each time a file is modified) and keep a no-dirty-reads-allowed transaction in use while checking the timestamp and determining if the file was modified by another user after the user attempting to save last accessed it; if so, it won't save the file to the database and will instead alert the user that the server-side file was changed and ask if they want to view the changes (similar to how version control systems work). By disallowing dirty reads for all such transactions, that should prevent other users from changing the file's record while the first transaction is open (to mark a record as "dirty", you could perhaps use a dummy field that would be updated at the start of each transaction with some random value). (Note: aglassman's answer would work similarly to this.)
I am working currently on a web project where users can create image galleries and upload pictures. Optionally they can mark pictures as private so that nobody else can look at them.
Now i am not sure how to properly implement the protection mechanism. Of course i have some ideas but none of them seems to be optimal.
Here is one of my ideas:
Create a table for user images:
image_key (PK) | user_id | public_image (boolean)
the picture will be saved on the harddisk using the iamge_key and can be accessed via http by an url looking like this:
http://www.myCompany.com/images/image_key
a servlet will be mapped to the url path images, the key will be extracted, a stream to the file on the harddisk will be openend and the picutre will be streamed.
additionally there will be a reverse proxy in front of the servlet container to provide some caching.
The issue with this solution is that my serlvet would have to go to the database and check if the image with the given key is public or not.
My question:
Can this be done without hitting the database? (some fancy ideas)
Can someone provide a better solution to store and keep track of the pictures?
How would a solution look like where besides public and private pictures also some pictures are shared to firends only?
Please note that this question is not about storing pictures in a database or somewhere else but concerns access rights management of binary resources in a web application environment.
If the DB table is properly indexed and you're using a connection pool, then hitting the DB is cheap. I would just keep it as is. You can at highest have a copy of the table in a Map in the application scope. But this may eat too much server memory. If you're using an ORM framework like JPA/Hibernate, you could also just turn the second level cache on to delegate the caching to the ORM. It will generally do its job very well.
As to the client side caching, you'd like to have a short expire time (1 hour?) and provide a doHead() in the servlet which in turn does basically just the same as doGet() but then without writing any bit to the response body. You would also like to check for If-None-Match and If-Last-Modified headers in the servlet if the client supplied them. You can find here a basic example of a servlet which does that.
My question: Can this be done without
hitting the database? (some fancy
ideas)
Yup, you can do it without hitting the database. We've got something similar and just wanted to put something quick in place.
The user is marking the resource private or public when he's uploading it.
We do something very simple:
public resources have a "tinyurl like" URL, say: abcd.org/aZ3tD (part of the point of the very short tinyurl-link thing is so that people who want to cut/paste/twitter it don't have to use an additional layer of bit.ly or tinyurl)
private resources aren't meant to be shared nor archived, so users don't care about a URL looking like: abcd.org/private/da499c3314e2fdce6a10a8b985489671971c187d
The first part of that URL is the user's ID.
So only the user da499c3314e2 (which must be logged in) can access resource fdce6a10a8b985489671971c187d
You asked for some way to do this without hitting the DB, this should work...
To avoid having to go to the database so often, how about the following URL patterns:
1. http://www.mycompany.com/images/<user_id>/public/<image_key>
2. http://www.mycompany.com/images/<user_id>/private/<image_key>
3. http://www.mycompany.com/images/<user_id>/shared/<image_key>
For 1, obviously no DB lookup is required - the image can be served to anybody immediately. For 2, your servlet would have to check that the ID of the active user matches the user_id in the request - again, hopefully no DB lookup required, just a check of a session variable.
The only case in which a DB call would be needed is 3, to check the relationship between the requesting user and the user who owns the image.
Of course, you'll need to be very careful about caching to ensure that your cache doesn't serve up private or shared images to unauthorised users...
Another alternative can be to store such information in image metadata.
Image metadata API: http://download-llnw.oracle.com/javase/1.4.2/docs/api/javax/imageio/metadata/package-summary.html
You can find related example:
http://download-llnw.oracle.com/javase/1.5.0/docs/guide/imageio/spec/apps.fm5.html
Write dpi metadata to a jpeg image in Java