Is it good practice to validate user input using the domain constraints such as email(unique:true) then rely on a message.properties input such as className.email.unique=Email address already in use to create an error message. Or is it better practice to have some client side validation or some check being carried in a web service before trying to persist to the domain?
It common practise to use both client and server sides.
Client side validation adds convenience to the user and can reduce bandwidth or improve the work flow but it isn't 100% reliable.
Client side validation has significant aesthetic appeal as well as being able to alert users of mistakes before the post operation, it will look better but and be nice for users but won't stop bad inputs, it is purely an aesthetic choice for improving how the user interacts with the page and hopefully reducing the bandwidth of sending multiple bad inputs before getting it right.
The source of a page can be edited locally in order to disable or bypass even the most well formed validation and to completely suppress it, so nothing you can do on the client side will be able to stop a determined user from making a mess of your system.
This means you also need to have good server side validation, it is good practise to try and protect yourself against injections and other sorts of nonsense users can intentionally or accidentally pull off, especially since you are out on the web. Reducing the points of failure by having both validations is the preferred way because they both add value.
You should look into using CommandObjects on your controller action when accepting request payload.
http://grails.org/doc/latest/guide/single.html#commandObjects
Command Objects allow you to put validation rules/constraints on the request payload. Now this is good because you apply new constraints which are specific to payload request from web without causing it to hit your logic. A cool feature is you can inherit domain constraints.
#grails.validation.Validateable
class LoginCommand {
String username
String password
static constraints = {
username(blank: false, minSize: 6)
password(blank: false, minSize: 6)
}
}
Related
I'm developing this JavaEE REST microservice oriented CQRS + EventSourcing app, I have this entity (Artwork) with many fields and I have to record each update to this entity according to EventSourcing pattern (basically each update creates a new event, the artwork is then rebuild using these events).
My approach basically works, but I'm stuck with a "compliance" towards HTTP standards, basically I want to avoid a "generic" update in which you update the whole entity because it will be a mess to handle each single field update (and consequent event generation).
So this is what I did.
Let's say that I have this entity:
public entity{
int id;
String field1;
String field2;
...
Then I created as many requests as many fields I have to update (not all fields can be updated, such as the ID)
public field1UpdateRequest{
field1 newvalue;
}
and the same for field 2.
These updated are handled using a PUT request, when such a request arrives, it is handled by something like this:
HTTP → Controller→ Service → (DAOS etc.)
So in the controller class I have a PUT http://...//updatefield1 method that accepts field1UpdateRequest objects.
My question is:
Is this right to do? How can I explain that this is right (if it is)? should these requests be PATCH more than PUT? Should a generic PUT request also be included? (Even if I'm scared that this will make the event sourcing part more difficult)?
In a CQRS spproach, it's important to remember that the C stands for Command. Every request to your "write-side" is thus a command. A generic "here is the new value for this resource" request (which is what REST tends to lead to) can be interpreted as a "use this value henceforth" command, but it is a bit of an impedance mismatch with CQRS, because it's a fairly anemic command. There are definitely cases where having that in an API can be called for (and if it's an exceptionally rare request, you may even be able to get away with modeling it as a single "new beginning" event rather than teasing out finer-grained events; this has the cost of shifting some complexity out to consumers of the events).
With that in mind, an alternative approach that updates parts of an object is a little more of a fit with CQRS (though in this case, you are shifting some complexity to requestor, at least if that requestor wants to do wholesale updates). HTTP PUTsounds proper to me: the command is "use this value for this field/component of the entity".
That said, an even more CQRSy API would instead focus on the higher-level activities which motivate a change to the entity. For instance if you're tracking the current owner of the artwork as of when it was sold, you might have a currentOwner and a currentOwnerAcquired field in your artwork entity. When recording a sale, you would want to update both, so a POST /artworks/{artworkId}/transferOwnership endpoint taking something like
{
"transferor": "Joe Bloggs",
"transferee": "Jack Schmoe",
"date": "2021-12-24T00:00:01Z"
}
would allow the update to be a single transaction and allow you to encode the "why" as well as the "what" in your events (which is an advantage of event sourcing).
So in the controller class i have a PUT http://...//updatefield1 method that accepts field1UpdateRequest objects.
Is this right to do?
It might be, but it probably isn't.
Here's the key idea: your REST API is a facade; it supports the illusion that your server is stores and produces documents. In other words, your providing an interface to your data that makes it look like every other site on the web.
The good news: when you do that, you get (for free!) the benefits of a bunch of general purpose work that has already been done for you.
But the cost (of these things that you get for free) is that - for everything to "just work", you need to handle the impedance mismatch between HTTP (which is based on an idiom of documents) and your domain model.
So I send to you messages to "edit your documents", and you in turn figure out how to translate those messages into commands for your domain model.
In HTTP, both PUT and PATCH have remote authoring semantics. Both of those messages mean "make your copy of the document look like my copy". They are the flavor of HTTP messages you would use to (for example) edit the title of an HTML document on your web server.
The semantics are fundamentally anemic. My message tells you how I want your copy of the document to look, your server is responsible for figuring out how to achieve that.
And that's fine when you are working with documents, or using documents as a facade in front of a data model. But matching remote authoring requests with a domain model are a lot harder.
(Recommended reading: Greg Young 2010 on task based user interfaces).
In the case of a domain model, you normally want to send to the server a representation of a command message. HTTP really wants you to deal with command messages in one of two ways:
treat the command message as a document/resource of its own, to be stored on the server (the changes to the domain model are a side effect of storing a new command message)
POST the command message to the resource most directly impacted by the change.
(See Fielding, 2009; it is okay to use POST).
In both cases, the HTTP application itself knows nothing about what's going on at the domain level, it is only concerned with the transfer of documents over the network.
HTTP doesn't really place any constraints on your resource model - if you want to publish all of your information in one document, that's fine. If you want to distribute your information across many documents, that's also fine.
So taking a single entity in your domain, and distributing its information across many resources is fine.
BUT: remember caching. HTTP has simple rules for automatically invalidating previously cached responses; separating the resource you use for reading information from the resource that you use for editing information makes caching harder (caution: caching is already one of the two hard problems).
In other words: trade offs.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Is it a good practice to defensively check for null/empty inputs in client ? In the server this check is happening of whether or not input is null and an exception is thrown accordingly, should client also make a check in order to avoid a call to the webservice ?
Under the best circumstances, it is a performance improvement, and nothing else.
Under the worst circumstances, the client side checking can drift away from what the server accepts, actually introducing bugs due to inconsistent deployments.
In either case, you don't typically have control over the client environment, so you cannot assume the client-side check was performed. Malicious users can inject their own client-side code which will permit non-valid inputs to be sent to the server, so server-side checking is still strongly required.
I would recommend that you do client-side checks, but I would also recommend that you take the care to ensure that your client-side checks are synchronized with your server-side checks, such that your client doesn't start filtering inputs in a different manner than your server would. If that becomes too problematic, error on making the server side checking correct. It's the only real defense point.
It's good practice to do whatever you need to do to protect your server, whatever that may be.
Always do checking server side, you never know where data is going to come from.
Do checking client side if you have some reason for notifying the user of their mistake before sending data to a server. For example, a client-side validation of an integer input can, e.g., update a warning label as the user is typing without requiring round-trip validation to the server. Client-side checks are essentially a first line of action for displaying clear validation errors to the user, but really they are nothing more than UI performance improvements. If you don't want to do that, then you don't need to do that. If you only want to do that for certain values, you only need to do that for certain values.
Perhaps your server already generates reasonable information about validation errors, in which case you could display those to the client. It really depends on your situation and needs.
For example, lets say the client displays a series of dialogs asking for input before finally sending a request to the server. It's irritating for the user if they aren't notified of an invalid input until after they go through the entire series of dialogs. This is a good case for client-side validation at each step of the input.
Note that the cost of client-side validation is that you need to make sure to maintain it to match the actual server-side rules if they change.
It's also good practice to think a little about your specific requirements and choose an appropriate course of action to make sure those requirements are met, rather than asking vague questions about generic, situation-agnostic "good practice".
Personally, I try my best to have server-side validation report useful information, and I don't do any initial client-side validation. I then add client-side validation later, after higher priority work is complete and after determining that the UX would clearly benefit from it.
Yes, in order to keep the bandwidth and the server load as low as possible, you should always add client-side validation as well. Even a thin-client can do easy validations like null/empty-checks.
If you have some complex validation depending on many different inputs (cross-validation) or maybe complicated checksum calculations, you might skip the client-side validation and do it only on server side.
Server side validation is always needed though, because as you can see, the client cannot be trusted if you would now decide to not validate.
The HATEOAS principle "Clients make state transitions only through actions that are dynamically identified within hypermedia by the server"
Now I have a problem with the word dynamically, though I guess it's the single most important word there.
If I change one of my parameters from say optional to mandatory in the API, I HAVE to fix my client else the request would fail.
In short, all HATEOAS does is give the server side developer extreme liberty to change the API at will, at the cost of all clients using his/her API.
Am I right in saying this, or am I missing something like versioning or maybe some other media-type than JSON which the server has to adopt?
Any time you change a parameter from optional to mandatory in an API, you will break consumers of that API. That it is a REST API that follows HATEOAS principles does not change this in any way. Instead, if you wish to maintain compatibility you should avoid making such changes; ensure that any call made or message sent by a client written against the old API will continue to function as expected.
On the other hand, it is also a good idea to not write clients to expect the set of returned elements to always be identical. They should be able to ignore additional information given by a server if the server chooses to provide it. Again, this is just good API design.
HATEOAS isn't the problem. Excessively rigid API expectations are the problem. HATEOAS is just part of the solution to the problem (as it potentially relieves clients from having to know vast amounts about the state model of the service, even if it doesn't necessarily make it straight-forward).
Donal Fellows has a good answer, but there's another side to the same coin. The HATEOAS principle doesn't have anything to say itself about the format of your messages (other parts of REST do); instead, it means essentially that the client should not try to know which URI's to act upon out of band. Instead, the server should tell the client which URI's are of interest via hyperlinks (or forms/templates which construct hyperlinks). How it works:
The client starts at state 0.
The client requests a well-known resource.
The server's response moves the client to a new state N. There may be multiple states achievable at this point depending on the response code and payload.
The response includes links (or forms/templates) which tell the client, in band, the set of potential next states.
The client selects one of the potential next states by issuing a method on a URI.
Repeat 3 through 5 to states N+1 and beyond until the client's application needs are met.
In this fashion, the server is free to change the URI that moves the client from state N to state N+1 without breaking the client.
It seems to me that you misunderstood the quoted principle. Your question suggests that you think about the resources and that they could be "dynamically" defined. Like a mandatory property added to certain resource type at the application runtime. This is not what the principle says and this was correctly pointed out in other answers. The quoted principle says that the actions within the hypermedia should be dynamically identified.
The actions available for a given resource may change in time (e.g. because someone added/removed a relationship in the meantime) and there may be different actions available for the same resource but for different users (e.g. because users have different authorization levels). The idea of HATEOAS is that clients should not have any assumptions about actions available for certain resource at any given time. The client should identify available actions each time it reads that resource.
Edit: The below paragraph may be wrong. See the comments for discussion about it.
On the other hand clients may have expectation for the data available in the resource. Like that a book resource must have a title and that it there may be links to the book's author or authors. There is no way of avoiding the coupling introduced by these assumptions but both service providers and clients should use backward-compatibility and versioning techniques to deal with it.
We are currently working on a very simple Webapp, and we would like to "obfuscate" ( what would be the right term? ) or encode somehow the request parameter, so we can reduce the chance an idle user from sending arbitrarily data.
For instance, the url looks like /webapp?user=Oscar&count=3
We would like to have somthing like: /webapp?data=EDZhjgzzkjhGZKJHGZIUYZT and have that value decoded in the server with the real request info.
Before going into implementing something like this ourselves ( and probably doing it wrong ) I would like to know if there is something to do this already?
We have Java on the server and JavaScript on the client.
No, don't do this. If you can build something in your client code to obfuscate the data being transmitted back to the server, then so can a willful hacker. You simply can't trust data being sent to your server, no matter what your official client does. Stick to escaping client data and validating it against a whitelist on the server side. Use SSL, and if you can, put your request parameters in a POST instead of GET.
Expansion edit
Your confusion stems from the goal to block users from tampering with request data, with the need to implementing standard security measures. Standard security measures for web applications involve using a combination of authentication, privilege and session management, audit trails, data validation, and secure communication channels.
Using SSL doesn't prevent the client from tampering with the data, but it does prevent middle-men from seeing or tampering with it. It also instructs well-behaved browsers not to cache sensitive data in the URL history.
It seems you have some sort of simple web application that has no authentication, and passes around request parameters that control it right in the GET, and thus some non-technically savvy people could probably figure out that user=WorkerBee can simply be changed to user=Boss in their browser bar, and thus they can access data they shouldn't see, or do things they shouldn't do. Your desire (or your customer's desire) to obfuscate those parameters is naïve, as it is only going to foil the least-technically savvy person. It is a half-baked measure and the reason you haven't found an existing solution is that it isn't a good approach. You're better off spending time implementing a decent authentication system with an audit trail for good measure (and if this is indeed what you do, mark Gary's answer as correct).
So, to wrap it up:
Security by obfuscation isn't
security at all.
You can't trust
user data, even if it is obscured.
Validate your data.
Using secure communication channels (SSL)
helps block other related threats.
You
should abandon your approach and do
the right thing. The right thing, in
your case, probably means adding an
authentication mechanism with a
privilege system to prevent users
from accessing things they aren't
privileged enough to see - including
things they might try to access by
tampering with GET parameters. Gary
R's answer, as well as Dave and Will's comment hit
this one on the head.
If your goal is to "reduce the chance an idle user from sending arbitrarily data," there's another simpler approach I would try. Make a private encryption key and store it in your application server side. Whenever your application generates a url, create a hash of the url using your private encryption key and put that hash in the query string. Whenever a user requests a page with parameters in the url, recompute the hash and see if it matches. This will give you some certainty that your application computed the url. It will leave your query string parameters readable though. In pseudo-code,
SALT = "so9dnfi3i21nwpsodjf";
function computeUrl(url) {
return url + "&hash=" + md5(url + SALT );
}
function checkUrl(url) {
hash = /&hash=(.+)/.match(url);
oldUrl = url.strip(/&hash=.+/);
return md5(oldUrl + SALT ) == hash;
}
If you're trying to restrict access to data then use some kind of login mechanism with a cookie providing a Single Sign On authentication key. If the client sends the cookie with the key then they can manipulate the data in accordance with the authorities associated with their account (admin, public user etc). Just look at Spring Security, CAS etc for easy to use implementations of this in Java. The tokens provided in the cookie are usually encrypted with the private key of the issuing server and are typically tamper proof.
Alternatively, if you want your public user (unauthenticated) to be able to post some data to your site, then all bets are off. You must validate on the server side. This means restricting access to certain URIs and making sure that all input is cleaned.
The golden rule here is disallow everything, except stuff you know is safe.
If the goal it to prevent "static" URLs from being manipulated, then you can simply encrypt the parameters, or sign them. It's likely "safe enough" to tack on an MD5 of the URL parameters, along with some salt. The salt can be a random string stored in the session, say.
Then you can just:
http://example.com/service?x=123&y=Bob&sig=ABCD1324
This technique exposes the data (i.e. they can "see" that xyz=123), but they can not change the data.
There's is an advantage of "encryption" (and I use that term loosely). This is where you encrypt the entire parameter section of the URL.
Here you can do something like:
http://example.com/service?data=ABC1235ABC
The nice thing about using encryption is two fold.
One it protects the data (they user can never see that xyz=123, for example).
The other feature tho is that it's extensible:
http://example.com/service?data=ABC1235ABC&newparm=123&otherparm=abc
Here, you can decode the original payload, and do a (safe) merge with the new data.
So, requests can ADD data to the request, just not change EXISTING data.
You can do the same via the signing technique, you would just need consolidate the entire request in to a single "blob", and that blob is implicitly signed. That's "effectively" encrypted, just a weak encryption.
Obviously you don't want to do ANY of this on the client. There's no point. If you can do it, "they" can do it and you can't tell the difference, so you may as well not do it at all -- unless you want to "encrypt" data over a normal HTTP port (vs TLS, but then folks will wisely wonder "why bother").
For Java, all this work goes in a Filter, that's the way I did it. The back end is isolated from this.
If you wish, you can make the back end completely isolated from this with an outbound filter that handles the URL encryption/signing on the way out.
That's also what I did.
The down side is that it's very involved to get it right and performant. You need a light weight HTML parser to pull out the URLs (I wrote a streaming parser to do it on the fly so it didn't copy the entire page in to RAM).
The bright side is all of the content side "just works", as they don't know anything about it.
There's also some special handling when dealing with Javascript (as your filter won't easily "know" where there's a URL to encrypt). I resolved this by requiring urls to be signed to be specific "var signedURL='....'", so I can find those easily in the output. Not as crushing a burden on designers as you might think.
The other bright side of the filter is that you can disable it. If you have some "odd behavior" happening, simply turn it off. If the behavior continues, you've found a bug related to encryption. It also let developers work in plain text and leave the encryption for integration testing.
Pain to do, but it's nice overall in the end.
You can encode data using base64 or something similar. I would encode the arguments inself in JSON to serialize them.
Something like jCryption ?
http://www.jcryption.org/examples/
I have just starting learning the MVC pattern today (for gui applications, not web), and have a few questions on where data validation should take place.
From what i have read, it seems like most people are saying that all the validations should take place on the controller, and the model should pretty much only hold the state of the data. However, it seems like in some situations it would make more sense to do the verification in the model.
For example, lets say that a client changes the ipv4 address of the server they want to connect to from the gui. We want to verify that this is in fact an ipv4 address, and not just random characters. If the ip address is valid, then we want to change that data in the model to the new ip address, and if it isn't valid we want to have the view display an error (or something).
If you did the verification in the controller, then if in the future you decided you wanted to have a different controller/view (because from what i can tell, they are coupled pretty closely together), you would have to make sure to include this same verification code in both controllers, and would thus have to manage two pieces of the same code. That would of course be more prone to bugs then managing just one piece of code, like if the verification was done in the model.
Should i be doing it this way? Or am i missing that makes doing it in the controller make more sense? Or should some data be handled in the model, and some data handled in the controller?
Thanks
I guess that at the model the verification is always necessary. Some times would be nice to have also in the view so that the user would be event not allowed to input something other than what the field is asking for. One way or the other, generic verification like IP addresses validation make's a great candidate to be a global/static/utility method to be used, not only in the view and in the model but in several different models. Sometimes it can even be shared between client's view components and server components like in a GWT application