One day I was talking with a friend about one of his server applications for a little flash game.
The server communicates with a mysql database. And I found this request:
"UPDATE phpbb_users SET patojdur = '" + this.score + "' WHERE user_id = '" + this.user_id + "'"
As this.score is data entered by the user, I asked him if it wasn't unsafe to put that directly in the SQL request, and take the risk of an SQL injection.
But he answered me: "No, because this.score is an user_request.split("'")[1], the split is protecting me and you can't put a ' to inject."
My question isn't if he made the right choice by doing that, because I know he won't change his mind, but What he said made me curious about a thing: is split really safe? Does it really prevent the splitted character to pass whatever you do? Or even if it's risky, put a var.split("'") finally prevent you from ' injection?
Edit: I've read the following question but mine is specific to the Split method, and doesn't apply only on SQL database, in other word my question is:
Does var.split('c') really prevent c to be in the final string?
Still dangerous. Of course it depends on the SQL variant. Backslash is by the standard an escape. Easy would be \x27 (if that works) for apostrophe; but already havoc is possible if the injected string ends with a backslash.
'ʼ;DROP TABLE myTable--'
there are cases where Unicode conversion might slip through a single quote, since you are only explicitly replacing one representation of the sincle quote charater with an empty string (thats what split does ..)
see: https://siderite.dev/blog/why-doubling-single-quotes-is-not.html
Related
I'm implementing a "hack" in a Spring Boot application to implement a 3-part login (company ID, username, password) by mashing up the company ID and username into one text string with a separator, and parsing them out later. This approach is described here by Chris Oberle on Baeldung.com.
Oberle uses Character.LINE_SEPARATOR as the separator character. I'm afraid that could become annoying if it's a character someone can easily enter into the login form (for example by hitting Enter or by copy-pasting their username).
My question is: Is Character.LINE_SEPARATOR a good choice of a Character constant that's not likely to accidentally find its way into user input? And if not, I'm open to suggestions of better choices.
No.
Character.LINE_SEPARATOR is not even a character. It’s a constant corresponding to a category, as returned by the getType method. In other words, the following code prints true:
System.out.println(Character.getType('\u2028') == Character.LINE_SEPARATOR);
The type of this constant is byte, in other words a numerical value, not a character. Therefore, as the linked article contains an expression like
String.format("%s%s%s", username.trim(), String.valueOf(Character.LINE_SEPARATOR), domain)
which is an unnecessarily expensive way to get the same result as the straight-forward expression
username.trim() + Character.LINE_SEPARATOR + domain
which will convert the numerical value to its string representation, it is equivalent to the following code:
username.trim() + "13" + domain
Yes, the actual value is 13, which will get converted to the string "13".
So now, it should become obvious that this is not a good separator, as a legal user name could contain the string "13".
The other question is whether an actual line separator would make a good separator. This depends on whether the value would get transported to the target correctly when containing such a separator. Mind that this has never been tested—the author of the article actually used the string "13" as separator.
My suggestion is to just use the # sign to separate user name and domain. That’s a well established pattern and no-one would be surprised when you disallow an # sign in user and domain names.
Character.LINE_SEPARATOR is a good choice because most inputs should natively not allow them to be entered in, and it is OS agnostic. If you are experiencing problems you can also use another special character like
%
Then use a ReGeX to ensure they are not allowed to type that in.
But really, Character.LINE_SEPARATOR is your best bet
I came across the same issue as the author of this question (PreparedStatement IN clause alternatives?), and wondered if using mysql's REGEXP would be an elegant way of getting the same functionality of IN while using only one PreparedStatement for varying number of values to match? Some example SQL here to show what I am talking about:
SELECT first_name, last_name
FROM people
WHERE first_name REGEXP ?
Multiple values could be supplied using a string like "Robert|Janice|Michael". I did not see REGEXP mentioned anywhere in that post.
Technically, yes, it is an alternative.
Note, however, that using a regex for matching is less efficient that the in operator ; it incurs more work for the database, that needs to initialize the regex engine, and run it against each and every value (it cannot take advantage of an index).You might not notice it on small volumes, but as your data grows larger this might become an issue. So I would not recommend that as a general solution: instead, just write a few more code lines in your application to properly use the in operator, and use regexes only where they are truly needed.
Aside: if you want to match the entire string, as in does, you need to surround the list of values with ^ and $, so the equivalent for:
first_name in ('Robert', 'Janice', 'Michael')
Would be:
first name regexp '^(Robert|Janice|Michael)$'
Another approach:
FIND_IN_SET(name, 'Robert,Janice,Michael')
Yes, that could be substituted in. But it must be a commalist of the desired values. This also works for FIND_IN_SET(foo, '1,123,45'). Note that 12 will not match.
Is there a built-in method to escape a string for SQL? I would use setString, but it happens I am using setString multiple times in the same combined SQL statement and it would be better performance (I think) if the escape happened only once instead of each time I say setString. If I had the escaped string in a variable, I could re-use it.
Is there no way to do this in Java?
Current method, multi-source search. In reality they are three entirely different where statements including joins, but for this example I will just show the same where for each table.
String q = '%' + request.getParameter("search") + '%';
PreparedStatement s = s("SELECT a,b,c FROM table1 where a = ? UNION select a,b,c from table2 where a = ? UNION select a,b,c FROM table3 where a = ?");
s.setString(1, q);
s.setString(2, q);
s.setString(3, q);
ResultSet r = s.executeQuery();
I know this is not a big deal, but I like to make things efficient and also there are situations where it is more readable to use " + quote(s) + " instead of ? and then somewhere down the line you find setString.
If you use setString for a parameter (e.g. PreparedStatement.setString), there may well be no actual escaping required - it's likely that the data will be passed separately from the SQL itself, in a way that doesn't require escaping.
Do you have any concrete indication that this really is a performance bottleneck? It seems very unlikely that within a database query, the expensive part is setting the parameters locally...
Short answer: I wouldn't bother. It's best to do escaping at the last popssible moment. When you try to escape a string early and keep it around, it becomes much more difficult to verify that all strings have been escaped exactly once. (Escaping a string twice is almost as bad as not escaping it at all!) I've seen plenty of programs that try to escape strings early and then run into trouble because they need to update the string and then the programmer forgets to re-do the escape, or they update the escaped version of the string, or they have four strings and they escape three of them, etc. (I was just working on a bug where a programmer did HTML escapes on a string early, then decided he had to truncate the string to fit on a form, and ended up trying to output a string that ended with "&am". That is, he truncated his escape sequence so it was no longer valid.)
The CPU time to escape a string should be trivial. Unless you have a very large number of records or very big strings that are re-used, I doubt the savings would be worth worrying about. You'd probably be better off spending your time optimizing queries: saving a read of one record would probably be worth far more than eliminating 1000 trips through the string escape logic.
Longer answer: There's no built-in function. You could write one easily enough: Most flavors of SQL just need you to double any single quotes. You may need to also double backslashes or one or two other special characters. The fact that this can be different between SQL engines is one of the big arguments for using PreparedStatements and letting JDBC worry about it. (Personally I think there should be a JDbC function to do escaping that could then know any requirements specific to the DB engine. But there isn't so that's how it is.)
In any case, it's not clear how it would work with a PreparedStatement. There'd have to be some way to tell the PreparedStatement not to escape this string because it's already been escaped. And who really knows what's happening under the table in the conversation between JDBC and the DB engine: Maybe it never really escapes it at all, but passes it separately from the query. I suppose there could be an extra parameter on the setString that says "this string was pre-escaped", but that would add complexity and potential errors for very little gain.
Do not use org.apache.commons.lang.StringEscapeUtils.escapeSql(yourUnscapedSQL);
It does not escape characters like \
You can use StringEscapeUtils from Apache commons:
org.apache.commons.lang.StringEscapeUtils.escapeSql(yourUnscapedSQL);
I am having problem querying single quote while using the sql LIKE statement
this is my SQL query for searching the MUSIC file in the SD CARD.
final Uri uri = MediaStore.Audio.Media.EXTERNAL_CONTENT_URI;
final String[] cursor_cols = {
MediaStore.Audio.Media.TITLE
};
where = MediaStore.Audio.Media.TITLE + " like ('%"+SomeSongTitle+"%')";
cursor = getContentResolver().query(uri, cursor_cols, where, null, null);
SomeSongTitle is some arbitrary input text that the a user input.
My Question is why when SomeSongTitle contains a single Quote(for example SomeSongTitle=don't), it crashes.
And How to fix it?
thankz for reading and hope to hear some solution from you guys =D. hehe
If you don't want to do String substitution you can use SQLiteDatabase.rawQuery to get your Cursor object. And then do something like:
String query = "select * from your_table_name where" + MediaStore.Audio.Media.TITLE + " like ('%?%')";
cursor = yourDB.rawQuery(query, new String[] {SomeSongTitle});
That should get around the quoting issue.
To fix it you need to replace the single quote with two single quotes. Try using something like...
SomeSongTitle = SomeSongTitle.replace("'", "''");
If you use bindings (?) for the argument(s) in the where clause, then you do not need and should not use any single quotes because the binding already takes care of that.
In particular, the second argument in a binding is an array of strings,
String[], providing one String for each ?. In the binding process, each of those Strings is treated by sql as if it has single quotes around it. Binding creates a compiled sql statement with variable substitution, so it is efficient to write your sql as a fixed String and binding rather than make a different statement each call.
You'll need to escape the single quote. There are much more sophisticated methods to do this, but an easy way to start is to simply to a find and replace in order to add a slash (\) before the quote mark so that it looks like this: (\').
You can read more about it SQL Injection. Specifically, look at the section on Mitigation.
Android's database API sits on top of sqlite. In its FAQ, you can see that to "escape" a single quote, you just use two single quotes. See here.
As in title: to be sure, I was debugging my application, and so in line, where I put strings into PreparedStatement variable, special characters are changing to "?". I actually don't know where to search for things that should repair it, so I don't know if code is required.. Anyway, I'll put some here:
PreparedStatement stm = null;
String sql = "";
try{
sql = "INSERT INTO methods (name, description) VALUES (?, ?)";
stm = connection.prepareStatement(sql);
stm.setString(1, method.getName());
stm.setString(2, method.getDescription());
//...
}catch(Exception e){}
while debugging 'name' field was correct in method object, but after adding it into stm variable, it changed it's characters to '?'.
I have found one topic about the similar sitoatuin on SO, but there wasn't any answer that could help me since I exactely know that there is something not right in adding string to statement, not in database. But I don't know what..
Any sugestions?
PS. I'm using netbeans 6.7.1 version
EDIT: I was debugging with standard netbeans debugger, and was checking state of variables before adding strings to 'stm' variable. I was even changing getName() method to static string with special characters. So for sure everything is ok with Method class.
EDIT2: I've made one more test. Checked stm variable and one of it's properties is "charEncoding" which is set to "cp1252". So the main question is.. how to change that?
this normally happens by using different charsets in different locations. sound like you're getting your input as UTF-8, converting it to another chatset (maybe your database is set to something else) which breaks the special character.
to fix this: use the same charset everywhere*. (i would recommend using UTF-8)
*take a look at this or my answer to another thread (that's about a problem in php, but in java it's almost the same)
Sounds like a character encoding issue to me. Perhaps the driver is transcoding your strings into the appropriate encoding for the field/table/schema/database rather than letting the server do it? If you are trying to store a character which has no representation in the encoding of the field/table/schema/database, that would explain the '?' characters.
Are you using Oracle? I have had similar situations, if the environment variables regarding character sets weren't defined correctly.
By default, an Oracle connection is ASCII (7-bit characters, A-Z, a-z, numbers, punctuation, ...). If you use any character outside of that (e.g. European accents, Chinese characters, ..) then you need to use something other than ASCII. UTF-8 is best. If you don't, your characters will get replaced by "?".
You'd need to get your sysadmin to set this up for you. Alternatively take a look here:
http://arjudba.blogspot.com/2009/02/what-is-nlslang-environmental-variable.html