SQL Injection in Java and MySQL when using multiple queries

SQL Injection in Java and MySQL when using multiple queries - java

I've got a web application with an SQL injection as part of an INSERT statement. It looks like this:
INSERT INTO table1 VALUES ('str1', 1, 'INJECTION HERE')
I can insert the regular multiple-query injections such as ');truncate table1;-- but due to the fact that Java + MySQL is used it does not allow stacking multiple queries so the above injection would result in an error from MySQL and the second query never gets executed.
So basically it seems that all one can achieve from such an injection in the aforementioned architecture is injecting "junk data", which is possible without an injection as well.
There are more techniques such as using load_file() but that would still not allow me to manipulate the database to the extent I'm looking for.
Am I missing something here? Is there some other way to use this injection for gaining control over the database?

Of course, if you change your database/driver combination from your current implementation to something supporting multiple requests, then you'll activate a dormant security hole that (no doubt) people will have forgotten about!
Ignoring the nefarious, malicious scenarios, the above will cause you problems with inserting regular data, including quote characters etc. i.e. the above simply won't work for particular sets of data (unless cleansed/esaped etc.). I would correct it simply for functionality purposes.
You should have a look at PreparedStatement, and the data insertion methods for this (setString()) etc.
e.g.:
PreparedStatement pstmt = con.prepareStatement("UPDATE EMPLOYEES
SET SALARY = ? WHERE ID = ?");
pstmt.setBigDecimal(1, 153833.00)
pstmt.setString(2, "Insert what you like here")
The setString() method will support any string without escaping/injection issues.

SQL injection doesn't have to delete something from the database. The attacker might want to retrieve some valuable data that he's not supposed to have access to.
For example, consider the following post-injection form (I'm not familiar with MySQL syntax, but something like this should be possible in general - add casts as needed):
INSERT INTO table1 VALUES ('str1', 1,
-- injected stuff --
'' || (SELECT valuable_info FROM admin_only_table WHERE id=1) || ''
-- end injected stuff --
))
Now table1 - which can be, say, where some publicly accessible info is retrieved from, so anyone can see the values - contains a potentially sensitive value from a presumably secure table admin_only_table.
Of course, this assumes that your server doesn't do any tricks such as user impersonation or otherwise limits permissions on SQL level for the queries, but rather performs them all with full privileges.

As explained in this post, there are more bad things that can happen to your application than the classic table DROP:
call a sleep function so that all your database connections will be busy, therefore making your application unavailable
extracting sensitive data from the DB
bypassing the user authentication
Bottom line, you should never use string concatenation when building SQL statements. Use a dedicated API for that purpose.

Related

Single line select using string builder or Stored Procedure

I have a lot of single line select queries in my application with multiple joins spanning 5-6 tables. These queries are generated based on many conditions based on input from a form etc using String Builders. However my team lead who happens to be a sql developer has asked me to convert those single line queries to Stored Procedures.
Is there any advantage of converting the single line select queries to backend and performing all the if and else there as SP.

One advantage of having all your sql part in stored procedures is that you keep your queries in one place that is database so it would a lot easier to change or modify without making a lot of changes in application layer or front end layer.
Besides DBA's or SQL develoeprs could fine tune the SQL's if it is stored in database procedures. You could keep all your functions/stored procedures in a package which would be better in terms of performance and organizing your objects(similar way of creating packages in Java). And of course in packages you could restrict direct access to its objects.
This is more of team's or department policy where to keep the sql part whether in front end or in database itself and of course like #Gimby mentioned, many people could have different views.
Update 1
If you have a select statement which returns something use a function, if you have INSERT/UPDATE/DELETE or similar stuff like sending emails or other business rules then use a procedure and call these from front end by passing parameters.

I'm afraid that is a question that will result in many different answers based on many different personal opinions.
Its business logic you are talking about here in any case, in -my- opinion that belongs in the application layer. But I know a whole club of Oracle devs who wholeheartedly disagree with me.

If your use PreparedStatement in java then there is no big differense in performance between
java queries and stored procedures. (If your use Statement in java, then your have a problem).
But Stored Procedure is a good way to organize and reuse your sql code. Your can group them in packages, your can change them without java compilation and your DBA or SQL spetialist can tune them.

Better option to fetch results from database tables

Are there any performance improvement in calling a procedure which returns SYS_RECURSOR or call a query?
For example
CREATE OR REPLACE PROCEDURE my_proc
(
p_id number,
emp_cursor IN OUT SYS_REFCURSOR
)
AS
BEGIN
OPEN emp_cursor for
select * from emp where emp_number=p_id
end;
/
and call the above from Java by registering OUT parameter,pass IN parameter and fetch the results.
Or
From Java get the results from emp table by
preparedStatement = prepareStatement(connection, "select * from emp where emp_number=?", values);
resultSet = preparedStatement.executeQuery();
Which one of the above is a better option to call from Java?

There is no performance difference assuming your prepareStatement method is using the appropriate type for all bind variables. That is, you would need to ensure that you are calling setLong, setDate, setString, etc. depending on the data type of the parameter. If you bind the data incorrectly (i.e. calling setString to bind a numeric value), you may force Oracle to do data type conversion which may prevent the optimizer from using an index that would improve performance.
From a code organization and maintenance standpoint, however, I would rather have the queries in the database rather than in the Java application. If you find that a query is using a poor plan, for example, it's likely to be much easier for a DBA to address the problem if the query is in a stored procedure than if the query is embedded in a Java application. If the query is stored in the database, you can also use the database's dependency tracking functions to more easily do an impact analysis if you need to do something like determine what would be impacted if the emp table needs to change.

Well, I don't think there is major significant difference from the Java invocation standpoint.
Some differencesI can think of are:
You will now have to maintain two different code bases: your Java code and your stored procedures. In case of errors, you will have to debug in two different places, and fix problems in two different places.
Once production-ready, making changes to the database is probably going to require some additional formalisms besides those required to change the Java code deployed.
Another important matter to take into account is database-independence, if you are building a product to work with different kinds of databases, you would be forced to write different versions of your stored procedures and you will have more code to maintain (debug, bugfix, change, etc).
This very important if you're building a product that you intend to deploy in different environments of different (possible yet unknown) clients, wich you cannot predict what RDBMS will be using.
If you want to use an ORM framework i.e. Hibernate, EclipseLink) it will generate pretty optimized queries for you. Plus, it would be more difficult to integrate it later on if you use stored-procedures.
With proper amount of logging is easy to analyze your queries for optimization purposes. You could use JDBC logging or the logging provided by your ORM provider and actually see how the query is being used by the application, how many times, how often, etc, and optimize where it matters.

Hibernate produce different SQL for every query

I've just tested my application under the profiler and found out that sql strings use about 30% of my memory! This is bizarre.
There are a lot of strings like this stored in app memory. This is SQL queries generated by hibernate, note the different numbers and trailing underscores:
select avatardata0_.Id as Id4305_0_,...... where avatardata0_.Id=? for update
select avatardata0_.Id as Id4347_0_,...... where avatardata0_.Id=? for update
Here is the part I can't understand. Why does hibernate have to generate different sql strings with different identifiers like "Id4305_0_" for each query? Why can't it use one query string for all identical queries? Is this some kind of trick to bypass query caching?
I would greatly appreciate if someone would describe me why it happening and how to avoid such resource wasting.
UPDATE
Ok. I found it. I was wrong assuming memory leak, It was my fault. Hibernate is working as intended.
My app created 121(!) SessionFactories in 10 threads, they produced about 2300 instances of SingleTableEntityPersisters. And each SingleTableEntityPersister generates about 15 SQL queries with different identifiers. Hibernate was forced to generate about 345.000 different SQL queries. Everything is fine, nothing weird :)

There is a logic behind the query string that hibernate generates. Its primary aim is to get unique aliases for tables and columns names.
From your query,
select avatardata0_.Id as Id4305_0_,...... where avatardata0_.Id=?
avatardata0_ ==> avatardata is the alias of the table and 0_ is appended to indicate it is the first table in the query. So if it were the second table(or Entity) in the query it should have been shown as avatardata1_. It uses the same logic for the column aliases.
So, this way all the possible conflicts are avoided.
You are seeing theses queries because you have turns on the show_sql flag the configuration. This is intended for the debugging of queries. Once you application started working you are supposed turn it off.
Read more on the API docs here.
I am not much aware of the memory consumption part, but you repeat your tests with the above flag turned off and see if there is any improvement.

Assuming you are using sql server, you might want to check the parameter type declaration for '?', making sure the declaration results in the same, fixed length declaration every time.
Dynamic length parameters would result in separate execution plans for each query. This could possibly comsume a lot of resources. What we see as the same procedure, get's interpreted by sql server as a different query, rendering a separate execution plan.
Thus,
exec myprocedure #p1 varchar(3)='foo'
and
exec myprocedure #p1 varchar(6)='foobar'
would result in different plans. Simply by the fact that the declarations of #p1, differ in size.
There is a lot to know about this behaviour. If the above applies to you, I would recommend you read up on 'parameter sniffing'.

No... you can generate you common query inside the hibernate. The logic behind is to mapping with table and fetch the record from there. It is used common query for all the database. Please create a common query like that :
Example :
select t.Id as Id4305_0_,...... from t where t.Id=?

Is HibernateCallback best for executing SQL/procedures?

I'm working on a web based application that belongs to an automobil manufacturer, developed in Spring-Hibernate with MS SQL Server 2005 database.
There are three kind of use cases:
1) Through this application, end users can request for creating a Car, Bus, Truck etc through web based interfaces. When a user logs in, a HTML form gets displayed for capturing technical specification of vehicle, for ex, if someone wanted to request for Car, he can speify the Engine Make/Model, Tire, Chassis details etc and submit the form. I'm using Hibernate here for persistence, i.e. I've a Car Entity that gets saved in DB for each such request.
2) This part of the application deals with generation of reports. These reports mainly dela with number of requests received in a day and the summary. Some of the reports calculate Turnaround time for individual Create vehicle requests.
I'm using plain JDBC calls with Preparedstatement (if report can be generated with SQLs), Callablestatement (if report is complex enough and needs a DB procedure/Function to fetch all details) and HibernateCallback to execute the SQLs/Procedures and display information on screen.
3) Search: This part of application allows ensd users to search for various requests data, i.e. how many vehicle have been requested in a Year etc. I'm using DB procedure with CallableStatement..Once again executing these procedures within HibernateCallback, populating and returning search result on GUI in a POJO.
I'm using native SQL in (2) and (3) above, because for the reporting/search purpose the report data structure to display on screen is not matching with any of my Entity. For ex: Car entity has got more than 100 attributes in itself, but for reporting purpose I don't need more than 10 of them.. so i just though loading all 100 attributes does not make any sense, so why not use plain SQL and retrieve just the data needed for displaying on screen.
Similarly for Search, I had to write procedures/Functions because search algorithm is not straight forward and Hibernate has no way to write a stored procedure kind of thing.
This is working fine for proto type, however I would like to know
a. If my approach for using native SQLs and DB procedures are fine for case 2 and 3 based on my judgement.
b. Also whether executing SQLs in HibernateCallback is correct approach?
Need expert's help.

I would like to know (...) if my approach for using native SQLs and DB procedures are fine for case 2 and 3 based on my judgment
Nothing forces your to use a stored procedure for case 2, you could use HQL and projections as already pointed out:
select f.id, f.firstName from Foo f where ...
Which would return an Object[] or a List<Object[]> depending on the where condition.
And if you want type safe results, you could use a SELECT NEW expression (assuming you're providing the appropriate constructor):
select new Foo(f.id, f.firstName) from Foo f
And you can even return non entities
select new com.acme.LigthFoo(f.id, f.firstName) from Foo f
For case 3, the situation seems different. Just in case, note that the Criteria API is more appropriate than HQL to build dynamic queries. But it looks like this won't help here.
I would like to know (...) whether executing SQLs in HibernateCallback is correct approach?
First of all, there are several restrictions when using stored procedures and I prefer to avoid them when possible. Secondly, if you want to return entities, it isn't the only way and simplest solution as we saw. So for case 2, I would consider using HQL.
For case 3, since you aren't returning entities at all, I would consider not using Hibernate API but the JDBC support from Spring which offers IMHO a cleaner API than Session#connection() and the HibernateCallback.
More interesting readings:
References
Hibernate Core reference guide
14.6. The select clause (about the select new)
16.1.5. Returning non-managed entities (about ResultTransformer)
16.2.2. Using stored procedures for querying
Resources
Hibernate 3.2: Transformers for HQL and SQL
Related questions
hibernate SQLquery extract variable
hibernate query language or using criteria

You should strive to use as much HQL as possible, unless you have a good argument (like performance, but do a benchmark first). If the use of native queries becomes to excessive, you should consider whether Hibernate has been a good choice.
Note a few things:
you can have native queries and stored procedures that result in Hibernate entities. You just have to map the query / storproc call to a class and call it by session.createSQLQuery(queryName)
If you really need to construct native queries at runtime, the newest version of hibernate have a doWork(..) method, by which you can do JDBC work.

You say
For ex: Car entity has got more than 100 attributes in itself, but for reporting purpose I don't need more than 10 of them.. so i just though loading all 100 attributes does not make any sense
but HQL in hibernate allows you to do a projection (select only a subset of the columns back). You don't have to pull the entire entity if you don't want to.
Then you get all the benefits of HQL (typing of results, HQL join syntax) but you can pretty much write SQLish code.
See here for the HQL docs and here for the select syntax. If you're used to SQL it's pretty easy.
So to answer you directly
a - No, I think you should be using HQL
b - Becomes irrelevant if you go with my suggestion for a.

How to limit number of rows returned from Oracle at the JDBC data source level?

Is there a way to limit the rows returned at the Oracle datasource level in a Tomcat application?
It seems maxRows is only available if you set it on the datasource in the Java code. Putting maxRows="2" on the datasource doesn't apply.
Is there any other way limit the rows returned? Without a code change?

It isn't something that is available at the configuration level. You may want to double check that it does what you want it to do anyway: see the javadoc for setMaxRows. With Oracle it is still going to fetch every row back for the query and then just drop the ones outside the range. You would really need to use rownum to make it work well with Oracle and you can't do that either in the configuration.

The question is why do you want to limit the number of rows returned. There could be many reasons to do this. The first would be to just limit the data returned by the database. In my opinion this makes no sense in most cases as if I would like to get certain data only then I would use a different statement or add a filter condition or something. E.g. if you use rownum of Oracle you don't exactly know which data is in the rows you get and which data is not included as you just tell the database that you want row x to y.
The second approach is to limit memory usage and increase performance so that the ResultSet you get from the JDBC driver will not include all data. You can limit the number of rows hold by the ResultSet using Statement.setFetchSize(). If you move the cursor in the ResultSet beyond the number of rows fetched the JDBC driver will fetch the missing data from the database. (In case of Oracle the database will store the data in a ref cursor which is directly accessed by the JDBC driver).

*Beware: the code below is provided as pure example. It has not
been tested * It thus may harm yourself or your computer or even punch you
in the face.
If you want to avoid modifying your SQL queries but still want to have clean code (which means that your code stay maintainable), you may design the solution using wrappers. That is, by using a small set of classes wrapping existing ones, you may achieve what you want seamlessly for the rest of the application which will still think it is working with real DataSource, Connection and Statement.
1 - implement a StatementWrapper or PreparedStatementWrapper class, depending what your application already uses. Those classes are wrappers around normal Statement or PreparedStatement instances. They are implemented simply as using the inner statement as a delegate which does all the work, except when this is a QUERY statement (Statement.executeQuery() method). Only in that precise situation, the wrapper surrounds the query by the two following strings : "SELECT * FROM (" and ") WHERE ROWNUM < "+maxRowLimit. For basic code wrapper code, see how it looks for the DataSourceWrapper below.
2 - write one more wrapper : ConnectionWrapper which wraps a Connection which returns StatementWrapper in createStatement() and PreparedStatementWrapper in prepareStatement(). Those are the previously coded classes taking ConnectionWrapper's delegateConnection.createStatement()/prepareStatement() as construction arguments.
3 - repeat the step with a DataSourceWrapper. Here is a simple code example.
public class DataSourceWrapper implements DataSource
{
private DataSource mDelegate;
public DataSourceWrapper( DataSource delegate )
{
if( delegate == null ) { throw new NullPointerException( "Delegate cannot be null" );
mDelegate = delegate;
}
public Connection getConnection(String username, String password)
{
return new ConnectionWrapper( mDelegate.getConnection( username, password ) );
}
public Connection getConnection()
{
... <same as getConnection(String, String)> ...
}
}
4 - Finally, use that DataSourceWrapper as your application's DataSource. If you're using JNDI (NamingContext), this change should be trivial.
Coding all this is quick and very straightforward, especially if you're using smart IDE like Eclipse or IntelliJ which will implement the delegating methods automagically.

If you know you will be dealing with only one table, then define a view with rownum in the where statement to limit the number of rows. In this way, the number of rows is controlled at the DB and does not need to be specified as part of any query from a client application. If you want to change the number of rows returned, then redefine the view prior to executing query.
A more dynamic method would be to develop a procedure and pass in a number of rows, and have the procedure return a ref_cursor to your client. This would have the advantage of avoiding hard parsing on the DB, and increase performance.

Ok, a code change it'll have to be then.
The scenario is limiting an adhoc reporting tool so that the end user doesnt pull back too many records and generate a report which is unusable.
We already use oracle cost based resource management.

Take a look at this page with a description of limiting how much is sucked into the Java App at a time. As another post points out, the DB will still pull all of the data, this is more for controlling network use, and memory on the Java side.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.