Cleanest way to build an SQL string in Java

Cleanest way to build an SQL string in Java - java

I want to build an SQL string to do database manipulation (updates, deletes, inserts, selects, that sort of thing) - instead of the awful string concat method using millions of "+"'s and quotes which is unreadable at best - there must be a better way.
I did think of using MessageFormat - but its supposed to be used for user messages, although I think it would do a reasonable job - but I guess there should be something more aligned to SQL type operations in the java sql libraries.
Would Groovy be any good?

First of all consider using query parameters in prepared statements:
PreparedStatement stm = c.prepareStatement("UPDATE user_table SET name=? WHERE id=?");
stm.setString(1, "the name");
stm.setInt(2, 345);
stm.executeUpdate();
The other thing that can be done is to keep all queries in properties file. For example
in a queries.properties file can place the above query:
update_query=UPDATE user_table SET name=? WHERE id=?
Then with the help of a simple utility class:
public class Queries {
private static final String propFileName = "queries.properties";
private static Properties props;
public static Properties getQueries() throws SQLException {
InputStream is =
Queries.class.getResourceAsStream("/" + propFileName);
if (is == null){
throw new SQLException("Unable to load property file: " + propFileName);
}
//singleton
if(props == null){
props = new Properties();
try {
props.load(is);
} catch (IOException e) {
throw new SQLException("Unable to load property file: " + propFileName + "\n" + e.getMessage());
}
}
return props;
}
public static String getQuery(String query) throws SQLException{
return getQueries().getProperty(query);
}
}
you might use your queries as follows:
PreparedStatement stm = c.prepareStatement(Queries.getQuery("update_query"));
This is a rather simple solution, but works well.

For arbitrary SQL, use jOOQ. jOOQ currently supports SELECT, INSERT, UPDATE, DELETE, TRUNCATE, and MERGE. You can create SQL like this:
String sql1 = DSL.using(SQLDialect.MYSQL)
.select(A, B, C)
.from(MY_TABLE)
.where(A.equal(5))
.and(B.greaterThan(8))
.getSQL();
String sql2 = DSL.using(SQLDialect.MYSQL)
.insertInto(MY_TABLE)
.values(A, 1)
.values(B, 2)
.getSQL();
String sql3 = DSL.using(SQLDialect.MYSQL)
.update(MY_TABLE)
.set(A, 1)
.set(B, 2)
.where(C.greaterThan(5))
.getSQL();
Instead of obtaining the SQL string, you could also just execute it, using jOOQ. See
http://www.jooq.org
(Disclaimer: I work for the company behind jOOQ)

One technology you should consider is SQLJ - a way to embed SQL statements directly in Java. As a simple example, you might have the following in a file called TestQueries.sqlj:
public class TestQueries
{
public String getUsername(int id)
{
String username;
#sql
{
select username into :username
from users
where pkey = :id
};
return username;
}
}
There is an additional precompile step which takes your .sqlj files and translates them into pure Java - in short, it looks for the special blocks delimited with
#sql
{
...
}
and turns them into JDBC calls. There are several key benefits to using SQLJ:
completely abstracts away the JDBC layer - programmers only need to think about Java and SQL
the translator can be made to check your queries for syntax etc. against the database at compile time
ability to directly bind Java variables in queries using the ":" prefix
There are implementations of the translator around for most of the major database vendors, so you should be able to find everything you need easily.

I am wondering if you are after something like Squiggle (GitHub). Also something very useful is jDBI. It won't help you with the queries though.

I would have a look at Spring JDBC. I use it whenever I need to execute SQLs programatically. Example:
int countOfActorsNamedJoe
= jdbcTemplate.queryForInt("select count(0) from t_actors where first_name = ?", new Object[]{"Joe"});
It's really great for any kind of sql execution, especially querying; it will help you map resultsets to objects, without adding the complexity of a complete ORM.

I tend to use Spring's Named JDBC Parameters so I can write a standard string like "select * from blah where colX=':someValue'"; I think that's pretty readable.
An alternative would be to supply the string in a separate .sql file and read the contents in using a utility method.
Oh, also worth having a look at Squill: https://squill.dev.java.net/docs/tutorial.html

I second the recommendations for using an ORM like Hibernate. However, there are certainly situations where that doesn't work, so I'll take this opportunity to tout some stuff that i've helped to write: SqlBuilder is a java library for dynamically building sql statements using the "builder" style. it's fairly powerful and fairly flexible.

I have been working on a Java servlet application that needs to construct very dynamic SQL statements for adhoc reporting purposes. The basic function of the app is to feed a bunch of named HTTP request parameters into a pre-coded query, and generate a nicely formatted table of output. I used Spring MVC and the dependency injection framework to store all of my SQL queries in XML files and load them into the reporting application, along with the table formatting information. Eventually, the reporting requirements became more complicated than the capabilities of the existing parameter mapping frameworks and I had to write my own. It was an interesting exercise in development and produced a framework for parameter mapping much more robust than anything else I could find.
The new parameter mappings looked as such:
select app.name as "App",
${optional(" app.owner as "Owner", "):showOwner}
sv.name as "Server", sum(act.trans_ct) as "Trans"
from activity_records act, servers sv, applications app
where act.server_id = sv.id
and act.app_id = app.id
and sv.id = ${integer(0,50):serverId}
and app.id in ${integerList(50):appId}
group by app.name, ${optional(" app.owner, "):showOwner} sv.name
order by app.name, sv.name
The beauty of the resulting framework was that it could process HTTP request parameters directly into the query with proper type checking and limit checking. No extra mappings required for input validation. In the example query above, the parameter named serverId
would be checked to make sure it could cast to an integer and was in the range of 0-50. The parameter appId would be processed as an array of integers, with a length limit of 50. If the field showOwner is present and set to "true", the bits of SQL in the quotes will be added to the generated query for the optional field mappings. field Several more parameter type mappings are available including optional segments of SQL with further parameter mappings. It allows for as complex of a query mapping as the developer can come up with. It even has controls in the report configuration to determine whether a given query will have the final mappings via a PreparedStatement or simply ran as a pre-built query.
For the sample Http request values:
showOwner: true
serverId: 20
appId: 1,2,3,5,7,11,13
It would produce the following SQL:
select app.name as "App",
app.owner as "Owner",
sv.name as "Server", sum(act.trans_ct) as "Trans"
from activity_records act, servers sv, applications app
where act.server_id = sv.id
and act.app_id = app.id
and sv.id = 20
and app.id in (1,2,3,5,7,11,13)
group by app.name, app.owner, sv.name
order by app.name, sv.name
I really think that Spring or Hibernate or one of those frameworks should offer a more robust mapping mechanism that verifies types, allows for complex data types like arrays and other such features. I wrote my engine for only my purposes, it isn't quite read for general release. It only works with Oracle queries at the moment and all of the code belongs to a big corporation. Someday I may take my ideas and build a new open source framework, but I'm hoping one of the existing big players will take up the challenge.

Why do you want to generate all the sql by hand? Have you looked at an ORM like Hibernate Depending on your project it will probably do at least 95% of what you need, do it in a cleaner way then raw SQL, and if you need to get the last bit of performance you can create the SQL queries that need to be hand tuned.

You can also have a look at MyBatis (www.mybatis.org) . It helps you write SQL statements outside your java code and maps the sql results into your java objects among other things.

Google provides a library called the Room Persitence Library which provides a very clean way of writing SQL for Android Apps, basically an abstraction layer over underlying SQLite Database. Bellow is short code snippet from the official website:
#Dao
public interface UserDao {
#Query("SELECT * FROM user")
List<User> getAll();
#Query("SELECT * FROM user WHERE uid IN (:userIds)")
List<User> loadAllByIds(int[] userIds);
#Query("SELECT * FROM user WHERE first_name LIKE :first AND "
+ "last_name LIKE :last LIMIT 1")
User findByName(String first, String last);
#Insert
void insertAll(User... users);
#Delete
void delete(User user);
}
There are more examples and better documentation in the official docs for the library.
There is also one called MentaBean which is a Java ORM. It has nice features and seems to be pretty simple way of writing SQL.

Read an XML file.
You can read it from an XML file. Its easy to maintain and work with.
There are standard STaX, DOM, SAX parsers available out there to make it few lines of code in java.
Do more with attributes
You can have some semantic information with attributes on the tag to help do more with the SQL. This can be the method name or query type or anything that helps you code less.
Maintaince
You can put the xml outside the jar and easily maintain it. Same benefits as a properties file.
Conversion
XML is extensible and easily convertible to other formats.
Use Case
Metamug uses xml to configure REST resource files with sql.

If you put the SQL strings in a properties file and then read that in you can keep the SQL strings in a plain text file.
That doesn't solve the SQL type issues, but at least it makes copying&pasting from TOAD or sqlplus much easier.

How do you get string concatenation, aside from long SQL strings in PreparedStatements (that you could easily provide in a text file and load as a resource anyway) that you break over several lines?
You aren't creating SQL strings directly are you? That's the biggest no-no in programming. Please use PreparedStatements, and supply the data as parameters. It reduces the chance of SQL Injection vastly.

Related

Handling dynamic queries with Spring

The problem I'm trying to solve here is, filtering the table using dynamic queries supplied by the user.
Entities needed to describe the problem:
Table: run_events
Columns: user_id, distance, time, speed, date, temperature, latitude, longitude
The problem statement is to get the run_events for a user, based on a filterQuery.
Query is of the format,
((date = '2018-06-01') AND ((distance < 20) OR (distance > 10))
And this query can combine multiple fields and multiple AND/OR operations.
One approach to solving this is using hibernate and concatenating the filterQuery with your query.
"select * from run_events where user_id=:userId and "+filterQuery;
This needs you to write the entire implementation and use sessions, i.e.
String q = select * from run_events where user_id=:userId and "+filterQuery;
Query query = getSession().createQuery(q);
query.setParameter("userId", userId);
List<Object[]> result = query.list();
List<RunEvent> runEvents = new ArrayList<>();
for(Object[] obj: result){
RunEvent datum = new RunEvent();
int index = -1;
datum.setId((long) obj[++index]);
datum.setDate((Timestamp) obj[++index]);
datum.setDistance((Long) obj[++index]);
datum.setTime((Long) obj[++index]);
datum.setSpeed((Double) obj[++index]);
datum.setLatitude((Double) obj[++index]);
datum.setLongitude((Double) obj[++index]);
datum.setTemperature((Double) obj[++index]);
runEvents.add(datum);
}
This just doesn't seem very elegant and I want to use the #Query annotation to do this i.e.
#Query(value = "select run_event from RunEvent where user_id = :userId and :query order by date asc")
List<RunEvent> getRunningData(#Param("userId") Long userId,
#Param("query") String query,
);
But this doesn't work because query as a parameter cannot be supplied that way in the query.
Is there a better, elegant approach to getting this done using JPA?
Using Specifications and Predicates seems very complicated for this sort of a query.

To answer the plain question: This is not possible with #Query.
It is also in at least 99% of the cases a bad design decision because constructing SQL queries by string concatenation using strings provided by a user (or any source not under tight control) opens you up for SQL injection attacks.
Instead you should encode the query in some kind of API (Criteria, Querydsl, Query By Example) and use that to create your query. There are plenty of questions and answers about this on SO so I won't repeat them here. See for example Dynamic spring data jpa repository query with arbitrary AND clauses
If you insist on using a SQL or JPQL snippet as input a custom implementation using String concatenation is the way to go.

This opens up attack for SQL injection. Maybe that’s why this feature is not possible.
It is generally a bad idea to construct query by appending random filters at the end and running them.
What if the queryString does something awkward like
Select * from Foo where ID=1234 or true;
thereby returning all the rows and bringing a heavy load on DB possibly ceasing your whole application?
Solution: You could use multiple Criteria for filtering it dynamically in JPA, but you’ll need to parse the queryString yourself and add the necessary criteria.

You can use kolobok and ignore fields with null values.
For example create one method like bellow
findByUserIdAndDistanceaLessThanAndDistancebGreaterThan....(String userid,...)
and call that method only with the filter parameters while other parameters are null

Add basic value to Ontology individuals #Jena

I have an Ontology with some Classes and everything setup to run. What is a good way to fill it up with Individuals and Data?? In Short do a one-way Mapping from Database (as Input) to an Ontology.
public class Main {
static String SOURCE = "http://www.umingo.de/ontology/bento.owl";
static String NS = SOURCE+"#";
public static void main(String[] args) throws Exception {
OntModel model = ModelFactory.createOntologyModel( OntModelSpec.OWL_MEM );
// read the RDF/XML file
model.read(SOURCE);
OntologyPreLoader loader = new OntologyPreLoader();
model = loader.init(model);
model.write(System.out,"RDF/XML");
}
}
My Preloader has a Method init with the goal to copy data from a database into the ontology. Here is the Excerpt.
public OntModel init(OntModel model) throws SQLException{
Resource r = model.getResource( Main.NS + "Tag" );
Property tag_name = model.createProperty(Main.NS + "Tag_Name");
OntClass tag = r.as( OntClass.class );
// statements allow to issue SQL queries to the database
statement = connect.createStatement();
// resultSet gets the result of the SQL query
resultSet = statement
.executeQuery("select * from niuu.tags");
// resultSet is initialised before the first data set
while (resultSet.next()) {
// it is possible to get the columns via name
// also possible to get the columns via the column number
// which starts at 1
// e.g., resultSet.getSTring(2);
String id = resultSet.getString("id");
String name = resultSet.getString("name");
Individual tag_tmp = tag.createIndividual(Main.NS+"Tag_"+id);
tag_tmp.addProperty(tag_name,name);
System.out.println("id: " + id);
System.out.println("name: " + name);
}
return model;
}
Everything is working, but I feel really unsure about this way to preload ontologies. Also every Individual should get its own ID so that i can match it with the database at a later point.
Can i simply define a Property ID and add it to every Individual?
I thought about Adding ID to "Thing" as it is the most basic Type in OWL ontologies.

At first sight it seems ok. One tip is to try convert the Jena model into a RDF serialization and run it through Protégé to get a more clear picture on how your ontology mapping looks like.
You can definitely make your own property to describe the id of every individual.
Beneath is an example on how you can create a similar property in turtle format.(I did not add the prefixes for OWL and rdfs since they are some common)
You can add this in Jena aswell if needed. (or load this into your model in Jena.)
#prefix you: <your domain> .
you:dbIdentificator a owl:DatatypeProperty .
you:dbIdentificator rdfs:label "<Your database identifcator>"#en .
you:dbIdentificator rdfs:comment "<Some valuable information if needed>"#en .
you:dbIdentificator rdfs:isDefinedBy <your domain> .
you:dbIdentificator rdfs:domain owl:Thing .
You could also add owl:Thing to every resource, but that is not the best practice because it is a vague definition of a resource. I would look around for vocabularies that defines more what the resource is. Take a look at GoodRelations. It is a very good defined vocabulary that can describe information even though it is not for commercial use. Especially check out the classes there.
Hope that answered some of your question.

Programatically generating URIs is always somewhat unsettling. If you have Guava, use Preconditions to make some fail-fast assertions about what's coming out of the database (so that your code will let you know if it gets out of alignment with your schema). Use the JDK's URLEncoder to ensure that the id you get from the database is converted to a URI-friendly format (Note that if your data contains characters that cannot be printed in xml and have no percent encoding, you'll need to manually handle them).
For your property/column values, use explicitly create the literal. This makes it very clear whether you are using plain literals, language literals, or typed literals:
// If things can have multiple names in multiple languages, for example
tag_tmp.addProperty(tag_name,model.createTypedLiteral(name, "en"));
Note that you may not wish to define your schema so that it implies things about owl:Thing, because that would have implications outside of your domain. Instead, define a domain-specific notion like a :DatabaseResource. Set the domains of your properties to be that and it's subclasses rather than thing. This way the use of your property implies that the subject with within your domain, rather than simply an owl individual (which is implied by the domain of owl:DatatypeProperty anyway).
EDIT: It's absolutely acceptable to create a representation of the database's unique ID and place it into the RDF model. If you are using owl2, you can define an OWL-2 Key on that property for your :DatabaseResources and keep the same semantics that you had in the database.
EDIT: Noting a portion of your post on the Jena mailing list:
I have a huge MYSQL-Database for read only purpose and want to extract some Data into the Ontology.
I would highly recommend using the TDB Java API to construct a Dataset that backed by your disk. I've worked on very large database exports before, and it's quite possible that your data size won't be tractable otherwise. TDB's indexing requires a lot of disk space, but the memory-mapped IO makes it very difficult to kill due to OOM errors. Finally, once you have constructed the database on disk, you won't have to perform this expensive import operation again (or could at least optimize it).
If you find database creation times to be prohibitive, then you may with to utilize the bulk loader in creative ways. This answer has an example of using the bulk loader from java.

How to keep the sql queries out of the java code

I've written a small code which functions like follows:
Read an input file
Read the first line in it and check if it
has value AAA at a certain position
if it satisfies the
condition call the method insertAAAmethod and load the data to the
oracle table
read the next line if it is record BBB call the insertBBBrmethod which has different insert query
The problem is I have 15 different records in the input file so I have 15 different methods like insertAAAmethod each with different insert query:
public static void insertAAARecord() throws Exception {
String sqlQuery = "insert into my_table(ColumnA,ColumnB,ColumnC,ColumnD,ColumnE,ColumnF,ColumnG)"
+ "values (?,?,?,?,?,?,?)";
try {
pstmt = conn.prepareStatement(sqlQuery);
pstmt.setString(1, "AAA");
pstmt.setDate(2,
StringtoDate("AAA", CurrentLine.substring(150, 158)));
pstmt.setDate(3,
StringtoDate("AAA", CurrentLine.substring(158, 166)));
pstmt.setString(4, CurrentLine.substring(24, 34));
pstmt.setString(5, CurrentLine.substring(37, 45));
pstmt.setString(6, CurrentLine.substring(147, 150));
pstmt.setDate(7, headerDate);
pstmt.executeUpdate();
} catch (SQLException e) {
e.printStackTrace(System.out);
} finally {
pstmt.close();
}
}
Is it possible to keep the sql query out of the java code? (Like in a properties file, for example)
Note: The insert query changes as per the record if it is record 'AAA' it should follow one insert query if the record is different insert query should change..
Let me know how to optimize my code.

Yes, you should create a properties file with all your queries and load them on startup using Properties object, then use the Properties to look them up. You can also you Spring to inject the queries into your configuration objects.
I have always built DB object that integrate your code to DB data and queries are kept there, it is much easier to debug and manage as everything you need is in one place.
Make your life simple, avoid ORMs and tune your SQL queries (or let DBAs do that that's what they do and they are good at it). However if you do not like SQL or don't care how efficient it is, then ORM like Hibernate may be what you need.

Maybe this will take more time than other approaches, but it would help you with your code optimization.
Every query/process have common attributes.
SQL Query
String pattern
Parameters
and each parameter has
Position
Datatype
etc
If you're using spring already, you'll be able to define these as beans in your config.xml.
if not, you can use xml configuration anyway (instead of property files)
Then, you will have to create some class to parse these beans and create the custom queries.
Hope it helps.

Try to use any ORM framework like Hibernate/JPA, Ibatis?

You want to use JPA. I would recommend walking through a simple JPA tutorial such as the following: http://glassfish.java.net/javaee5/persistence/persistence-example.html
That tutorial will demonstrate the basics of how to create or modify objects and persist those changes in a database. Good luck.

How to Manipulate SQL specified in jrxml template

Problem: Report Templates are created by Admin Users who only decide what data to show where as the filter for the data is specified by the business user. In simpler SQL terms, query is specified by the Admin User, Business User specifies the WHERE clause.
Jasper allows user to specify parameters in SQL query like $P{city}. I have tried to retrieve the query dynamically using the method specfied in the link.
Possible Solution can be
Use WHERE clause parameters in JRXML and replace them while report creation - This will save me SQL parsing but I don't want to guide the admin user with this complexity. Parsing is not a huge problem.
Use my custom jdbc query executor and factory, only created to allow me extension point before jasper fire SQL query. I will be completely relying on vanilla Jasper JDBC data source but will only modify query before execution. JRAbstractQueryExecuter simplifies the query and replace the jasper related tokens before firing query - This will be very dirty and will force me to be implementation specific.
Do the same replacement as it is done in JRAbstractQueryExecuter in my application code base, parse the SQL query, modify it and set it again as specified in link
Can you please suggest a better way of doing this? I have a feeling that this can definitly be done in cleaner way.
Thanks for your help

You could create an input control to determine the desired WHERE clause and use a parameter to hold the contents of that WHERE clause. The default value expression would be something like:
$P{theParameter} == "value_1" ?
(" AND CONDITION_1 IN ('A', 'B', 'C') AND CONDITION_2 = 'Yes' "
) : " AND CONDITION_3 = 'Other' AND CONDITION_4 = 'No' "
Then in your WHERE clause you would reference it like:
WHERE
.... = .....
AND .... = ....
AND .... = ....
$P!{theParameter}
If your constraint columns are the same across your WHERE clauses, you could use $P! to bring in the parameter value literally, and reference it in your query:
WHERE
.... = .....
AND .... = ....
AND .... = ....
AND thisValue = $P!{theParameter}

How can I generically detect if a database is 'empty' from Java

Can anyone suggest a good way of detecting if a database is empty from Java (needs to support at least Microsoft SQL Server, Derby and Oracle)?
By empty I mean in the state it would be if the database were freshly created with a new create database statement, though the check need not be 100% perfect if covers 99% of cases.
My first thought was to do something like this...
tables = metadata.getTables(null, null, null, null);
Boolean isEmpty = !tables.next();
return isEmpty;
...but unfortunately that gives me a bunch of underlying system tables (at least in Microsoft SQL Server).

There are some cross-database SQL-92 schema query standards - mileage for this of course varies according to vendor
SELECT COUNT(*) FROM [INFORMATION_SCHEMA].[TABLES] WHERE [TABLE_TYPE] = <tabletype>
Support for these varies by vendor, as does the content of the columns for the Tables view. SQL implementation of Information Schema docs found here:
http://msdn.microsoft.com/en-us/library/aa933204(SQL.80).aspx
More specifically in SQL Server, sysobjects metadata predates the SQL92 standards initiative.
SELECT COUNT(*) FROM [sysobjects] WHERE [type] = 'U'
Query above returns the count of User tables in the database. More information about the sysobjects table here:
http://msdn.microsoft.com/en-us/library/aa260447(SQL.80).aspx

I don't know if this is a complete solution ... but you can determine if a table is a system table by reading the table_type column of the ResultSet returned by getTables:
int nonSystemTableCount = 0;
tables = metadata.getTables(null, null, null, null);
while( tables.next () ) {
if( !"SYSTEM TABLE".equals( tables.getString( "table_type" ) ) ) {
nonSystemTableCount++;
}
}
boolean isEmpty = nonSystemTableCount == 0;
return isEmpty;
In practice ... I think you might have to work pretty hard to get a really reliable, truly generic solution.

Are you always checking databases created in the same way? If so you might be able to simply select from a subset of tables that you are familiar with to look for data.
You also might need to be concerned about static data perhaps added to a lookup table that looks like 'data' from a cursory glance, but might in fact not really be 'data' in an interesting sense of the term.
Can you provide any more information about the specific problem you are trying to tackle? I wonder if with more data a simpler and more reliable answer might be provided.
Are you creating these databases?
Are you creating them with roughly the same constructor each time?
What kind of process leaves these guys hanging around, and can that constructor destruct?
There is certainly a meta data process to loop through tables, just through something a little more custom might exist.

In Oracle, at least, you can select from USER_TABLES to exclude any system tables.

I could not find a standard generic solution, so each database needs its own tests set.
For Oracle for instance, I used to check tables, sequences and indexes:
select count(*) from user_tables
select count(*) from user_sequences
select count(*) from user_indexes
For SqlServer I used to check tables, views and stored procedures:
SELECT * FROM sys.all_objects where type_desc in ('USER_TABLE', 'SQL_STORED_PROCEDURE', 'VIEW')
The best generic (and intuitive) solution I got, is by using ANT SQL task - all I needed to do is passing different parameters for each type of database.
i.e. The ANT build file looks like this:
<project name="run_sql_query" basedir="." default="main">
<!-- run_sql_query: -->
<target name="run_sql_query">
<echo message="=== running sql query from file ${database.src.file}; check the result in ${database.out.file} ==="/>
<sql classpath="${jdbc.jar.file}"
driver="${database.driver.class}"
url="${database.url}"
userid="${database.user}"
password="${database.password}"
src="${database.src.file}"
output="${database.out.file}"
print="yes"/>
</target>
<!-- Main: -->
<target name="main" depends="run_sql_query"/>
</project>
For more details, please refer to ANT:
https://ant.apache.org/manual/Tasks/sql.html

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.