I am developing a portlet in which I access a database quite often. I have to specify the query in a way, that offers the possibility of filtering as a reaction on a user input. The parameters used for filtering are two at the moment, but this number can grow in the future.
At the moment, my construction works pretty well for all inputs, however, I dont think that I am doing it in a right/effective way, since I do not use prepared statement and just construct the query manually.
This is example of my code (serviceFilter is an arrayList and typeFlag is a String)
private String prepareQuery() {
String query = "SELECT * from messages ";
// check filters
if (!typeFlag.equals("ALL")) {
if (typeFlag.equals("XML")) {
query += "WHERE type='" + TYPE_XML + "'";
} else {
query += "WHERE type='" + TYPE_JAVA + "'";
}
}
// lets see if user specifies some service filtering
if (serviceFilter.size() > 0) {
if (!typeFlag.equals("ALL")) {
query += " AND (";
} else {
query += " WHERE (";
}
for (int i = 0; i < serviceFilter.size(); i++) {
if (i>0) {
query += " OR ";
}
String service = serviceFilter.get(i);
System.out.println("Filter: " + service);
query += "sender='" + service + "' OR receiver='" + service + "'";
}
query += ")";
}
query += " ORDER BY id DESC LIMIT " + String.valueOf(limit);
System.out.println(query);
return query;
}
First problem is, that this has no way to prevent SQL injection (which would not be such a big problem since all the inputs come from checkBoxes and scrollbars, so the user does not actually type anything). I am not sure how to use a prepared statement here, because the population of my arrayList can be quite long and changes for every query.
The query itself, due to this fact can get really long. Here is an example of a query just for two arguments (imagine this for 20 items):
SELECT * from messages WHERE (sender='GreenServiceESB#GreenListener' OR receiver='GreenServiceESB#GreenListener' OR sender='queue/DeadMessageQueue' OR receiver='queue/DeadMessageQueue') ORDER BY id DESC LIMIT 50
So basically, my question is: Is this an effective way of constructing my query (propably not, right)? What approach would you suggest?
PS: I am using JDBC to connect to db and execute the query, if it is important in any way...
Thanks for any tips!
If you want to use something like
create.selectFrom(BOOK)
.where(PUBLISHED_IN.equal(2011))
.orderBy(TITLE)
instead of
SELECT * FROM BOOK
WHERE PUBLISHED_IN = 2011
ORDER BY TITLE
you can look to http://www.jooq.org/. It will simplify your code and you can avoid things like "if (something) { sql += " WHERE ..." }". This is antipattern and should not be used when possible.
First of all, you hinted at one of your issues - not using a PreparedStatement. Taking user input and using it directly in a SQL statement opens a site to SQL injection attacks.
I think what you're wanting is this:
select * from (
select *, row_number over (order by id_desc) as rowNum
from messages
where sender in (?,?,?,?,?,?,?,?) --8, 16 or however many ?'s you'll need
or receiver in (?,?,?,?,?,?,?,?)
) results
where rowNum between (1 and ?)
order by rowNum
Now, you bind the parameters with whatever the user input is and if you have extra spots in you IN operator left over, you bind them with some value that can't (or likely won't) be in your table such as null or HiMom#2$#. If you need to support an arbitrary number of values, you run the query multiple times and sort the results in memory yourself.
As far as the row_number function, that may not work in MySQL (I'm not a MySQL guy) but it does have an equivalent (or it could be that limit may be parameterizable, I don't know.
Related
I'm trying to write an sql query which runs over a set and sees if the id is in the set but it gives the error that only 1000 items can be in the array. I'm trying to solve it but I got stuck here:
for (int i = 0; i < e.getEmployeeSet().size(); i+=1000) {
sqlQuery.append("AND employee.id");
if(!e.incudeEmployee()){
sqlQuery.append("NOT ");
}
sqlQuery.append("IN (");
for(Employee employee: e.getEmployeeSet().){
sqlQuery.append(employee.getEmployeeId())
.append(",");
}
sqlQuery.deleteCharAt(sqlQuery.length()-1)
.append(") ");
}
I still have to figure out that the first time it has to be AND id.., the other times it has to be OR ... and I have to go over the set in a way that the first time I only go over the first 1000 employee's and so on. Any clean way to fix this?
Sql allows upto 1000 list values in SQL statements. And it is not an efficient way of including the list in IN clause.
Better store data in a temporary table and add join in your query.
Temporary table creation :
create table temp_emprecords as
select * from a, b,c
where clause...;
Now add the temp_emprecords table in your query and join with employee id.
select *
from employee emp,
temp_emprecords tmp
where emp.id = tmp.id;
You can modify your sql to be like:
SELECT /* or UPDATE (whatever you do) */
...
WHERE
employee.id IN (... first thousand elements ...)
OR
employee.id IN (... next thousand elements ...)
OR
... and so on ...
Your Java code will be slightly different to produce "OR employee.id IN" block for each thousand of ids
UPD: to make it just introduce another counter to do like (pseudo code):
counter=0;
for each employeeId {
if counter equals 1000 {
complete current IN block;
counter=0;
if not first thousand {
start new OR block;
}
start new IN block;
}
add employeeId into IN block;
counter++;
}
but important: I do not recommend go the way as you do either with or without OR blocks
It is because construct SQL as you do is direct way to SQL injection.
To avoid it just follow simple rule:
No any actual data must be inline in SQL String. All data must be passed to Query as parameters
You have to use prepared SQL statement with parameters for employee.id values.
Also: Simple way is to run separate query for each 1000 ids in the loop
So the solution that works is like this:
int counter = 1;
String queryString= "WHERE (employeeId IN ( ";
Iterator<Long> it = getEmployeeSet().iterator();
while (it.hasNext()) {
if (counter % 999 == 0)
queryString= queryString.substring(0, queryString.length() - 1) + " ) or
employeeId IN ( '" + it.next()+ "',";
else
queryString+= "'" + it.next() + "',";
counter++;
}
append(queryString.substring(0, queryString.length() - 1) + " )) ");
I have EVENTS table in my database. A regular insertion of a new event would look like this:
private static final String INSERT_EVENT_SQL = "INSERT INTO EVENTS"
+ "(EVENT_ID, AGGREGATE_ID, AGGREGATE_VERSION, EVENT_TYPE, EVENT_PAYLOAD) VALUES"
+ "(?,?,?,?,?)";
pst = conn.prepareStatement(INSERT_EVENT_SQL);
pst.setString(1, event.getEventId().toString());
pst.setString(2, event.getAggregateId().toString());
pst.setLong(3, event.getAggregateVersion());
pst.setString(4, event.getEventType());
pst.setString(5, event.getPayload());
I would like to make this insertion conditional and atomic.
The condition that must be satisfied is that event.getAggregateVersion() is equal to the current aggregate version stored in the database plus 1.
The current aggregate version can be calculated from database entries in either of two ways:
Find the latest event having the same AGGREGATE_ID and get its AGGREGATE_VERSION
Maximal value of AGGREGATE_VERSION among all events having the same AGGREGATE_ID
The version comparison and insertion should be done atomically in order to prevent concurrent insertion of two events having the same AGGREGATE_ID and AGGREGATE_VERSION.
Nice to have: if the insertion fails due to comparison error, it would be nice to have an exception thrown which is not a general SQLExeption (in order to handle version violation in a special way)
You can probably write it like:
private static final String INSERT_EVENT_SQL = ""
+ "WITH data (event_id, aggregate_id, aggregate_version, event_type, event_payload) AS"
+ "("
+ " VALUES(?, ?, ?, ?, ?, ?)"
+ ")"
+ "INSERT INTO "
+ " events"
+ " (event_id, aggregate_id, aggregate_version, event_type, event_payload)"
+ "SELECT"
+ " *"
+ "FROM"
+ " data"
+ "WHERE"
+ " data.aggregate_version = "
+ " coalesce ((SELECT max(events.aggregate_version)"
+ " FROM events "
+ " WHERE events. aggregate_id = data. aggregate_id"
+ " ), 0) + 1"
+ "RETURNING"
+ " event_id, aggregate_id ;"
// bind parameters
// execute
// get results
And check whether you have any rows returned or not; and act accordingly. You should wrap this into a TRANSACTION and issue also SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, as mentioned by Craig Ringer to avoid concurrency risks. Even if this is just one statement, another process working in parallel (or an external client) might also INSERT a row while the max is being computed.
I don't full grasp the meanings of your columns, neither why you're doing what you're doing. So, I've made a few assumptions of my own that might not do the right thing when seeking the max. You probably ought to change the WHERE.
See dbfiddle here for a simulation of how this will behave.
The normal way to have in incrementing column in Postgres is to declare it as SERIAL:
create table . . . (
tableId serial primary key,
. . .
);
This is functionally equivalent to auto_increment and identity in other databases. You can read about serial in the documentation.
As noted in the documentation, there can be gaps under some circumstances. However, this is usually the best way to get an increasing sequence number into the table -- and the database does most of the work.
I need to choose one of three values of an integer using the value of a column on a nullable column of a table.
There are at least two approaches: 1) use SQL to do all the work: test null values, and choose between the other values, or 2) read the value and use code -in this case Java- to choose.
Which one is "better", ie. easier to understand & more maintainable? Do you have any other metric use to decide?
As an example, I have the following code:
// If id is equal to:
// -1, then make v = 1
// null, then make v = 2
// in any other case, make v = 3
// Option 1:
int v;
String query = "SELECT CASE min(Id) WHEN NULL THEN 2 WHEN -1 THEN 1 ELSE 3 END AS Id"
+ "FROM TableA WHERE SomeField IN (SELECT ...blah blah...)";
ResultSet rs = // execute query
if (rs.next()) {
v = rs.getInt("Id");
} else {
// TODO something went *very* wrong...
}
// Option 2:
int v;
String query = "SELECT CASE min(Id) Id"
+ "FROM TableA WHERE SomeField IN (SELECT ...blah blah...)";
ResultSet rs = // execute query
if (rs.next()) {
final int id = rs.getInt("Id");
if (rs.wasNull()) {
v = 2;
} else if (id == -1) {
v = 1;
} else {
v = 3;
}
} else {
// TODO something went *very* wrong...
}
I’d say have SQL do the work. It’s fairly trivial and won’t soak up CPU time, and SQL will have to load the pertinent info in memory anyway so it’s already there for processing. Doing it on the app side, to a certain extent it seems like you have to “re-stage” the data for analysis, and (imho) the java code seems more difficult to read through and understand.
Note that there’s a minor flaw in your SQL code, you can’t use WHEN NULL that way in a case statement. You’d want something like
...case
when min(Id) is null then 2
when min(Id) = -1 then 1
else 3
end
I'd go for SQL if the query is not very complex (mean executes in a reasonable time). If the query is 10min, better try java. But I always would prefere sql approach if the DB can do the job for me.
(just copy of my comment)
I'm having a problem with a java OutOfMemoryError. The program basically looks at mysql tables that are running on mysql workbench, and queries them to get out certain information, and then puts them in CSV files.
The program works just fine with a smaller data set, but once I use a larger data set (hours of logging information as opposed to perhaps 40 minutes) I get this error, which to me says that the problem comes from having a huge data set and the information not being handled too well by the program. Or it not being possible to handle this amount of data in the way that I have.
Setting Java VM arguments to -xmx1024m worked for a slightly larger data set but i need it to handle even bigger ones but it gives the error.
Here is the method which I am quite sure is the cause of the program somewhere:
// CSV is csvwriter (external lib), sment are Statements, rs is a ResultSet
public void pidsforlog() throws IOException
{
String[] procs;
int count = 0;
String temp = "";
System.out.println("Commence getting PID's out of Log");
try {
sment = con.createStatement();
sment2 = con.createStatement();
String query1a = "SELECT * FROM log, cpuinfo, memoryinfo";
rs = sment.executeQuery(query1a);
procs = new String[countThrough(rs)];
// SIMPLY GETS UNIQUE PROCESSES OUT OF TABLES AND STORES IN ARRAY
while (rs.next()) {
temp = rs.getString("Process");
if(Arrays.asList(procs).contains(temp)) {
} else {
procs[count] = temp;
count++;
}
}
// BELIEVE THE PROBLEM LIES BELOW HERE. SIZE OF THE RESULTSET TOO BIG?
for(int i = 0; i < procs.length; i++) {
if(procs[i] == null) {
} else {
String query = "SELECT DISTINCT * FROM log, cpuinfo, memoryinfo WHERE log.Process = " + "'" + procs[i] + "'" + " AND cpuinfo.Process = " + "'" + procs[i] + "'" + " AND memoryinfo.Process = " + "'" + procs[i] + "' AND log.Timestamp = cpuinfo.Timestamp = memoryinfo.Timestamp";
System.out.println(query);
rs = sment.executeQuery(query);
writer = new CSVWriter(new FileWriter(procs[i] + ".csv"), ',');
writer.writeAll(rs, true);
writer.flush();
}
}
writer.close();
} catch (SQLException e) {
notify("Error pidslog", e);
}
}; // end of method
Please feel free to ask if you want source code or more information as I'm desperate to get this fixed!
Thanks.
SELECT * FROM log, cpuinfo, memoryinfo will sure give a huge result set. It will give a cartesian product of all rows in all 3 tables.
Without seeing the table structure (or knowing the desired result) it's hard to pinpoint a solution, but I suspect that you either want some kind of join conditions to limit the result set, or use a UNION a'la;
SELECT Process FROM log
UNION
SELECT Process FROM cpuinfo
UNION
SELECT Process FROM memoryinfo
...which will just give you all distinct values for Process in all 3 tables.
Your second SQL statement also looks a bit strange;
SELECT DISTINCT *
FROM log, cpuinfo, memoryinfo
WHERE log.Process = #param1
AND cpuinfo.Process = #param1
AND memoryinfo.Process = #param1
AND log.Timestamp = cpuinfo.Timestamp = memoryinfo.Timestamp
Looks like you're trying to select from all 3 logs simultaneously, but ending up with another cartesian product. Are you sure you're getting the result set you're expecting?
You could limit the result returned by your SQL queryes with the LIMIT estatementet.
For example:
SELECT * FROM `your_table` LIMIT 100
This will return the first 100 results
SELECT * FROM `your_table` LIMIT 100, 200
This will return results from 100 to 200
Obviously you can iterate with those values so you get to all the elements on the data base no matter how many there are.
I think your are loading too many data at the same in the memory. try to use offset and limit in your sql statement so that you can avoid this problem
Your Java code is doing things that the database could do more efficiently. From query1a, it looks like all you really want is the unique processes. select distinct Process from ... should be sufficient to do that.
Then, think carefully about what table or tables are needed in that query. Do you really need log, cpuinfo, and memoryinfo? As Joachim Isaksson mentioned, this is going to return the Cartesian product of those three tables, giving you x * y * z rows (where x, y, and z are the row counts in each of those three tables) and a + b + c columns (where a, b, and c are the column counts in each of the tables). I doubt that's what you want or need. I assume you could get those unique processes from one table, or a union (rather than join) of the three tables.
Lastly, your second loop and query are essentially doing a join, something again better and more efficiently left to the database.
Like others said, fetching the data in smaller chunks might resolve the issue.
This is one of the other threads in stackoverflow that talks about this issue:
How to read all rows from huge table?
I have a SELECT query which combining three tables. I want to add them to a Jtable by separating the MYSQL tables. I just want to know how can I identify the table name in a Resultset?
resultLoad("SELECT sqNum,RegDate,JobNo,DecName,NoOfLines,EntryType,EntrySubType,EntrySubType2,Amount,Status FROM paysheet WHERE TypeBy='" + cmbStaff.getSelectedItem().toString() + "' AND CommissionStatus='no' UNION SELECT sqNum,RegDate,JobNo,DecName,NoOfLines,EntryType,EntrySubType,EntrySubType2,Amount,Status FROM creditsheet WHERE TypeBy='" + cmbStaff.getSelectedItem().toString() + "' AND CommissionStatus='no' ORDER BY RegDate UNION SELECT sqNumber,date,JobNumber,DecName,noOfLines,type,type,type,CommissionAmount,status FROM newjobsheet WHERE TypeBy='" + cmbStaff.getSelectedItem().toString() + "' AND CommissionStatus='no' ORDER BY RegDate");
private void resultLoad(String loadQuery) {
try {
Connection c = DB.myConnection();
Statement s = c.createStatement();
ResultSet r = s.executeQuery(loadQuery);
while (r.next()) {
Vector v = new Vector();
v.addElement(r.getString("sqNum"));
v.addElement(r.getString("RegDate"));
v.addElement(r.getString("JobNo"));
v.addElement(r.getString("DecName"));
v.addElement(r.getString("NoOfLines"));
v.addElement(r.getString("EntryType") + " - " + r.getString("EntrySubType") + " - " + r.getString("EntrySubType2"));
v.addElement(r.getString("Amount"));
v.addElement(r.getString("Status"));
tm.addRow(v);
totalComAmount = totalComAmount + Integer.parseInt(r.getString("Amount"));
}
} catch (Exception e) {
// e.printStackTrace();
JOptionPane.showMessageDialog(this, e, "Error!", JOptionPane.ERROR_MESSAGE);
}
}
I want to add to the Jtable like this by sorting the dates. But the three tables containing different columns.
From your result set, you can get ResultSetMetaData. It looks like this:
rs.getMetaData().getTableName(int Column);
"I want to add them to a table by separating the tables."
Not sure what you mean by that, but:
"I just want to know how can I identify the table name in a Resultset?"
the answer is no, not unless you rewrite the query so that it does it for you.
A SELECT statement yields a new (virtual) table - any columns it delivers are technically columns of that new virtual table. The result does not remember the origin of those columns
However, you can write the query so as to give every expression in the SELECT list a column alias that allows you to identify the origin. For instance, you could do:
SELECT table1.column1 AS table1_column1
, table1.column2 AS table1_column2
, ...
, table2.column1 AS table2_column1
, ...
FROM table1 INNER JOIN table2 ON ... etc
If the underscore is not suitable as a separator for your purpose, then you could consider to quote the aliases and pick whatever character as separator you like, including a dot:
SELECT table1.column1 AS `table1.column1`
, ....
etc.
UPDATE:
I just noticed your query is a UNION over several tables. Obviously this method won't work in that case. I think your best bet is still to rewrite the query so that it reads:
SELECT 'paysheet' as table_name, ...other columns...
FROM paysheet
UNION
SELECT 'creditsheet', ...other columns...
...
It seems to me like you want to SELECT data from three separate tables and have it returned in one query as one ResultSet, then you want to separate the results back out into individual tables.
This is completely counter intuitive.
If you want to have your results separated so that you are able to look at each table's results separately, your best bet is to perform 3 different SELECT queries. Basically, what you are asking it to combine 3 things and then separate them again in one pass. Don't do this; if you leave them uncombined in the first place, separating them back out won't be an issue for you.