Apache Calcite to Find Selected Columns in an SQL String - java

I have a use case where I want to know the columns which have been selected in an SQL string.For instance, if the SQL is like this:
SELECT name, age*5 as intelligence FROM bla WHERE bla=bla
Then, after parsing the above String, I just want the output to be: name, intelligence.
Firstly, is it possible through Calcite?
Any other option is also welcome.
PS: I want to know this before actually running the query on database.

This is definitely doable with Calcite. You'll want to start by creating an instance of SqlParser and parsing the query:
SqlParser parser = SqlParser.create(query)
SqlNode parsed = parser.parseQuery()
From there, you'll probably have the most success implementing the SqlVisitor interface. You'll want to first find a SqlSelect instance and then visit each expression being selected by calling visit on each element of getSelectList.
From there, how you proceed will depend on the complexity of expressions you want to support. However, it's probably sufficient to recursively visit all SqlCall nodes and their operands and then collect any SqlIdentifier values that you see.

It can be as simple as:
SqlParser parser = SqlParser.create(yourQuery);
SqlSelect selectNode = (SqlSelect) parser.parseQuery();
SqlNodeList list = selectNode.getList();
for (int i = 0; i < list.size(); i++) {
System.out.println("Column " + (i + 1) + ": " + list.get(i).toString());
}

Related

Getting WHERE clause after Update statement

I am required to fetch the where clause after an update statement e.g.
UPDATE user_accounts SET bio='This is my bio' WHERE user_id = 1 OR name = 'Alex';
At the moment, I am able to get everything after the where clause with the following code:
String query = "UPDATE user_accounts SET bio='This is my bio' WHERE user_id = 1 OR name = 'Alex'";
int index = query.toUpperCase().indexOf("WHERE");
if (index != -1) {
System.out.println(query.substring(index));
}
But then I discovered that this was significantly flawed, since these sample queries would fail:
UPDATE user_accounts SET bio='This is where my bio is' WHERE user_id = 1 OR name = 'Alex';
UPDATE user_accounts SET whereColumn='' WHERE user_id = 1 OR name = 'Alex'
UPDATE user_whereabouts SET columnName='' WHERE user_id = 1 OR name = 'Alex'
Essentially, this fails if table name or any column name or column value under SET contains the word 'where' (case insensitive).
My thinking has currently been along the lines of a regex that does the following:
Checks if the word where is in between ' ' or " " (e.g. bio = "This is where my bio is") and skips it to move to the one which isn't inside the quotes. This will help eliminate the where words found in the SET values. Of course the Java quotes surrounding a string do not apply since they aren't part of the string itself.
Checks that the word where is sandwiched between spaces (e.g. ... WHERE ...). This will help eliminate the where words found in either table name or column name (SQL syntax itself can't allow table name or column name to solely be a reserved word).
Finally, returns the index of the wanted WHERE in order to get the substring (The objective of the question).
I am not very conversant with regexes and thus, I am in need of assistance. Please note that any other ways of achieving the objective will be highly appreciated as well.
You need some kind of SQL parser because regexes will fail short in many cases (for instance what if the where clause has a sub-select?)
I've never tried it but JSQLParser seems like a good solution
Using JSqlParser this would look like:
List<String> sqls = Arrays.asList("UPDATE user_accounts SET bio='This is my bio' WHERE user_id = 1 OR name = 'Alex';",
"UPDATE user_accounts SET whereColumn='' WHERE user_id = 1 OR name = 'Alex'",
"UPDATE user_whereabouts SET columnName='' WHERE user_id = 1 OR name = 'Alex'");
for (String sql : sqls) {
Statement stmt = CCJSqlParserUtil.parse(sql);
Update update = (Update)stmt;
System.out.println("sql=" + stmt.toString());
System.out.println(" where=" + update.getWhere());
}
However you could improve your where search by using regular expressions, e.g. for with word boudaries:
\bwhere\b
but again you are right, this version is flawed as well for e.g. set col = ' test where test'.
The right way to do it, is to parse the whole sql via a parser (https://github.com/JSQLParser/JSqlParser).

Building PreparedStatement in Java With Variable Number of Columns for Inserting Data into Database [duplicate]

This question already has answers here:
How to insert values in a table with dynamic columns Jdbc/Mysql
(2 answers)
Closed 5 years ago.
What is a good design pattern to achieve this without endless code?
Given the scenario whereby the user may input 1...100 columns, maybe 23 one time, 32 on another insert, and 99 fields on another insert etc. All of which may be different fields each time too.
The PreparedStatement in Java needs to know what column names to enter first, how many ?'s to put into the values part of the INSERT query, the data types of the database field names to ensure the correct setInt and setString etc are entered.
For less than around 10 columns, you can kind of get around this challenge with the following logic;
1) If variableEnteredForFieldName is not null, then append to the relevant parts of the query in the form of a String builder type setup;
fieldName_1
?
2) Do the same for all entered field names
3) Strip out the final trailing , that will naturally be present in both the field names and the ?s
4) Create the PreparedStatement
5) Run through the same input parameters again to determine of the variableEnteredForFieldName is not null, if not null, then run a setInt or setString based on the known data type that the database requires and set this to the correct index number for the ?s.
As long as the query builder logic and the query filler logic have the names/values in the correct order in part 1 and part 2, then all works well. It does however mean duplicating the entire code that relates to this logic, one for generating the SQL to use when creating the PreparedStatement and another for filling the PreparedStatement.
This is manageable for a small number of input parameters, but this soon gets unmanageable for larger number of input parameters.
Is there a better design pattern to achieve the same logic?
The code below is an outline of all of the above for reference;
String fieldName1 = request.getParameter("fieldName1");
String fieldName2 = request.getParameter("fieldName2");
//Build Query
String fieldNames = "";
String fieldQuestionMarks = "";
if (fieldName1 != null) {
fieldNames = fieldNames + " FIELD_NAME_1 ,";
fieldQuestionMarks = fieldQuestionMarks + " ? ,";
}
if (fieldName2 != null) {
fieldNames = fieldNames + " FIELD_NAME_2 ,";
fieldQuestionMarks = fieldQuestionMarks + " ? ,";
}
//Trim the trailing ,
fieldNames = fieldNames.substring(1, fieldNames.length() - 1);
fieldQuestionMarks = fieldQuestionMarks.substring(1, fieldQuestionMarks.length() - 1);
try {
String completeCreateQuery = "INSERT INTO TABLE_NAME ( " + fieldNames + " ) VALUES ( " + fieldQuestionMarks + " );";
Connection con = DriverManager.getConnection(connectionURL, user, password);
PreparedStatement preparedStatement = con.prepareStatement(completeCreateQuery);
int parameterIndex = 1;
//Fill Query
if (fieldName1 != null) {
preparedStatement.setString(parameterIndex, fieldName1);
parameterIndex++;
}
if (fieldName2 != null) {
preparedStatement.setInt(parameterIndex, Integer.parseInt(fieldName2));
parameterIndex++;
}
}
As you can see, it's do-able. But even with just 2 optional fields, this code is huge.
The way I see it, if user is able to omit any of the columns from the list, then all columns are optional, and can be safely set to NULL during an insert. Therefore, all you need is one prepared statement with the "monster" INSERT, with all columns listed; then during the actual insert operation, you loop though the user-provided data, setting values for the columns provided, and calling setNull() for omitted columns. You'll need to maintain a structure somewhere (your DAO class most likely) mapping column names to their order in the SQL statement.

Efficient way to compare input Data with Sql table in Java

First of all I will explain my use case:
I will get a String Array of names from user(Can of size 2,5,1)
e.g Suppose user input is like this:
String[] names={"Micheal", "Joe","Jim"}
Now after taking input from user, I have to hit SQL table called "USERS" and check whether all of these names are present in USERS table or not. If any single name is not present then return false. If all names are present in USERS table then return true.
My Idea:
My idea is to hit USERS table. Get all names of USERS table in a String array (named as all_names) and then compare my input string(i.e names) with this all_names String. So if names is subset of all_names then return true else return false.
Problem:
But I think this is not an efficient solution. When this table will expand then I will have thousands of records so this technique will be very exhaustive. Any other better and efficient solution for this please.
Updated Solution:
Suppose names in USERS table are unique.
Thanks for your replies. Now I have adopted this approach after getting help from your answers. I want to know that this solution is a better approach or not:
String[] names={"Micheal","Jim","Joe"};
String list2string = StringUtils.join(names, ", ");
//connection was established previosuly
stmt = conn.createStatement();
System.out.println(list2string);
rs = stmt.executeQuery("SELECT COUNT(*) AS rowcount FROM USERS WHERE name IN (" +
list2string +
")");
rs.next();
int count = rs.getInt("rowcount");
rs.close();
if(names.length==count){
System.out.println("All names are in users table");
}else{
System.out.println("All names are not present in users table");
}
Want your comments on this updated solution please.
Regards
You are right, this is not really efficient.
It is the database job to do such things.
You can either make a select statement for each name, eg.
SELECT name FROM users WHERE name = 'Micheal'
or
SELECT name FROM users WHERE name IN ('Micheal', 'Joe', 'Jim')
and check the returned rows.
It might be quiet different depending on which framework you use to query the database.
you can form a string out of string array using loop
for example if you have string array like this:
String[] names={"Micheal", "Joe","Jim"}
get a string lets say s -> "Micheal", "Joe","Jim"
now query like this:
String sql = SELECT name FROM users WHERE name IN (" + s + ")". (you can check the format).
get the output collection and compare with the given collection.
One way to do it, could be
SELECT
COUNT(DISTINCT name)
FROM
users
WHERE
name IN ('Micheal', 'Joe', 'Jim')
Then check if the count is equal to your parameter count, in our case, we should get 3.
I will get a String Array of names from user(Can of size 2,5,1)
You get the input from user, you hit the database with query:
SELECT (WHATEVER_YOU_NEED) FROM SCHEMA_NAME.TABLE_NAME WHERE COLUMN IN
(USER_PROVIDED_INPUT);
You store this result in List.
Get all names of USERS table in a String array (named as all_names)
and then compare my input string(i.e names) with this all_names
String. So if names is subset of all_names then return true else
return false.
Yes, you are right, so you will use
Use Collection.containsAll():
boolean isSubset = listA.containsAll(listB);
And, if your database has unique names (which I guess can be duplicate), you can simply get the count from SQL Query and match it with the user input.
I hope this will help.
SELECT IF(
( SELECT COUNT(DISTINCT name) FROM users WHERE name IN ({toSearch}) ) = {Count},
, 1 , 0
) as Result
replace {toSearch} with e.g. 'Micheal', 'Joe', 'Jim'
{count} is the number of searche, in this example 3. so if all exist the column "Result" has the value 1 else 0

Why does my Play Framework (1.2.4) count query fail?

I have a simple model involving title and description. It extends play.db.jpa.Model
The following search method works perfectly
public static SearchResults search(String search, Integer page) {
String likeSearch = "%" + search + "%";
long count = find("title like ? OR description like ? order by " +
"title ASC", likeSearch, likeSearch).fetch().size();
List<Must> items = find("title like ? OR description like ? order by " +
"title ASC", likeSearch, likeSearch).fetch(page, 20);
return new SearchResults(items, count);
}
However when I tweak count as follows
long count = count("title like ? OR description like ? order by " +
"title ASC", likeSearch, likeSearch);
I get
PersistenceException occured :
org.hibernate.exception.SQLGrammarException: could not execute query
ERROR ~ ERROR: column "must0_.title" must appear in the GROUP BY
clause or be used in an aggregate function
Why is the error asking me to use an aggregate function when the query has not changed at all?
This is because in the first query, all the records are returned and then counted in the result list.
In your second query the count is done in the database so your sql must be formed correctly.
I think the order by is causing the error you described, try removing it. You are trying to order on column which are not part of the return (count return numbers not columns).
You can set the jpa.debugSQL=true in your application.conf if you need to see the sql generated.

How to replace single quote in Java with Postgres?

How to replace single quote in Java with Postgres?
select * from where id in ('<45646300.KDSFJJSKJSDF95'fdgdfgdfgd>', 'fdgdfgdg');
I always use params like
select * from where id = ?;
But in this case i have problem, where i have 'in' statement with string passed to it.
I wish to replace all dangerous chars
It would be better to continue using PreparedStatements rather than to escape characters manually.
In the case of IN clause you can generate a query with appropriate number of ?s dynamically.
String[] input = ...;
StringBuilder b = new StringBuilder();
b.append("select * from where id in (");
b.append("?"); // Assume that input contains at least one element
for (int i = 1; i < input.length; i++) b.append(", ?");
b.append(")");
PreparedStatement s = c.prepareStatement(b.toString());
for (int i = 0; i < input.length; i++) s.setString(i + 1, input[i]);
Apache commons API provides multiples ways to remove dangerous chars for specific languages such as CSS, Javascript SQL, etc...
Take a look at this if it helps : http://commons.apache.org/lang/api-2.4/org/apache/commons/lang/StringEscapeUtils.html
Use the standard SQL quoting for single quotes:
select *
from the_table
where id in ('<45646300.KDSFJJSKJSDF95''fdgdfgdfgd>', 'fdgdfgdg');
So any embedded single quote needs to be written twice.
Did you try?
select * from TABLE_NAME where id in (?);
There's a page explaining the different options over here on the javaranch site

Categories