Parsing mysql using ANTLR4 simple example - java

I am using mysql grammar from here: https://github.com/antlr/grammars-v4/tree/master/mysql and have generated java files using Maven. Now, I was trying to parse a query but I am not getting how to do so.
I basically want to 'get' all the different components of a query, like the list columns selected, where conditions, sub queries, table names, etc. But I have no idea how to proceed. I have written below code as of now. Can someone please suggest with a simple example so that I can understand the usage and take up more complex tasks? Here is my code:
public static void main( String[] args )
{
String sql="select cust_name from database..table where cust_name like 'Kash%'";
ANTLRInputStream input = new ANTLRInputStream(sql);
MySqlLexer mySqlLexer = new MySqlLexer(input);
CommonTokenStream tokens = new CommonTokenStream(mySqlLexer);
MySqlParser mySqlParser = new MySqlParser(tokens);
ParseTree tree = mySqlParser.dmlStatement();
ParseTreeWalker walker = new ParseTreeWalker();
MySqlParserBaseListener listener=new MySqlParserBaseListener();
ParseTreeWalker.DEFAULT.walk(listener, tree);
System.out.println(?);
}
Using the above code, I am getting the following output:
line 1:11 no viable alternative at input '_'
(dmlStatement _ . . _ 'Kash%')
Thanks For Help :)

I basically want to 'get' all the different components of a query, like the list columns selected, where conditions, sub queries, table names, etc.
Your tree variable holds all that data: ParseTree tree = mySqlParser.dmlStatement();
line 1:11 no viable alternative at input '_'
If you look at the lexer rules:
SELECT: 'SELECT';
ID: ID_LITERAL;
fragment ID_LITERAL: [A-Z_$0-9]*?[A-Z_$]+?[A-Z_$0-9]*;
it appears that keywords and identifiers cannot contain lowercase letters.
If you run it like this:
String sql = "SELECT CUST_NAME FROM CUSTOMERS WHERE CUST_NAME LIKE 'Kash%'";
MySqlLexer lexer = new MySqlLexer(CharStreams.fromString(sql));
MySqlParser parser = new MySqlParser(new CommonTokenStream(lexer));
ParseTree root = parser.dmlStatement();
System.out.println(root.toStringTree(parser));
you will see the following output (indented for easier reading):
(dmlStatement
(selectStatement
(querySpecification SELECT
(selectElements
(selectElement
(fullColumnName
(uid
(simpleId CUST_NAME)))))
(fromClause FROM
(tableSources
(tableSource
(tableSourceItem
(tableName
(fullId
(uid
(simpleId CUSTOMERS))))))) WHERE
(expression
(predicate
(predicate
(expressionAtom
(fullColumnName
(uid
(simpleId CUST_NAME))))) LIKE
(predicate
(expressionAtom
(constant
(stringLiteral 'Kash%'))))))))))

Related

How can I translate a TupleExpr or a ParsedTupleQuery into the Query String?

I want to parse a query using rdf4j's SPARQLParser, modify the underlying query tree (=TupleExpr) and translate it back into a query string. Is there a way to do that with rdf4j?
I tried the following but it didn't work
SPARQLParser parser = new SPARQLParser();
ParsedQuery originalQuery = parser.parseQuery(query, null);
if (originalQuery instanceof ParsedTupleQuery) {
TupleExpr queryTree = originalQuery.getTupleExpr();
queryTree.visit(myQueryModelVisitor());
originalQuery.setTupleExpr(queryTree);
System.out.println(queryTree);
ParsedQuery tsQuery = new ParsedTupleQuery(queryTree);
System.out.println(tsQuery.getSourceString());
}
the printed output is null.
You'll want to use the org.eclipse.rdf4j.queryrender.sparql.experimental.SparqlQueryRenderer which is specifically designed to transform a TupleExpr back into a SPARQL query string.
Roughly, like this:
SPARQLParser parser = new SPARQLParser();
ParsedQuery originalQuery = parser.parseQuery(query, null);
if (originalQuery instanceof ParsedTupleQuery) {
TupleExpr queryTree = originalQuery.getTupleExpr();
queryTree.visit(myQueryModelVisitor());
originalQuery.setTupleExpr(queryTree);
System.out.println(queryTree);
ParsedQuery tsQuery = new ParsedTupleQuery(queryTree);
String transformedQuery = new SparqlQueryRenderer().render(tsQuery);
}
Note that this component is still experimental, and does not have guaranteed complete coverage of all SPARQL 1.1 features.
As an aside, the reason getSourceString() does not work here is that method is designed to return the input source string from which the parsed query was generated. Since in your case you've just created a new ParsedQuery object from scratch, there is no source string.

Text normalization in python using Normalizer.Form.NFKD

A field in the table is normalized using Java as shown below,
String name = customerRecord.getName().trim();
name = name.replaceAll("œ", "oe");
name = name.replaceAll("æ", "ae");
name = Normalizer.normalize(name, Normalizer.Form.NFKD).replaceAll("[^\\p{ASCII}]", "");
name = name.toLowerCase();
Now I'm trying to query the same db using Python. How do I do Normalizer.normalize(name, Normalizer.Form.NFKD) in Python so that it is compatible with the way it is written to?
An almost complete translation of the above Java code to Python would be like as follows,
import unicodedata
ASCII_REPLACEMENTS = {
'œ': 'oe',
'æ': 'ae'
}
text = ''.join([ASCII_REPLACEMENTS.get(c, c) for c in search_term])
ascii_term = (
unicodedata.normalize('NFKD', text).
encode('ascii', errors='ignore').decode()
)
return ascii_term.lower()
ASCII_REPLACEMENTS should be amended with what ever characters that wont get translated correctly by unicodedata.normalize compared to Java's Normalizer.normalize(name, Normalizer.Form.NFKD). This way we can ensure the compatibility between the two.

Find a mongodb Document that field value match with some item of input array

I have a string list with the dates of the days of a given week.
String daysweek[] = ["10/05/2020", "11/05/2020", "12/05/2020", "13/05/2020", "14/05/2020", "15/05/2020", "16/05/2020" ]
My goal is to be able to find several documents that belong to a certain week. The comparison field is "firstday".
Follows the image of the document structure in the database:
Document insert = new Document().append("$elemMatch", daysweek[]);
Document filterstar = new Document().append("id_motorista", idmotorista).append("pagamento", false).append("firstday", insert);
coll.find(filterstar).projection(new Document().append("_id", 1).append("origem",1).append("destino", 1).append("formadepagamento", 1).append("valordaviagem",1)
.append("notamotorista",1).append("pagamento",1).append("iniciodaviagem", 1).append("fimdaviagem",1).append("viagemcancelada", 1).append("horadaaceitacao",1)
.append("horacancelamentomotorista", 1).append("horacancelamentousuario", 1).append("taxadecancelamento", 1).append("valordaviagemmotorista", 1).append("valordaviagemusuario", 1).append("id_acompanhamento",1)
.append("taxaaplicativo", 1).append("taxacartao", 1).append("taxamotorista", 1)).sort(new Document().append("firstday", 1)).limit(100)
.into(docs).addOnSuccessListener(new OnSuccessListener<List<Document>>() {
#Override
public void onSuccess(List<Document> documents) {}
But the search finds no documents. The number of queries expected would be 35.
I would like to know if there is any way to find documents through a given document field, match any of the items within an arraylist.
$elemMatch is used when you're querying against an array field, but in your scenario you're querying against a string field and input is an array, then you can just use $in operator.
Mongo Shell Syntax :
db.collection.find({firstday : {$in : ["10/05/2020", "11/05/2020", "12/05/2020", "13/05/2020", "14/05/2020", "15/05/2020", "16/05/2020"]}})
Test : mongoplayground
The advice of #whoami works for me :D
So i change part of the code.
I changed that:
Document insert = new Document().append("$elemMatch", daysweek[]);
to this:
Document insert = new Document().append("$in", daysweek[]);
FINAL CODE:
Document insert = new Document().append("$in", daysweek[]);
Document filterstar = new Document().append("id_motorista", idmotorista).append("pagamento", false).append("firstday", insert);
coll.find(filterstar).projection(new Document().append("_id", 1).append("origem",1).append("destino", 1).append("formadepagamento", 1).append("valordaviagem",1)
.append("notamotorista",1).append("pagamento",1).append("iniciodaviagem", 1).append("fimdaviagem",1).append("viagemcancelada", 1).append("horadaaceitacao",1)
.append("horacancelamentomotorista", 1).append("horacancelamentousuario", 1).append("taxadecancelamento", 1).append("valordaviagemmotorista", 1).append("valordaviagemusuario", 1).append("id_acompanhamento",1)
.append("taxaaplicativo", 1).append("taxacartao", 1).append("taxamotorista", 1)).sort(new Document().append("firstday", 1)).limit(100)
.into(docs).addOnSuccessListener(new OnSuccessListener<List<Document>>() {
#Override
public void onSuccess(List<Document> documents) {}

using MultiMatchQueryBuilder for 'and' keyword query search

I have a bunch of documents stored in elasticsearch with fields title and abstract. I have to search documents for queries like 'word1 word2 ..'. Currently I am using spring data
MultiMatchQueryBuilder multiMatchQueryBuilder = new MultiMatchQueryBuilder(query, "abstract", "title");
Iterable<Document> result = documentRepository.search(multiMatchQueryBuilder);
This gives me all the documents that contain word1 or word2. How can I match all the keywords? It should give me documents that have all the words in the query word1 and word2. Basically I want an and and not or of all the keywords in the search query.
You can specify the AND operator like this:
MultiMatchQueryBuilder multiMatchQueryBuilder = new MultiMatchQueryBuilder(query, "abstract", "title")
.operator(Operator.AND); // <---- add this
Iterable<Document> result = documentRepository.search(multiMatchQueryBuilder);

How to "split" in J2ME a String data containing new line characters?

I want to call a PHP webservice from my J2ME program. Here is the PHP function called :
...
$req="SELECT DISTINCT a.adc_id FROM adc a INNER JOIN utilisateur u ON a.adc_id=u.adc_id INNER JOIN transfert t ON u.user_code = t.user_code
WHERE t.user_code ='". $user_code ."' AND t.date_transfert='".$datejour."'";
$query=mysql_query($req) ;
while($ligne = mysql_fetch_array($query))
{
$chaine .=$ligne['adc_id'].';';
$chaine .= "\r\n" ;
}
return $chaine;
As you see there is the "\r\n" new-line character returned by the webservice among the column data. For example the returned String is :
12011;Michael;12/12/2012;
13455;Sue;14/05/2011;
So how to "split" this String data in J2ME so that I will get an array String[] containing the values :
12011;Michael;12/12/2012; and 13455;Sue;14/05/2011; ?
You will either have to write your own tokenizer or use one of the many available on the net.
One example could be this one from nokia's page
Usage example:
Tokenizer t = new Tokenizer(yourString, "\r\n");
while (t.hasMoreTokens()) {
String token = t.nextToken();
//do something with token
}

Categories