Does the order of function calls matter in Firestore query? - java

I'm trying to implement pagination as explained in a web tutorial but I don't understand what is the order of functions in query for maximum speed. This is my code:
Query query = db.orderBy("name", Query.Direction.ASCENDING)
.startAt("John").endAt("John" + "\uf8ff")
.startAfter(lastVisible)
.limit(10); //Called last
Or:
Query query = db.orderBy("name", Query.Direction.ASCENDING)
.limit(10) // Called first
.startAt("John").endAt("John" + "\uf8ff")
.startAfter(lastVisible);
Or:
Query query = db.orderBy("name", Query.Direction.ASCENDING)
.limit(10) // Called first
.startAfter(lastVisible)
.startAt("John").endAt("John" + "\uf8ff"); // Called last
All three compile fine, I have no error. How to make it correct for fast pagination?

From the perspective of performance, the order in which you build the Query object doesn't matter. The end result is a Query with the same internal configuration.
However, startAt is not compatible with startAfter. Only one or the other will take effect, likely the one that appears last in the builder chain.

Related

Handling dynamic queries with Spring

The problem I'm trying to solve here is, filtering the table using dynamic queries supplied by the user.
Entities needed to describe the problem:
Table: run_events
Columns: user_id, distance, time, speed, date, temperature, latitude, longitude
The problem statement is to get the run_events for a user, based on a filterQuery.
Query is of the format,
((date = '2018-06-01') AND ((distance < 20) OR (distance > 10))
And this query can combine multiple fields and multiple AND/OR operations.
One approach to solving this is using hibernate and concatenating the filterQuery with your query.
"select * from run_events where user_id=:userId and "+filterQuery;
This needs you to write the entire implementation and use sessions, i.e.
String q = select * from run_events where user_id=:userId and "+filterQuery;
Query query = getSession().createQuery(q);
query.setParameter("userId", userId);
List<Object[]> result = query.list();
List<RunEvent> runEvents = new ArrayList<>();
for(Object[] obj: result){
RunEvent datum = new RunEvent();
int index = -1;
datum.setId((long) obj[++index]);
datum.setDate((Timestamp) obj[++index]);
datum.setDistance((Long) obj[++index]);
datum.setTime((Long) obj[++index]);
datum.setSpeed((Double) obj[++index]);
datum.setLatitude((Double) obj[++index]);
datum.setLongitude((Double) obj[++index]);
datum.setTemperature((Double) obj[++index]);
runEvents.add(datum);
}
This just doesn't seem very elegant and I want to use the #Query annotation to do this i.e.
#Query(value = "select run_event from RunEvent where user_id = :userId and :query order by date asc")
List<RunEvent> getRunningData(#Param("userId") Long userId,
#Param("query") String query,
);
But this doesn't work because query as a parameter cannot be supplied that way in the query.
Is there a better, elegant approach to getting this done using JPA?
Using Specifications and Predicates seems very complicated for this sort of a query.
To answer the plain question: This is not possible with #Query.
It is also in at least 99% of the cases a bad design decision because constructing SQL queries by string concatenation using strings provided by a user (or any source not under tight control) opens you up for SQL injection attacks.
Instead you should encode the query in some kind of API (Criteria, Querydsl, Query By Example) and use that to create your query. There are plenty of questions and answers about this on SO so I won't repeat them here. See for example Dynamic spring data jpa repository query with arbitrary AND clauses
If you insist on using a SQL or JPQL snippet as input a custom implementation using String concatenation is the way to go.
This opens up attack for SQL injection. Maybe that’s why this feature is not possible.
It is generally a bad idea to construct query by appending random filters at the end and running them.
What if the queryString does something awkward like
Select * from Foo where ID=1234 or true;
thereby returning all the rows and bringing a heavy load on DB possibly ceasing your whole application?
Solution: You could use multiple Criteria for filtering it dynamically in JPA, but you’ll need to parse the queryString yourself and add the necessary criteria.
You can use kolobok and ignore fields with null values.
For example create one method like bellow
findByUserIdAndDistanceaLessThanAndDistancebGreaterThan....(String userid,...)
and call that method only with the filter parameters while other parameters are null

Firestore order by and limit not working in kotlin

I have the following firestore setup:
-Root
- Queue
- item1
- time : 20
- item2
- time : 1
- 2000 more items, with a random time value
What i want is to show 40 items, with smallest time first so i do the following in kotlin:
val ref = firestore.collection("Queue")
orderBy?.let{
ref.orderBy(it)
}
limit?.let{
ref.limit(it)
}
return ref.get().get().toObjects(Queue::class.java)
It actually completly ignore my order by and limit statements. and is returning all items in the Queue collection, what am i doing wrong.
The documentation here:
https://firebase.google.com/docs/reference/android/com/google/firebase/firestore/Query
says that the orderBy and limit methods return a new query object, so maybe you should try
val ref = firestore.collection("Queue").orderBy("time").limit(40)
As per the update to your question, you could create a function that returns the query you want based on whether or not the orderBy and limit query modifiers are present. You would have to make that query object a var in order to make it mutable.

java looping through multiple sql queries

I'm trying to loop through multiple sql queries that are executed. I want to first get all the question information for a certain task and then get the keywords for that question. I have three records in my Questions table, but when the while loop at the end of list.add(keyword); is done, it jumps to the SELECT Questions.Question loop (as it should) and then just jumps out and gives me only one record and not the other 2.
What am I doing wrong? Can someone maybe help me fix my code? I've thought of doing batch sql executes (maybe that is the solution), but within each while loop, I need information from the previous sql statement, so I can't just do it all at the end of the batch.
SQL Code:
String TaskTopic = eElement.getElementsByTagName("TaskTopic").item(0).getTextContent();
// perform query on database and retrieve results
String sql = "SELECT Tasks.TaskNo FROM Tasks WHERE Tasks.TaskTopic = '" + TaskTopic + "';";
System.out.println(" Performing query, sql = " + sql);
result = stmt.executeQuery(sql);
Document doc2 = x.createDoc();
Element feedback = doc2.createElement("Results");
while (result.next())
{
String TaskNo = result.getString("TaskNo");
// perform query on database and retrieve results
String sqlquery = "SELECT Questions.Question, Questions.Answer, Questions.AverageRating, Questions.AverageRating\n" +
"FROM Questions\n" +
"INNER JOIN TaskQuestions ON TaskQuestions.QuestionID = Questions.QuestionID \n" +
"INNER JOIN Tasks ON Tasks.TaskNo = '" + TaskNo + "';";
result = stmt.executeQuery(sqlquery);
while (result.next())
{
String Question = result.getString("Question");
String Answer = result.getString("Answer");
String AverageRating = result.getString("AverageRating");
String sqlID = "SELECT QuestionID FROM Questions WHERE Question = '" + Question + "';";
result = stmt.executeQuery(sqlID);
while (result.next())
{
String ID = result.getString("QuestionID");
String sqlKeywords = "SELECT Keyword FROM LinkedTo WHERE QuestionID = '" + ID + "';";
result = stmt.executeQuery(sqlKeywords);
while (result.next())
{
String keyword = result.getString("Keyword");
list.add(keyword);
}
}
feedback.appendChild(x.CreateQuestionKeyword(doc2, Question, Answer, AverageRating, list));
}
}
Why this should be done in SQL
Creating loops is exponentially less efficient than writing a sql query. Sql is built to pull back this type of data and can plan out how it is going to get this data from the database (called an execution plan).
Allowing Sql to do its job and determine the best way to pull back the data instead of explicitly determining what tables you are going to use first and then calling them one at a time is better in terms of the amount of resources you will use, how much time it will take to get the results, code readability, and maintainability in the future.
What information you are looking for
In the psuedocode you provided, you are using the Keyword, Question, Answer, and AnswerRating values. Finding these values should be the focus of the sql query. Based on the code you have written, Question, Answer, and AnswerRating are coming from the Questions table and Keyword is coming from the LinkedTo table, so both of these tables should be available to have data pulled from them.
You can note at this point that we have essentially just mapped out what the Select and From portions of your query should look like.
It also looks like you have a parameter called TaskTopic so we need to include the table Tasks to make sure the correct data is returned. Lastly, the TaskQuestions table is the link between the tasks and the questions. Now that we know what the query should look like, let's see what the results are using sql syntax.
The Code
You did not include the declaration of stmt, but I assume that it is a PreparedStatement. You can add parameters to a prepared statement. Notice the ? in the sql code? The parameters you provide will be added in place of the ?. To do this, you should use stmt.setString(1, TaskTopic);. Note that if there were more than one parameter, you would need to add them in the order that they exists in the sql query (using 1, 2, ...)
SELECT l.Keyword,
q.Question,
q.Answer,
q.AverageRating
FROM LinkedTo l Inner Join
Questions q
on l.questionID = q.QuestionID
Where exists ( Select 1
From TaskQuestions tq INNER JOIN
Tasks t
on tq.TaskNo = t.TaskNo
Where t.TaskTopic = ?
and tq.QuestionID = q.QuestionID)
This is one way that you can write the query to return the same results. There are other ways to write this to get what you are looking for.
What's Going On?
There are a few things in this query you may not be familiar with. First are table aliases. Instead of writing the table name over and over again, you can alias your tables. I used the letter q to represent the Questions table. Any time you see q. you should recognize that I am referring to a column from Questions. The q after Questions is what gives the table its alias.
Exists Instead of doing a bunch of inner joins with tables that you are not selecting information from, you can use an exists to check if what you are looking for is in those tables. You can continue to do inner joins if you need data from the tables, but if you don't, Exists is more efficient.
I suspect you had issues with the query before (and probably the one you provided) because you did not provide any information to join TaskQuestions and Tasks together. That most likely resulted in the duplicates. I joined on TaskNo but this may not be the correct column depending on how the tables are set up.

Determine which parameter failed in a Lucene BooleanQuery?

I need to determine which part of a Lucene BooleanQuery failed if the entire query returns no results.
I'm using a BooleanQuery made up of 4 NumericRangeQueries and a PhraseQuery. Each is added to the query with Occur.MUST.
If I don't get any results for a query, is there a way to tell which part of the query failed to match anything? Do I need to run queries individually and compare results to get the one that failed?
Edit - Added PhraseQuery code.
if( row.getPropertykey_tx() != null && !row.getPropertykey_tx().trim().isEmpty()){
PhraseQuery pQuery = new PhraseQuery();
String[] words = row.getPropertykey_tx().trim().split(" ");
for( String word : words ){
pQuery.add(new Term(TitleRecordColumns.SA_SITE_ADDR.toString(), word));
}
pQuery.setSlop(2);
topBQuery.add(pQuery, BooleanClause.Occur.MUST);
}
Running individual parts of the query is probably the simplest approach, to my mind.
Another tool available is the getting an Explaination. You can call IndexSearcher.explain to get an Explanation of the scoring for the query against a particular document. If you can provide the docid of a document you believe should match the query, you can analyze Explanation.toString (or toHtml, if you prefer) to determine which subqueries are not matching against it.
If you want to automatically keep a record of which clause of a BooleanQuery doesn't produce results, I believe you will need to run each query independantly. If you no longer have access to the subqueries used to create it, you can get the clauses of it instead:
findTroublesomeQuery(BooleanQuery query) {
for (BooleanClause clause : query.clauses()) {
Query subquery = clause.getQuery()
TopDocs docs = searchHoweverYouDo(subquery);
if (doc.totalSize == 0) {
//If you want to dig down recursively...
if (subquery instanceof BooleanQuery)
findTroublesomeQuery(query);
else
log(query); //Or do whatever you want to keep track of it.
}
}
}
DisjunctionMaxQuery is a commonly used query that wraps multiple subqueries as well, so might be worth considering for this sort of approach.

How to use Hibernate to query a MySQL database with indexes

I have an application developed based on MySQL that is connected through Hibernate. I used DAO utility code to query the database. Now I need optimize my database query by indexes. My question is, how can I query data through Hibernate DAO utility code and make sure indexes are used in MySQL database when queries are executed. Any hints or pointers to existing examples are appreciated!
Update: Just want to make the question more understandable a little bit. Following is the code I used to query the MySQL database through Hibernated DAO utility codes. I'm not directly using HQL here. Any suggestions for a best solution? If needed, I will rewrite the database query code and use HQL directly instead.
public static List<Measurements> getMeasurementsList(String physicalId, String startdate, String enddate) {
List<Measurements> listOfMeasurements = new ArrayList<Measurements>();
Timestamp queryStartDate = toTimestamp(startdate);
Timestamp queryEndDate = toTimestamp(enddate);
MeasurementsDAO measurementsDAO = new MeasurementsDAO();
PhysicalLocationDAO physicalLocationDAO = new PhysicalLocationDAO();
short id = Short.parseShort(physicalId);
List physicalLocationList = physicalLocationDAO.findByProperty("physicalId", id);
Iterator ite = physicalLocationList.iterator();
while(ite.hasNext()) {
PhysicalLocation physicalLocation = (PhysicalLocation)ite.next();
List measurementsList = measurementsDAO.findByProperty("physicalLocation", physicalLocation);
Iterator jte = measurementsList.iterator();
while(jte.hasNext()){
Measurements measurements = (Measurements)jte.next();
if(measurements.getMeasTstime().after(queryStartDate)
&& measurements.getMeasTstime().before(queryEndDate)) {
listOfMeasurements.add(measurements);
}
}
}
return listOfMeasurements;
}
Just like with SQL, you don't need to do anything special. Just execute your queries as usual, and the database will use the indices you've created to optimize them, if possible.
For example, let's say you have a HQL query that searches all the products that have a given name:
select p from Product where p.name = :name
This query will be translated by Hibernate to SQL:
select p.id, p.name, p.price, p.code from product p where p.name = ?
If you don't have any index set on product.name, the database will have to scan the whole table of products to find those that have the given name.
If you have an index set on product.name, the database will determine that, given the query, it's useful to use this index, and will thus know which rows have the given name thanks to the index. It willl thus be able to only read a small subset of the rows to return the queries data.
This is all transparent to you. You just need to know which queries are slow and frequent enough to justify the creation of an index to speed them up.

Categories