Profile neo4j database hits using java - java

I am trying to profile neo4j database hits using the following code
public int calculateHits(List<ExecutionPlanDescription> list) {
int hits = 0;
int head = 0;
if (list.isEmpty())
return 0;
if (list.get(head).hasProfilerStatistics()) {
hits += list.get(head).getProfilerStatistics().getDbHits();
System.out.println(hits);
}
hits += calculateHits(list.get(head).getChildren()); // recurse over the children of the head
list.remove(head); // remove the head to recurse on the remaining of the list
hits += calculateHits(list);
return hits;
}
in main I call it this way
Result result = neo4jGraph.execute(query);
int hits = calculateHits(result.getExecutionPlanDescription().getChildren());
However, the method always returns 0 hits. I logged the names of queryExecuter plans and found EagerAggregation, Filter. Expand(All), Filter, and NodeByLabelScan plans. But seems that profilerStatistics do not exist for all the plans as it never accesses the condition and increases the hits.
Is there any problem in the code or I need to make certain configuration first to profile the DB hits? ... appreciate your help. Thanks!

I figured out the problem finally! I have to use profile in the query to fill the statistics and get the DB hits. so the query should be
PROFILE Match (e:Event)-[r:has_metadata]-> (s:EventMetadata) where s.type STARTS WITH 'ELec' AND s.eventLocation IN ["GW", "GW32", "FW", "FW29" , "SW", "SW00"] AND e.date="1/11/2016" return SUM( e.reading)}

Related

Vaadin's DataProvider.fromFilteringCallbacks hangs forever in loop, how to make it being complete?

I have the below dataProvider
DataProvider<WebLogFileRow, WebLogFileFilter> dataProvider = DataProvider.fromFilteringCallbacks(
query -> {
int offset = query.getOffset();
int limit = query.getLimit();
return webLogFileService.getLogFileRows(query.getFilter().get(), offset, limit).stream();
},
query -> {
int offset = query.getOffset();
int limit = query.getLimit();
return webLogFileService.getLogFileRowsCount(query.getFilter().get(), offset, limit);
}
);
and the methods in fact call inside of them
data repository with offset and limit values passed
filter the results based on some conditions, so in fact not the whole set comes to grid output.
So, hanging happens when the counter query defines that the only 1 row should be present, then it puts limit 1 to data query and retrieves only one row, at step 1 only one piece of data is taken from the DB and at step 2 it is filtered out, so the total number of rows becomes 0. Instead of throwing some exception for me, the DataProvider starts the eternal loop. Is there a way to throw the exception when the data query's limit doesn't fit the expected value, instead of trying more and more?
First my mistake was passing limit and offset params to the counter query. However, all that was passed haven't broke the counting results.
The main mistake was to associate the grid's limit and offset with DB limit and offset, if then the list of items it was filtered out and no longer the same length as it was given to the counter query. So, I can't no longer use offset and limit there, as it isn't known how much data would be filtered out. So I had to give them all and that potentially wrong on large data grids
DataProvider<WebLogFileRow, WebLogFileFilter> dataProvider = DataProvider.fromFilteringCallbacks(
query -> {
int offset = query.getOffset();
int limit = query.getLimit();
return webLogFileService.getLogFileRows(query.getFilter().get(), null, null).stream();
},
query -> {
int offset = query.getOffset();
int limit = query.getLimit();
return webLogFileService.getLogFileRowsCount(query.getFilter().get(), null, null);
}
);

Understanding "CancellationException: Task was cancelled" error while doing a Google Datastore query

I'm using Google App Engine v. 1.9.48. During some of my data store queries, I am randomly getting "CancellationException: Task was cancelled" error. And I'm not really sure what exactly is causing this error. From other Stackoverflow posts, I vaguely understand that this has to do with timeouts, but not entirely sure what is causing this. I'm not using any TaskQueues - if that helps.
Below is the stack trace:
java.util.concurrent.CancellationException: Task was cancelled.
at com.google.common.util.concurrent.AbstractFuture.cancellationExceptionWithCause(AbstractFuture.java:1126)
at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:504)
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:407)
at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:86)
....
at com.sun.proxy.$Proxy14.size(Unknown Source)
at main.java.com.continentalist.app.model.Model.getEntitySentimentCounts(Model.java:285)
at main.java.com.continentalist.app.model.Model.access$100(Model.java:37)
at main.java.com.continentalist.app.model.Model$2.vrun(Model.java:251)
at com.googlecode.objectify.VoidWork.run(VoidWork.java:14)
at com.googlecode.objectify.VoidWork.run(VoidWork.java:11)
at com.googlecode.objectify.ObjectifyService.run(ObjectifyService.java:81)
...
My app engine code that is throwing that error is here. I added in line comments where the error is being thrown at (typically at one of the list().size()):
private EntityAnalysis getEntitySentimentCounts(ComboCall comboCall) {
Query<ObjectifyArticle> queryArticles = ofy().load().type(ObjectifyArticle.class);
queryArticles = queryArticles.filter("domain", comboCall.getDomain());
Set<Entity> entitySet = comboCall.getEntitySet();
SentimentCount[] allSentimentCounts = new SentimentCount[entitySet.size()];
int index = 0;
for(Entity eachEntity : entitySet) {
SentimentCount sentimentCount = new SentimentCount();
String eachEntityName = eachEntity.getText();
Query<ObjectifyArticle> newQuery = queryArticles;
newQuery = newQuery.filter("entityName", eachEntityName);
sentimentCount.setEntityName(eachEntityName);
Query<ObjectifyArticle> positiveFilter = newQuery;
positiveFilter = positiveFilter.filter("entityType", POSITIVE);
int positive = positiveFilter.list().size(); // ERROR EITHER HERE
sentimentCount.setPositiveCount(positive+"");
Query<ObjectifyArticle> negativeFilter = newQuery;
negativeFilter = negativeFilter.filter("entityType", NEGATIVE);
int negative = negativeFilter.list().size(); // OR HERE
sentimentCount.setNegativeCount(""+negative);
Query<ObjectifyArticle> neutralFilter = newQuery;
neutralFilter = neutralFilter.filter("entityType", NEUTRAL);
int neutral = neutralFilter.list().size(); // OR HERE
sentimentCount.setNeutralCount(""+neutral);
allSentimentCounts[index] = sentimentCount;
index++;
}
EntityAnalysis entityAnalysis = new EntityAnalysis();
entityAnalysis.setDomain(comboCall.getDomain());
entityAnalysis.setSentimentCount(allSentimentCounts);
return entityAnalysis;
}
You don't need to call .list().size(), you can simply call count().
If you simply count, use a keys-only query - it's free and much faster.
Don't forget to set chunckAll() on your query when you expect to process a large number of entities. It's much faster than a default setting.
If you still run into these exceptions, you need to use cursors in your query.
Such error occurrs due to following common reasons:
Any API call like datastore.list times out
Solution: read data in paged manner via cursor
Parent method hits the request time out limit e.g. cronjob hits 10 mins limit in standard app engine
Solution: Increase time out or use multi-threading or use cache to expedite the execution

JDBC connector is too slow on SELECT

This is not the rare question on the net, but I does a few optimization work with MySQL server for solve this problem and did not get results. So at first I use maven's package mysql:mysql-connector-java:6.0.6.
I try just to run this code:
try {
mysqlConnection = DriverManager.getConnection(DatabaseUtils.mysqlUrl, DatabaseUtils.mysqlUser, DatabaseUtils.mysqlPassword);
PreparedStatement valuesStatement = "SELECT * FROM `test` ORDER BY `id`"
ResultSet cursor = valuesStatement.executeQuery();
double value = 0;
if (cursor.next())
value = cursor.getDouble("value");
} catch (SQLException sqlEx) {
sqlEx.printStackTrace();
} finally {
cursor.close();
pricesStatement.close();
}
I have a lot records in the table. It's about million but every day add about thousand records. So I was very surprised when this simple example executed 30 seconds. I googled my problem and I find only "using pool", "tune mysql server", "try to EXPLAIN SELECT". But I've noticed that execution time related with rows count. So I looked into driver's code and found that:
TextResultsetReader::read():
while(true) {
if(row == null) {
rows = new ResultsetRowsStatic(rowList, cdef);
break;
}
if(maxRows == -1 || rowList.size() < maxRows) {
rowList.add(row);
}
row = (ResultsetRow)this.protocol.read(ResultsetRow.class, trf);
}
This means that even if I want to fetch only one row driver fetches all queried rows and get me first of it. Manuals suggest to use "setFetchSize" for fetching only n records. But it doesn't work. Driver code fetching all data anyway. So then I found that there is two recordsets: ResultRowsStatic and ResultSetStreaming. Second seems to be fetching data only when I need query it. How to use ResultRowsStreaming? I found it only into code. Parameter "fetchSize" must to equal -2147483648. I did try and it worked! Execution time of "executeQuery()" now if about 0.0007 sec. It's very fast for me. But wait.. My script anyway takes 30 seconds. Why? I debugged code after executing query. There's only two "close" methods after that. What's can go wrong? And that's true, "cursor.close()" takes the rest of time. I looked into library code again and reached ResultsetRowsStreaming::close():
boolean hadMore = false;
int howMuchMore = 0;
synchronized(mutex) {
while(this.next() != null) {
hadMore = true;
++howMuchMore;
if(howMuchMore % 100 == 0) {
Thread.yield();
}
}
if(conn != null) {
if(!((Boolean)this.protocol.getPropertySet().getBooleanReadableProperty("clobberStreamingResults").getValue()).booleanValue() && ((Integer)this.protocol.getPropertySet().getIntegerReadableProperty("netTimeoutForStreamingResults").getValue()).intValue() > 0) {
int oldValue = this.protocol.getServerSession().getServerVariable("net_write_timeout", 60);
this.protocol.clearInputStream();
try {
this.protocol.sqlQueryDirect((StatementImpl)null, "SET net_write_timeout=" + oldValue, (String)this.protocol.getPropertySet().getStringReadableProperty("characterEncoding").getValue(), (PacketPayload)null, -1, false, (String)null, (ColumnDefinition)null, (GetProfilerEventHandlerInstanceFunction)null, this.resultSetFactory);
} catch (Exception var9) {
throw ExceptionFactory.createException(var9.getMessage(), var9, this.exceptionInterceptor);
}
}
if(((Boolean)this.protocol.getPropertySet().getBooleanReadableProperty("useUsageAdvisor").getValue()).booleanValue() && hadMore) {
ProfilerEventHandler eventSink = ProfilerEventHandlerFactory.getInstance(conn.getSession());
eventSink.consumeEvent(new ProfilerEventImpl(0, "", this.owner.getCurrentCatalog(), this.owner.getConnectionId(), this.owner.getOwningStatementId(), -1, System.currentTimeMillis(), 0L, Constants.MILLIS_I18N, (String)null, (String)null, Messages.getString("RowDataDynamic.2") + howMuchMore + Messages.getString("RowDataDynamic.3") + Messages.getString("RowDataDynamic.4") + Messages.getString("RowDataDynamic.5") + Messages.getString("RowDataDynamic.6") + this.owner.getPointOfOrigin()));
}
}
}
This code unconditionally fetching all the rest of data only for logging how many records I did not fetched. Really weird. And it would be justified if logger was attached. But in my case this code counting unfetched rows and in 30 seconds and... do nothing with it. And this proble I cannot fix because there's not parameter which can tell code not to count rows.
Now I don't know what to do next. Query time is very slow for me. For example mysql driver for php execute this query in 0.0004-0.001 seconds.
So people who using mysql-connector for Java, tell me please have you got these problems? If not, could you post any examples what should I do to bypass the above problems? Maybe you use another connectors. So tell me please, what to do?
Your SQL query says
SELECT * FROM test ORDER BY id
You are, with that query, instructing your MySQL server to serialize every column of every row of your test table and send it to your Java program. So, MySQL obeys. You have a large table. So your instruction to MySQL takes time. And yes, the more rows in your table the longer it takes. This is not a problem with JDBC or the driver; it's a problem with the SQL you're using.
It seems from your sample code that you want one column -- named value -- from one row -- the first one -- in your table. You could accomplish that using this SQL statement:
SELECT value FROM test ORDER BY id LIMIT 1
If your id column is your table's primary key, this will be fast.
The whole point of SQL is to allow your tables to contain so many rows that it's unreasonable to fetch them all into your Java (or other) program in a short amount of time. That's why SQL has WHERE and LIMIT clauses.

Java List values changes when any change in one of the element is made

I am having a weird issue with java list. Please see the code below:
for ( int i=0; i < tripList.size(); i++ ) {
ModeChoiceTrip trip = tripList.get(i);
int newUniqueId = tripListIds[trip.uniqueId];
int newLinkedId = trip.linkedId >= 0 ? tripListIds[trip.linkedId] : -1;
int jointTripNum = trip.linkedId >= 0 && trip.tourType != TourTypes.SPECIAL_EVENT ? jointTripListIds[trip.linkedId] : 0;
trip.uniqueId = newUniqueId;
trip.linkedId = newLinkedId;
trip.jointTripNum = jointTripNum;
}
In the above code, the values in tripList seem correct but after executing a few iterations (up to i = 6), the values in tripList changes for all the positions.
I cannot provide the whole source code here but showing the snippet of the code where I have an issue.
I found that there are some duplicate trips in tripList. When one of the trips is changed, the copy of it (located at different position) is also changed.
I am guessing this piece of code is executed by multiple threads, Then there is every chance that List could be modified by another thread while this loop is going on.
you could try synchronizing the loop and see if issue gets resolved.
Also, you could try using for-each loop instead of the loop with counter.
for (ModeChoiceTrip trip : tripList) {
.....
}
The issues was the duplicate values in the list. Thus, when I update a value in list the copy to that value changes as well
You set the unique id to -1. So if the trip list id comes in as -1, you grab the index like tripListIds[-1]; which might be the second to the last item in the list.

Problem with recursive backtracking

Hey guys, recently posted up about a problem with my algorithm.
Finding the numbers from a set which give the minimum amount of waste
Ive amended the code slightly, so it now backtracks to an extent, however the output is still flawed. Ive debugged this considerablychecking all the variable values and cant seem to find out the issue.
Again advice as opposed to an outright solution would be of great help. I think there is only a couple of problems with my code, but i cant work out where.
//from previous post:
Basically a set is passed to this method below, and a length of a bar is also passed in. The solution should output the numbers from the set which give the minimum amount of waste if certain numbers from the set were removed from the bar length. So, bar length 10, set includes 6,1,4, so the solution is 6 and 4, and the wastage is 0. Im having some trouble with the conditions to backtrack though the set. Ive also tried to use a wastage "global" variable to help with the backtracking aspect but to no avail.
SetInt is a manually made set implementation, which can add, remove, check if the set is empty and return the minimum value from the set.
/*
* To change this template, choose Tools | Templates
* and open the template in the editor.
*/
package recursivebacktracking;
/**
*
* #author User
*/
public class RecBack {
int WASTAGE = 10;
int BESTWASTAGE;
int BARLENGTH = 10;
public void work()
{
int[] nums = {6,1,2,5};
//Order Numbers
SetInt ORDERS = new SetInt(nums.length);
SetInt BESTSET = new SetInt(nums.length);
SetInt SOLUTION = new SetInt(nums.length);
//Set Declarration
for (int item : nums)ORDERS.add(item);
//Populate Set
SetInt result = tryCutting(ORDERS, SOLUTION, BARLENGTH, WASTAGE);
result.printNumbers();
}
public SetInt tryCutting(SetInt possibleOrders, SetInt solution, int lengthleft, int waste)
{
for (int i = 0; i < possibleOrders.numberInSet(); i++) // the repeat
{
int a = possibleOrders.min(); //select next candidate
System.out.println(a);
if (a <= lengthleft) //if accecptable
{
solution.add(a); //record candidate
lengthleft -= a;
WASTAGE = lengthleft;
possibleOrders.remove(a); //remove from original set
if (!possibleOrders.isEmpty()) //solution not complete
{
System.out.println("this time");
tryCutting(possibleOrders, solution, lengthleft, waste);//try recursive call
BESTWASTAGE = WASTAGE;
if ( BESTWASTAGE <= WASTAGE )//if not successfull
{
lengthleft += a;
solution.remove(a);
System.out.println("never happens");
}
} //solution not complete
}
} //for loop
return solution;
}
}
Instead of using backtracking, have you considered using a bitmask algorithm instead? I think it would make your algorithm much simpler.
Here's an outline of how you would do this:
Let N be number of elements in your set. So if the set is {6,1,2,5} then N would be 4. Let max_waste be the maximum waste we can eliminate (10 in your example).
int best = 0; // the best result so far
for (int mask = 1; mask <= (1<<N)-1; ++mask) {
// loop over each bit in the mask to see if it's set and add to the sum
int sm = 0;
for (int j = 0; j < N; ++j) {
if ( ((1<<j)&mask) != 0) {
// the bit is set, add this amount to the total
sm += your_set[j];
// possible optimization: if sm is greater than max waste, then break
// out of loop since there's no need to continue
}
}
// if sm <= max_waste, then see if this result produces a better one
// that our current best, and store accordingly
if (sm <= max_waste) {
best = max(max_waste - sm);
}
}
This algorithm is very similar to backtracking and has similar complexity, it just doesn't use recursion.
The bitmask basically is a binary representation where 1 indicates that we use the item in the set, and 0 means we don't. Since we are looping from 1 to (1<<N)-1, we are considering all possible subsets of the given items.
Note that running time of this algorithm increases very quickly as N gets larger, but with N <= around 20 it should be ok. The same limitation applies with backtracking, by the way. If you need faster performance, you'd need to consider another technique like dynamic programming.
For the backtracking, you just need to keep track of which element in the set you are on, and you either try to use the element or not use it. If you use it, you add it to your total, and if not, you proceeed to the next recursive call without increasing your total. Then, you decrement the total (if you incremented it), which is where the backtracking comes in.
It's very similar to the bitmask approach above, and I provided the bitmask solution to help give you a better understanding of how the backtracking algorithm would work.
EDIT
OK, I didn't realize you were required to use recursion.
Hint1
First, I think you can simplify your code considerably by just using a single recursive function and putting the logic in that function. There's no need to build all the sets ahead of time then process them (I'm not totally sure that's what you're doing but it seems that way from your code). You can just build the sets and then keep track of where you are in the set. When you get to the end of the set, see if your result is better.
Hint2
If you still need more hints, try to think of what your backtracking function should be doing. What are the terminating conditions? When we reach the terminating condition, what do we need to record (e.g. did we get a new best result, etc.)?
Hint3
Spoiler Alert
Below is a C++ implementation to give you some ideas, so stop reading here if you want to work on it some more by yourself.
int bestDiff = 999999999;
int N;
vector< int > cur_items;
int cur_tot = 0;
int items[] = {6,1,2,5};
vector< int > best_items;
int max_waste;
void go(int at) {
if (cur_tot > max_waste)
// we've exceeded max_waste, so no need to continue
return;
if (at == N) {
// we're at the end of the input, see if we got a better result and
// if so, record it
if (max_waste - cur_tot < bestDiff) {
bestDiff = max_waste - cur_tot;
best_items = cur_items;
}
return;
}
// use this item
cur_items.push_back(items[at]);
cur_tot += items[at];
go(at+1);
// here's the backtracking part
cur_tot -= items[at];
cur_items.pop_back();
// don't use this item
go(at+1);
}
int main() {
// 4 items in the set, so N is 4
N=4;
// maximum waste we can eliminiate is 10
max_waste = 10;
// call the backtracking algo
go(0);
// output the results
cout<<"bestDiff = "<<bestDiff<<endl;
cout<<"The items are:"<<endl;
for (int i = 0; i < best_items.size(); ++i) {
cout<<best_items[i]<<" ";
}
return 0;
}

Categories