how to use mongodb java-driver Projections.slice - java

I am trying to use Aggregates.project to slice the array in my documents.
My documents is like
{
"date":"",
"stype_0":[1,2,3,4]
}
in the mongochef looks like
and my code in java is :
Aggregates.project(Projections.fields(
Projections.slice("stype_0", pst-1, pen-pst),Projections.slice("stype_1", pst-1, pen-pst),
Projections.slice("stype_2", pst-1, pen-pst),Projections.slice("stype_3", pst-1, pen-pst))))
finally i get error
First argument to $slice must be an array, but is of type: int
I guess that is because the first element in stype_0 is int , but I really do not know why? Thanks a lot!

Slice has two versions. $slice(aggregation) & $slice(projection). You are using the wrong one.
Aggregate slice function doesn't have any built-in support. Below is an example for one such projection. Do the same for all the other projection fields.
List stype_0 = Arrays.asList("$stype_0", 1, 1);
Bson project = Aggregates.project(Projections.fields(new Document("stype_0", new Document("$slice", stype_0))));
AggregateIterable<Document> iterable = dbCollection.aggregate(Arrays.asList(project));

Related

Obtain java primitive from mongo aggregation without a new output class

I have an aggregation:
AggregationResults<Integer> result = mongoTemplate.aggregate(
Aggregation.newAggregation(
Aggregation.group().count().as("value"),
Aggregation.project("value").andExclude("_id"),
MyData.class, Integer.class);
In the mongo shell, when I don't have to map an object, I get: { "value" : 2 }
However, I get the following error when trying to map this lone value: org.springframework.data.mapping.model.MappingException: No mapping metadata found for java.lang.Integer
Can I get around having to create a new output type class, when I only want to get a single java primitive?
Note: I'm going this approach instead of db.collection.count() for the sharding inaccuracies stated here - https://docs.mongodb.com/manual/reference/method/db.collection.count/#sharded-clusters
AggregationResults<DBObject> result = mongoTemplate.aggregate(
Aggregation.newAggregation(
Aggregation.group().count().as("value"),
Aggregation.project("value").andExclude("_id"),
MyData.class, DBObject.class);
int count = (Integer) result.getUniqueMappedResult().get("value");
So, not exactly what I wanted, because I still have to traverse over an object, but it's not any more code than I had before and I didn't need to make another class as the outputType.

what is this error and how do I prevent this? The bucket expression values are not comparable and no comparator specified

Im using jasperReports with dynamicReports and I want to build a crosstab report. so far I have figured out that this error happens when I add columns that are numeric to rowGroups or columnGroups. this is what I get and I don't know why and I don't know how to solve this.
The error is:
The bucket expression values are not comparable and no comparator specified
My code is:
CrosstabValues crosstabValues = report.getCrosstab().getCrosstabValues();
Collection<CrosstabRowGroupBuilder> rowGroup = generateRowGroup(crosstabValues);
Collection<CrosstabColumnGroupBuilder> columnGroup = generateColumnGroup(crosstabValues);
Collection<CrosstabMeasureBuilder> measures = generateMeasures(crosstabValues);
CrosstabBuilder crosstab = ctab.crosstab();
for(CrosstabRowGroupBuilder row : rowGroup)
crosstab.addRowGroup(row);
for(CrosstabColumnGroupBuilder columnGroupBuilder : columnGroup)
crosstab.addColumnGroup(columnGroupBuilder);
for(CrosstabMeasureBuilder measure : measures)
crosstab.addMeasure(measure);
crosstab.headerCell(cmp.text(crosstabValues.getHeader())
.setStyle(getCrosstabHeaderCellStyle(report.getTemplate().getReportTemplateValues())));
the problem was the class I was giving to this method:
CrosstabRowGroupBuilder cTabRow = ctab.rowGroup(column.getName()
, getColumnTypeClass(column));
i was using Number class for all numeric data. the funny thing is that it worked for measures but it did not work for rowGroup or columnGroup. that is why I got confused.
now with Integer.Class or Long.Class it works good.
Crosstab must know in which order display rowHeader or columnHeader. And crosstab must know in which cell of crosstab put measure. It is possible only if crosstab is able compare rowGroup (and ColumnGroup) values.
Classes which used in rowGroup and columnGroup must implements Comparable interface

Easiest way to extract fields from JSON

Update: I should have mentioned this right off the bat: I first considered a Java/JSON mapping framework, but my manager does not want me adding any more dependencies to the project, so that is out as an option. The JSON-Java jar is already on our classpath, so I could use that, but still not seeing the forest through the trees on how it could be used.
My Java program is being handed JSON of the following form (althought the values will change all the time):
{"order":{"booze":"1","handled":"0","credits":"0.6",
"execute":0,"available":["299258"],"approved":[],
"blizzard":"143030","reviewable":["930932","283982","782821"],
"units":"6","pending":["298233","329449"],"hobbit":"blasphemy"}}
I'm looking for the easiest, efficient, surefire way of cherry-picking specific values out of this JSON string and aggregating them into a List<Long>.
Specifically, I'm looking to extract-and-aggregate all of the "ids", that is, all the numeric values that you see for the available, approved, reviewable and pending fields. Each of these fields is an array of 0+ "ids". So, in the example above, we see the following breakdown of ids:
available: has 1 id (299258)
approved: has 0 ids
reviewable: has 3 ids (930932, 283982, 782821)
pending: has 2 ids (298233, 329449)
I need some Java code to run and produce a List<Long> with all 6 of these extracted ids, in no particular order. The ids just need to make it into the list.
This feels like an incredibly complex, convoluded regex, and I'm not even sure where too begin. Any help at all is enormously appreciated. Thanks in advance.
The easiest way IMO is use a json library such as gson, jackson, json.org, etc, parse de JSON into an object and create a new List<Long> with the values of the properties you need.
Pseudocode with gson:
class Order {
long[] available;
long[] approved;
...
}
Order order = gson.fromJson("{ your json goes here }", Order.class);
List<Long> result = new ArrayList<Long>();
result.add(order.getAvailable());
result.add(order.getApproved());
...
Pseudocode with json.org/java:
JSONObject myobject = new JSONObject("{ your json goes here"});
JSONObject order = myobject.getJSONObject("order");
List<Long> result = new ArrayList<Long>();
for (int i=0; i<order.getJSONArray("approved").length(); i++) {
Long value = order.getJSONArray("approved").getLong(i);
result.add(value);
}
...

Faceting using SolrJ and Solr4

I've gone through the related questions on this site but haven't found a relevant solution.
When querying my Solr4 index using an HTTP request of the form
&facet=true&facet.field=country
The response contains all the different countries along with counts per country.
How can I get this information using SolrJ?
I have tried the following but it only returns total counts across all countries, not per country:
solrQuery.setFacet(true);
solrQuery.addFacetField("country");
The following does seem to work, but I do not want to have to explicitly set all the groupings beforehand:
solrQuery.addFacetQuery("country:usa");
solrQuery.addFacetQuery("country:canada");
Secondly, I'm not sure how to extract the facet data from the QueryResponse object.
So two questions:
1) Using SolrJ how can I facet on a field and return the groupings without explicitly specifying the groups?
2) Using SolrJ how can I extract the facet data from the QueryResponse object?
Thanks.
Update:
I also tried something similar to Sergey's response (below).
List<FacetField> ffList = resp.getFacetFields();
log.info("size of ffList:" + ffList.size());
for(FacetField ff : ffList){
String ffname = ff.getName();
int ffcount = ff.getValueCount();
log.info("ffname:" + ffname + "|ffcount:" + ffcount);
}
The above code shows ffList with size=1 and the loop goes through 1 iteration. In the output ffname="country" and ffcount is the total number of rows that match the original query.
There is no per-country breakdown here.
I should mention that on the same solrQuery object I am also calling addField and addFilterQuery. Not sure if this impacts faceting:
solrQuery.addField("user-name");
solrQuery.addField("user-bio");
solrQuery.addField("country");
solrQuery.addFilterQuery("user-bio:" + "(Apple OR Google OR Facebook)");
Update 2:
I think I got it, again based on what Sergey said below. I extracted the List object using FacetField.getValues().
List<FacetField> fflist = resp.getFacetFields();
for(FacetField ff : fflist){
String ffname = ff.getName();
int ffcount = ff.getValueCount();
List<Count> counts = ff.getValues();
for(Count c : counts){
String facetLabel = c.getName();
long facetCount = c.getCount();
}
}
In the above code the label variable matches each facet group and count is the corresponding count for that grouping.
Actually you need only to set facet field and facet will be activated (check SolrJ source code):
solrQuery.addFacetField("country");
Where did you look for facet information? It must be in QueryResponse.getFacetFields (getValues.getCount)
In the solr Response you should use QueryResponse.getFacetFields() to get List of FacetFields among which figure "country". so "country" is idenditfied by QueryResponse.getFacetFields().get(0)
you iterate then over it to get List of Count objects using
QueryResponse.getFacetFields().get(0).getValues().get(i)
and get value name of facet using QueryResponse.getFacetFields().get(0).getValues().get(i).getName()
and the corresponding weight using
QueryResponse.getFacetFields().get(0).getValues().get(i).getCount()

hbase: querying for specific value with dynamically created qualifier

Hy,
Hbase allows a column family to have different qualifiers in different rows. In my case a column family has the following specification
abc[cnt] # where cnt is an integer that can be any positive integer
what I want to achieve is to get all the data from a different column family, only if the value of the described qualifier (in a different column family) matches.
for narrowing the Scan down I just add those two families I need for the query. but that is as far as I could get for now.
I already achieved the same behaviour with a SingleColumnValueFilter, but then the qualifier was known in advance. but for this one the qualifier can be abc1, abc2 ... there would be too many options, thus too many SingleColumnValueFilter's.
Then I tried using the ValueFilter, but this filter only returns those columns that match the value, thus the wrong column family.
Can you think of any way to achieve my goal, querying for a value within a dynamically created qualifier in a column family and returning the contents of the column family and another column family (as specified when creating the Scan)? preferably only querying once.
Thanks in advance for any input.
UPDATE: (for clarification as discussed in the comments)
in a more graphical way, a row may have the following:
colfam1:aaa
colfam1:aab
colfam1:aac
colfam2:abc1
colfam2:abc2
whereas I want to get all of the family colfam1 if any value of colfam2 has e.g. the value x, with regard to the fact that colfam2:abc[cnt] is dynamically created with cnt being any positive integer
I see two approaches for this: client-side filtering or server-side filtering.
Client-side filtering is more straightforward. The Scan adds only the two families "colfam1" and "colfam2". Then, for each Result you get from scanner.next(), you must filter according to the qualifiers in "colfam2".
byte[] queryValue = Bytes.toBytes("x");
Scan scan = new Scan();
scan.addFamily(Bytes.toBytes("colfam1");
scan.addFamily(Bytes.toBytes("colfam2");
ResultScanner scanner = myTable.getScanner(scan);
Result res;
while((res = scanner.next()) != null) {
NavigableMap<byte[],byte[]> colfam2 = res.getFamilyMap(Bytes.toBytes("colfam2"));
boolean foundQueryValue = false;
SearchForQueryValue: while(!colfam2.isEmpty()) {
Entry<byte[], byte[]> cell = colfam2.pollFirstEntry();
if( Bytes.equals(cell.getValue(), queryValue) ) {
foundQueryValue = true;
break SearchForQueryValue;
}
}
if(foundQueryValue) {
NavigableMap<byte[],byte[]> colfam1 = res.getFamilyMap(Bytes.toBytes("colfam1"));
LinkedList<KeyValue> listKV = new LinkedList<KeyValue>();
while(!colfam1.isEmpty()) {
Entry<byte[], byte[]> cell = colfam1.pollFirstEntry();
listKV.add(new KeyValue(res.getRow(), Bytes.toBytes("colfam1"), cell.getKey(), cell.getValue());
}
Result filteredResult = new Result(listKV);
}
}
(This code was not tested)
And then finally filteredResult is what you want. This approach is not elegant and might also give you performance issues if you have a lot of data in those families. If "colfam1" has a lot of data, you don't want to transfer it to the client if it will end up not being used if value "x" is not in a qualifier of "colfam2".
Server-side filtering. This requires you to implement your own Filter class. I believe you cannot use the provided filter types to do this. Implementing your own Filter takes some work, you also need to compile it as a .jar and make it available to all RegionServers. But then, it helps you to avoid sending loads of data of "colfam1" in vain.
It is too much work for me to show you how to custom implement a Filter, so I recommend reading a good book (HBase: The Definitive Guide for example). However, the Filter code will look pretty much like the client-side filtering I showed you, so that's half of the work done.

Categories