Updating values of given field using expression in mongodb using java - java

Consider a collection of objects having fields like:
{
id: // String
type: //Integer
score: //Double value
}
I would like to query on collection using type and for returned documents divide their scores by their maximum. Consider following query oject:
DBObject searchQuery = new BasicDBObject("type", 2);
collection.find(searchQuery);
With above query it'll return some documents. I want to get maximum of scores among all those documents and then divide all those documents' score by obtained maximum.
How can I do this??
I could find maximum using aggregation as follows:
String propertyToOperateOn = "score";
DBObject match = new BasicDBObject("$match", searchQuery);
DBObject groups = new BasicDBObject("_id", null);
DBObject operation = new BasicDBObject("$max", "$" + propertyToOperateOn);
groups.put("maximum", operation);
DBObject apply = new BasicDBObject("$group", groups);
AggregationOutput output = mongoConstants.IAScores.aggregate(match, apply);
Here output will contain the maximum value. But then how can I update (divide) all documents' scores by this maximum??
I hope there could be better way to do this task, but I'm unable to get it as I'm very much new to mongodb (or any database as such).

This is technically the same issue as "mongodb: java: How to update a field in MongoDB using expression with existing value", but I'll repeat the answer:
At the moment, MongoDB doesn't allow you to update the value of a field according to an existing value of a field. Which means, you can't do the following SQL:
UPDATE foo SET field1 = field1 / 2;
In MongoDB, you will need to do this in your application, but be aware that this is no longer an atomic operation as you need to read and then write.

Related

DynamoDB: Batch query items with highest range key given a set of hash key

I have a table Book with bookId and lastBorrowed as hash and range keys, respectively.
Let's say each time a book is borrowed, a new row is created.
(Yes, this is NOT sufficient and I can just add a column to keep track of the count and update lastBorrowed date. But let's just say I'm stuck with this design there's nothing I can do about it.)
Given a set of bookIds (or hashKeys), I would like to be able to query the last time each book is borrowed.
I attempted to use QueryRequest, but kept getting com.amazonaws.AmazonServiceException: Attempted conditional constraint is not an indexable operation
final Map<String, Condition> keyConditions =
Collections.singletonMap(hashKeyFieldName, new Condition()
.withComparisonOperator(ComparisonOperator.IN)
.withAttributeValueList(hashKeys.stream().map(hashKey -> new AttributeValue(hashKey)).collect(Collectors.toList())));
I also tried using BatchGetItemRequest, but it didn't work, either:
final KeysAndAttributes keysAndAttributes = new KeysAndAttributes() .withConsistentRead(areReadsConsistent);
hashKeys.forEach(hashKey -> { keysAndAttributes.addExpressionAttributeNamesEntry(hashKeyFieldName, hashKey); });
final Map<String, KeysAndAttributes> requestedItemsByTableName = newHashMap();
requestedItemsByTableName.put(tableName, keysAndAttributes);
final BatchGetItemRequest request = new BatchGetItemRequest().withRequestItems(requestedItemsByTableName);
Any suggestion would be much appreciated!
Or if someone can tell me this is currently not supported at all, then I guess I'll just move on!
You can do this, in fact its very easy. All you have to do is execute a Query for your bookId and then take the first result.
By the way, your table design sounds absolutely fine, the only problem is the attribute should probably be called borrowed rather than last borrowed.
You can have multiple results for a single bookId, but because lastBorrowed is your range key, the results will come back ordered by that attribute.
You seem to be using Legacy Functions, are you editing old code?
If not, execute your Query something like this:
//Setting up your DynamoDB connection
AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard()
.withRegion(Regions.US_WEST_2).build();
DynamoDB dynamoDB = new DynamoDB(client);
Table table = dynamoDB.getTable("YOURTABLE");
//Define the Query
QuerySpec spec = new QuerySpec()
.withKeyConditionExpression("bookId = :book_id)
.withValueMap(new ValueMap()
.withString(":book_id", "12345")
.withScanIndexForward(true);
//Execute the query
ItemCollection<QueryOutcome> items = table.query(spec);
//Print out your results - take the first item
Iterator<Item> iterator = items.iterator();
while (iterator.hasNext()) {
System.out.println(iterator.next().toJSONPretty());
}

Storing numeric values in Lucene 6.5.0

I need to Store the Numeric field in the Lucene docs, but Lucene 6.5.1 the signature of the NumericField is like
NumericDocValuesField(String name, long value)
In older lucene versions the method is like,
NumericField(String, Field.Store, boolean)
.
Can someone guide me how to store the numeric values in the document using lucene6.5.1.
Regards,
Raghavan
NumericDocValuesField is used for scoring/sorting only:
http://lucene.apache.org/core/6_5_0/core/org/apache/lucene/document/NumericDocValuesField.html
If you like to store any kind of values (including numeric) you have to use a StoredField:
https://lucene.apache.org/core/6_5_0/core/org/apache/lucene/document/StoredField.html
Depending on what you need you have to add multiple fields for multiple purposes. If you have a numeric value as long and you like to do range queries and sort you would do something like this:
// for range queries
new LongPoint(field, value);
// for storing the value
new StoredField(field, value);
// for sorting / scoring
new NumericDocValuesField(field, value);
Use special type oriented numeric fields:
IntField intField = new IntField("int_value", 100, Field.Store.YES);
LongField longField = new LongField("long_value", 100L, Field.Store.
YES);
FloatField floatField = new FloatField("float_value", 100.0F, Field.
Store.YES);
DoubleField doubleField = new DoubleField("double_value", 100.0D,
Field.Store.YES);
You can store their values and sort them if you need. All of those field are indexable.

Sorting search result in Lucene based on a numeric field

I have some docs with two fields: text, count.
I've used Lucene to index docs and now I want to search in text and get the result sorted by count in descending order. How can I do that?
The default search implementation of Apache Lucene returns results sorted by score (the most relevant result first), then by id (the oldest result first).
This behavior can be customized at query time with an additionnal Sort parameter .
TopFieldDocs Searcher#search(Query query, Filter filter, int n, Sort sort)
The Sort parameter specifies the fields or properties used for sorting. The default implementation is defined this way :
new Sort(new SortField[] { SortField.FIELD_SCORE, SortField.FIELD_DOC });
To change sorting, you just have to replace fields with the ones you want :
new Sort(new SortField[] {
SortField.FIELD_SCORE,
new SortField("field_1", SortField.STRING),
new SortField("field_2", SortField.STRING) });
This sounds simple, but will not work until the following conditions are met :
You have to specify the type parameter of SortField(String field, int
type) to make Lucene find your field, even if this is normaly
optional.
The sort fields must be indexed but not tokenized :
document.add (new Field ("byNumber", Integer.toString(x), Field.Store.NO, Field.Index.NOT_ANALYZED));
The sort fields content must be plain text only. If only one single
element has a special character or accent in one of the fields used
for sorting, the whole search will return unsorted results.
Check this tutorial.
Below line will do the trick. Last parameter is boolean reverse if you set it to true it will sort in reverse order i.e. descending in your case.
SortField longSort = new SortedNumericSortField(FIELD_NAME_LONG, SortField.Type.LONG, true);
Sample code:
IndexSearcher searcher = new IndexSearcher(reader);
Query q = new MultiFieldQueryParser(new String[] { FIELD_NAME_NAME}, analyzer).parse("YOUR_QUERY") );
SortField longSort = new SortedNumericSortField(FIELD_NAME_LONG, SortField.Type.LONG, true);
Sort sort = new Sort(longSort);
ScoreDoc[] hits = searcher.search(q, 10 , sort).scoreDocs;
Also it's necessary that you add you sort enabled field as a NumericDocValuesField when you create your index.
doc.add(new NumericDocValuesField(FIELD_NAME_LONG, longValue));//sort enabled field
Code is as per lucene-core-5.0.0
first:
Fieldable count = new NumericField("count", Store.YES, true);
second:
SortField field = new SortField("count", SortField.INT);
Sort sort = new Sort(field);
third:
TopFieldDocs docs = searcher.search(query, 20, sort);
ScoreDoc[] sds = docs.scoreDocs;
Like this is OK !

How to get just the desired field from an array of sub documents in Mongodb using Java

I have just started using Mongo Db . Below is my data structure .
It has an array of skillID's , each of which have an array of activeCampaigns and each activeCampaign has an array of callsByTimeZone.
What I am looking for in SQL terms is :
Select activeCampaigns.callsByTimeZone.label,
activeCampaigns.callsByTimeZone.loaded
from X
where skillID=50296 and activeCampaigns.campaign_id= 11371940
and activeCampaigns.callsByTimeZone='PT'
The output what I am expecting is to get
{"label":"PT", "loaded":1 }
The Command I used is
db.cd.find({ "skillID" : 50296 , "activeCampaigns.campaignId" : 11371940,
"activeCampaigns.callsByTimeZone.label" :"PT" },
{ "activeCampaigns.callsByTimeZone.label" : 1 ,
"activeCampaigns.callsByTimeZone.loaded" : 1 ,"_id" : 0})
The output what I am getting is everything under activeCampaigns.callsByTimeZone while I am expecting just for PT
DataStructure :
{
"skillID":50296,
"clientID":7419,
"voiceID":1,
"otherResults":7,
"activeCampaigns":
[{
"campaignId":11371940,
"campaignFileName":"Aaron.name.121.csv",
"loaded":259,
"callsByTimeZone":
[{
"label":"CT",
"loaded":6
},
{
"label":"ET",
"loaded":241
},
{
"label":"PT",
"loaded":1
}]
}]
}
I tried the same in Java.
QueryBuilder query = QueryBuilder.start().and("skillID").is(50296)
.and("activeCampaigns.campaignId").is(11371940)
.and("activeCampaigns.callsByTimeZone.label").is("PT");
BasicDBObject fields = new BasicDBObject("activeCampaigns.callsByTimeZone.label",1)
.append("activeCampaigns.callsByTimeZone.loaded",1).append("_id", 0);
DBCursor cursor = coll.find(query.get(), fields);
String campaignJson = null;
while(cursor.hasNext()) {
DBObject campaignDBO = cursor.next();
campaignJson = campaignDBO.toString();
System.out.println(campaignJson);
}
the value obtained is everything under callsByTimeZone array. I am currently parsing the JSON obtained and getting only PT values . Is there a way to just query the PT fields inside activeCampaigns.callsByTimeZone .
Thanks in advance .Sorry if this question has already been raised in the forum, I have searched a lot and failed to find a proper solution.
Thanks in advance.
There are several ways of doing it, but you should not be using String manipulation (i.e. indexOf), the performance could be horrible.
The results in the cursor are nested Maps, representing the document in the database - a Map is a good Java-representation of key-value pairs. So you can navigate to the place you need in the document, instead of having to parse it as a String. I've tested the following and it works on your test data, but you might need to tweak it if your data is not all exactly like the example:
while (cursor.hasNext()) {
DBObject campaignDBO = cursor.next();
List callsByTimezone = (List) ((DBObject) ((List) campaignDBO.get("activeCampaigns")).get(0)).get("callsByTimeZone");
DBObject valuesThatIWant;
for (Object o : callsByTimezone) {
DBObject call = (DBObject) o;
if (call.get("label").equals("PT")) {
valuesThatIWant = call;
}
}
}
Depending upon your data, you might want to add protection against null values as well.
The thing you were looking for ({"label":"PT", "loaded":1 }) is in the variable valueThatIWant. Note that this, too, is a DBObject, i.e. a Map, so if you want to see what's inside it you need to use get:
valuesThatIWant.get("label"); // will return "PT"
valuesThatIWant.get("loaded"); // will return 1
Because DBObject is effectively a Map of String to Object (i.e. Map<String, Object>) you need to cast the values that come out of it (hence the ugliness in the first bit of code in my answer) - with numbers, it will depend on how the data was loaded into the database, it might come out as an int or as a double:
String theValueOfLabel = (String) valuesThatIWant.get("label"); // will return "PT"
double theValueOfLoaded = (Double) valuesThatIWant.get("loaded"); // will return 1.0
I'd also like to point out the following from my answer:
((List) campaignDBO.get("activeCampaigns")).get(0)
This assumes that "activeCampaigns" is a) a list and in this case b) only has one entry (I'm doing get(0)).
You will also have noticed that the fields values you've set are almost entirely being ignored, and the result is most of the document, not just the fields you asked for. I'm pretty sure you can only define the top-level fields you want the query to return, so your code:
BasicDBObject fields = new BasicDBObject("activeCampaigns.callsByTimeZone.label",1)
.append("activeCampaigns.callsByTimeZone.loaded",1)
.append("_id", 0);
is actually exactly the same as:
BasicDBObject fields = new BasicDBObject("activeCampaigns", 1).append("_id", 0);
I think some of the points that will help you to work with Java & MongoDB are:
When you query the database, it will return you the whole document of
the thing that matches your query, i.e. everything from "skillID"
downwards. If you want to select the fields to return, I think those will only be top-level fields. See the documentation for more detail.
To navigate the results, you need to know that a DBObjects are returned, and that these are effectively a Map<String,
Object> in Java - you can use get to navigate to the correct node,
but you will need to cast the values into the correct shape.
Replacing while loop from your Java code with below seems to give "PT" as output.
`while(cursor.hasNext()) {
DBObject campaignDBO = cursor.next();
campaignJson = campaignDBO.get("activeCampaigns").toString();
int labelInt = campaignJson.indexOf("PT", -1);
String label = campaignJson.substring(labelInt, labelInt+2);
System.out.println(label);
}`

How to retrieve the schema of a DBObject?

In mySQL the describe statement can be used to retrieve the schema of a given table, unfortunately I could not locate a similar functionality for the MongoDB java driver :(
Let's say I have I have the flowing BSON documents:
{
_id: {$oid:49},
values: { a:10, b:20}
}
,
{
_id: {$oid:50},
values: { b:21, c:31}
}
Now let's suppose I do:
DBObject obj = cursor.next();
DBObject values_1 = (DBObject) obj.get("values");
and the schema should be something like this:
>a : int
>b : int
and for the next document:
DBObject obj = cursor.next();
DBObject values_2 = (DBObject) obj.get("values");
the schema should be:
>b : int
>c : int
Now that I explained what the schema retrial is, can some1 be nice and tell me how to do it?
If it helps, In ma case I only need to know the field names (because the datatype is always the same, but it would be nice also know how to retrieve the datatypes).
A work arround, is convert the DBObject to a Map, then the Map to a Set, the Set to an Iterator and extract the attribute names/values... Still have now idea how to extract data types.
this is:
DBObject values_1 = (DBObject) obj.get("values");
Map _map = values_1.toMap();
// Set set = _map.entrySet(); // if you want the <key, value> pairs
Set _set_keys = _map.keySet();
Iterator _iterator = _set_keys.iterator();
while (_iterator.hasNext())
System.out.println("-> " + _iterator.next());
A DBObject has a method called keySet (documentation). You shouldn't need to convert to a Map first.
There's no exposed method to determine the underlying BSON data type at this point, so you'd need to investigate using instanceof or getClass to determine the underlying data type of the Object that is returned from get.
If you look at the source code for BasicBSONObject for example, you'll see how the helper functions that cast do some basic checks and then force the cast.

Categories