How to fetch version from search results using spring-data-elasticsearch - java

I am executing NativeSearchQuery in Springframework to execute search operation. I am trying to fetch "_version" value from response through Java API. The following is the sample request and response related to this use case
GET /my-index/_doc/foo-bar
{
"_index" : "my-index",
"_type" : "_doc",
"_id" : "foo-bar",
"_version" : 88,
"_seq_no" : 169,
"_primary_term" : 1,
"found" : true,
"_source" : {
"_class" : "fooBar",
"value" : 1250087,
"creatdDateTime" : "20210203T124928.913Z",
"ModifiedDateTime" : "20210203T124928.913Z"
}
}
I am trying to fetch _version=88 from above response through JAVA API using NativeSearchQuery. How to get this value

When using Spring Data Elasticsearch, you need to have a property of type Long annotated with #Version. Spring Data Elasticsearch will then populate this property with the version value.

At the request level, you don't need to do anything as the _version field is always returned from search queries.
Each search hit present in the response will be transformed to a Document instance and if the _version value is >= 0, then the version value will be set on the Document instance.
Then the NativeSearchQuery.search() method will return an instance of SearchHits which contains and array of SearchHit and that class provides a getVersion() method

Related

Getting java.lang.IllegalArgumentException "Name must not be null!" when updating MongoDB collection

I am facing below issue when I am going to update a existing collection entry in mongoDB. I am using Spring boot 2.0.
my existing MongoDB collection entry is like below. I'm going to update the "external_item" of below collection. It has a null key in that json posion.
{
"TestItem" : {
"item1" : "value1"
},
"external_item" : {
"" : "keyIsEmptyOfThisValue",
"key2" : false
},
"links" : [],
"createdDate" : ISODate("2020-05-10T05:24:44.014Z"),
"updatedDate" : ISODate("2020-05-10T05:24:44.014Z")
}
For that I m using below payload with the PUT method with a REST API
{
"external_item" : {
"" : "keyIsEmptyOfThisValue",
"key2" : true
}
}
when updating gives below issue. It says Name must not be null!. How can I get updated the MongoDB content in this way ??
java.lang.IllegalArgumentException: Name must not be null!
at org.springframework.util.Assert.hasText(Assert.java:162)
at org.springframework.data.mongodb.core.convert.QueryMapper$Field.<init>(QueryMapper.java:591)
at org.springframework.data.mongodb.core.convert.QueryMapper.createPropertyField(QueryMapper.java:216)
at org.springframework.data.mongodb.core.convert.UpdateMapper.createPropertyField(UpdateMapper.java:169)
at org.springframework.data.mongodb.core.convert.QueryMapper.getMappedObject(QueryMapper.java:122)
at org.springframework.data.mongodb.core.convert.QueryMapper.convertSimpleOrDBObject(QueryMapper.java:359)
at org.springframework.data.mongodb.core.convert.UpdateMapper.getMappedObjectForField(UpdateMapper.java:81)
at org.springframework.data.mongodb.core.convert.QueryMapper.getMappedObject(QueryMapper.java:123)
at org.springframework.data.mongodb.core.MongoTemplate$11.doInCollection(MongoTemplate.java:1016)
at org.springframework.data.mongodb.core.MongoTemplate$11.doInCollection(MongoTemplate.java:1007)
at org.springframework.data.mongodb.core.MongoTemplate.execute(MongoTemplate.java:410)
at org.springframework.data.mongodb.core.MongoTemplate.doUpdate(MongoTemplate.java:1007)
at org.springframework.data.mongodb.core.MongoTemplate.updateFirst(MongoTemplate.java:985)
at com.pearson.socket.core.dao.MongoDriverImpl.updateFirst(MongoDriverImpl.java:127)
at com.pearson.socket.core.dao.AbstractDAOImpl.updateFirst(AbstractDAOImpl.java:92)```
There seems to be a bug somewhere in old versions of spring-data-mongodb that caused this.
Consider updating to newer spring-boot and spring-data-mongodb versions, e.g I know that in spring-boot 2.2 that uses spring-data-mongodb 2.2.6 updating an entity with empty map key works.

MongoDb Java Driver toJson() and $oid

I'm building a Java Jersey API which uses MongoDb and MongoDb driver.
The resources should output JSON of the stored MongoDb document to be used in the frontend project using Svelte.
Due to the standard org.bson.Document.toJson() implementation the output of my documents look somehow like:
[{ "_id" : { "$oid" : "5e97f08f2175aa9174dbec0e" }, "hour" : 8, "minute" : 15, "enabled" : true, "duration" : 120 }
I would rather like it to be:
[{ "_id" : "5e97f08f2175aa9174dbec0e", "hour" : 8, "minute" : 15, "enabled" : true, "duration" : 120 }
That way it's easier to handle the id in the frontend. So how to get rid of the $oid object?
I already managed to get the format as I wish by using:
JsonWriterSettings settings = JsonWriterSettings.builder()
.outputMode(JsonMode.RELAXED)
.objectIdConverter((value, writer) -> writer.writeString(value.toHexString()))
.build();
System.out.println(doc.toJson(settings));
But how to register this setting object globally so that every doc.toJson() call will use it?
And what will happen if I send modified or new documents from the frontend to the API and do:
Document document = Document.parse(doc);
Is my modified _id field automatically converted again to an ObjectId? Or do I need a org.bson.codecs.Decoder or CodecRegistry? How would this be done?
$oid refers to ObjectId field type in bson spec. As far as I know, you need to manipulate your document to replace ObjectId for your _id into String.
String oidAsString = document.getObjectId("_id").toString();
document.put("_id", oidAsString);

ElasticSearch IO How to remove id from JSON document before writing

I have an Apache Beam streaming job which reads data from Kafka and writes to ElasticSearch using ElasticSearchIO.
The issue I'm having is that messages in Kafka already have key field, and using ElasticSearchIO.Write.withIdFn() I'm mapping this field to document _id field in ElasticSearch.
Having a big volume of data I don't want the key field to be also written to ElasticSearch as part of _source.
Is there an option/workaround that would allow doing that?
Using the Ingest API and the remove processor you´ll be able to solve this pretty easy only using your elasticsearch cluster. You can also simulate ingest pipeline and the results.
I´ve prepared a example which will probably cover your case:
POST _ingest/pipeline/_simulate
{
"pipeline": {
"description": "remove id form incoming docs",
"processors": [
{"remove": {
"field": "id",
"ignore_failure": true
}}
]
},
"docs": [
{"_source":{"id":"123546", "other_field":"other value"}}
]
}
You see, there is one test document containing a filed "id". This field is not present in the response/result anymore:
{
"docs" : [
{
"doc" : {
"_index" : "_index",
"_type" : "_type",
"_id" : "_id",
"_source" : {
"other_field" : "other value"
},
"_ingest" : {
"timestamp" : "2018-12-03T16:33:33.885909Z"
}
}
}
]
}
I've created a ticket in Apache Beam JIRA describing this issue.
For now the original issue can not be resolved as part of indexation process using Apache Beam API.
The workaround that Etienne Chauchot, one of the maintainers, proposed is to
have separate task which will clear indexed data afterwords.
See Remove a field from a Elasticsearch document for example.
For the future, if someone also would like to leverage such feature, you might want to follow the linked ticket.

mongodb java driver pullByFilter

I have document schema such as
{
"_id" : 18,
"name" : "Verdell Sowinski",
"scores" : [
{
"type" : "exam",
"score" : 62.12870233109035
},
{
"type" : "quiz",
"score" : 84.74586220889356
},
{
"type" : "homework",
"score" : 81.58947824932574
},
{
"type" : "homework",
"score" : 69.09840625499065
}
]
}
I have a solution using pull that copes with removing a single element at a time but saw
I want to get a general solution that would cope with irregular schema where there would be between one and many elements to the array and I would like to remove all elements based on a condition.
I'm using mongodb driver 3.2.2 and saw this pullByFilter which sounded good
Creates an update that removes from an array all elements that match the given filter.
I tried this
Bson filter = and(eq("type", "homework"), lt("score", highest));
Bson u = Updates.pullByFilter(filter);
UpdateResult ur = collection.updateOne(studentDoc, u);
Unsurprisingly, this did not have any effect since I wasn't specifying the array scores
I get an error
The positional operator did not find the match needed from the query. Unexpanded update: scores.$.type
when I change the filter to be
Bson filter = and(eq("scores.$.type", "homework"), lt("scores.$.score", highest));
Is there a one step solution to this problem?
There seems very little info on this particular method I can find. This question may relate to How to Update Multiple Array Elements in mongodb
After some more "thinking" (and a little trial and error), I found the correct Filters method to wrap my basic filter. I think I was focusing on array operators too much.
I'll not post it here in case of flaming.
Clue: think "matches..." (as in regex pattern matching) when dealing with Filters helper methods ;)

mongoDB: $inc of a nonexistent document in an array

I was not able to write a code, which would be able to increment a non-existent value in an array.
Let's consider a following structure in a mongo collection. (This is not the actual structure we use, but it maintains the issue)
{
"_id" : ObjectId("527400e43ca8e0f79c2ce52c"),
"content" : "Blotted Science",
"tags_with_ratings" : [
{
"ratings" : {
"0" : 6154,
"1" : 4974
},
"tag_name" : "math_core"
},
{
"ratings" : {
"0" : 154,
"1" : 474,
},
"tag_name" : "progressive_metal"
}
]
}
Example issue: We want to add to this document into the tags_with_ratings attribute an incrementation of a rating of a tag, which is not yet added in the array. For example we would want to increment a "0" value for a tag_name "dubstep".
So the expected behaviour would be, that mongo would upsert a document like this into the "tags_with_ratings" attribute:
{
"ratings" : {
"0" : 1
},
"tag_name" : "dubstep"
}
At the moment, we need to have one read operation, which checks if the nested document for the tag is there. If it's not, we pull the array tags_with_ratings out, create a new one, re-add the values from the previous one and add the new nested document in there. Shouldn't we be able to do this with one upsert operation, without having the expensive read happen?
The incrementation of the values takes up 90% of the process and more than half of it is consumed by reading, because we are unable to use $inc capability of creating an attribute, if it is non-existent in the array.
You cannot achieve what you want with one step using this schema.
You could do it however if you used tag_name as the key name instead of using ratings there, but then you may have a different issue when querying.
If the tag_name value was the field name (replacing ratings) you'd have {"dubstep":{"0":1}} instead of { "ratings" : {"0" : 1},"tag_name" : "dubstep"} which you can update dynamically the way you want to. Just keep in mind that this schema will make it more difficult to query - you have to know what the ratings are in advance to be able to query by keyname.

Categories