Fetch Paginated view data after applying reduce in couchdb with ektorp - java

Hi I want to fetch data from couchdb-view by applying reduce and pagination.
My view gives reduce function result as complex key as follows
{"rows":[
{"key":{"attribute":"Attribute1"},"value":20},
{"key":{"attribute":"Attribute2"},"value":1}
{"key":{"attribute":"Attribute3"},"value":1}
]}
I am trying to fetch data from couchdb using ektorp, check following code
PageRequest pageRequest = PageRequest.firstPage(10);
ViewQuery query = new ViewQuery()
.designDocId("_design/medesign")
.viewName("viewname")
.includeDocs(false)
.reduce(true)
.group(true);
Page<ViewResult> rs1 = db.queryForPage(query, pageRequest, ViewResult.class);
rs1.forEach(v -> {
System.out.println(v.getSize());
});
I am getting following error
org.ektorp.DbAccessException: com.fasterxml.jackson.databind.JsonMappingException:
Can not construct instance of org.ektorp.ViewResult:
no int/Int-argument constructor/factory method to deserialize from Number value (20)
at [Source: N/A; line: -1, column: -1]

CouchDB doesn't Give paginated details if you want paginated reduced data.
Request with paginated include docs
group=false & reduce=false & include_docs=true
URL : http://localhost:5984/dn_anme/_design/design_name/_view/viewname?include_docs=true&reduce=false&skip=0&group=false&limit=2
Response :
{
"total_rows":81,
"offset":0,
"rows":[
{
"id":"906a74b8019716f1240a7117580ec172",
"key":{
"attribute":"BuildArea"
},
"value":1,
"doc":{
"_id":"906a74b8019716f1240a7117580ec172",
"_rev":"3-7e0a1da0c2260040f8a9787636385785",
"country":"POL",
"recordStatus":"MATCHED"
}
},
{
"id":"906a74b8019716f1240a7117580eaefb",
"key":{
"attribute":"Area",
},
"value":1,
"doc":{
"_id":"906a74b8019716f1240a7117580eaefb",
"_rev":"3-165ea3a3ed07ad8cce1f3e095cd476b5",
"country":"POL",
"recordStatus":"MATCHED"
}
}
]
}
Request with Reduce
group=true& reduce=true& include_docs=false
URL : http://localhost:5984/dn_anme/_design/design_name/_view/viewname?include_docs=false&reduce=true&group=true&limit=2
Resoonse :
{
"rows":[
{
"key":[
"BuildArea"
],
"value":1
},
{
"key":[
"Area"
],
"value":1
}
]
}
Difference in between both Request:
Request with paginated include docs gives page data {"total_rows":81, "offset":0, rows":[{...},{...}]}
AND
Request with reduce give {"rows":[{...},{..}]}
How you can get paginated reduce data:
Step 1: Request rows_per_page + 1 rows from the view
Step 2: if in response one extra records than page_size then there are more records
Step 3: calculate and update skip value and got to step 1 for next page
Note: adding skip is not good option for lots of records instead of that find start key and add start key, its good for better perforamance

Related

Elasticsearch 7.13 - elastic search response with old data after update api

We using elastic 7.13
we are doing periodical update to index using upsert
The sequence of operations
create new index with dynamic mapping all strings mapped as text
"dynamic_templates": [
{
"strings_as_keywords": {
"match_mapping_type": "string",
"mapping": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "search_term_analyzer",
"copy_to": "_all",
"fields": {
"keyword": {
"type": "keyword",
"normalizer": "lowercase_normalizer"
}
}
}
}
}
]
upsert bulk with the attached code (I don't have equivalent with rest)
doing search on specific filed
localhost:9200/mdsearch-vitaly123/_search
{
"query": {
"match": {
"fullyQualifiedName": `value_test`
}
}
}
got 1 result
upsert again now "fullyQualifiedName": "value_test1234" (as in step 2)
do search as in step 3
got 2 results 1 doc with "fullyQualifiedName": "value_test"
and other "fullyQualifiedName": "value_test1234"
snippet below of upsert (step 2):
#Override
public List<BulkItemStatus> updateDocumentBulk(String indexName, List<JsonObject> indexDocuments) throws MDSearchIndexerException {
BulkRequest request = new BulkRequest().setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE);
ofNullable(indexDocuments).orElseThrow(NullPointerException::new)
.forEach(x -> {
var id = x.get("_id").getAsString();
x.remove("_id");
request.add(new UpdateRequest(indexName, id)
.docAsUpsert(true)
.doc(x.toString(), XContentType.JSON)
.retryOnConflict(3)
);
});
BulkResponse bulk = elasticsearchRestClient.bulk(request, RequestOptions.DEFAULT);
return stream(bulk.getItems())
.map(r -> new BulkItemStatus(r.getId(), isSuccess(r), r.getFailureMessage()))
.collect(Collectors.toList());
}
I can search by updated properties.
But the problem is that searches retrieve "updated fields" and previous one as well.
How can I solve it ?
maybe limit somehow the version number to be only 1.
I set setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE) but it didn't helped
Here in picture we can see result
P.S - old and updated data retrieved as well
Suggestions ?
Regards,
What is happening is that the following line must yield null:
var id = x.get("_id").getAsString();
In other words, there is no _id field in the JSON documents you pass in indexDocuments. It is not allowed to have fields with an initial underscore character in the source documents. If it was the case, you'd get the following error:
Field [_id] is a metadata field and cannot be added inside a document. Use the index API request parameters.
Hence, your update request cannot update any document (since there's no ID to identify the document to update) and will simply insert a new one (i.e. what docAsUpsert does), which is why you're seeing two different documents.

Create sheet and update data with one request

I want to implement Google sheets api request with one api call.
I managed to implement this code:
List<Request> requests = new ArrayList<>();
List<CellData> values = new ArrayList<>();
values.add(new CellData()
.setUserEnteredValue(new ExtendedValue()
.setStringValue("Hello World!")));
requests.add(new Request().setAddSheet(new AddSheetRequest()
.setProperties(new SheetProperties()
.setTitle("scstc")))
.setUpdateCells(new UpdateCellsRequest()
.setStart(new GridCoordinate()
.setSheetId(0)
.setRowIndex(0)
.setColumnIndex(0))
.setRows(Arrays.asList(
new RowData().setValues(values)))
.setFields("userEnteredValue,userEnteredFormat.backgroundColor"))
);
BatchUpdateSpreadsheetRequest body = new BatchUpdateSpreadsheetRequest().setRequests(requests);
BatchUpdateSpreadsheetResponse response = service.spreadsheets().batchUpdate(spreadsheetId, body).execute();
But I get error:
400 Bad Request
{
"code" : 400,
"errors" : [ {
"domain" : "global",
"message" : "Invalid value at 'requests[0]' (oneof), oneof field 'kind' is already set. Cannot set 'updateCells'",
"reason" : "badRequest"
} ],
"message" : "Invalid value at 'requests[0]' (oneof), oneof field 'kind' is already set. Cannot set 'updateCells'",
"status" : "INVALID_ARGUMENT"
}
at com.google.sheet.impl.GoogleSheetBasicTest1_____1.hello(GoogleSheetBasicTest1_____1.java:133)
Do you how how I can fix this issue?
Each Request object is intended to have just a single value set within it. You are setting two values:
requests.add(new Request()
.setAddSheet(...)
.setUpdateCells(...));
Instead of doing the above, you need to use two request objects:
requests.add(new Request().setAddSheet(...));
requests.add(new Request().setUpdateCells(...));
#Sam is correct, however if you are using the JSON representation make sure that your formatting is set correctly in the dictionaries you are making. I found the following formating helpfull, found in the Google Devs' Formatting cells with the Google Sheets API
blogpost:
reqs = {'requests': [
# frozen row 1, request #1
{'updateSheetProperties': {
'properties': {'gridProperties': {'frozenRowCount': 1}},
'fields': 'gridProperties.frozenRowCount',
}},
# embolden row 1, request #2
{'repeatCell': {
'range': {'endRowIndex': 1},
'cell': {'userEnteredFormat': {'textFormat': {'bold': True}}},
'fields': 'userEnteredFormat.textFormat.bold',
}},
]}
*I am new to adding information to this site. Sorry if this is not he best way to add the information but I just want to help out. I had this problem while using python instead of java and found that it was a simple error of were the brackets where.

Cloudera Navigator API fail to fetch nested data

I am working in Cloudera Manager Navigator REST API where extracting result is working fine, but unable to get any nested value.
The type of data is extracting as below.
{
"parentPath": "String",
"customProperties": "Map[string,string]",
"sourceType": "String",
"entityType": "String"
}
And data should be like
{
"parentPath": "abcd",
"customProperties": {
"nameservice" : "xyz"
},
"sourceType": "rcs",
"entityType": "ufo"
}
But I am getting key-value result as follows.
parentPath :abcd
customProperties : null
sourceType : rcs
entityType : ufo
In above response data, "customProperties" is coming with a null value where it should return a map object contains ["nameservice" : "xyz"]. This is the problem with following code snippet.
MetadataResultSet metadataResultSet = extractor.extractMetadata(null, null,"sourceType:HDFS", "identity:*");
Iterator<Map<String, Object>> entitiesIt = metadataResultSet.getEntities().iterator();
while(entitiesIt.hasNext()){
Map<String, Object> result = entitiesIt.next();
for(String data : result.keySet()){
System.out.println(" key:"+data+" value:"+result.get(data));
}
}
Can you suggest me how to get the nested value where datatype is complex.
have u checked how the data looks on navigator ui? You can first verify that once, and also try cloudera /entities/entity-id rest API in browser to check how json response is coming

Scan and scroll query not working when combined with range query in Elasticsearch

I am trying to use Java API for executing scan and scroll query while combining it with range query but I am not getting any hits. I am not sure if it's a problem with my query or my data but below is my code:
String from = "1429192800000";
String to = "1429193100000";
QueryBuilder qb = rangeQuery("myField")
.gte(from)
.lte(to);
SearchRequestBuilder searchBuilder = client.prepareSearch(oldIndex)
.setTypes("myType")
.setSearchType(SearchType.SCAN)
.setScroll(new TimeValue(60000))
.setQuery(qb)
.setSize(100);
And here's my output:
{
"_scroll_id" : my64BitScrolliD,
"took" : 576,
"timed_out" : false,
"_shards" : {
"total" : 15,
"successful" : 15,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : 0.0,
"hits" : [ ]
}
}
I have tried querying without using java api and still it doesn't return any hits. Here's my query:
curl -XGET "http://localhost:9220/path/to/cluster/myIndex/myType/_search?search_type=scan&scroll=1m -d '{
"size": 100,
"query": {
"range": {
"myField": {
"from": "1429192800000",
"to": "1429193100000"
}
}
}
}'
And I get the same output as above. The field for which I have applied the range query holds an epoch timestamp and data exists between the specified range so there is no mistake with the query I suppose (I have tested it with just _search API and without using scan and scroll and it returned me hits!).
I would like to add that the scan and scroll query works fine if I don't combine range query with it! Since there is a really large amount of data, I want to get the results in batches by specifying the time range. Am I missing something basic here? Let me know if any more information is required.

Change Value of Nested Object - MongoDB Java

I'm making a MongoDB statistic system using the Java driver, and I am wondering if it is possible (and how) to change the value of a key nested inside many objects. Here is how my database is formatted:
{
location : “chicago”,
stats : [
{
"employee" : "rob",
"stat1" : 1,
"stat2" : 3,
"stat3" : 2
},
{
"employee" : "krista",
"stat1" : 1,
"stat2" : 3,
"stat3" : 2
}
]
}
So, for example, how could I change Rob's "stat2" to another value? I am new to JSON and the MongoDB Java driver. Any help is appreciated!
You need to use the positional $ operator and $set in order to update what you want.
db.collection.update(
{ _id: <docId>, "stats.employee": "rob" },
{ "$set": { "stats.$.stat2": <value> } }
)
So you match your document and the required element of the array. The update side uses that array index to know in which element to update. The $set operator only updates the specified field.
In Java, Build with BasicDBObject.
BasicDBObject query = new BasicDBObject("_id", id);
query.append( new BasicDBObject("stats.employee", "rob") );
BasicDBObject update = new BasicDBObject("$set",
new BasicDBObject("stats.$.stat2", value));
collection.update(query,update);

Categories