kairosdb and elasticsearch integration - java

I'm using Kairosdb as my primary db. Now I want to integrate the Elasticsearch functionalities to my data inside Kairosdb. As stated inside the docs I have to duplicate all entries of my primary db inside Elasticsearch database.
Update
What I mean is that, if I want to index something inside elasticsearch, I have to do, for example:
Retrieve data of Kairosdb, a example json {"name": "hi","value": "6","tags"}
and then put it inside Elasticsearch:
curl -XPUT 'http://localhost:9200/firstIndex/test/1' -d '{"name": "hi","value": "6","tags"}'
If I want to search I have to do this:
curl 'http://localhost:9200/_search?q=name:hi&pretty=true'
I'm wondering if it is possible to not duplicate my data inside Elasticsearch, in a way which I can achieve this:
get data from KairosDB
index them using Elasticsearch without duplicate the data.
How can I go about that?

It sounds like you're hoping to use Elasticsearch as a secondary (and external) fulltext index for your primary datastore (KairosDB).
Since KairosDB is remaining your primary datastore, each record you load into Elasticsearch needs two pieces of information (at minimum):
The primary key field(s) for locating the corresponding KairosDB record(s). In the mapping, make sure to set "store": true, "index": "not_analyzed"
Any fields which you wish to be searchable (in your example, only name is searched) "store": false, "index": "analyzed"
If you want to reduce your index size further, consider disabling the _source field
Then your search workflow becomes a two-step process:
Query Elasticsearch for name:hi and retrieve the KairosDB primary key field(s) for each of the matching record(s).
Query/return KairosDB time-series data using key fields returned from Elasticsearch.
But to be clear. You don't need an exact duplicate of each KairosDB record loaded into Elasticsearch. Just the searchable fields, along with a means to locate the original record in KairosDB.

Related

MongoWriteException when inserting into Mongodb with composite custom _id

I have a MongoDB remote server that I am using.
My KEY is a custom object that has other nested objects in it.
Simple inserting works fine, although if I try to run
collection.replaceOne(eq("_id", KEY), document, new UpdateOptions().upsert(true));
I get com.mongodb.MongoWriteException: After applying the update, the (immutable) field '_id' was found to have been altered to _id: .......
If I have only have primitives in the key it works fine. Of course the value of the KEY is not changed (traced all the way down).
Is this a Mongo Java Driver bug with the ReplaceOne function?
As it turns out for Mongo filters, the order of json properties matter. With debugging it is possible to see what the actual orders of the properties in the filters are and then you can set your model properties order with #JsonPropertyOrder("att1", att2") so they match in order.
It was confirmed by a member of Mongo team.
Mongo ticket-> https://jira.mongodb.org/browse/JAVA-3392

Cannot get _id of MongoDB object with java driver

I'm inserting objects into MongoDB without specifying the _ids, because I want it to create them automatically. The problem is that at a certain point of the program I need to retrieve the _ids, but I can't get them. The code I use is the following:
List<DBObject> objs=collection.find(filter).toArray();
BDObject obj=objs.get(0);
String id=obj.get("_id");
//now id is something like 2d938830-2732-44fd-84b0-aa56b95c5df0
Eventually the id variable contains a GUID, but it's different from the one I see in RoboMongo, so it's wrong. What I see in RoboMongo is something like:
"_id": LUUID("cada0d4f-a72d-47ad-8ea8-239c3e5795dd")

Mongo DB / No duplicates

I have have a mongo collection that keeps state records for devices. Thus, there could be multiple records per device. What I would like to do is create a query through the mongoTemplate that gets the latest record for each device.
Here's the constraints:
Pass in a Set<'String'> name_ids, regular field within mongo collection not the _id or found within the _id
get only the latest record for each device with matching name_id
return List<'DeviceStateData'> (No duplicates should be found with the same name_id)
example of collection object:
{
_id: "241324123412",
name_id: "flyingMan",
powerState:"ON",
timeStamp: ISODate('')
}
Thanks
You should look on Distinct function.
Here you can find details with Spring.

Partial JSON update on DynamoDB

I have a simple dynamo table that consists of cookies and attributes:
customer
cookie
attribute_1
attribute_2
...
attribute_n
Right now, these attributes are variable and need to be updated upon receiving a partial JSON through and endpoint.
I made my mind into using the new JSON type field in DynamoDB (since that's our main datastore choice), and I intend to reshape the table into:
customer
cookie
attributes
Where attributes is just a JSON document.
Main issues:
I have no way of knowing which attributes are going to be added
I have no way ok knowing which items already exist (save from making an extra query)
I'd like to avoid a super complex code to do this
Main goal:
In an ideal world, there should be some way of having or not an item in dynamo and passing the primary key along with some JSON and then having the DB partially update the existing JSON.
So far I've seen this kind of code:
DynamoDB dynamo = new DynamoDB(new AmazonDynamoDBClient(...));
Table table = dynamo.getTable("people");
table.updateItem(
new UpdateItemSpec()
.withPrimaryKey("person_id", 123)
.withUpdateExpression("SET document.current_city = :city")
.withValueMap(new ValueMap().withString(":city", "Seattle")));
But I'd like to avoid making an extra query (to know if I need to create or update) and constructing all the update expressions.
Is there a way to do this?
Here is a full example just in case:
1) Receive the following JSON in the API:
{"name": "John"}
Expected dynamo attribute:
attributes={"name": "John"}
2) Receive the following JSON in the API:
{"age": 12}
Expected dynamo attribute:
attributes={"name": "John", "age": 12}
And so on. The primary key is constructed from the request cookie / customer.
My hopes for this existing comes from the fact that dynamo supports the smart updateItem (which I'm currently using) that allows to specify only some attributes to update or create an item.

MongoDB: insert documents with specific id instead of auto generated ObjectID

I need to insert documents on MongoDB (with specific id instead of auto generated ObjectID) using java..
To insert one document or update if exist, I tried use findOne to search for the id, if it doesn't exist then insert the id and then findAndModify. It works but I don't feel that it's efficient way, it's time consuming. Is there a better way to achieve that?
To insert multiple documents at once,I'm following this solution. but I don't know how I can insert my custom Id instead of objectID?
Any help will be appreciated
For your first problem MongoDB has upsert so
db.collection.update(
{query for id},
{document},
{upsert: true}
)
or in the Java driver
yourCollection.update(searchObject, modifiedObject, true, false);
If you want to set a custom ID you just set the _id key manually i.e.
yourBasicDBObject.put("_id",yourCustomId)
you just have to ensure it is unique for each document.
You will also need to set the _id in your modifiedObject otherwise a new one will be generated.
As for the bulk operations, just setting a custom ID for each document by giving the _id key should also work.
Try this #ebt_dev:
db.collection("collectionname").insertOne(data, { forceServerObjectId: false })

Categories