I've to change the format of elasticseach document id, I was wondering if its possible without deleting and re-indexing all the documents.
You have to reindex. The simplest way to apply these changes to your existing data is: create a new index with the new settings and copy all of your documents from the old index to the new index with bulk-api, see:
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/reindex.html
Yes, You can do it via fetching the data and Re-Indexing it. But in case you have GB's of data you should Run it like a long term Job.
So, you can do like, Fetch the old format Documents Id's of the indexed data and store/index in the new storage such as Cassandra, MongoDB or even in the SQL(As such your application need) by mapping the new format ID to that older one and when you fetch it and while using or on the displaying the data replace that with the mapped newer ID.
Related
I'm new to Elastic search and I'm doing a task in which I need to upload more number of documents to ES. Whenever I upload everytime, I need to specify document id for that document in IndexRequest api. Is there any way in java so that i can insert documents without giving id for it (i.e creating random document id's for my documents).
Please have a look at https://www.elastic.co/guide/en/elasticsearch/client/java-rest/7.13/java-rest-high-document-index.html
In order to have the id autogenerated, just ommit this call:
request.id("1");
This should do the trick for single document operations.
If you need bulk changes, see https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high-document-bulk.html
In this case, also remove the
.id("1")
method call.
I have a web page with multiple controls which contains,text,images,attachments, and we allow customers to upload (drag and drop) attachments. I need to save theis data to mongo db. Till now , I was saving text data to one collection and saving attachments using GridFs separately.
I want to save all the data(text/images) including attachments (as json ) in base 64 encoding data as a single record.
Can I save this entire data as a single record into mongo db. The total file size could be more than 20MB (in case it has attachments).How can I achieve this?
Can I write entire data into json file and same it into mongo db using GridFs ?
Can I save this entire data as a single record into mongo db. The total file size could be more than 20MB (in case it has attachments).How can I achieve this?
The maximum size of a single BSON document in the current versions of MongoDB is 16MB, hence, I'm afraid you cannot save it in a single document.
Can I write entire data into json file and same it into mongo db using GridFs ?
This, on the other hand, you can do, though I don't know why would you. Your initial scheme (one document + files in GridFS) is the "normal" way of handling such cases, storing them in GridFS doesn't give you any edge: GridFS itself automatically splits documents into multiple chunks.
I have cloudant database with some already populated documents in use... I'm using a cloudant java client to fetch data from that. I plan to change the indexes that are used currently. Basically I plan to change over from using createIndex() to https://github.com/cloudant/java-cloudant#cloudant-search. Also would like to change the fields on which the documents are indexed.
Would changing the index impact the underlying data or cause any migration issues with existing data when I start to use the new Index?
It sounds like you want to change from using Cloudant Query to Cloudant Search. This should be straight forward and safe.
Adding a new index will not change or affect the existing data -- the main thing to be careful of is not deleting your old index before you've migrated your code. The easiest way to do this is by using a new design document for your new search indexes:
Create a new design document containing your search index and upload it to Cloudant (https://github.com/cloudant/java-cloudant#creating-a-search-index).
Migrate your app to use the new search index.
(Optionally) remove the design document containing the indexes that you no longer need. Cloudant will then clean up the index files that are no longer needed (https://github.com/cloudant/java-cloudant#comcloudantclientapidatabaseremovedoc-idrev-id).
I included links to the relevant parts of the Java API, but obviously you could do this through the dashboard.
Is there any way to get all documents from a db rather than specifying an id and retrieving a single document using lightcouch api in Java.Presently i am using the method
JsonObject json = dbClient.find(JsonObject.class, "some-id")
to retrieve a single document.
Thanks in advance.
How about the _all_docs view? That will return a list of all the docs in the database. Or if you include include_docs=true to the request, you will also get the contents of the documents.
Try this... use a view that emits documents, but don't send any key values. There is a chance you will get all the documents of the type you specified. Also, a warning... if you use a LightCouch view, you might need to retrieve the id's of your documents, then get your actual data by "finding" it using those id's.
I want to store java objects as part of the Solr document.
They don't need to be parsed or searched, only be returned as part of the document.
I can convert them to json or XML and store the text but I prefer something more efficient.
If I could use Java serialization and then add the binary blob to the document it could be ideal.
I'm aware of the option to convert the binary blob with base64 but I was wondering if there is a more efficient way.
I do not share the opinions of the first two answers.
An additional database call can in some scenarios be completely unnecessary, Solr can act as a NoSQL database, too.
It can even use compression for some fields, which affects CPU cost, but saves some cache memory for some kind of binary data.
Take a look at BinaryField and the lazy loading field declarations within your schema.xml.
As you can construct an id in Solr to pass with any document, you can store this object in other way (database for example) and query it as you get the id back from solr.
For example, we're storing web pages in Solr. When we index it, we're creating an id which match the id of a WebPage Object created by the ORM in the database
When a search is performed, we get the id back and load the java object from the database
No need to store it in solr (which has been made to store and index documents)