Is there any way to get all documents from a db rather than specifying an id and retrieving a single document using lightcouch api in Java.Presently i am using the method
JsonObject json = dbClient.find(JsonObject.class, "some-id")
to retrieve a single document.
Thanks in advance.
How about the _all_docs view? That will return a list of all the docs in the database. Or if you include include_docs=true to the request, you will also get the contents of the documents.
Try this... use a view that emits documents, but don't send any key values. There is a chance you will get all the documents of the type you specified. Also, a warning... if you use a LightCouch view, you might need to retrieve the id's of your documents, then get your actual data by "finding" it using those id's.
Related
I'm new to Elastic search and I'm doing a task in which I need to upload more number of documents to ES. Whenever I upload everytime, I need to specify document id for that document in IndexRequest api. Is there any way in java so that i can insert documents without giving id for it (i.e creating random document id's for my documents).
Please have a look at https://www.elastic.co/guide/en/elasticsearch/client/java-rest/7.13/java-rest-high-document-index.html
In order to have the id autogenerated, just ommit this call:
request.id("1");
This should do the trick for single document operations.
If you need bulk changes, see https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high-document-bulk.html
In this case, also remove the
.id("1")
method call.
Is it possible to get data from MarkLogic xml database, but using MarkLogic API for java?
I have read documentation, but it only shows how to add a xml to collection or to delete it, doesn't show how to get all xml documents from one selected collection?
This looks like it will do the job
https://docs.marklogic.com/javadoc/client/index.html?com/marklogic/client/query/StructuredQueryBuilder.html
StructuredQueryBuilder.CollectionConstraintQuery collectionConstraint(String constraintName, String... uris)
Matches documents belonging to at least one of the criteria collections with the specified constraint.
I have been following the tutorial regarding the Google Search API at https://developers.google.com/appengine/docs/java/search/overview. The information I have found is very clear on how to build the document and load it into an index. What I am not sure of is how to load the datastore data into the document.
What am trying to achieve is a simple %LIKE% query on a few fields. For example, I am working on a music library. If the user types in "glory", then I would like to use the Search API to return all entities with "glory" somewhere in the title. I have implemented the "starts with" work around by adding the search text to "\uFFFD", however, I find this insufficient. My users will be very novice, and it would also be helpful if they didn't have to pick a field as in a traditional search. So full text search seems the solution.
Here are my questions:
Should each record in my datastore be a document? Or all the records into one document? I have a pretty well fixed datastore size of only 1000 records. Could anyone provide an example of the correct method?
I would like to return the entire datastore entity (it's only 8 fields) as an Iterable of the type of my entity. Do we specify each field we need to return? The example just says:
for (ScoredDocument scoredDocument : results) {
// process scoredDocument
}
Does anybody have an example of what comes out of the stored document? Is it exactly what we put in or must you identify each field again? Or an example of processes a ScoredDocument returning a datastore entities?
If anybody could help fill in these blanks for me, I would appreciate it.
Thank you for looking at this with me.
What am trying to achieve is a simple %LIKE% query on a few fields
In order to achieve this you need to "tokenize" your records name, GAE provides FULL TEXT SEARCH so in order for you to get partial matches you need to add partial matches for every record so:
If your record's name is "Glory" you should add the tokens for "G","Gl","Glo","Glor","y","ry","ory","lory"
Here's a very basic implementation I use to provide partial search results (only for "starts with" not implementing "end with")
public void addField(String name, String value, boolean tokenize) {
addField(Field.newBuilder().setName(name).setText(value));
if (tokenize) {
for (int i = startTokenIndex ; i < value.length() ;i++) {
addField(Field.newBuilder().setName("token" + (lastTokenIndex++))
.setText(value.substring(0, i)));
}
}
}
Should each record in my datastore be a document?
Yes. you could even match the document ID with the entity's datastore ID for quick matching. (or you can just add it as a separate field)
I would like to return the entire datastore entity (it's only 8
fields) as an Iterable of the type of my entity. Do we specify each
field we need to return?
You need to store your entity's ID in your document, that way when your search returns a set of documents you just retrieve all entities with those IDs.
Does anybody have an example of what comes out of the stored document?
Is it exactly what we put in or must you identify each field again? Or
an example of processes a ScoredDocument returning a datastore
entities?
Documents return all fields you stored in them, plus a lot of data like scoring, id, etc. The "processing" in your case would consist of getting the entity id form the Document.
If you are certain your records wont grow above 1000 you could virtually store everything within your search index. Just bear in mind the index is not designed for that and will face some serious limitations when scaling, which the datastore obviously doesn't.
I want to store java objects as part of the Solr document.
They don't need to be parsed or searched, only be returned as part of the document.
I can convert them to json or XML and store the text but I prefer something more efficient.
If I could use Java serialization and then add the binary blob to the document it could be ideal.
I'm aware of the option to convert the binary blob with base64 but I was wondering if there is a more efficient way.
I do not share the opinions of the first two answers.
An additional database call can in some scenarios be completely unnecessary, Solr can act as a NoSQL database, too.
It can even use compression for some fields, which affects CPU cost, but saves some cache memory for some kind of binary data.
Take a look at BinaryField and the lazy loading field declarations within your schema.xml.
As you can construct an id in Solr to pass with any document, you can store this object in other way (database for example) and query it as you get the id back from solr.
For example, we're storing web pages in Solr. When we index it, we're creating an id which match the id of a WebPage Object created by the ORM in the database
When a search is performed, we get the id back and load the java object from the database
No need to store it in solr (which has been made to store and index documents)
I am using XML in my project for data to be Insert/Update/Delete.
Currently i am using XPath for doing the above operations from my Java application.
I am facing a problem while retrieving the data from XML. If there are 1000 records in the XML file i want to get the data from XML file with some limit (same as limit in a MySQL select query) in the rows, for implementing the pagination in the view page. I want to display 100 records at a time, so that end-user can click on next button to see all the 1000 records.
Can anyone tell me the best way to full-fill this requirement?
Ya, we can do it with "position()" function but the problem is i want to get the data in an sorted order. position() will return the data from the XML file respective to the given number(in XML file the data may not be in an order). So i want to read the data along with order. I am not able to find the XML query for Sorting and Paginated data in XPath.
You can consider using JAXB instead of direct XML manipulation.
As you are using XPath to access your XML data, one possibility could be the position() function to get "paginated" data from the XML. Like:
/path/to/some/element[position() >= 100 and position() <= 200]
Of course you have to store the boundaries (e.g. 100 - 200 as an example) then between user requests.
Ok, if you need sorted output aswell... as far as I know there is no sort function in pure xpath (1.0/2.0). Maybe you are using a library that offers this as an extension. Or you maybe have the possibility to use an XSLT and xsl:sort. Or you use XML binding as written in the other answer.