Elasticsearch use Scroll api in Java - java

I tried to use the example in here:
https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/java-search-scrolling.html
on how to use scroll with java in elasticsearch.
this is the code:
QueryBuilder qb = termQuery("multi", "test");
SearchResponse scrollResp = client.prepareSearch("test")
.addSort(FieldSortBuilder.DOC_FIELD_NAME, SortOrder.ASC)
.setScroll(new TimeValue(60000))
.setQuery(qb)
.setSize(100).get(); //max of 100 hits will be returned for each scroll
//Scroll until no hits are returned
do {
for (SearchHit hit : scrollResp.getHits().getHits()) {
//Handle the hit...
}
scrollResp = client.prepareSearchScroll(scrollResp.getScrollId()).setScroll(new TimeValue(60000)).execute().actionGet();
} while(scrollResp.getHits().getHits().length != 0); // Zero hits mark the end of the scroll and the while loop.
though for some reasons I have an error which says The method prepareSearch(String) is undefined for the type RestHighLevelClient.
my client variable is indeed RestHighLevelClient but in the tutorial it is what it should be.
ant ideas what is the problem?

RestHighLevelClient works differently than a TransportClient.
Following are the steps you must follow if you wish to use scroll with RestHighLevelClient:
Create a SearchRequest:
SearchRequest request = new SearchRequest("test").scroll(new TimeValue(60000));
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(qb);
searchSourceBuilder.sort(FieldSortBuilder.DOC_FIELD_NAME, SortOrder.ASC);
request.source(searchSourceBuilder);
Perform the first search:
SearchResponse scrollResp = client.search(sreq);
here client is the RestHighLevelClient.
For subsequent scroll searches create a SearchScrollRequest and then use it for scroll:
scrollResp = client.searchScroll(new SearchScrollRequest(scrollResponse.getScrollId()).scroll(new TimeValue(60000)));
For more information refer :Search Scroll API

From elasticsearch 6 there are two apis
One is Rest Api
One is transport api.
Error says that you have used client of REST Api and Code of TRANSPORT Api.
You need to use this Client Api : https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/transport-client.html
But it would be beneficial if you use REST api as elasticsearch will remove TRANSPORT Api in future.
Here is scroll request for REST Api : https://www.elastic.co/guide/en/elasticsearch/client/java-rest/6.3/java-rest-high-search-scroll.html

Related

get results without id in org.elasticsearch.client.RestHighLevelClient

I'm using org.elasticsearch.client.RestHighLevelClient to get data from elasticsearch.
I want to know is it possible to get all documents using RestHighLevelClient for given index?
like http://localhost:9200/test/_search?
It is definitely possible.
First of all you need to initialize the client
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(
new HttpHost("localhost", 9200, "http"),
new HttpHost("localhost", 9201, "http")));
then you need to execute a search query.
If you would like to fetch all docs you will have to use the scrolling API.
You can find a complete example here.
If you do not need all, you can simply use the search API.
And don't forget to close the connection when the work is done
client.close();

How to get the total count of documnets present in an index using JAVA High Level Rest Client

I want to know the count of all the documetns present in an index,is it possible to get the count using java high level rest client COUNT API?
You can get the count of all the documents in an index either using cat count or Count API.
If you're using elasticsearch version 6.6 and above then you can follow this link to get the count using the Java High Level REST Client's Count API.
If you're using older versions then you have to use the Java Low Level REST Client to get the doc count.
As a RestHighLevelClient is built on top of Low Level REST Client.
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(
new HttpHost("localhost", 9200, "http"),
new HttpHost("localhost", 9201, "http")));
you can use this to get the low level client from RestHighLevelClient:
RestClient lowLevelClient = client.getLowLevelClient();
Perform the following command for elasticsearch version 6.3 and lower:
Response response = client.getLowLevelClient().performRequest("GET", indexName+"/_count");
Perform the following for elasticsearch version 6.3 till 6.5:
Request request = new Request("GET", indexName+"/_count");
client.getLowLevelClient().performRequest(request);
Convert the response into String:
String responseBody = EntityUtils.toString(response.getEntity());
Then you can parse the responseBody to get the count value.

Using ES's UpdateByQueryRequestBuilder with Rest High Level Client

In Elasticsearch v5.5, we have used Transport Client when defining
UpdateByQueryRequestBuilder and it worked fine:
UpdateByQyeryRequestBuilder updateByQuery = UpdateByQueryAction.INSTANCE
.newRequestBuilder(transportClient);
Since we're upgrading to use RestHighLevelClient, the above builder no longer works and it has this as error: "The method newRequestBuilder(ElasticsearchClient) in the type UpdateByQueryAction is not applicable for the arguments (RestHighLevelClient)".
Does anyone know if i can just simply cast it like below:
UpdateByQyeryRequestBuilder updateByQuery = UpdateByQueryAction.INSTANCE
.newRequestBuilder((ElasticsearchClient) restHighLevelClient);
or there should be some other way to do it? Thanks
From the documentation. It looks like you should prepare request directly:
UpdateByQueryRequest request = new UpdateByQueryRequest("source1", "source2");
request.set...
and later execute the request:
BulkByScrollResponse bulkResponse = client.updateByQuery(request, RequestOptions.DEFAULT);
I think UpdateByQyeryRequestBuilder is a class specific for the TransportClient only.

Elasticsearch Java Rest Client: how to get the list of all indices

How do I get the list of all indices in Elasticsearch using the Rest Client?
(All answers I've found online seem to deal with the old type of client.
I fail to find the direct answer in the doc,
https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/index.html
can't figure out which section to look into, either Cluster or Index APIs etc.)
Via the REST API you can verify with this URL : http://elasticsearch:9200/_cat/indices?v
Via the Java Client API (I just realised you asked this way) : you can bet on the Cluster Health API : https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-cluster-health.html
And use
ClusterHealthRequest request = new ClusterHealthRequest();
ClusterHealthResponse response = client.cluster().health(request, RequestOptions.DEFAULT);
Set<String> indices = response.getIndices().keySet();
And you will get the list of indices ;)
In current Java High Level REST Client you can list all indices simply by requesting a GetIndex request with "*" as an index name.
GetIndexRequest request = new GetIndexRequest().indices("*");
GetIndexResponse response = client.indices().get(request, RequestOptions.DEFAULT);
String[] indices = response.getIndices();

Elasticsearch 6 RestHighLevelClient: How to know when the result of an IndexRequest is ready to be read?

I'm writing a unit test where I need to write to an Elasticsearch 6 index using a RestHighLevelClient in the Java Elasticsearch 6 library, then read from the index. How can I know when the results of an IndexRequest are ready to be read from the index via RestHighLevelClient.search? For example:
RestHighLevelClient client;
//client initialization
BulkRequest request = new BulkRequest();
request.add(new IndexRequest(...));
BulkResponse response = client.bulk(request);
//process response
SearchRequest request = new SearchRequest(...);
SearchResponse scrollResponse = client.search(request);
//scrollResponse is empty!
Basically, if I put a Thread.sleep between the write and the read, the response has the content I wrote, so I think the requests are being made properly. Is there a way I can be sure to wait until the client.bulk(request) part has completely finished writing before I do the read operation?
This will force a refresh as part of this request.
request.setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE);
Figured it out. If anyone in the future happens to have this very specific problem, you need to include:
client.refreshIndex(indexName)
In between the write and the read. Elasticsearch refreshes by default every 1 second, but you can do this explicitly as well if you need to read <1 second after writing.

Categories