ElasticSearch how to set specific index for a QueryBuilder (Java API)

ElasticSearch how to set specific index for a QueryBuilder (Java API) - java

I'm trying to convert this multiple GET operation of elasticsearch into the Java/Kotlin code equivalent
curl "localhost:9200/index1, index2/_search?pretty=true" -d '{
"query" : {
"bool" : {
"must" : [
{
"indices" : {
"indices" : ["index1"],
"query" : {
SOMETHING1
}
}
},
{
"indices" : {
"indices" : ["index2"],
"query" : {
SOMETHING2
}
}
}
]
}
}
}'
My solution is
val searchRequest = SearchRequest("index1", "index2")
val searchSourceBuilder = SearchSourceBuilder()
val qb: BoolQueryBuilder = QueryBuilders.boolQuery()
val qbFirst: BoolQueryBuilder = QueryBuilders.boolQuery()
qbFirst.must().add(SOMETHING1)
val qbSecond: BoolQueryBuilder = QueryBuilders.boolQuery()
qbSecond.must().add(SOMETHING2)
qb.must().add(qbFirst)
qb.must().add(qbSecond)
searchSourceBuilder.query(qb)
searchRequest.source(searchSourceBuilder)
It's not so important the SOMETHING1 and SOMETHING2 code block.
The problem of my solution is that I can't be able to specify for qbFirst to consider ONLY index1 for the research. Unfortunately it uses both index1 and index2 (and the same for qbSecond)
Any Ideas?

What you need to do is the following:
val qb: BoolQueryBuilder = QueryBuilders.boolQuery()
var q1: XXXBuilder = QueryBuilders.xxxQuery()
var q2: XXXBuilder = QueryBuilders.xxxQuery()
qb.must().add(QueryBuilders.indicesQuery(q1, "index1"))
qb.must().add(QueryBuilders.indicesQuery(q2, "index2"))
Now, since the indices query has been deprecated in ES 5 and removed in ES 6, if you ever upgrade, you'll need to search on the _index field instead, which goes like this:
val qb: BoolQueryBuilder = QueryBuilders.boolQuery()
var q1: BoolQueryBuilder = QueryBuilders.boolQuery()
q1.must().add(QueryBuilders.termQuery("_index", "index1"))
q1.must().add(QueryBuilders.xxxQuery("SOMETHING1"))
var q2: BoolQueryBuilder = QueryBuilders.boolQuery()
q2.must().add(QueryBuilders.termQuery("_index", "index2"))
q2.must().add(QueryBuilders.xxxQuery("SOMETHING2"))
qb.must().add(q1)
qb.must().add(q2)
PS: also note that unless you have a document that is present in both indexes at the same time and satisfies both conditions, you should be using should instead of must in the top-level query.

Related

Java ElasticSearch API search multiple possible values

I'm searching in multiple fields, and I want to get results if the record matches a specific value (entry.getValue()) or the String "ALL"
Here is my code, but it's not working.
SearchRequest searchRequest = new SearchRequest(MY_INDEX);
final BoolQueryBuilder booleanQuery = QueryBuilders.boolQuery();
searchRequest.source().query(booleanQuery);
final BoolQueryBuilder booleanQuery= QueryBuilders.boolQuery();
for (Map.Entry<String, String> entry : params.entrySet()) {
booleanQuery.should(QueryBuilders.termsQuery(entry.getKey(), entry.getValue(), "ALL");
}
I'm using JDK 11 and ES 7.1

Here is a sample code written for country index which is searching for data provided in map. Customize it according to your needs.
//using map for country
Map<String, String> map = new HashMap<>();
map.put("country" , "FRANCE");
map.put("countryCode", "FR");
//List of should queries this will go in should clause of bool query
List<Query> shouldQueryList = new ArrayList<>();
for (Map.Entry<String, String> entry :map.entrySet()) {
//list of terms to match i.e value from map and all.
List<FieldValue> list = Arrays.asList(FieldValue.of(entry.getValue()), FieldValue.of("ALL"));
//Terms query
Query query = new Query.Builder().terms(termsQueryBuilder -> termsQueryBuilder
.field(entry.getKey())
.terms(termQueryField -> termQueryField
.value(list))).build();
shouldQueryList.add(query);
}
try {
//running search from elastic search java client 7.16.3
SearchResponse<Country> response = elasticsearchClient.search(searchRequest -> searchRequest
.query(qBuilder -> qBuilder
.bool(boolQueryBuilder -> boolQueryBuilder
//using should query list here
.should(shouldQueryList)))
, Country.class);
response.hits().hits().forEach(a -> {
//Print matching country name in console
System.out.println(a.source().getCountry());
});
} catch (IOException e) {
log.info(e.getMessage());
}
Above code will generate query like this :
{"query":{"bool":{"should":[{"terms":{"country":["FRANCE","ALL"]}},{"terms":{"countryCode":["FR","ALL"]}}]}}}

how to insert unique data in Elasticsearch index document in java

this is my previous question - how to insert data in elastic search index
index mapping is as follows
{
"test" : {
"mappings" : {
"properties" : {
"name" : {
"type" : "keyword"
},
"info" : {
"type" : "nested"
},
"joining" : {
"type" : "date"
}
}
}
how can i check the data of field is already present or not before uploading a data to the index
Note :- I dont have id field maintained in index. need to check name in each document if it is already present then dont insert document into index
thanks in advance

As you don't have a id field in your mapping, you have to search on name field and you can use below code to search on it.
public List<SearchResult> search(String searchTerm) throws IOException {
SearchRequest searchRequest = new SearchRequest(INDEX_NAME);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
MatchQueryBuilder multiMatchQueryBuilder = new
MatchQueryBuilder(searchTerm, "firstName");
searchSourceBuilder.query(matchQueryBuilder);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = esclient.search(searchRequest, RequestOptions.DEFAULT);
return getSearchResults(searchResponse);
}
Note, as you have keyword field instead of match you can use termquerybuilder
And it uses the utility method to parse the searchResponse of ES, code of which is below:
private List<yourpojo> getSearchResults(SearchResponse searchResponse) {
RestStatus status = searchResponse.status();
TimeValue took = searchResponse.getTook();
Boolean terminatedEarly = searchResponse.isTerminatedEarly();
boolean timedOut = searchResponse.isTimedOut();
// Start fetching the documents matching the search results.
//https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-search
// .html#java-rest-high-search-response-search-hits
SearchHits hits = searchResponse.getHits();
SearchHit[] searchHits = hits.getHits();
List<sr> sr = new ArrayList<>();
for (SearchHit hit : searchHits) {
// do something with the SearchHit
String index = hit.getIndex();
String id = hit.getId();
float score = hit.getScore();
//String sourceAsString = hit.getSourceAsString();
Map<String, Object> sourceAsMap = hit.getSourceAsMap();
String firstName = (String) sourceAsMap.get("firstName");
sr.add(userSearchResultBuilder.build());
}

Elasticsearch Java query with combination of AND/OR

I am trying to write a query in Elasticsearch via Spring and Java (Elasticsearch client).
The query is somewhat like:
SELECT *** FROM elasticsearch_index
WHERE isActive = 1 AND
(
(store_code = 41 AND store_genre IN ('01', '03') )
OR (store_code = 40 AND store_genre IN ('02') )
OR (store_code = 42 AND store_genre IN ('05', '06') )
)
AND LATITUDE ...
AND LONGITUDE...
Please know that the parameters within the outer brackets is a Map<Integer, String[]>, so I would iterate over the map to add to AND + OR condition.
I tried with equivalent Java approach but does not seem to work:
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(QueryBuilders.matchQuery("isActive", 1));
BoolQueryBuilder orQuery = QueryBuilders.boolQuery();
for (Entry<Integer, String[]> entry : cvsDepoMapping.entrySet()) {
int key = entry.getKey();
String[] value = entry.getValue();
orQuery.must(QueryBuilders.matchQuery("storeCode", key));
orQuery.must(QueryBuilders.termsQuery("storeGenre", value)); // IN clause
boolQueryBuilder.should(orQuery);
}
But neither is this working nor. I am certain of the solution.
I am struggling to find the Java equivalent conditions for the above condition.
I am using:
Spring Boot 2.1.1.RELEASE
Elasticsearch 6.4.3

within your or query you need to put a nested and query for each entry:
without trying to run it:
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(QueryBuilders.matchQuery("isActive", 1));
BoolQueryBuilder orQuery = QueryBuilders.boolQuery();
for (Entry<Integer, String[]> entry : cvsDepoMapping.entrySet()) {
BoolQueryBuilder storeQueryBuilder = QueryBuilders.boolQuery();
int key = entry.getKey();
String[] value = entry.getValue();
storeQueryBuilder.must(QueryBuilders.matchQuery("storeCode", key));
storeQueryBuilder.must(QueryBuilders.termsQuery("storeGenre", value)); // IN clause
orQuery.should(storeQueryBuilder);
}
boolQueryBuilder.must(orQuery);

Elastic Search : Highlighted field not always returned

Using Java API, I need to be able to retrieve the field/highlighted field associated with the query. So I'm adding the _all field (or else *) to the query and highlighted field to the response.
It works most of the time, but not always. Here is a snippet :
final BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
Arrays.asList(query.split(" "))
.stream()
.map(QueryParser::escape)
.map(x -> String.format("*%s*", x))
.forEach(x -> {
boolQueryBuilder.should(
QueryBuilders.queryStringQuery(x)
.field("_all")
.allowLeadingWildcard(true));
});
SearchResponse response = client
.prepareSearch()
.setSize(10)
.addHighlightedField("*")
.setHighlighterRequireFieldMatch(false)
.setQuery(boolQueryBuilder)
.setHighlighterFragmentSize(40)
.setHighlighterNumOfFragments(40)
.execute()
.actionGet();
Any idea on why the field field as well as the highlightedField is not always accessible in the response given that it is technically always queried?

Not sure but I think you might be looking for this :-
String aQueryWithPartialSerach = null;
final BoolQueryBuilder aBoolQueryBuilder = new BoolQueryBuilder();
// Enabling partial sarch
if (query.contains(" ")) {
List<String> aTokenList = Arrays.asList(query.split(" "));
aQueryWithPartialSerach = String.join(" ", aTokenList.stream().map(p -> "*" + p + "*").collect(Collectors.toList()));
} else {
aQueryWithPartialSerach = "*" + query + "*";
}
aBoolQueryBuilder.should(QueryBuilders.queryStringQuery(aQueryWithPartialSerach));

CloudSearch deleteByQuery

The official Solr Java API has a deleteByQuery operation where we can delete documents that satisfy a query. The AWS CloudSearch SDK doesn't seem to have matching functionality. Am I just not seeing the deleteByQuery equivalent, or is this something we'll need to roll our own?
Something like this:
SearchRequest searchRequest = new SearchRequest();
searchRequest.setQuery(queryString);
searchRequest.setReturn("id,version");
SearchResult searchResult = awsCloudSearch.search(searchRequest);
JSONArray docs = new JSONArray();
for (Hit hit : searchResult.getHits().getHit()) {
JSONObject doc = new JSONObject();
doc.put("id", hit.getId());
// is version necessary?
doc.put("version", hit.getFields().get("version").get(0));
doc.put("type", "delete");
docs.put(doc);
}
UploadDocumentsRequest uploadDocumentsRequest = new UploadDocumentsRequest();
StringInputStream documents = new StringInputStream(docs.toString());
uploadDocumentsRequest.setDocuments(documents);
UploadDocumentsResult uploadResult = awsCloudSearch.uploadDocuments(uploadDocumentsRequest);
Is this reasonable? Is there an easier way?

You're correct that CloudSearch doesn't have an equivalent to deleteByQuery. Your approach looks like the next best thing.
And no, version is not necessary -- it was removed with the CloudSearch 01-01-2013 API (aka v2).

CloudSearch doesn't provide delete as query, it supports delete in a slightly different way i.e. build json object having only document id (to be deleted) and operation should be specified as delete. These json objects can be batched together but batch size has to be less than 5 MB.
Following class supports this functionality, you just pass its delete method the array of ids to be deleted:
class AWS_CS
{
protected $client;
function connect($domain)
{
try{
$csClient = CloudSearchClient::factory(array(
'key' => 'YOUR_KEY',
'secret' => 'YOUR_SECRET',
'region' => 'us-east-1'
));
$this->client = $csClient->getDomainClient(
$domain,
array(
'credentials' => $csClient->getCredentials(),
'scheme' => 'HTTPS'
)
);
}
catch(Exception $ex){
echo "Exception: ";
echo $ex->getMessage();
}
//$this->client->addSubscriber(LogPlugin::getDebugPlugin());
}
function search($queryStr, $domain){
$this->connect($domain);
$result = $this->client->search(array(
'query' => $queryStr,
'queryParser' => 'lucene',
'size' => 100,
'return' => '_score,_all_fields'
))->toArray();
return json_encode($result['hits']);
//$hitCount = $result->getPath('hits/found');
//echo "Number of Hits: {$hitCount}\n";
}
function deleteDocs($idArray, $operation = 'delete'){
$batch = array();
foreach($idArray as $id){
//dumpArray($song);
$batch[] = array(
'type' => $operation,
'id' => $id);
}
$batch = array_filter($batch);
$jsonObj = json_encode($batch, JSON_HEX_TAG | JSON_HEX_APOS | JSON_HEX_QUOT | JSON_HEX_AMP);
print_r($this->client->uploadDocuments(array(
'documents' => $jsonObj,
'contentType' =>'application/json'
)));
return $result['status'] == 'success' ? mb_strlen($jsonObj) : 0;
}
}

Modified for C# - Deleting uploaded document in cloud search
public void DeleteUploadedDocuments(string location)
{
SearchRequest searchRequest = new SearchRequest { };
searchRequest = new SearchRequest { Query = "resourcename:'filepath'", QueryParser = QueryParser.Lucene, Size = 10000 };
searchClient = new AmazonCloudSearchDomainClient( ConfigurationManager.AppSettings["awsAccessKeyId"] , ConfigurationManager.AppSettings["awsSecretAccessKey"] , new AmazonCloudSearchDomainConfig { ServiceURL = ConfigurationManager.AppSettings["CloudSearchEndPoint"] });
SearchResponse searchResponse = searchClient.Search(searchRequest);
JArray docs = new JArray();
foreach (Hit hit in searchResponse.Hits.Hit)
{
JObject doc = new JObject();
doc.Add("id", hit.Id);
doc.Add("type", "delete");
docs.Add(doc);
}
UpdateIndexDocument<JArray>(docs, ConfigurationManager.AppSettings["CloudSearchEndPoint"]);
}
public void UpdateIndexDocument<T>(T document, string DocumentUrl)
{
AmazonCloudSearchDomainConfig config = new AmazonCloudSearchDomainConfig { ServiceURL = DocumentUrl };
AmazonCloudSearchDomainClient searchClient = new AmazonCloudSearchDomainClient( ConfigurationManager.AppSettings["awsAccessKeyId"] , ConfigurationManager.AppSettings["awsSecretAccessKey"] , config);
using (Stream stream = GenerateStreamFromString(JsonConvert.SerializeObject(document)))
{
UploadDocumentsRequest upload = new UploadDocumentsRequest()
{
ContentType = "application/json",
Documents = stream
};
searchClient.UploadDocuments(upload);
};
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

ElasticSearch how to set specific index for a QueryBuilder (Java API) - java

Related

Java ElasticSearch API search multiple possible values

how to insert unique data in Elasticsearch index document in java

Elasticsearch Java query with combination of AND/OR

Elastic Search : Highlighted field not always returned

CloudSearch deleteByQuery

Categories

Resources