Elasticsearch get all data with filters - java

I want to get all data from elasticsearch with filters without pageable. Which way is the best to get it? I`v got default limit set to 2000. I read I should use scan but I dont know how I should use it. How I should use scan and scroll to get all data?
public Map searchByIndexParams(AuctionIndexSearchParams searchParams, Pageable pageable) {
final List<FilterBuilder> filters = Lists.newArrayList();
final NativeSearchQueryBuilder searchQuery = new NativeSearchQueryBuilder().withQuery(matchAllQuery());
Optional.ofNullable(searchParams.getCategoryId()).ifPresent(v -> filters.add(boolFilter().must(termFilter("cat", v))));
Optional.ofNullable(searchParams.getCurrency()).ifPresent(v -> filters.add(boolFilter().must(termFilter("curr", v))));
Optional.ofNullable(searchParams.getTreeCategoryId()).ifPresent(v -> filters.add(boolFilter().must(termFilter("tcat", v))));
Optional.ofNullable(searchParams.getUid()).ifPresent(v -> filters.add(boolFilter().must(termFilter("uid", v))));
//access for many uids
if(searchParams.getUids() != null){
Optional.ofNullable(searchParams.getUids().split(",")).ifPresent(v -> {
filters.add(boolFilter().must(termsFilter("uid", v)));
});
}
//access for many categories
if(searchParams.getCategories() != null){
Optional.ofNullable(searchParams.getCategories().split(",")).ifPresent(v -> {
filters.add(boolFilter().must(termsFilter("cat", v)));
});
}
final BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
if (Optional.ofNullable(searchParams.getTitle()).isPresent()) {
boolQueryBuilder.should(queryStringQuery(searchParams.getTitle()).analyzeWildcard(true).field("title"));
}
if (Optional.ofNullable(searchParams.getStartDateFrom()).isPresent()
|| Optional.ofNullable(searchParams.getStartDateTo()).isPresent()) {
filters.add(rangeFilter("start_date").from(searchParams.getStartDateFrom()).to(searchParams.getStartDateTo()));
}
if (Optional.ofNullable(searchParams.getEndDateFrom()).isPresent()
|| Optional.ofNullable(searchParams.getEndDateTo()).isPresent()) {
filters.add(rangeFilter("end_date").from(searchParams.getEndDateFrom()).to(searchParams.getEndDateTo()));
}
if (Optional.ofNullable(searchParams.getPriceFrom()).isPresent()
|| Optional.ofNullable(searchParams.getPriceTo()).isPresent()) {
filters.add(rangeFilter("price").from(searchParams.getPriceFrom()).to(searchParams.getPriceTo()));
}
searchQuery.withQuery(boolQueryBuilder);
FilterBuilder[] filterArr = new FilterBuilder[filters.size()];
filterArr = filters.toArray(filterArr);
searchQuery.withFilter(andFilter(filterArr));
final FacetedPage<AuctionIndex> search = auctionIndexRepository.search(searchQuery.build());
response.put("content", search.map(index ->auctionRepository
.findAuctionById(Long.valueOf(index.getId())))
.getContent());
return response;
}
edit:
I`v got:
String scrollId = searchTemplate.scan(searchQuery.build(), 1000, false);
Page<AuctionIndex> page = searchTemplate.scroll(scrollId, 15000L, AuctionIndex.class);
Integer i = 0;
if (page != null && page.hasContent()) {
while(page.hasContent()){
page = searchTemplate.scroll(scrollId, 15000L, AuctionIndex.class);
if(page.hasContent()){
System.out.println(i);
i++;
}
}
}
but iterate go to 166 and stop what`s wrong ?

Scroll API is the best way to go through all the documents in the most efficient way. Using the scroll_id you can find a session that is stored on the server for your specific scroll request.
Here is a sample how you can using elasticsearch java scroll api in your code to fetch all the results matching your query.
SearchResponse searchResponse = client.prepareSearch(<INDEX>)
.setQuery(<QUERY>)
.setSearchType(SearchType.SCAN)
.setScroll(SCROLL_TIMEOUT)
.setSize(SCROLL_SIZE)
.execute()
.actionGet();
while (true) {
searchResponse = client
.prepareSearchScroll(searchResponse.getScrollId())
.setScroll(SCROLL_TIMEOUT)
.execute().actionGet();
if (searchResponse.getHits().getHits().length == 0) {
break; //Break condition: No hits are returned
}
for (SearchHit hit : searchResponse.getHits()) {
// process response
}
}
Sample using Spring-data-elasticsearch
#Autowired
private ElasticsearchTemplate searchTemplate;
String scrollId = searchTemplate.scan(<SEARCH_QUERY>, 1000, false);
Page<ExampleItem> page = searchTemplate.scroll(scrollId, 5000L, ExampleItem.class);
if (page != null && page.hasContent()) {
// process first batch
while (page != null && page.hasContent()) {
page = searchTemplate.scroll(scrollId, 5000L, ExampleItem.class);
if (page != null && page.hasContent()) {
// process remaining batches
}
}
}
Here, ExampleItem specifies the entity that is to be fetched.

Related

How can i convert it to java stream

I am pretty new to java8 streams. I was trying to work on collection of objects using stream. But not able to achieve in precise way.
Below is the snippet which I achieved (which is giving wrong result). expected end result is List<String> of "Names email#test.com".
recordObjects is collection of object
choices = recordObjects.stream()
.filter(record -> record.getAttribute
(OneRecord.AT_RECORD_SUBMITTER_TABLE_EMAIL) != null)
.filter(record -> !record.getAttributeAsString
(OneRecord.AT_RECORD_SUBMITTER_TABLE_EMAIL).isEmpty())
.map(record -> record.getMultiValuedAttribute
(OneRecord.AT_RECORD_SUBMITTER_TABLE_EMAIL, String.class))
.flatMap(Collection::stream)
.map(email -> getFormattedEmailAddress(ATTRI_AND_RECORD_CONTACT_DEFAULT_NAME, email))
.collect(Collectors.toList());
but below is the exact logic i want to implement using streams.
for (CallerObject record : recordObjects) {
List<String> emails = record.getMultiValuedAttribute(
OneRecord.AT_RECORD_SUBMITTER_TABLE_EMAIL, String.class);
List<String> names = record.getMultiValuedAttribute(
OneRecord.AT_RECORD_SUBMITTER_TABLE_NAME, String.class);
int N = emails.size();
for (int i = 0 ; i < N ; i++) {
if(!isNullOrEmpty(emails.get(i)))
{
choices.add(getFormattedEmailAddress(isNullOrEmpty(names.get(i)) ?
ATTRI_AND_RECORD_CONTACT_DEFAULT_NAME : names.get(i) , emails.get(i)));
}
}
}
Since we don't know the getFormattedEmailAddress method, I used String.format instead to achieve the desired representation "Names email#test.com":
// the mapper function: using String.format
Function<RecordObject, String> toEmailString = r -> {
String email = record.getMultiValuedAttribute(OneRecord.AT_RECORD_SUBMITTER_TABLE_EMAIL, String.class);
String name = record.getMultiValuedAttribute(OneRecord.AT_RECORD_SUBMITTER_TABLE_NAME, String.class);
if (email != null) {
return String.format("%s %s", name, email);
} else {
return null;
}
};
choices = recordObjects.stream()
.map(toEmailString) // map to email-format or null
.filter(Objects::nonNull) // exclude null strings where no email was found
.collect(Collectors.toList());
Changed your older version code to Java 8
final Function<RecordedObject, List<String>> filteredEmail = ro -> {
final List<String> emails = ro.getMultiValuedAttribute(
OneRecord.AT_RECORD_SUBMITTER_TABLE_EMAIL, String.class);
final List<String> names = ro.getMultiValuedAttribute(
OneRecord.AT_RECORD_SUBMITTER_TABLE_NAME, String.class);
return IntStream.range(0, emails.size())
.filter(index -> !isNullOrEmpty(emails.get(index)))
.map(index -> getFormattedEmailAddress(isNullOrEmpty(names.get(index)) ?
ATTRI_AND_RECORD_CONTACT_DEFAULT_NAME : names.get(index) , emails.get(index)))
.collect(Collectors.toList());
};
recordObjects
.stream()
.map(filteredEmail)
.flatMap(Collection::stream)
.collect(Collectors.toList());

Replace SQL Exception with empty object

I use this SQL query to get simple object:
#Override
public Optional<PaymentTransactions> paymentTransactionByWpfPaymentId(Integer id) {
String hql = "SELECT t FROM " + PaymentTransactions.class.getName() + " t "
+ " WHERE wppt.wpf_payment_id = :id ";
TypedQuery<PaymentTransactions> query = entityManager.createQuery(hql, PaymentTransactions.class).setParameter("id", id);
List<PaymentTransactions> wpfPayments = query.getResultList();
return wpfPayments.isEmpty() ? Optional.empty() : Optional.of(wpfPayments.get(0));
}
I use this End point
#GetMapping("/{id}")
public ResponseEntity<List<PaymentTransactionsDTO>> getWpf_reference_transactions(#PathVariable String id) throws NumberFormatException, Exception {
Optional<PaymentTransactions> tnx = wpfPaymentsService.paymentTransactionByWpfPaymentId(Integer.parseInt(id));
if(tnx.get().getId() != 0) {
return ResponseEntity.ok(transactionService
.get(Integer.valueOf(tnx.get().getId())).stream()
.collect(Collectors.toList()));
}
return ResponseEntity.notFound().build();
}
But when the database is empty I get java.util.NoSuchElementException: No value present. Is there a way to return just empty object without this exception?
You can simplify your return statement using
return tnx.map(PaymentTransactions::getId)
.filter(id -> id != 0)
.map(id -> transactionService.get(id)
.stream()
.collect(Collectors.toList()))
.map(ResponseEntity::ok)
.orElse(ResponseEntity.notFound().build());
For a cleaner approach.
Also, this
id -> transactionService.get(id)
.stream()
.collect(Collectors.toList()
can become
id -> new ArrayList<>(transactionService.get(id)))
and so you have
tnx.map(Transaction::getId)
.filter(id -> id != 0)
.map(id -> new ArrayList<>(transactionService.get(id)))
.map(ResponseEntity::ok)
.orElse(ResponseEntity.notFound().build());
I also doubt you need
id -> new ArrayList<>(transactionService.get(id))
Instead, this is sufficient
id -> transactionService.get(id)
Because you cannot touch that List at all.
Optional.get() will throw NoSuchElementException - if there is no value present, so use isPresent to know the value is present in optional or not
if(tnx.isPresent() && tnx.get().getId() != 0) {
return ResponseEntity.ok(transactionService
.get(Integer.valueOf(tnx.get().getId())).stream()
.collect(Collectors.toList()));
}
return ResponseEntity.notFound().build();
Yes. Wherever your exception originates from, wrap it in a try/catch block, as follows:
try {
<code that throws exception>
} catch (NoSuchElementException e) {
return new MyObject();
}

Spring amqp reject message outside a listener

The application uses java 10, spring amqp and rabbitmq.
The system has a dead letter queue where we send some messages (they couldn't be processed as expected because of database unavailability).
For now, database availability is checked every X seconds and, if available only, we re-queue messages to their original queue. Otherwise we do nothing and messages stays in the dead letter queue.
When re-queued to original queue, messages can go back to dead letter queue again and see the x-death header count growing.
For some reasons, we would like to process dead-lettered messages that have count >= 5 (for example) and re-queue others to the dead letter queue.
I need to basic ack the message first to check the x-death count header, then send it to the original queue if count is big enough, else re-queue in dead letter queue.
I can't manage to re-queue to dead letter queue because the basic get in not inside a listener: throwing AmqpRejectAndDontRequeueException doesn't work as the exception is not thrown inside a rabbitmq listener object.
I tried throwing the exception inside a receiveAndCallback method, but this seems not better:
rabbitTemplate.receiveAndReply(queueName, new ReceiveAndReplyCallback<Message, Object>() {
#Override
public Object handle(Message message) {
Long messageXdeathCount = null;
if (null != message.getMessageProperties() && null != message.getMessageProperties().getHeaders()) {
List<Map<String, ?>> xdeathHeader =
(List<Map<String, ?>>) message.getMessageProperties().getHeaders().get(
"x-death");
if (null != xdeathHeader && null != xdeathHeader.get(0)) {
messageXdeathCount = (Long) xdeathHeader.get(0).get("count");
}
}
if (messageXdeathCount == null) {
messageXdeathCount = 0L;
}
if (messageXdeathCount >= 5) {
resendsMessage(message);
} else {
// this does not reject the message
throw new AmqpRejectAndDontRequeueException("rejected");
}
return null;
}
});
return receive;
}
After this method execution, the message is not rejected as I expect and is away from the queue (it has been acked).
Here is the exchange and queue declaration:
#Bean
public Exchange exchange() {
TopicExchange exchange = new TopicExchange(EXCHANGE, true, false);
admin().declareExchange(exchange);
Map<String, Object> args = new HashMap<String, Object>();
args.put("x-dead-letter-exchange", EXCHANGE);
Queue queue = new Queue("queueName", true, false, false, args);
admin().declareQueue(queue);
Binding binding = BindingBuilder.bind(queue).to(exchange).with(routingKey).noargs();
admin().declareBinding(binding);
return exchange;
}
How can reject the messages in the dead letter queue without using the AmqpRejectAndDontRequeueException?
Is is possible for an exchange to have x-dead-letter-exchange set to self?
Thanks for your help
UPDATE
I tried another way, with channel get and reject:
// exchange creation
#Bean
public Exchange exchange() throws IOException {
Connection connection = connectionFactory().createConnection();
Channel channel = channel();
channel.exchangeDeclare(EXCHANGE, ExchangeTypes.TOPIC, true, false, null);
Map<String, Object> args = new HashMap<String, Object>();
args.put("x-dead-letter-exchange", EXCHANGE);
channel.queueDeclare("queueName", true, false, false, args);
channel.queueBind("queueName", EXCHANGE, routingKey);
return exchange;
}
Message get and ack or reject:
GetResponse response = channel.basicGet(queueName, false);
Long messageXdeathCount = null;
if(null != response.getProps() && null != response.getProps().getHeaders()) {
List<Map<String, ?>> xdeathHeader =
(List<Map<String, ?>>) response.getProps().getHeaders().get("x-death");
if(null != xdeathHeader && null != xdeathHeader.get(0)) {
messageXdeathCount = (Long) xdeathHeader.get(0).get("count");
}
}
if (messageXdeathCount == null) {
messageXdeathCount = 0L;
}
if (messageXdeathCount >= 5) {
MessagePropertiesConverter messagePropertiesConverter = new DefaultMessagePropertiesConverter();
MessageProperties messageProps =
messagePropertiesConverter.toMessageProperties(response.getProps(),
response.getEnvelope(), "UTF-8");
resendsMessage(new Message(response.getBody(), messageProps));
channel.basicAck(response.getEnvelope().getDeliveryTag(), false);
} else {
if(response.getProps().getHeaders().get("x-death") == null) {
response.getProps().getHeaders().put("x-death", new ArrayList<>());
}
if(((List<Map<String, Object>>) response.getProps().getHeaders().get("x-death")).get(0) == null) {
((List<Map<String, Object>>)response.getProps().getHeaders().get("x-death")).add(new HashMap<>());
}
((List<Map<String, Object>>) response.getProps().getHeaders().get("x-death")).get(0).put(
"count", messageXdeathCount + 1);
channel.basicReject(response.getEnvelope().getDeliveryTag(), true);
}
First I realized that it was quite ugly, then that messages cannot be updated between get and rejected. It there a way to use channel.basicReject and update the x-death count header?
receiveAndReply() methods currently do not provide control over the acknowledging of the received message. Feel free to open a New Feature Request.
You can use a listener container instead to get the flexibility you need.
EDIT
You can drop down to the rabbitmq API...
rabbitTemplate.execute(channel -> {
// basicGet, basicPublish, ack/nack etc here
});
I could use the channel basic methods:
GetResponse response = channel.basicGet(queueName, false);
Long messageXdeathCount = 0L;
if(null != response.getProps() && null != response.getProps().getHeaders()) {
List<Map<String, ?>> xdeathHeader =
(List<Map<String, ?>>) response.getProps().getHeaders().get("x-death");
if(null != xdeathHeader && null != xdeathHeader.get(0)) {
for (Map<String, ?> map : xdeathHeader) {
Long count = (Long) map.get("count");
messageXdeathCount += count == null ? 0L : count;
}
}
}
if (messageXdeathCount >= 5) {
MessagePropertiesConverter messagePropertiesConverter = new DefaultMessagePropertiesConverter();
MessageProperties messageProps = messagePropertiesConverter.toMessageProperties(response.getProps(), response.getEnvelope(), "UTF-8");
resendsMessage(new Message(response.getBody(), messageProps));
channel.basicAck(response.getEnvelope().getDeliveryTag(), false);
} else {
channel.basicReject(response.getEnvelope().getDeliveryTag(), false);
}
the issue in the update part of my question was in last line:
channel.basicGet(queueName, true);
the boolean indicates if the message should be requeued or not: if not requeued, it goes to exchange letter and increments count x-death header, as expected. Boolean updated to false fixed the issue.

elasticsearch: return TotalPages not correct

I have 107 documents in my index base, i created a method to return all these documents with pagination, in my case the first page contains 20 documents and i logically get 6 pages, the 5 first pages contain 20 documents each and the 6th page contains only 7. The problem is that the methods reeturn always 1 page not 6
#Override
#Transactional(readOnly = true)
public Page<Convention> findAll(Pageable pageable) throws UnknownHostException {
String[] parts = pageable.getSort().toString().split(":");
SortOrder sortOrder;
if ("DESC".equalsIgnoreCase(parts[1].trim())) {
sortOrder = SortOrder.DESC;
} else {
sortOrder = SortOrder.ASC;
}
SearchResponse searchResponse = elasticsearchConfiguration.getTransportClient()
.prepareSearch("convention")
.setTypes("convention")
.setQuery(QueryBuilders.matchAllQuery())
.addSort(SortBuilders.fieldSort(parts[0])
.order(sortOrder))
.setSize(pageable.getPageSize())
.setFrom(pageable.getPageNumber() * pageable.getPageSize())
.setSearchType(SearchType.QUERY_THEN_FETCH)
.get();
return searchResults(searchResponse);
}
private Page<Convention> searchResults(SearchResponse searchResponse) {
List<Convention> conventions = new ArrayList<>();
for (SearchHit hit : searchResponse.getHits()) {
if (searchResponse.getHits().getHits().length <= 0) {
return null;
}
String sourceAsString = hit.getSourceAsString();
if (sourceAsString != null) {
ObjectMapper mapper = new ObjectMapper();
Convention convention = null;
try {
convention = mapper.readValue(sourceAsString, Convention.class);
} catch (IOException e) {
LOGGER.error("Error", e);
}
conventions.add(convention);
}
}
return new PageImpl<>(conventions);
}
http://localhost:8081/api/conventions?page=0&size=20&sort=shortname,DESC
When i execute this api, i have TotalElements=20, Number=0, TotalPages=1, and Size=0
#GetMapping("/conventions")
public ResponseEntity<List<Convention>> getAllConventions(final Pageable pageable) throws UnknownHostException {
final Page<Convention> page = conventionService.findAll(pageable);
System.out.println("-------------- 1:" + page.getTotalElements()); // 20
System.out.println("-------------- 2:" + page.getNumber()); // 0
System.out.println("-------------- 3:" + page.getTotalPages()); // 1
System.out.println("-------------- 4:" + page.getSize()); // 0
HttpHeaders headers = new HttpHeaders();
headers.add("X-Total-Count", Long.toString(page.getTotalElements()));
return new ResponseEntity<>(page.getContent(), headers, HttpStatus.OK);
}
This issue is addressed and fixed in current stable version of spring-data-elasticsearch 3.0.7
See https://jira.spring.io/browse/DATAES-402
i think it comes from this line: return new PageImpl<>(conventions);
Maybe you should transfer the total size of the responshits, because you override the query.

Efficient way to get only _ids in ElasticSearch - Java

Is this the most efficient way to retrieve only ids from ElasticSearch?
requestBuilder.setQuery(queryBuilder);
requestBuilder.setFrom(start);
requestBuilder.setSize(limit);
requestBuilder.setFetchSource(false);
SearchResponse response = requestBuilder.execute().actionGet();
SearchHit[] hits = response.getHits().getHits();
List<Long> refugeeIds = new ArrayList<>();
for (SearchHit hit : hits) {
if (hit.getId() != null) {
refugeeIds.add(Long.parseLong(hit.getId().toString()));
}
}
That should be the best way. You don't return the _source and ES will only return the _type, _index, _score and _id.

Categories