Spring Data Mongo - Query methods and Distinct field - java

I'm currently working on a project using Spring Data Mongo.
My repository is just an interface extending MongoRepository. I would like to add a custom query method in order to retrieve all distinct values for one of my collection's fields.
I tried something like this:
#RepositoryRestResource(path = "devices", collectionResourceRel = "deviceInfos")
public interface DeviceInfoRepository extends MongoRepository<DeviceInfo, String> {
#RestResource(path = "distinctUnitIds")
List<String> findDistinctUnitIdBy();
}
With that code, Spring give me an error because it's not able to build my list. So I tried this:
#RepositoryRestResource(path = "devices", collectionResourceRel = "deviceInfos")
public interface DeviceInfoRepository extends MongoRepository<DeviceInfo, String> {
#RestResource(path = "distinctUnitIds")
List<DeviceInfo> findDistinctUnitIdBy();
}
That code works but the distinct seems to be totally ignored.
The documentation about Distinct in query method is really not clear...
Did I do something wrong? What's the best way to solve get the distinct values of a field using Spring Data?
Thanks!

You will have to use Spring Data MongoTemplate - the MongoRepository interfaces are made only for basic functionality and for more fine grain control of what you are querying, its best to use MongoTemplate.
Here is an example of how one would get distinct values from a collection:
Criteria criteria = new Criteria();
criteria.where("dataset").is("d1");
Query query = new Query();
query.addCriteria(criteria);
List list = mongoTemplate.getCollection("collectionName")
.distinct("source",query.getQueryObject());
Here is the link to more info: mongodb mongoTemplate get distinct field with some criteria

in SpringBoot2 you can do the following :
DistinctIterable<String> iterable = mongoTemplate.getCollection(COLLECTION_NAME).distinct("source",in(FieldValue,query.getQueryObject(), String.class);
MongoCursor<String> cursor = iterable.iterator();
List<String> list = new ArrayList<>();
while (cursor.hasNext()) {
list.add(cursor.next());
}
return list;

Related

How to know the missing items from Spring Data JPA's findAllById method in an efficient way?

Consider this code snippet below:
List<String> usersList = Arrays.asList("john", "jack", "jill", "xxxx", "yyyy");
List<User> userEntities = userRepo.findAllById(usersList);
User class is a simple Entity object annotated with #Entity and has an #Id field which is of String datatype.
Assume that in db I have rows corresponding to "john", "jack" and "jill". Even though I passed 5 items in usersList(along with "xxxx" and "yyyy"), findAllById method would only return 3 items/entities corresponding to "john","jack",and "jill".
Now after the call to findAllById method, what's the best, easy and efficient(better than O(n^2) perhaps) way to find out the missing items which findAllById method did not return?(In this case, it would be "xxxx" and "yyyy").
Using Java Sets
You could use a set as the source of filtering:
Set<String> usersSet = new HashSet<>(Arrays.asList("john", "jack", "jill", "xxxx", "yyyy"));
And now you could create a predicate to filter those not present:
Set<String> foundIds = userRepo.findAllById(usersSet)
.stream()
.map(User::getId)
.collect(Collectors.toSet());
I assume the filter should be O(n) to go over the entire results.
Or you could change your repository to return a set of users ideally using a form of distinct clause:
Set<String> foundIds = userRepo.findDistinctById(usersSet)
.stream()
.map(User::getId)
.collect(Collectors.toSet());;
And then you can just apply a set operator:
usersSet.removeAll(foundIds);
And now usersSet contains the users not found in your result.
And a set has a O(1) complexity to find an item. So, I assume this should be O(sizeOf(userSet)) to remove them all.
Alternatively, you could iterate over the foundIds and gradually remove items from the userSet. Then you could short-circuit the loop algorithm in the event you realize that there are no more userSet items to remove (i.e. the set is empty).
Filtering Directly from Database
Now to avoid all this, you can probably define a native query and run it in your JPA repository to retrieve only users from your list which didn't exist in the database. The query would be somewhat as follows that I did in PostgreSQL:
WITH my_users AS(
SELECT 'john' AS id UNION SELECT 'jack' UNION SELECT 'jill'
)
SELECT id FROM my_users mu
WHERE NOT EXISTS(SELECT 1 FROM users u WHERE u.id = mu.id);
Spring Data: JDBC Example
Since the query is dynamic (i.e. the filtering set could be of different sizes every time), we need to build the query dynamically. And I don't believe JPA has a way to do this, but a native query might do the trick.
You could either pack a JdbcTemplate query directly into your repository or use JPA native queries manually.
#Repository
public class UserRepository {
private final JdbcTemplate jdbcTemplate;
public UserRepository(JdbcTemplate jdbcTemplate) {this.jdbcTemplate = jdbcTemplate;}
public Set<String> getUserIdNotFound(Set<String> userIds) {
StringBuilder sql = new StringBuilder();
for(String userId : userIds) {
if(sql.length() > 0) {
sql.append(" UNION ");
}
sql.append("SELECT ? AS id");
}
String query = String.format("WITH my_users AS (%sql)", sql) +
"SELECT id FROM my_users mu WHERE NOT EXISTS(SELECT 1 FROM users u WHERE u.id = mu.id)";
List<String> myUsers = jdbcTemplate.queryForList(query, userIds.toArray(), String.class);
return new HashSet<>(myUsers);
}
}
Then we just do:
Set<String> usersIds = Set.of("john", "jack", "jill", "xxxx", "yyyy");
Set<String> notFoundIds = userRepo.getUserIdNotFound(usersIds);
There is probably a way to do it with JPA native queries. Let me see if I can do one of those and put it in the answer later on.
You can write your own algorithm that finds missing users. For example:
List<String> missing = new ArrayList<>(usersList);
for (User user : userEntities){
String userId = user.getId();
missing.remove(userId);
}
In the result you will have a list of user-ids that are missing:
"xxxx" and "yyyy"
You can just add a method to your repo:
findByIdNotIn(Collection<String> ids) and Spring will make the query:
See here:
https://docs.spring.io/spring-data/jpa/docs/current/reference/html/#jpa.query-methods
Note (from the docs):
In and NotIn also take any subclass of Collection as aparameter as well as arrays or varargs.

Dynamic search term SQL query with Spring JPA or QueryDSL

I am trying to learn QueryDSL in order to return the results of a single search term from Postgres:
#GetMapping("/product/contains/{matchingWords}")
public List<ProductModel> findByTitleContaining(#PathVariable String matchingWords) {
QProductModel product = QProductModel.productModel;
JPAQuery<?> query = new JPAQuery<>(em);
List<ProductModel> products = query.select(product)
.from(product)
.where(product.title.toLowerCase()
.contains(matchingWords.toLowerCase()))
.fetch();
return products;
}
But I also want to search for any number of search terms, for example, say this is my list of search terms divided by the plus symbol:
String[] params = matchingWords.split("[+]");
How can I dynamically create contains(params[0]) AND/OR contains(params[1] AND/OR ... contains(params[n]) using either QueryDSL or any Java/Spring query framework? I see QueryDSL has a predicate system, but I still don't understand how to dynamically create a query based on a variable number of parameters searching in the same column.
I figured it out. It's a little non-intuitive, but using BooleanBuilder and JPAQuery<?> you can create a dynamic series of boolean predicates, which return a list of matches.
Example:
QProductModel product = QProductModel.productModel;
JPAQuery<?> query = new JPAQuery<>(//entity manager goes here//);
// example list of search parameters separated by +
String[] params = matchingWords.split("[+]");
BooleanBuilder builder = new BooleanBuilder();
for(String param : params) {
builder.or(product.title.containsIgnoreCase(param));
}
// then you can put together the query like so:
List<ProductModel> products = query.select(product)
.from(product)
.where(builder)
.fetch();
return products;

Spring mongo repository slice

I am using spring-sata-mongodb 1.8.2 with MongoRepository and I am trying to use the mongo $slice option to limit a list size when query, but I can't find this option in the mongorepository.
my classes look like this:
public class InnerField{
public String a;
public String b;
public int n;
}
#Document(collection="Record")
punlic class Record{
public ObjectId id;
public List<InnerField> fields;
public int numer;
}
As you can see I have one collection name "Record" and the document contains the InnerField. the InnerField list is growing all the time so i want to limit the number of the selected fields when I am querying.
I saw that: https://docs.mongodb.org/v3.0/tutorial/project-fields-from-query-results/
which is exactly what I need but I couldn't find the relevant reference in mongorepository.
Any ideas?
Providing an abstraction for the $slice operator in Query is still an open issue. Please vote for DATAMONGO-1230 and help us prioritize.
For now you still can fall back to using BasicQuery.
String qry = "{ \"_id\" : \"record-id\"}";
String fields = "{\"fields\": { \"$slice\": 2} }";
BasicQuery query = new BasicQuery(qry, fields);
Use slice functionality as provided in Java Mongo driver using projection as in below code.
For Example:
List<Entity> list = new ArrayList<Entity>();
// Return the last 10 weeks data only
FindIterable<Document> list = db.getDBCollection("COLLECTION").find()
.projection(Projections.fields(Projections.slice("count", -10)));
MongoCursor<Document> doc = list.iterator();
while(doc.hasNext()){
list.add(new Gson().fromJson(doc.next().toJson(), Entity.class));
}
The above query will fetch all documents of type Entity class and the "field" list of each Entity class document will have only last 10 records.
I found in unit test file (DATAMONGO-1457) way to use slice. Some thing like this.
newAggregation(
UserWithLikes.class,
match(new Criteria()),
project().and("likes").slice(2)
);

Getting count and list of ids using ElasticsearchTemplate in spring-data-elasticsearch

I am using spring-data-elasticsearch for a project to provide it with full text search functionality. We keep the real data in a relational database and relevant metadata along with respective id in elasticsearch. So for search results, only id field is required as the actual data will be retrieved from the relational database.
I am building the search query based on search criteria and then performing a queryForIds():
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withIndices(indexName)
.withTypes(typeName)
.withQuery(getQueryBuilder(searchParams))
.withPageable(pageable)
.build();
return elasticsearchTemplate.queryForIds(searchQuery);
If I also need the total count for that specific searchQuery, I can do another elasticsearchTemplate.count(searchQuery) call, but that will be redundant as I understand. I think there is a way to get both the list of id and total count by using something like elasticsearchTemplate.queryForPage() in a single call.
Also, can I use a custom class in queryForPage(SearchQuery query, Class<T> clazz, SearchResultMapper mapper) which is not annotated with #Document? The actual document class is really big, and if I am not sure if passing large classes will cause any extra load on the engine since there are over 100 fields to be json mapped, but all I need is the id field. I will have a .withFields("id") in the query builder anyway.
If you want to prevent two calls to elasticsearch, i would suggest to write an custom ResultsExtractor:
SearchQuery searchQuery = new NativeSearchQueryBuilder().withIndices(indexName)
.withTypes(typeName)
.withQuery(queryBuilder)
.withPageable(pageable)
.build();
SearchResult result = template.query(searchQuery, new ResultsExtractor<SearchResult>() {
#Override
public SearchResult extract(SearchResponse response) {
long totalHits = response.getHits()
.totalHits();
List<String> ids = new ArrayList<String>();
for (SearchHit hit : response.getHits()) {
if (hit != null) {
ids.add(hit.getId());
}
}
return new SearchResult(ids, totalHits);
}
});
System.out.println(result.getIds());
System.out.println(result.getCount());
where SearchResult is a custom class:
public class SearchResult {
List<String> ids;
long count;
//getter and setter
}
This way you can get the information you need from the elasticsearch SearchResponse.
Regarding your second question: As far as I can see, when calling queryForPage(SearchQuery query, Class<T> clazz, SearchResultMapper mapper)
the passed class is not checked for the #Document annotation. Just try it out!
One may also consider using AggregatedPage<T>. You can get the total number of records, total pages, current page records, etc. just like in Pageable<T>.
SearchQuery searchQuery = new NativeSearchQueryBuilder().withIndices(indexName)
.withTypes(typeName)
.withQuery(queryBuilder)
.withPageable(pageable)
.build();
AggregatedPage<ElasticDTO> queryResult = elasticsearchTemplate.queryForPage(searchQuery , ElasticDTO.class);

Spring MongoDB query sorting

I'm fairly new on mongodb, and while I'm trying to make ordered mongodb query. But spring data mongodb's sort method is deprecated. So I used org.springframework.data.domain.Sort:
Query query = new Query();
query.with(new Sort(Sort.Direction.ASC,"pdate"));
return mongoTemplate.find(query, Product.class);
I used this code block. But its not sorting the data. So can you prefer to use any useful method for this practice?
You can define your sort in this manner to ignore case:
new Sort(new Order(Direction.ASC, FIELD_NAME).ignoreCase()
NEW ANSWER - Spring Data Moore
Use Sort.by
Query().addCriteria(Criteria.where("field").`is`(value)).with(Sort.by(Sort.Direction.DESC, "sortField"))
When you've written a custom query in your repository then you can perform sorting during invocation. Like,
Repository
#Query("{ 'id' : ?0}")
List<Student> findStudent(String id, Sort sort);
During invocation
Sort sort = new Sort(Sort.Direction.ASC, "date")
List<Student> students = studentRepo.findStudent(1, sort);
I hope this helps! :)
query.with(new Sort(Sort.Direction.ASC, "timestamp"));
remember sort parameter as field, 1 or -1 to specify an ascending or descending sort respectively.
This is how you sort a field value by Descending order.
My field value is "nominationTimestamp", but for you it could be "firstName" for example.
List<Movie> result = myMovieRepository.findAll(Sort.by(Sort.Direction.DESC, "nominationTimestamp"));
myMovieRepository is an instance of whatever class extends MongoRepository<>.
You can use aggregation for sorting your data. You have to use matching and grouping criteria for aggregation query and unwind your field.
AggregationOperation match = Aggregation.match(matching criteria);
AggregationOperation group = Aggregation.group("fieldname");
AggregationOperation sort = Aggregation.sort(Sort.Direction.ASC, "fieldname");
Aggregation aggregation = Aggregation.newAggregation(Aggregation.unwind("fieldname"),match,group,sort);
This one worked for me:
query.with(Sort.by(Sort.Order.asc("pdate")));
spring-data-commons version 2.3.5
The Sort constructor is private, so:
Query query = new Query();
query.with(Sort.by(Sort.Direction.ASC,"pdate"));
return mongoTemplate.find(query, Product.class);
You can use Aggregation in repository
Repository
#Aggregation(pipeline ={
"{$match: { id : ?0 }",
"{$sort: {date: 1}}",
"{$limit: 1}"
}
Optional<Student> findStudent(String id);
I'm using TypedAggregation with mongoTemplate in spring data to sort and limit the results set.
import static org.springframework.data.mongodb.core.aggregation.Aggregation.limit;
import static org.springframework.data.mongodb.core.aggregation.Aggregation.match;
import static org.springframework.data.mongodb.core.aggregation.Aggregation.newAggregation;
import static org.springframework.data.mongodb.core.aggregation.Aggregation.project;
import static org.springframework.data.mongodb.core.aggregation.Aggregation.sort;
import org.springframework.data.domain.Sort.Direction;
TypedAggregation<ObjectType> agg = newAggregation(ObjectType.class,
match(matching Criteria),
project("_id", ...),
sort(Direction.ASC, "_id"),
limit(pageSize));
List<RESULT_OBJECT> mappedResult = mongoTemplate.aggregate(agg, COLLECTION_NAME, RESULT_OBJECT.class).getMappedResults();

Categories