I want to search the contents in a forum especially forum questions
for example:
searchString = "Hibernate Session Configuration";
will give corresponding details in the Forum Questions
but all the words need not to be consecutive in the forum content and so i am storing the searching string in a java.util.Set containing each word
String[] searchArray= searchString.toLowerCase().split(" ");
Set<String> searchSet = new HashSet<String>();
// searchSet contains words of searchString
for(String string : searchArray){
searchSet.add(string);
}
I written hibernate query as,
DetachedCriteria detachedCriteria = DetachedCriteria.forClass(ForumQuestion.class);
for(String searchUnitString : searchSet)
{
detachedCriteria= detachedCriteria.add(Restrictions.disjunction().add(Restrictions.ilike("forumQuestion", "%"+searchUnitString+"%")));
}
return template.findByCriteria(detachedCriteria);
But this query is not working properly.. it just take the last Restrictions ignoring the previous Restrictions!
In this example, it will consider only for '%Configuration%' but my need is '%Hibernate%' or '%Session%' or '%Configuratoin%' together
Note: if I query for each word, then database hit will be high
You're not adding a disjunction. You're adding N disjunctions containing only one restriction each. The code should be:
DetachedCriteria detachedCriteria = DetachedCriteria.forClass(ForumQuestion.class);
Disjunction disjunction = Restrictions.disjunction();
for (String searchUnitString : searchSet) {
disjunction.add(Restrictions.ilike("forumQuestion", "%"+searchUnitString+"%"));
}
detachedCriteria.add(disjunction);
return template.findByCriteria(detachedCriteria);
Note that unless you have few questions in your forum, these searches will be slow. SQL queries are not the best way to handle full text search. I would look at Lucene (that Hibernate Search uses, BTW) for such a task.
Related
I am playing around with the Criteria API but having troubles to achieve the following:
I want to compare lowercased values from a search term with lowercased values from the Database!
Lets say, I want to find some people in my database.
The following code builds up a predicate with a certain key (e.g. "firstName") and value (e.g. "John")
Here is some simplified version of what I want to achieve:
CriteriaBuilder builder = entityManager.getCriteriaBuilder();
CriteriaQuery<People> query = builder.createQuery(People.class);
Root root = query.from(People.class);
Predicate predicate = builder.conjunction();
predicate = builder.and(predicate,
builder.like(root.get(key), ("%"+ value +"%").toLowerCase() ) );
Later I am saying, I want all people now matching this Criteria:
query.where(predicate);
List<People> result = entityManager.createQuery(query).getResultList();
Problem is that I also want to lowercase the Peoples' firstName in the Database so I am able to find all 'John's with a search term of:
'JOHN'
'john'
'Oh'
and so on
Thanks for help!
Cheers
EDIT:
Thanks to the answer of #Neil, I finalised the solution to:
// key and value are whitespace trimmed
predicate = builder.and(predicate,
builder.like(builder.lower(root.get(key)),
builder.lower(builder.literal("%"+ value +"%"))));
I need to determine which part of a Lucene BooleanQuery failed if the entire query returns no results.
I'm using a BooleanQuery made up of 4 NumericRangeQueries and a PhraseQuery. Each is added to the query with Occur.MUST.
If I don't get any results for a query, is there a way to tell which part of the query failed to match anything? Do I need to run queries individually and compare results to get the one that failed?
Edit - Added PhraseQuery code.
if( row.getPropertykey_tx() != null && !row.getPropertykey_tx().trim().isEmpty()){
PhraseQuery pQuery = new PhraseQuery();
String[] words = row.getPropertykey_tx().trim().split(" ");
for( String word : words ){
pQuery.add(new Term(TitleRecordColumns.SA_SITE_ADDR.toString(), word));
}
pQuery.setSlop(2);
topBQuery.add(pQuery, BooleanClause.Occur.MUST);
}
Running individual parts of the query is probably the simplest approach, to my mind.
Another tool available is the getting an Explaination. You can call IndexSearcher.explain to get an Explanation of the scoring for the query against a particular document. If you can provide the docid of a document you believe should match the query, you can analyze Explanation.toString (or toHtml, if you prefer) to determine which subqueries are not matching against it.
If you want to automatically keep a record of which clause of a BooleanQuery doesn't produce results, I believe you will need to run each query independantly. If you no longer have access to the subqueries used to create it, you can get the clauses of it instead:
findTroublesomeQuery(BooleanQuery query) {
for (BooleanClause clause : query.clauses()) {
Query subquery = clause.getQuery()
TopDocs docs = searchHoweverYouDo(subquery);
if (doc.totalSize == 0) {
//If you want to dig down recursively...
if (subquery instanceof BooleanQuery)
findTroublesomeQuery(query);
else
log(query); //Or do whatever you want to keep track of it.
}
}
}
DisjunctionMaxQuery is a commonly used query that wraps multiple subqueries as well, so might be worth considering for this sort of approach.
I have integrated the hibernate search 3.1.1 with my existing application with Spring 2.5 and Hibernate core 3.3.2 GA. With hibernate-search-3.1.1, I am using apache lucene 2.4.1.
The problem I am facing is when I search a single word or multiple words in order, it searches perfectly and return the result but when I search multiple words out of order with blank spaces, it does not return any result. For Example, If I have a text indexed as
"Hello great world!"
Now If I search "Hello" or "great world", it returns result successfully but if I search "world Hello", it returns no result.
What I want is to be able return result if any of the complete or partial words matches on the indexed text. My source code is as below:
FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(this.entityManager);
// create native Lucene query
String[] fields = new String[] { "text", "description", "standard.title", "standard.briefPurpose", "standard.name" };
MultiFieldQueryParser parser = new MultiFieldQueryParser(fields, new StandardAnalyzer());
org.apache.lucene.search.Query query = null;
try {
query = parser.parse(searchTerm);
} catch (ParseException e) {
e.printStackTrace();
}
// wrap Lucene query in a javax.persistence.Query
FullTextQuery persistenceQuery = fullTextEntityManager.createFullTextQuery(query, Requirement.class);
// execute search
#SuppressWarnings("unchecked")
List<Requirement> result = persistenceQuery.getResultList();
return result;
Please help if I need to add any thing to support what I desire.
I know its very old question. But for other this short description can be helpful.
You should use analyze = Analyze.YES attribute with #Field annotation in your pojo class. It divides the string value of that field into tokens i.e. into single words. So you can search in any sequence. But note that you have to enter whole world. For example to search 'United States of America' can be found with 'America', 'States', 'States United', 'America United', etc. But whole word will be required by Hibernate search. Only 'Uni' will not work.
EDIT:
For older version of Hibernate Search Apis:
One has to use Index.TOKENIZED instead of analyze = Analyze.YES with #Field annotation.
Have you tried to use PhraseQuery.setSlop(int)? This should allow word reordering. Check out the Javadoc for more information.
I'm learning the Hibernate Search Query DSL, and I'm not sure how to construct queries using boolean arguments such as AND or OR.
For example, let's say that I want to return all person records that have a firstName value of "bill" or "bob".
Following the hibernate docs, one example uses the bool() method w/ two subqueries, such as:
QueryBuilder b = fts.getSearchFactory().buildQueryBuilder().forEntity(Person.class).get();
Query luceneQuery = b.bool()
.should(b.keyword().onField("firstName").matching("bill").createQuery())
.should(b.keyword().onField("firstName").matching("bob").createQuery())
.createQuery();
logger.debug("query 1:{}", luceneQuery.toString());
This ultimately produces the lucene query that I want, but is this the proper way to use boolean logic with hibernate search? Is "should()" the equivalent of "OR" (similarly, does "must()" correspond to "AND")?.
Also, writing a query this way feels cumbersome. For example, what if I had a collection of firstNames to match against? Is this type of query a good match for the DSL in the first place?
Yes your example is correct. The boolean operators are called should instead of OR because of the names they have in the Lucene API and documentation, and because it is more appropriate: it is not only influencing a boolean decision, but it also affects scoring of the result.
For example if you search for cars "of brand Fiat" OR "blue", the cars branded Fiat AND blue will also be returned and having an higher score than those which are blue but not Fiat.
It might feel cumbersome because it's programmatic and provides many detailed options. A simpler alternative is to use a simple string for your query and use the QueryParser to create the query. Generally the parser is useful to parse user input, the programmatic one is easier to deal with well defined fields; for example if you have the collection you mentioned it's easy to build it in a for loop.
You can also use BooleanQuery. I would prefer this beacuse You can use this in loop of a list.
org.hibernate.search.FullTextQuery hibque = null;
org.apache.lucene.search.BooleanQuery bquery = new BooleanQuery();
QueryBuilder qb = fulltextsession.getSearchFactory().buildQueryBuilder()
.forEntity(entity.getClass()).get();
for (String keyword : list) {
bquery.add(qb.keyword().wildcard().onField(entityColumn).matching(keyword)
.createQuery() , BooleanClause.Occur.SHOULD);
}
if (!filterColumn.equals("") && !filterValue.equals("")) {
bquery.add(qb.keyword().wildcard().onField(column).matching(value).createQuery()
, BooleanClause.Occur.MUST);
}
hibque = fulltextsession.createFullTextQuery(bquery, entity.getClass());
int num = hibque.getResultSize();
To answer you secondary question:
For example, what if I had a collection of firstNames to match against?
I'm not an expert, but according to (the third example from the end of) 5.1.2.1. Keyword queries in Hibernate Search Documentation, you should be able to build the query like so:
Collection<String> namesCollection = getNames(); // Contains "billy" and "bob", for example
StringBuilder names = new StringBuilder(100);
for(String name : namesCollection) {
names.append(name).append(" "); // Never mind the space at the end of the resulting string.
}
QueryBuilder b = fts.getSearchFactory().buildQueryBuilder().forEntity(Person.class).get();
Query luceneQuery = b.bool()
.should(
// Searches for multiple possible values in the same field
b.keyword().onField("firstName").matching( sb.toString() ).createQuery()
)
.must(b.keyword().onField("lastName").matching("thornton").createQuery())
.createQuery();
and, have as a result, Persons with (firstName preferably "billy" or "bob") AND (lastName = "thornton"), although I don't think it will give the good ol' Billy Bob Thornton a higher score ;-).
I was looking for the same issue and have a somewhat different issue than presented. I was looking for an actual OR junction. The should case didn't work for me, as results that didn't pass any of the two expressions, but with a lower score. I wanted to completely omit these results. You can however create an actual boolean OR expression, using a separate boolean expression for which you disable scoring:
val booleanQuery = cb.bool();
val packSizeSubQuery = cb.bool();
packSizes.stream().map(packSize -> cb.phrase()
.onField(LUCENE_FIELD_PACK_SIZES)
.sentence(packSize.name())
.createQuery())
.forEach(packSizeSubQuery::should);
booleanQuery.must(packSizeSubQuery.createQuery()).disableScoring();
fullTextEntityManager.createFullTextQuery(booleanQuery.createQuery(), Product.class)
return persistenceQuery.getResultList();
Title asks it all... I want to do a multi field - phrase search in Lucene.. How to do it ?
for example :
I have fields as String s[] = {"title","author","content"};
I want to search harry potter across all fields.. How do I do it ?
Can someone please provide an example snippet ?
Use MultiFieldQueryParser, its a QueryParser which constructs queries to search multiple fields..
Other way is to use Create a BooleanQuery consisting of TermQurey (in your case phrase query).
Third way is to include the content of other fields into your default content field.
Add
Generally speaking, querying on multiple fields isn’t the best practice for user-entered queries. More commonly, all words you want searched are indexed into a contents or keywords field by combining various fields.
Update
Usage:
Query query = MultiFieldQueryParser.parse(Version.LUCENE_30, new String[] {"harry potter","harry potter","harry potter"}, new String[] {"title","author","content"},new SimpleAnalyzer());
IndexSearcher searcher = new IndexSearcher(...);
Hits hits = searcher.search(query);
The MultiFieldQueryParser will resolve the query in this way: (See javadoc)
Parses a query which searches on the
fields specified. If x fields are
specified, this effectively
constructs:
(field1:query1) (field2:query2)
(field3:query3)...(fieldx:queryx)
Hope this helps.
intensified googling revealed this :
http://lucene.472066.n3.nabble.com/Phrase-query-on-multiple-fields-td2292312.html.
Since it is latest and best, I'll go with his approach I guess.. Nevertheless, it might help someone who is looking for something like I am...
You need to use MultiFieldQueryParser with escaped string. I have tested it with Lucene 8.8.1 and it's working like magic.
String queryStr = "harry potter";
queryStr = "\"" + queryStr.trim() + "\"";
Query query = new MultiFieldQueryParser(new String[]{"title","author","content"}, new StandardAnalyzer()).parse(queryStr);
System.out.println(query);
It will print.
(title:"harry potter") (author:"harry potter") (content:"harry potter")