Solr search not working properly

Solr search not working properly - java

I am searching for String Kansas City in description field.
"q":"description: *Kansas City*", but I am getting the results for both Kansas and City. Also it is getting the results from content field as well. I am not sure why it is fetching results from content field. Please suggest me if I am doing any error in my query.

Your quoting is wrong
description:"kansas city"
for example
What are the stars for?

After tokenizing and parsing query it looks like kansas city is tokenized into "kansas" and "city" and filters are applied as per fieldtype definition.
then they are searched in fieldname specified.
description:*Kansas
after tokenizing/word splitting, "city" becomes
different word for which you didn't specify fieldname. so by default it is searched in defaultfield(which is may be content in your case)
defaultsearchfield:city*
in your case after parsing description:kansasandcontent:city you can see the same debugQuery=on with URL in your browser.

Related

Spring data mongodb sort by a substring of a String field if it contains a specific character

I would like to make a query to the database to return a list of data, ordered by name.
Some names fields contains a code like this : [code] name.
so i want to sort data only by name and neglect the code if it exist.
Exemple of data :
[CODE1] John
Xavi
Arnold
[CODE 2] Ben
the order must be => Arnold, Ben, John, Xavi.
It is possible?
my code now :
query.with(Sort.by(Sort.Direction.DESC, "name")).with(pageable);

Realm.io [java] search by combined field

I have User model with Name and Surname properties, and I need query to search by name.
This code is now:
query
.beginGroup()
.contains("name", search, Case.INSENSITIVE)
.or()
.contains("surname", search, Case.INSENSITIVE)
.endGroup()
.findAll();
But, if i want to search Jerry Smith, i'll write "jerry smi" and won't get what i want, obviously because there's or. How should i do this?
I'm going to create and maintain fullname field, setting it on setters of name/surname and search by it, is it good path?

The easiest way would be to add a third field called "full name" where you put name + surname. Then you'd only search like this:
query.contains("full name", search, Case.INSENSITIVE).findAll();

RealmResults<Event> toEdit = realm.where(Event.class)
.equalTo("name", day)
.equalTo("surname", month)
.findAll();
Hope this will help ! Cheers !

Using Lucene, how to index TXT files into different fields?

I am using the NSF data whose format is txt. Now I have indexed these data and can send a query and got several results. But how can I search something in a selected field (eg. title) ? Because all of these NSF data are totally plain txt file. I do not think Lucene can recognize which part of the file is a "title" or something else. Should I firstly transfer the txt files to XML files (with tags telling Lucene which part is "title")? Can Lucene do that? I have no idea how to split the txt files into several fields. Can anyone please give me some suggestions? Thanks a lot!
BTW, every txt file looks like this:
---begin---
Title: Mitochondrial DNA and Historical Demography
Type: Award
Date: August 1, 1991
Number: 9000006
Abstract: asdajsfhsjdfhsjngfdjnguwiehfrwiuefnjdnfsd
----end----

You have to split the text into the several parts. You can use the resulting strings to create a field for each part of the text, i.e. title.
Create your lucene document with the fields like this:
Document doc = new Document();
doc.add(new Field("title", titleString, Field.Store.NO, Field.Index.TOKENIZED));
doc.add(new Field("abstract", abstractString, Field.Store.NO, Field.Index.TOKENIZED));
and so on. After indexing the document you can search in the title like this: title:dna
More complex queries and mixing multiple fields in the query also possible: +title:dna +abstract:"some example text" -number:935353

How to retrieve the Field that "hit" in Lucene

Maybe I'm really missing something.
I have indexed a bunch of key/value pairs in Lucene (v4.1 if it matters). Say I have
key1=value1 and key2=value2, e.g. as read from a properties file.
They get indexed both as specific fields and into a catchall "ALL" field, e.g.
new Field("key1", "value1", aFieldTypeMimickingKeywords);
new Field("key2", "value2", aFieldTypeMimickingKeywords);
new Field("ALL", "key1=value1", aFieldTypeMimickingKeywords);
new Field("ALL", "key2=value2", aFieldTypeMimickingKeywords);
// then get added to the Document of course...
I can then do a wildcard search, using
new WildcardQuery(new Term("ALL", "*alue1"));
and it will find the hit.
But, it would be nice to get more info, like "what was complete value (e.g. "key1=value1") that goes with that hit?".
The best I can figure out it to get the Document, then get the list of IndexableFields, then loop over all of them and see if the field.stringValue().contains("alue1"). (I can look at the data structures in the debugger and all the info is there)
This seems completely insane cause isn't that what Lucene just did? Shouldn't the Hit information return some of the Fields?
Is Lucene missing what seems like "obvious" functionality? Google and starting at the APIs hasn't revealed anything straightforward, but I feel like I must be searching on the wrong stuff.

You might want to try with IndexSearcher.explain() method. Once you get the ID of the matching document, prepare a query for each field (using the same search keywords) and invoke Explanation.isMatch() for each query: the ones that yield true will give you the matched field. Example:
for (String field: fields){
Query query = new WildcardQuery(new Term(field, "*alue1"));
Explanation ex = searcher.explain(query, docID);
if (ex.isMatch()){
//Your query matched field
}
}

Inconsistent Apache Solr query results

I'm new to Apache Solr and trying to make a query using search terms against a field called "normalizedContents" and of type "text".
All of the search terms must exist in the field. Problem is, I'm getting inconsistent results.
For example, the solr index has only one document with normalizedContents field with value = "EDOUARD SERGE WILFRID EDOS0004 UNE MENTION COMPLEMENTAIRE"
I tried these queries in solr's web interface:
normalizedContents:(edouard AND une) returns the result
normalizedContents:(edouar* AND une) returns the result
normalizedContents:(EDOUAR* AND une) doesn't return anything
normalizedContents:(edouar AND une) doesn't return anything
normalizedContents:(edouar* AND un) returns the result (although there's no "un" word)
normalizedContents:(edouar* AND uned) returns the result (although there's no "uned" word)
Here's the declaration of normalizedContents in schema.xml:
<field name="normalizedContents" type="text" indexed="true" stored="true" multiValued="false"/>
So, wildcards and AND operator do not follow the expected behavior. What am I doing wrong ?
Thanks.

By default the field type text does stemming on the content (solr.SnowballPorterFilterFactory). Thus 'un' and 'uned' match une. Then you might not have the solr.LowerCaseFilterFactory filter on both, query and index analyzer, therefore EDUAR* does not match. And the 4th doesnt match as edouard is not stemmed to edouar. If you want exact matches, you should copy the data in another field that has a type with a more limited set of filters. E.g. only a solr.WhitespaceTokenizerFactory
Posting the <fieldType name="text"> section from your schema might be helpful to understand everything.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Solr search not working properly - java

Your quoting is wrong description:"kansas city" for example What are the stars for?

Related

Spring data mongodb sort by a substring of a String field if it contains a specific character

Realm.io [java] search by combined field

Using Lucene, how to index TXT files into different fields?

How to retrieve the Field that "hit" in Lucene

Inconsistent Apache Solr query results

Categories

Resources