lucene 1.3 working with java - java

I am using Lucene 1.3. I am trying to index one key and values were a list. When ever I search for the key, I should get list of results, but I am getting 0 results.i am declaring pricelistkey as Long
Here is my code which I tried.
String fieldName = pliBean.getProductID();
Field skuField = new Field(String.valueOf(priceListKey),fieldName, true, true, false);
doc.add(skuField);
writer.addDocument(doc);
This is indexed in key:listofvalues
The query I am passing is term query.
TermQuery qry = new TermQuery(new Term(key,Key));
Search(qry);

Related

Lucene queryParser builds the query correctly but search doesn't wrok

I have indexed IntPointField using lucene which I am able to fetch using below query:
Query query = IntPoint.newRangeQuery("field1", 0, 40);
TopDocs topDocs = searcher.search(query);
System.out.println(topDocs.totalHits);
its fetching the relevant correctly.
If i build the query using parse it doesn't work
Query query = new QueryParser(Version.LUCENE_8_11_0.toString(), new StandardAnalyzer()).parse("field1:[0 TO 40]");
I checked the string representation of both the query they look identical as below
field1:[0 TO 40]
Does anyone know what am I doing wrong?
IntPoint field requires custom query paser.
The below solves the problem
StandardQueryParser parser = new StandardQueryParser();
parser.setAnalyzer(new StandardAnalyzer());
PointsConfig indexableFields = new PointsConfig(new DecimalFormat(), Integer.class);
Map<String, PointsConfig> indexableFieldMap = new HashMap<>();
pointsConfigMap.put("field1", indexableFields);
parser.setPointsConfigMap(indexableFieldMap);

How to define custom analyzer to do global search with hibernate-search and elasticsearch

I have an implementation of hibernate-search-orm (5.9.0.Final) with hibernate-search-elasticsearch (5.9.0.Final).
I defined a custom analyzer on an entity (see beelow) and I indexed two entities :
id: "1"
title: "Médiatiques : récit et société"
abstract:...
id: "2"
title: "Mediatique Com'7"
abstract:...
The search works fine when I search on title field :
"title:médiatique" => 2 results.
"title:mediatique" => 2 results.
My problem is when I do a global search with accents (or not) :
search on "médiatique => 1 result (id:1)
search on "mediatique => 1 result (id:2)
Is there a way to resolve this?
Thanks.
Entity definition:
#Entity
#Table(name="bibliographic")
#DynamicUpdate
#DynamicInsert
#Indexed(index = "bibliographic")
#FullTextFilterDefs({
#FullTextFilterDef(name = "fieldsElasticsearchFilter",
impl = FieldsElasticsearchFilter.class)
})
#AnalyzerDef(name = "customAnalyzer",
tokenizer = #TokenizerDef(factory = StandardTokenizerFactory.class),
filters = {
#TokenFilterDef(factory = LowerCaseFilterFactory.class),
#TokenFilterDef(factory = ASCIIFoldingFilterFactory.class),
})
#Analyzer(definition = "customAnalyzer")
public class BibliographicHibernate implements Bibliographic {
...
#Column(name="title", updatable = false)
#Fields( {
#Field,
#Field(name = "titleSort", analyze = Analyze.NO, store = Store.YES)
})
#SortableField(forField = "titleSort")
private String title;
...
}
Search method :
FullTextEntityManager ftem = Search.getFullTextEntityManager(entityManager);
QueryBuilder qb = ftem.getSearchFactory().buildQueryBuilder().forEntity(Bibliographic.class).get();
QueryDescriptor q = ElasticsearchQueries.fromQueryString(queryString);
FullTextQuery query = ftem.createFullTextQuery(q, Bibliographic.class).setFirstResult(start).setMaxResults(rows);
if (filters!=null){
filters.stream().map((filter) -> filter.split(":")).forEach((f) -> {
query.enableFullTextFilter("fieldsElasticsearchFilter")
.setParameter("field", f[0])
.setParameter("value", f[1]);
}
);
}
if (facetFields!=null){
facetFields.stream().map((facet) -> facet.split(":")).forEach((f) ->{
query.getFacetManager()
.enableFaceting(qb.facet()
.name(f[0])
.onField(f[0])
.discrete()
.orderedBy(FacetSortOrder.COUNT_DESC)
.includeZeroCounts(false)
.maxFacetCount(10)
.createFacetingRequest() );
}
);
}
List<Bibliographic> bibs = query.getResultList();
To be honest I'm more surprised document 1 would match at all, since there's a trailing "s" on "Médiatiques" and you don't use any stemmer.
You are in a special case here: you are using a query string and passing it directly to Elasticsearch (that's what ElasticsearchQueries.fromQueryString(queryString) does). Hibernate Search has very little impact on the query being run, it only impacts the indexed content and the Elasticsearch mapping here.
When you run a QueryString query on Elasticsearch and you don't specify any field, it uses all fields in the document. I wouldn't bet that the analyzer used when analyzing your query is the same analyzer that you defined on your "title" field. In particular, it may not be removing accents.
An alternative solution would be to build a simple query string query using the QueryBuilder. The syntax of queries is a bit more limited, but is generally enough for end users. The code would look like this:
FullTextEntityManager ftem = Search.getFullTextEntityManager(entityManager);
QueryBuilder qb = ftem.getSearchFactory().buildQueryBuilder().forEntity(Bibliographic.class).get();
Query q = qb.simpleQueryString()
.onFields("title", "abstract")
.matching(queryString)
.createQuery();
FullTextQuery query = ftem.createFullTextQuery(q, Bibliographic.class).setFirstResult(start).setMaxResults(rows);
Users would still be able to target specific fields, but only in the list you provided (which, by the way, is probably safer, otherwise they could target sort fields and so on, which you probably don't want to allow). By default, all the fields in that list would be targeted.
This may lead to the exact same result as the query string, but the advantage is, you can override the analyzer being used for the query. For instance:
FullTextEntityManager ftem = Search.getFullTextEntityManager(entityManager);
QueryBuilder qb = ftem.getSearchFactory().buildQueryBuilder().forEntity(Bibliographic.class)
.overridesForField("title", "customAnalyzer")
.overridesForField("abstract", "customAnalyzer")
.get();
Query q = qb.simpleQueryString()
.onFields("title", "abstract")
.matching(queryString)
.createQuery();
FullTextQuery query = ftem.createFullTextQuery(q, Bibliographic.class).setFirstResult(start).setMaxResults(rows);
... and this will use your analyzer when querying.
As an alternative, you can also use a more advanced JSON query by replacing ElasticsearchQueries.fromQueryString(queryString) with ElasticsearchQueries.fromJsonQuery(json). You will have to craft the JSON yourself, though, taking some precautions to avoid any injection from the user (use Gson to build the Json), and taking care to follow the Elasticsearch query syntax.
You can find more information about simple query string queries in the official documentation.
Note: you may want to add FrenchMinimalStemFilterFactory to your list of token filters in your custom analyzer. It's not the cause of your problem, but once you manage to use your analyzer in search queries, you will very soon find it useful.

SolrJ Getting document score of all the resulting documents from solr query

I am able to fetch all the documents for a solr query in Solr 6.3.0 using the JAVA API SolrJ.I want an additional field of correct "score" calculated by solr(using tf,idf and field norm) to rank the documents.I am getting the score field as 1.0 for all the documents.Can you help me get the correct "score" field.
Below is my code snippet and the output.
String urlString = "http://localhost:8983/solr/mycore2";
SolrClient solr = new HttpSolrClient.Builder(urlString).build();
SolrQuery query = new SolrQuery();
query.setQuery( "*" );
query.set("fl", "id,house,postcode,score");
String s="house=".concat(address.getHouseNumber().getCoveredText());
query.addFilterQuery(s);
QueryResponse resp = solr.query(query);
SolrDocumentList list = resp.getResults();
if(list!=null)
{
System.out.println(list.toString());
}
Output
{numFound=4,start=0,maxScore=1.0,docs=[SolrDocument{id=1, house=[150-151], postcode=[641044], score=1.0}, SolrDocument{id=2, house=[150/151], postcode=[641044], score=1.0}, SolrDocument{id=3, house=[151/150], postcode=[641044], score=1.0}, SolrDocument{id=4, house=[151/150], postcode=[641044], score=1.0}]}
Edit
After MatsLindh's suggestion,here is the tweaked code and the output.
String urlString = "http://localhost:8983/solr/mycore2";
SolrClient solr = new HttpSolrClient.Builder(urlString).build();
SolrQuery query = new SolrQuery();
query.setQuery(address.getHouseNumber().getCoveredText().concat(" ").concat(address.getPostcode().getCoveredText()));
query.set("fl", "id,house,postcode,score");
QueryResponse resp = solr.query(query);
SolrDocumentList list = resp.getResults();
if(list!=null)
{
System.out.println(list.toString());
}
Output
{numFound=3,start=0,maxScore=2.4800222,docs=[SolrDocument{id=6, house=[34], postcode=[641006], score=2.4800222}, SolrDocument{id=5, house=[34], postcode=[641005], score=1.2400111}, SolrDocument{id=7, house=[2-11A], postcode=[641006], score=1.1138368}]}
Since you're not querying for anything, you're not getting a score (each score is the same, 1.0f). You're only applying a filter, which does not affect the score.
There is no tf/idf (but remember that Solr now uses BM25 as the default similarity model and not tf/idf) score to calculate if there are no tokens to match in the actual query.

MongoDB: Query using $gte and $lte in java

I want to perform a query on a field that is greater than or equal to, AND less than or equal to(I'm using java btw). In other words. >= and <=. As I understand, mongoDB has $gte and $lte operators, but I can't find the proper syntax to use it. The field i'm accessing is a top-level field.
I have managed to get this to work:
FindIterable<Document> iterable = db.getCollection("1dag").find(new Document("timestamp", new Document("$gt", 1412204098)));
as well ass...
FindIterable<Document> iterable = db.getCollection("1dag").find(new Document("timestamp", new Document("$lt", 1412204098)));
But how do you combine these with each other?
Currently I'm playing around with a statement like this, but it does not work:
FindIterable<Document> iterable5 = db.getCollection("1dag").find(new Document( "timestamp", new Document("$gte", 1412204098).append("timestamp", new Document("$lte",1412204099))));
Any help?
Basically you require a range query like this:
db.getCollection("1dag").find({
"timestamp": {
"$gte": 1412204098,
"$lte": 1412204099
}
})
Since you need multiple query conditions for this range query, you can can specify a logical conjunction (AND) by appending conditions to the query document using the append() method:
FindIterable<Document> iterable = db.getCollection("1dag").find(
new Document("timestamp", new Document("$gte", 1412204098).append("$lte", 1412204099)));
The constructor new Document(key, value) only gets you a document with one key-value pair. But in this case you need to create a document with more than one. To do this, create an empty document, and then add pairs to it with .append(key, value).
Document timespan = new Document();
timespan.append("$gt", 1412204098);
timespan.append("$lt", 1412204998);
// timespan in JSON:
// { $gt: 1412204098, $lt: 1412204998}
Document condition = new Document("timestamp", timespan);
// condition in JSON:
// { timestamp: { $gt: 1412204098, $lt: 1412204998} }
FindIterable<Document> iterable = db.getCollection("1dag").find(condition);
Or if you really want to do it with a one-liner without temporary variables:
FindIterable<Document> iterable = db.getCollection("1dag").find(
new Document()
.append("timestamp", new Document()
.append("$gt",1412204098)
.append("$lt",1412204998)
)
);

MongoDB-Java: How to make $geoNear to first do query, then distance?

I'm trying to query and sort documents as followed:
Query only for documents older than SOMETIME.
Within range of AROUNDME_RANGE_RADIUS_IN_RADIANS.
Get distance for each document.
Sort them by time. New to Old.
Overall it should return up to 20 results.
But it seems that since $geoNear is by default limited to 100 results, I get unexpected results.
I see $geoNear working in the following order:
Gets docs from the entire collection, by distance.
And only then executes the given Query.
Is there a way to reverse the order?
MongoDB v2.6.5
Java Driver v2.10.1
Thank you.
Example document in my collection:
{
"timestamp" : ISODate("2014-12-27T06:52:17.949Z"),
"text" : "hello",
"loc" : [
34.76701564815013,
32.05852053407342
]
}
I'm using aggregate since from what I understood it's the only way to sort by "timestamp" and get the distance.
BasicDBObject query = new BasicDBObject("timestamp", new BasicDBObject("$lt", SOMETIME));
// aggregate: geoNear
double[] currentLoc = new double[] {
Double.parseDouble(myLon),
Double.parseDouble(myLat)
};
DBObject geoNearFields = new BasicDBObject();
geoNearFields.put("near", currentLoc);
geoNearFields.put("distanceField", "dis");
geoNearFields.put("maxDistance", AROUNDME_RANGE_RADIUS_IN_RADIANS));
geoNearFields.put("query", query);
//geoNearFields.put("num", 5000); // FIXME: a temp solution I would really like to avoid
DBObject geoNear = new BasicDBObject("$geoNear", geoNearFields);
// aggregate: sort by timestamp
DBObject sortFields = new BasicDBObject("timestamp", -1);
DBObject sort = new BasicDBObject("$sort", sortFields);
// aggregate: limit
DBObject limit = new BasicDBObject("$limit", 20);
AggregationOutput output = col.aggregate(geoNear, sort, limit);
You could add a $match stage at the top of the pipleine, to filter the documents before the $geonear stage.
BasicDBObject match = new BasicDBObject("timestamp",
new BasicDBObject("$lt", SOMETIME));
AggregationOutput output = col.aggregate(match,geoNear, sort, limit);
The below piece of code now, is not required,
geoNearFields.put("query", query);

Categories