Couchbase get the documents within given date range - java

This is the document structure which in my bucket.
{
"_class": "com.link.pojo.Event",
"year": "2015",
"start": 1440115200000,
"name": "129811",
"domain": "5000$3$2015$Exhibition",
"sporttype": "Indoor",
"eventtype": "Exhibition",
"end": 1440151199000,
}
In here start mean event start date and the type is util Date. Example date format value is 2015-08-10T09:45:00.000+0000
Now I want to fetch all the documents start in current date using couchbase view. This is the way I'm trying to get it, What are the
// Create the CouchbaseClient Query object & Pass the time range to fetch events.
Query query = new Query();
// Filter on the start date and this value has to be within below given range params.
query.setIncludeDocs(true);
query.setDescending(true);
query.setInclusiveEnd(true);
query.setRange(ComplexKey.of(""), ComplexKey.of(""));
List<Event> eventList = `eventService.getEventsByCurrentDate(query);`
What are the values I should have to pass within query.setRange(); function. And what is the view I need to implement?
function (doc, meta) {
if (doc._class == "com.link.pojo.Event") {
emit(doc.start, null);
}
}

You're doing it wrong :]
A query is just a way to filter some of the results of a view. So start by defining a view - and then work out which query you need to use to get just what you need.
start by creating the view in Couchbase UI.
then look at the results of the view - again using couchbase UI. There should be a link you can click to see the results of the view in a new tab of your browser.
You can then edit the url to "query" the results of your view. add "&key=123 to get just that key. SetRange just means - "get the keys that fall in that range of numbers".
in your case, since your view emits the "start" field, your keys (or range) will have to be in the same format. So something like &key=1440115200000
I hope this helps.

Related

Query Elastic document field with and without characters

I have the following documents stored at my elasticsearch index (my_index):
{
"name": "111666"
},
{
"name": "111A666"
},
{
"name": "111B666"
}
and I want to be able to query these documents using both the exact value of the name field as well as a character-trimmed version of the value.
Examples
GET /my_index/my_type/_search
{
"query": {
"match": {
"name": {
"query": "111666"
}
}
}
}
should return all of the (3) documents mentioned above.
On the other hand:
GET /my_index/my_type/_search
{
"query": {
"match": {
"name": {
"query": "111a666"
}
}
}
}
should return just one document (the one that matches exactly with the the provided value of the name field).
I didn't find a way to configure the settings of my_index in order to support such functionality (custom search/index analyzers etc..).
I should mention here that I am using ElasticSearch's Java API (QueryBuilders) in order to implement the above-mentioned queries, so I thought of doing it the Java-way.
Logic
1) Check if the provided query-string contains a letter
2) If yes (e.g 111A666), then search for 111A666 using a standard search analyzer
3) If not (e.g 111666), then use a custom search analyzer that trims the characters of the `name` field
Questions
1) Is it possible to implement this by somehow configuring how the data are stored/indexed at Elastic Search?
2) If not, is it possible to conditionally change the analyzer of a field at Runtime? (using Java)
You can easily use any build in analyzer or any custom analyzer to map your document in elasticsearch. More information on analyzer is here
The "term" query search for exact match. You can find more information about exact match here (Finding Exact Values)
But you can not change a index once it created. If you want to change any index, you have to create a new index and migrate all your data to new index.
Your question is about different logic for the analyzer at index and query time.
The solution for your Q1 is to generate two tokens at index time (111a666 -> [111a666, 111666]) but only on token at query time (111a666 -> 111a666 and 111666 -> 111666).
I.m.h.o. your have to generate a new analyzer like
https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern_replace-tokenfilter.html which supported "preserve_original" like https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern-capture-tokenfilter.html does.
Or you could use two fields (one with original and one without letters) and search over both.

Query documents in MongoDB by object matching

How can I find the document that contains the a given JSON object?
Example:
suppose that in the database test there is a document like this:
{
"identification": {
"componentId": "3a4f6199-6141-4179-ac5f-f1bbcf627bb2",
"componentType": "PivotTable",
"dataDate": "2016-06-15T15:29:51.139+0200",
"dataType": "PTF",
"properties": {
"contextId": "0329fe70-92f0-4b60-b3c2-79377adb8f95",
"tags": ["tag1", "tag2"]
}
},
"viewData": {
"lineGroups": []
}
}
Now given only the identification part of the document with partial keys set with value:
{
"componentType": "PivotTable",
"properties": {
"tags": ["tag1"]
}
}
Since the above document's identification part is matching the given identification, then that document should be returned.
If I do db.test.find({identification: {/*the given identification segment*/}}), mongodb will compare directly the identification part by checking exactly every entry in the document. In this case that document will not be returned.
Is there a way in mongodb query language that allows me to do this in relatively straight forward or easy way? Or I have to parse the entries in Identification object recursively in order to construct a query?
Mongo will try to match WHOLE properties subdocument,
so in this case we will have to supply 1:1 document.
The way you could try to get this working is unwind every element and add it to query filter section.
{
"componentType": "PivotTable",
"properties.tags": {$in:["tag1"]}
}

Logstash + Kibana terms panel without breaking words

I have a Java application that writes to a log file in json format.
The fields that come in the logs are variable.
The logstash reads this logfile and sends it to Kibana.
I've configured the logstash with the following file:
input {
file {
path => ["[log_path]"]
codec => "json"
}
}
filter{
json {
source => "message"
}
date {
match => [ "data", "dd-MM-yyyy HH:mm:ss.SSS" ]
timezone => "America/Sao_Paulo"
}
}
output {
elasticsearch_http {
flush_size => 1
host => "[host]"
index => "application-%{+YYYY.MM.dd}"
}
}
I've managed to show correctly everything in Kibana without any mapping.
But when I try to create a terms panel to show a count of the servers who sent those messages I have a problem.
I have a field called server in my json, that show the servers name (like: a1-name-server1), but the terms panel split the server name because of the "-".
Also I would like to count the number of times that a error message appears, but the same problem occurs, because the terms panel split the error message because of the spaces.
I'm using Kibana 3 and Logstash 1.4.
I've searched a lot on the web and couldn't find any solution.
I also tried using the .raw from logstash, but it didn't work.
How can I manage this?
Thanks for the help.
Your problem here is that your data is being tokenized. This is helpful to make any search over your data. ES (by default) will split your field message split into different parts to be able to search them. For example you may want to search for the word ERROR in your logs, so you probably would like to see in the results messages like "There was an error in your cluster" or "Error processing whatever". If you don't analyze the data for that field with tokenizers, you won't be able to search like this.
This analyzed behaviour is helpful when you want to search things, but it doesn't allow you to group when different messages that have the same content. This is your usecase. The solution to this is to update your mapping putting not_analyzed for that specific field that you don't want to split into tokens. This will probably work for your host field, but will probably break the search.
What I usually do for these kind of situations is to use index templates and multifields. The index template allow me to set a mapping for every index that match a regex and the multifields allow me to have the analyzed and not_analyzed behaviour in a same field.
Using the following query would do the job for your problem:
curl -XPUT https://example.org/_template/name_of_index_template -d '
{
"template": "indexname*",
"mappings": {
"type": {
"properties": {
"field_name": {
"type": "multi_field",
"fields": {
"field_name": {
"type": "string",
"index": "analyzed"
},
"untouched": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}'
And then in your terms panel you can use field.untouched, to consider the entire content of the field when you calculate the count of the different elements.
If you don't want to use index templates (maybe your data is in a single index), setting the mapping with the Put Mapping API would do the job too. And if you use multifields, there is no need to reindex the data, because from the moment that you set the new mapping for the index, the new data will be duplicated in these two subfields (field_name and field_name.untouched). If you just change the mapping from analyzed to not_analyzed you won't be able to see any change until you reindex all your data.
Since you didn't define a mapping in elasticsearch, the default settings takes place for every field in your type in your index. The default settings for string fields (like your server field) is to analyze the field, meaning that elastic search will tokenize the field contents. That is why its splitting your server names to parts.
You can overcome this issue by defining a mapping. You don't have to define all your fields, but only the ones that you don't want elasticsearch to analyze. In your particular case, sending the following put command will do the trick:
http://[host]:9200/[index_name]/_mapping/[type]
{
"type" : {
"properties" : {
"server" : {"type" : "string", "index" : "not_analyzed"}
}
}
}
You can't do this on an already existing index because switching from analyzed to not_analyzed is a major change in the mapping.

External paging for jqGrid-jQuery in Java

I have developed code as below.
jQuery(document).ready(function(){
jQuery("#records").jqGrid({
height:350,
datatype: 'local',
colNames:['Policy Name','Policy Type', 'Time allowed (HH:mm)','Expiration Duration (days)','Session Pulse(minutes)','Description'],
colModel :[
{name:'pName', index:'pName', editable:true,sorttype:'text',width:150,editoptions:{size:10},formatter:'showlink',formatoptions:{baseLinkUrl:'javascript:' , showAction: "GetAndShowUserData(jQuery('#list2'),'",addParam: "');"}},
{name:'pType', index:'pType', sorttype:'text',editable:true,width:150,editoptions:{size:10}},
{name:'timeAllowed', index:'timeAllowed', sorttype:'text',editable:true,width:200, align:"right",editoptions:{size:10}},
{name:'expDuration', index:'expDuration', sorttype:'text',editable:true,width:200, align:"right",editoptions:{size:10}},
{name:'sessionPulse', index:'sessionPulse',sorttype:'int',editable:true,width:200, align:"right",editoptions:{size:10}},
{name:'description', index:'description', sortable:false,editable:true,width:200,editoptions:{size:10}}],
pager:jQuery('#pager'),
rowNum:10,
sortname: 'pName',
autowidth:true,
altRows:true,
drag:true,
sortorder: "asc",
rowList:[2,5,10,20],
viewrecords: true,
loadonce:false,
multiselect: true,
/*
onSelectRow: function(id){
var gr = jQuery("#records").jqGrid('getGridParam','selrow');
if( gr != null ) jQuery("#records").jqGrid('editGridRow',gr,{height:280,reloadAfterSubmit:false});
else alert("Please Select Row");
},
editurl: "server.php",
*/
caption:'Manage Policy'
});
});
Now, I want to make an Ajax request to the servlet for next records when the user presses >> the (next) button of the jqGrid. I have searched much on the Internet, but I found a lot of PHP code, but I can not understand that PHP; I want to develop that thing in Java. How can I do that?
As GPS said, the paging in jqGrid works by paging through its current dataset. You have to load a big set of data, and it will page through that dataset. There may be a way to get it to behave the way you want, but I don't know how.
For my grids, I use the pagination plugin to trigger an Ajax call to get the next page of data. When the data comes back, I just clear the grid (clearGridData) and add the new set of data using addRowData.
I'm a .NET programmer, so I don't know how you'd do the database calls in with Java, but that's not really a jqGrid question.
To determine how many pages there are, you take the count of all the records that you'll be paging through and divide that by the number of records you will show on the grid per page.

Why wont this sort in Solr work?

I need to sort on a date-field type, which name is "mod_date".
It works like this in the browser adress-bar:
http://localhost:8983/solr/select/?&q=bmw&sort=mod_date+desc
But I am using a phpSolr client which sends an URL to Solr, and the url sent is this:
fq=+category%3A%22Bilar%22+%2B+car_action%3AS%C3%A4ljes&version=1.2&wt=json&json.nl=map&q=%2A%3A%2A&start=0&rows=5&sort=mod_date+desc
// This wont work and is echoed after this in php:
$queryString = http_build_query($params, null, $this->_queryStringDelimiter);
$queryString = preg_replace('/%5B(?:[0-9]|[1-9][0-9]+)%5D=/', '=', $queryString);
This wont work, I dont know why!
Everything else works fine, all right fields are returned. But the sort doesn't work.
Any ideas?
Thanks
BTW: The field "mod_date" contains something like:
2010-03-04T19:37:22.5Z
EDIT:
First I use PHP to send this to a SolrPhpClient which is another php-file called service.php:
require_once('../SolrPhpClient/Apache/Solr/Service.php');
$solr = new Apache_Solr_Service('localhost', 8983, '/solr/');
$results = $solr->search($querystring, $p, $limit, $solr_params);
$solr_params is an array which contains the solr-parameters (q, fq, etc).
Now, in service.php:
$params['version'] = self::SOLR_VERSION;
// common parameters in this interface
$params['wt'] = self::SOLR_WRITER;
$params['json.nl'] = $this->_namedListTreatment;
$params['q'] = $query;
$params['sort'] = 'mod_date desc'; // HERE IS THE SORT I HAVE PROBLEM WITH
$params['start'] = $offset;
$params['rows'] = $limit;
$queryString = http_build_query($params, null, $this->_queryStringDelimiter);
$queryString = preg_replace('/%5B(?:[0-9]|[1-9][0-9]+)%5D=/', '=', $queryString);
if ($method == self::METHOD_GET)
{
return $this->_sendRawGet($this->_searchUrl . $this->_queryDelimiter . $queryString);
}
else if ($method == self::METHOD_POST)
{
return $this->_sendRawPost($this->_searchUrl, $queryString, FALSE, 'application/x-www-form-urlencoded');
}
The $results contain the results from Solr...
So this is the way I need to get to work (via php).
This code below (also on top of this Q) works but thats because I paste it into the adress bar manually, not via the PHPclient. But thats just for debugging, I need to get it to work via the PHPclient:
http://localhost:8983/solr/select/?&q=bmw&sort=mod_date+des // Not via phpclient, but works
UPDATE (2010-03-08):
I have tried Donovans codes (the urls) and they work fine.
Now, I have noticed that it is one of the parameters causing the 'SORT' not to work.
This parameter is the "wt" parameter. If we take the url from top of this Q, (fq=+category%3A%22Bilar%22+%2B+car_action%3AS%C3%A4ljes&version=1.2&wt=json&json.nl=map&q=%2A%3A%2A&start=0&rows=5&sort=mod_date+desc), and just simply remove the "wt" parameter, then the sort works.
BUT the results appear differently, thus making my php code not able to recognize the results I believe. Donovan would know this I think. I am guessing in order for the PHPClient to work, the results must be in a specific structure, which gets messed up as soon as I remove the wt parameter.
Donovan, help me please...
Here is some background what I use your SolrPhpClient for:
I have a classifieds website, which uses MySql. But for the searching I am using Solr to search some indexed fields. Then Solr returns an array of ID:numbers (for all matches of the search criteria). Then I use those ID:numbers to find everything in a MySql db and fetch all other information (example is not searchable information).
So simplified: Search -> Solr returns all matches in an array of ID:nrs -> Id:numbers from Solr are the same as the Id numbers in the MySql db, so I can just make a simple match agains every record with the ID matching the ID from the Solr results array.
I don't use Faceting, no boosting, no relevancy or other fancy stuff. I only sort by the latest classified put, and give the option to users to also sort on the cheapest price. Nothing more.
Then I use the "fq" parameter to do queries on different fields in Solr depending on category chosen by users (example "cars" in this case which in my language is "Bilar").
I am really stuck with this problem here...
Thanks for all help
As pointed out in the stack overflow comments, your browser query is different than your php client based query - to remove that from the equation you should test with this corrected. To get the same results as the browser based query you're php code should have looked something like this:
$solr = new Apache_Solr_Client(...);
$searchOptions = array(
'sort' => 'mod_date desc'
);
$results = $solr->search("bmw", 0, 10, $searchOptions);
Instead, I imagine it looks more like:
$searchOptions = array(
'fq' => 'category:"Bilar" + car_action:Sälje',
'sort' => 'mod_date desc'
)
$solr->search("\*:*", 0, 10, $searchOptions);
What I expect you to see is that php client results will be the same as the browser based results, and I imagine the same would happen if you did it the opposite way - take your current parameters from the php client and applied them correctly to the browser based query.
Now onto your problem, you don't see documents sorted properly.
I would try this query, which is the equivalent of the php client based code:
http://localhost:8983/solr/select/?&q=%2A%3A%2A&fq=+category%3A%22Bilar%22+%2B+car_action%3AS%C3%A4ljes&sort=mod_date+desc
versus this query, which moves the filter query into the main query:
http://localhost:8983/solr/select/?&q=+category%3A%22Bilar%22+%2B+car_action%3AS%C3%A4ljes&sort=mod_date+desc
and see if there is a difference. If there is, then it might be a bug in how results from cached filtered queries are used and sorted by solr - which wouldn't be a problem with the client, but the solr service itself.
Hope this gets you closer to an anser.
Use session's values for save sort parameters.
The quick answer in case someone is attempting to sort via solr-php-client:
$searchOptions = array('sort' => 'field_date desc');
Ditch the + sign that you would usually put on the URL. It took me a while as well to figure it out, I was encoding it and putting it all over the place...
It's possible it's related to the json.nl=map parameter. When the response is set to JSON with wt=json and json.nl=map, facets are not sorted as expected with the facet.sort or f.<field_name>.facet.sort=count|index options.
e.g. with facet.sort=count and wt=json only, I get:
"dc_coverage": [
"United States",
5,
"19th century",
1,
"20th century",
1,
"Detroit (Mich.)",
1,
"Pennsylvania",
1,
"United States--Michigan--Detroit",
1,
"United States--Washington, D.C.",
1
]
But with facet.sort=count, wt=json, and json.nl=map as an option, you can see the sorting is lost:
"dc_coverage": {
"19th century": 1,
"20th century": 1,
"Detroit (Mich.)": 1,
"Pennsylvania": 1,
"United States": 5,
"United States--Michigan--Detroit": 1,
"United States--Washington, D.C.": 1
}
There is more information here about formatting the JSON response when using json.nl=map: https://cwiki.apache.org/confluence/display/solr/Response+Writers#ResponseWriters-JSONResponseWriter

Categories