I already ingested the file into the druid, greatfully it shows the ingestion is success. However when I checked in the reports of the ingestion, there are all rows are processed with error yet the Datasource is display in the "Datasource" tab.
I have tried to minimise the rows from 20M to 20 rows only. Here is my configuration file:
"type" : "index",
"spec" : {
"ioConfig" : {
"type" : "index",
"firehose" : {
"type" : "local",
"baseDir" : "/home/data/Salutica",
"filter" : "outDashboard2RawV3.csv"
}
},
"dataSchema" : {
"dataSource": "DaTRUE2_Dashboard_V3",
"granularitySpec" : {
"type" : "uniform",
"segmentGranularity" : "WEEK",
"queryGranularity" : "none",
"intervals" : ["2017-05-08/2019-05-17"],
"rollup" : false
},
"parser" : {
"type" : "string",
"parseSpec": {
"format" : "csv",
"timestampSpec" : {
"column" : "Date_Time",
"format" : "auto"
},
"columns" : [
"Main_ID","Parameter_ID","Date_Time","Serial_Number","Status","Station_ID",
"Station_Type","Parameter_Name","Failed_Date_Time","Failed_Measurement",
"Database_Name","Date_Time_Year","Date_Time_Month",
"Date_Time_Day","Date_Time_Hour","Date_Time_Weekday","Status_New"
],
"dimensionsSpec" : {
"dimensions" : [
"Date_Time","Serial_Number","Status","Station_ID",
"Station_Type","Parameter_Name","Failed_Date_Time",
"Failed_Measurement","Database_Name","Status_New",
{
"name" : "Main_ID",
"type" : "long"
},
{
"name" : "Parameter_ID",
"type" : "long"
},
{
"name" : "Date_Time_Year",
"type" : "long"
},
{
"name" : "Date_Time_Month",
"type" : "long"
},
{
"name" : "Date_Time_Day",
"type" : "long"
},
{
"name" : "Date_Time_Hour",
"type" : "long"
},
{
"name" : "Date_Time_Weekday",
"type" : "long"
}
]
}
}
},
"metricsSpec" : [
{
"name" : "count",
"type" : "count"
}
]
},
"tuningConfig" : {
"type" : "index",
"partitionsSpec" : {
"type" : "hashed",
"targetPartitionSize" : 5000000
},
"jobProperties" : {}
}
}
}
Report:
{"ingestionStatsAndErrors":{"taskId":"index_DaTRUE2_Dashboard_V3_2019-09-10T01:16:47.113Z","payload":{"ingestionState":"COMPLETED","unparseableEvents":{},"rowStats":{"determinePartitions":{"processed":0,"processedWithError":0,"thrownAway":0,"unparseable":0},"buildSegments":{"processed":0,"processedWithError":20606701,"thrownAway":0,"unparseable":1}},"errorMsg":null},"type":"ingestionStatsAndErrors"}}
I'm expecting this:
{"processed":20606701,"processedWithError":0,"thrownAway":0,"unparseable":1}},"errorMsg":null},"type":"ingestionStatsAndErrors"}}
instead of this:
{"processed":0,"processedWithError":20606701,"thrownAway":0,"unparseable":1}},"errorMsg":null},"type":"ingestionStatsAndErrors"}}
Below is my input data from csv;
"Main_ID","Parameter_ID","Date_Time","Serial_Number","Status","Station_ID","Station_Type","Parameter_Name","Failed_Date_Time","Failed_Measurement","Database_Name","Date_Time_Year","Date_Time_Month","Date_Time_Day","Date_Time_Hour","Date_Time_Weekday","Status_New"
1,3,"2018-10-05 15:00:55","1840SDF00038","Passed","ST1","BLTBoard","1.8V","","","DaTRUE2Left",2018,10,5,15,"Friday","Passed"
1,4,"2018-10-05 15:00:55","1840SDF00038","Passed","ST1","BLTBoard","1.35V","","","DaTRUE2Left",2018,10,5,15,"Friday","Passed"
1,5,"2018-10-05 15:00:55","1840SDF00038","Passed","ST1","BLTBoard","Isc_VChrg","","","DaTRUE2Left",2018,10,5,15,"Friday","Passed"
1,6,"2018-10-05 15:00:55","1840SDF00038","Passed","ST1","BLTBoard","Isc_VBAT","","","DaTRUE2Left",2018,10,5,15,"Friday","Passed"
Related
I'm new to elastic search.
So this is how the index looks:
{
"scresults-000001" : {
"aliases" : {
"scresults" : { }
},
"mappings" : {
"properties" : {
"callType" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"code" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"data" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"esdtValues" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"gasLimit" : {
"type" : "long"
},
AND MORE OTHER Fields.......
If I'm trying to create a search query in Java that looks like this:
{
"bool" : {
"filter" : [
{
"term" : {
"sender" : {
"value" : "sendervalue",
"boost" : 1.0
}
}
},
{
"term" : {
"data" : {
"value" : "YWRkTGlxdWlkaXR5UHJveHlAMDAwMDAwMDAwMDAwMDAwMDA1MDBlYmQzMDRjMmYzNGE2YjNmNmE1N2MxMzNhYjdiOGM2ZjgxZGM0MDE1NTQ4M0A3ZjE1YjEwODdmMjUwNzQ4QDBjMDU0YjcwNDhlMmY5NTE1ZWE3YWU=",
"boost" : 1.0
}
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
If I run this query I get 0 hits. If I change the field "data" with other field it works. I don't understand what's different.
How I actually create the query in Java+SpringBoot:
QueryBuilder boolQuery = QueryBuilders.boolQuery()
.filter(QueryBuilders.termQuery("sender", "sendervalue"))
.filter(QueryBuilders.termQuery("data",
"YWRkTGlxdWlkaXR5UHJveHlAMDAwMDAwMDAwMDAwMDAwMDA1MDBlYmQzMDRjMmYzNGE2YjNmNmE1N2MxMzNhYjdiOGM2ZjgxZGM0MDE1NTQ4M0A3ZjE1YjEwODdmMjUwNzQ4QDBjMDU0YjcwNDhlMmY5NTE1ZWE3YWU="));
Query searchQuery = new NativeSearchQueryBuilder()
.withFilter(boolQuery)
.build();
SearchHits<ScResults> articles = elasticsearchTemplate.search(searchQuery, ScResults.class);
Since you're trying to do an exact match on a string with a term query, you need to do it on the data.keyword field which is not analyzed. Since the data field is a text field, hence analyzed by the standard analyzer, not only are all letters lowercased but the = sign at the end also gets stripped off, so there's no way this can match (unless you use a match query on the data field but then you'd not do exact matching anymore).
POST _analyze
{
"analyzer": "standard",
"text": "YWRkTGlxdWlkaXR5UHJveHlAMDAwMDAwMDAwMDAwMDAwMDA1MDBlYmQzMDRjMmYzNGE2YjNmNmE1N2MxMzNhYjdiOGM2ZjgxZGM0MDE1NTQ4M0A3ZjE1YjEwODdmMjUwNzQ4QDBjMDU0YjcwNDhlMmY5NTE1ZWE3YWU="
}
Results:
{
"tokens" : [
{
"token" : "ywrktglxdwlkaxr5uhjvehlamdawmdawmdawmdawmdawmda1mdblymqzmdrjmmyznge2yjnmnme1n2mxmznhyjdiogm2zjgxzgm0mde1ntq4m0a3zje1yjewoddmmjuwnzq4qdbjmdu0yjcwndhlmmy5nte1zwe3ywu",
"start_offset" : 0,
"end_offset" : 163,
"type" : "<ALPHANUM>",
"position" : 0
}
]
}
For example, I want to display the below format:
Italy Format : 1.325.000
Us Format : 1,325,000
etc.
But highchart display another format using the below response.
Based on locale it shows the value in tooltip dynamically.
I am trying with below response.
Highcharts.chart('container', {
"tooltip" : {
"shared" : true
},
"legend" : {
"enabled" : true,
"reversed" : false
},
"credits" : {
"enabled" : false
},
"exporting" : {
"enabled" : false
},
"chart" : {
"zoomType" : "xy"
},
"title" : {
"text" : "Financial analytic"
},
"xAxis" : [ {
"categories" : [ "Amar", "Kiran", "Venkatesh" ],
"crosshair" : true
} ],
"yAxis" : [ {
"title" : {
"text" : "Financial"
},
"labels" : {
"format" : "${value:.2f}USD"
}
} ],
"series" : [ {
"type" : "column",
"name" : "Financial",
"data" : [ 1325000.0, 1740000.0, 1560000.0 ],
"tooltip" : {
"valueDecimals" : 2,
"valuePrefix" : "$",
"valueSuffix" : "USD"
},
"yAxis" : 0
}]
});
You need to set thousandsSep in the lang options:
Highcharts.setOptions({
lang: {
thousandsSep: ','
}
});
Live demo: http://jsfiddle.net/BlackLabel/4m79t50g/1/
API Reference: https://api.highcharts.com/gantt/lang.thousandsSep
I have the following union:
{
"name" : "price",
"type" : [ "null", {
"type" : "array",
"items" : {
"type" : "record",
"name" : "PriceType",
"fields" : [ {
"name" : "text_value",
"type" : [ "null", "double" ],
"source" : "element text_value"
}, {
"name" : "currency",
"type" : [ "null", "string" ],
"default" : null,
"source" : "attribute currency"
} ]
}
} ],
"default" : null,
"source" : "element price"
}
From this union I get the schema of price field using this code:
Schema new_schema=schema.getField("price").schema();
Now I want to obtain the schema of the Union:
{
"type" : "array",
"items" : {
"type" : "record",
"name" : "PriceType",
"fields" : [ {
"name" : "text_value",
"type" : [ "null", "double" ],
"source" : "element text_value"
}, {
"name" : "currency",
"type" : [ "null", "string" ],
"default" : null,
"source" : "attribute currency"
}
How can I do this? And I How Do I insert a Union in a Record?
Since it's a union type you can extract the schema as follows:
new_schema.getTypes().get(0);
I am trying to build an AVRO's complex record with Union data type supported member record type.
{
"namespace": "proj.avro",
"protocol": "app_messages",
"doc" : "application messages",
"types": [
{
"name": "record_request",
"type" : "record",
"fields":
[
{
"name" : "request_id",
"type" : "int"
},
{
"name" : "message_type",
"type" : int,
},
{
"name" : "users",
"type" : "string"
}
]
},
{
"name" : "request_response",
"type" : "record",
"fields" :
[
{
"name" : "request_id",
"type" : "int"
},
{
"name" : "response_code",
"type" : "string"
},
{
"name" : "response_count",
"type" : "int"
},
{
"name" : "reason_code",
"type" : "string"
}
]
}
]
"messages" :
{
"published_msgs" :
{
"doc" : "My Messages",
"fields" :
[
{
"name" : "message_type",
"type" : "int"
},
{
"name" : "message_body",
"type" :
[
"record_request", "request_response"
]
}
]
}
}
}
I am getting error while trying to read this kind of schema.
I would like to know - is it possible to declare such AVRO schema - which has one of field which type is union of complex user defined message structure.
If its possible then could you please let me know what i am doing wrong or an example of such structure with union type field's type definition?
I want to use AVRO's dynamically schema usage - so specify this schema file run-time and parse the incoming buffer as "request"/"response".
Thanks,
It is possible define an union of complex types, the problem with your schema is that it is not defined at field level. Your schema must looks like this to achieve the union of complex types
{
"namespace": "proj.avro",
"protocol": "app_messages",
"doc" : "application messages",
"name": "myRecord",
"type" : "record",
"fields": [
{
"name": "requestResponse",
"type": [
{
"name": "record_request",
"type" : "record",
"fields":
[
{
"name" : "request_id",
"type" : "int"
},
{
"name" : "message_type",
"type" : "int"
},
{
"name" : "users",
"type" : "string"
}
]
},
{
"name" : "request_response",
"type" : "record",
"fields" :
[
{
"name" : "request_id",
"type" : "int"
},
{
"name" : "response_code",
"type" : "string"
},
{
"name" : "response_count",
"type" : "int"
},
{
"name" : "reason_code",
"type" : "string"
}
]
}
]
}
]
}
I have such structure of document:
{
"_id" : "4e76fd1e927e1c9127d1d2e8",
"name" : "***",
"embedPhoneList" : [
{
"type" : "家庭",
"number" : "00000000000"
},
{
"type" : "手机",
"number" : "00000000000"
}
],
"embedAddrList" : [
{
"type" : "家庭",
"addr" : "山东省诸城市***"
},
{
"type" : "工作",
"addr" : "深圳市南山区***"
}
],
"embedEmailList" : [
{
"email" : "********#gmail.com"
},
{
"email" : "********#gmail.com"
},
{
"email" : "********#gmail.com"
},
{
"email" : "********#gmail.com"
}
]
}
What I wan't to do is find the document by it's sub document,such as email in embedEmailList field.
Or if I have structure like this
{
"_id" : "4e76fd1e927e1c9127d1d2e8",
"name" : "***",
"embedEmailList" : [
"123#gmail.com" ,
"********#gmail.com" ,
]
}
the embedEmailList is array,how to find if there is 123#gmail.com?
Thanks.
To search for a specific value in an array, mongodb supports this syntax:
db.your_collection.find({embedEmailList : "foo#bar.com"});
See here for more information.
To search for a value in an embedded object, it supports this syntax:
db.your_collection.find({"embedEmailList.email" : "foo#bar.com"});
See here for more information.