Druid - longSum metrics is not populating

Druid - longSum metrics is not populating - java

I am doing batch ingestion in druid, by using the wikiticker-index.json file which comes with the druid quickstart.
Following is my data schema in wikiticker-index.json file.
{
type:"index_hadoop",
spec:{
ioConfig:{
type:"hadoop",
inputSpec:{
type:"static",
paths:"quickstart/wikiticker-2015-09-12-sampled.json"
}
},
dataSchema:{
dataSource:"wikiticker",
granularitySpec:{
type:"uniform",
segmentGranularity:"day",
queryGranularity:"none",
intervals:[
"2015-09-12/2015-09-13"
]
},
parser:{
type:"hadoopyString",
parseSpec:{
format:"json",
dimensionsSpec:{
dimensions:[
"channel",
"cityName",
"comment",
"countryIsoCode",
"countryName",
"isAnonymous",
"isMinor",
"isNew",
"isRobot",
"isUnpatrolled",
"metroCode",
"namespace",
"page",
"regionIsoCode",
"regionName",
"user"
]
},
timestampSpec:{
format:"auto",
column:"time"
}
}
},
metricsSpec:[
{
name:"count",
type:"count"
},
{
name:"added",
type:"longSum",
fieldName:"added"
},
{
name:"deleted",
type:"longSum",
fieldName:"deleted"
},
{
name:"delta",
type:"longSum",
fieldName:"delta"
},
{
name:"user_unique",
type:"hyperUnique",
fieldName:"user"
}
]
},
tuningConfig:{
type:"hadoop",
partitionsSpec:{
type:"hashed",
targetPartitionSize:5000000
},
jobProperties:{
}
}
}
}
After ingesting the sample json. only the following metrics show up.
I am unable to find the longSum metrics.i.e added, deleted and delta.
Any particular reason?
Does anybody know about this?

OP confirmed this comment from Slim Bougerra worked:
You need to add yourself on the Superset UI. Superset doesn't populate the metrics automatically.

Related

How to upload Json Data or file in Elasticsearch using Java?

This is my sample Json Data coming from .json file now I want to do bulk_insert to elasticsearch dynamically so that I can perform operations on it ..can someone help me with java code to add this data dynamically ..this is just a piece of 5-6objects like this i have more then 500objects
[{
"data1" : "developer",
"data2" : "categorypos",
"data3" : "1001"
},
{
"data1" : "developer",
"data1" : "developerpos",
"data1" : "1002"
},
{
"data1" : "developer",
"data2" : "developpos",
"data3" : "1003"
},
{
"data1" : "support",
"data2" : "datapos",
"data3" : "1004"
}
]

There is a provision of bulk operations in elastic search following is the documentation this might help
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html

You need to read the file from your application, iterate over the array and for each document, send it to elasticsearch.
To do the latest, you should use the bulk processor class.
BulkProcessor bulkProcessor = BulkProcessor.builder(
(request, bulkListener) -> esClient.bulkAsync(request, RequestOptions.DEFAULT, bulkListener),
new BulkProcessor.Listener() {
#Override
public void beforeBulk(long executionId, BulkRequest request) { }
#Override
public void afterBulk(long executionId, BulkRequest request, BulkResponse response) { }
#Override
public void afterBulk(long executionId, BulkRequest request, Throwable failure) { }
})
.setBulkActions(10000)
.setFlushInterval(TimeValue.timeValueSeconds(5))
.build();
For each json document, call:
bulkProcessor.add(new IndexRequest("INDEXNAME").source(json, XContentType.JSON));

Groovy jenkins script println doesn't work

As a part of my promote artifact task I am creating a query to send to Artifactory. Unfortunately, it does not work and I am trying to debug it stepwise. Here is the message I am preparing to send. Somehow println returns "check", but does not show anything for message in the logs. Why so?
stage('Promote') {
id = "0.1-2020-01-28-18-08.zip"
try {
message = """
items.find(
{
"$and":[
{ "repo": {"$eq": "generic-dev-local"} },
{ "path": {"$match": "mh/*"} },
{ "name": {"$eq": ${id}}}
]
}
).include("artifact.module.build.number")
"""
println "check"
println message
} catch (e) {
return [''] + e.toString().tokenize('\n')
}
}

DynamoDB updated only partially

I want to update my DynamoDB through java using dynamoDBMapper Library.
What I did is to push messages(updates I want to executed) to one SQS, and let my java code to consume these messages and update my dynamoDB.
I found that when I push more than 150 messages in a short using a script, all the data can be consumed but only parts of the record in DynamoDB was updated.
The code to update DynamoDB like this:
#Service
public class PersistenceMessageProcessingServiceImpl implements PersistenceMessageProcessingService{
#Override
public void process(TextMessage textMessage){
String eventData = textMessage.getText();
updateEventStatus(eventData);
}
/*
each input is a caseDetail messages in Event Table
get data, parse data and update relative records partially in dynamodb.
finally check if still any open cases, if not change state of event
*/
private void updateEventStatus(String eventData) throws ParseException, IOException {
RetryUtils retryUtils = new RetryUtils(maxRetries, waitTimeInMilliSeconds, influxService);
SNowResponse serviceNowResponse = parseData(eventData);
EventCaseMap eventCaseMap = eventCaseMapRepository.findBySysId(sysId);
if (eventCaseMap != null) {
Event event = eventRepository.findByEventId(eventCaseMap.getSecurityEventManagerId());
CaseManagementDetails caseManagementDetails = event.getCaseManagementDetails();
Case existingCaseDetails = getCaseByCaseSystemId(caseManagementDetails, sysId);
caseDetails.setCaseStatus('Resolved');
caseDetails.setResolution(serviceNowResponse.getCloseCode());
caseDetails.setResolvedBy("A");
caseDetails.setAssessment(serviceNowResponse.getAssessment());
caseDetails.setResolutionSource("SEM");
retryUtils.run(() -> {
return eventRepository.updateEvent(event); }, RETRY_MEASUREMENT);
}
boolean stillOpen = false;
for(Case existingCase : caseManagementDetails.getCases()){
if(("OPEN").equals(existingCase.getCaseStatus().toString())){
stillOpen = true;
break;
}
}
if(!stillOpen){
event.setState('CLOSED');
}
}
private Case getCaseByCaseSystemId(CaseManagementDetails caseManagementDetails, String sysId) {
Case caseDetails = null;
if (caseManagementDetails != null) {
List<Case> caseList = caseManagementDetails.getCases();
for (Case c : caseList) {
if (c.getCaseSystemId() != null && c.getCaseSystemId().equalsIgnoreCase(sysId)) {
caseDetails = c;
break;
}
}
}
return caseDetails;
}
}
/* EventCaseMap Table in my DynamoDB
data model is like this for EventCaseMap Table:
{
"caseSystemId": "bb9cc488dbf67b40b3d57709af9619f8",
"securityEventManagerId": "756813a4-4e48-4abb-b37e-da00e931583b"
}
*/
#Repository
public class EventCaseMapRepositoryImpl implements EventCaseMapRepository {
#Autowired
DynamoDBMapper dynamoDBMapper;
#Override
public EventCaseMap findBySysId(String sysId) {
EventCaseMap eventCaseMap = new EventCaseMap();
eventCaseMap.setCaseSystemId(sysId);
return dynamoDBMapper.load(eventCaseMap, DynamoDBMapperConfig.ConsistentReads.CONSISTENT.config());
}
}
/*
data model is like this for Event Table:
{
"caseManagementDetails": {
"cases": [
{
"caseId": "SIR0123456",
"caseStatus": "OPEN",
},
{
"caseId": "SIR0654321",
"caseStatus": "OPEN",
},
{
many other cases(about two hundreds).....
}
]
},
"state": "OPEN",
"securityEventManagerId": "756813a4-4e48-4abb-b37e-da00e931583b"
}
*/
#Repository
public class EventRepositoryImpl implements EventRepository {
#Autowired
DynamoDBMapper dynamoDBMapper;
#Override
public Event findByEventId(String eventId) {
Event event = new Event();
event.setSecurityEventManagerId(eventId);
return dynamoDBMapper.load(event, DynamoDBMapperConfig.ConsistentReads.CONSISTENT.config());
}
#Override
public boolean updateEvent(Event event) {
dynamoDBMapper.save(event, DynamoDBMapperConfig.SaveBehavior.UPDATE_SKIP_NULL_ATTRIBUTES.config());
return false;
}
}
I already try to push the message and consume the message one by one in both 'RUN' and 'DEBUG' model in my Intellij. evertything works fine, all the cases can be updated.
So I was wondering if any inconsistency problems in DynamoDB, but I have already using Strong Consistency in my code.
So do any body know what happened in my code?
There is the input, output, expected output:
input:
many json files like this:
{
"number": "SIR0123456",
"state": "Resolved",
"sys_id": "bb9cc488dbf67b40b3d57709af9619f8",
"MessageAttributes": {
"TransactionGuid": {
"Type": "String",
"Value": "093ddb36-626b-4ecc-8943-62e30ffa2e26"
}
}
}
{
"number": "SIR0654321",
"state": "Resolved",
"sys_id": "bb9cc488dbf67b40b3d57709af9619f7",
"MessageAttributes": {
"TransactionGuid": {
"Type": "String",
"Value": "093ddb36-626b-4ecc-8943-62e30ffa2e26"
}
}
}
output for Event Table:
{
"caseManagementDetails": {
"cases": [
{
"caseId": "SIR0123456",
"caseStatus": "RESOLVED",
},
{
"caseId": "SIR0654321",
"caseStatus": "OPEN"
},
{
many other cases(about two hundreds).....
}
]
},
"state": "OPEN",
"securityEventManagerId": "756813a4-4e48-4abb-b37e-da00e931583b"
}
Expected output for Event Table:
{
"caseManagementDetails": {
"cases": [
{
"caseId": "SIR0123456",
"caseStatus": "RESOLVED",
},
{
"caseId": "SIR0654321",
"caseStatus": "RESOLVED"
},
{
many other cases(about two hundreds).....
}
]
},
"state": "OPEN",
"securityEventManagerId": "756813a4-4e48-4abb-b37e-da00e931583b"
}

I think the problem lies in that when DynamoDB persistent data, it run in a multi-threading way. so if we consume all these data in a short time, there may some threads didn't finish. So the result we saw was just the result of the last thread not that of all threads.

Parsing JSON from API with GSON [duplicate]

This question already has answers here:
Why does Gson fromJson throw a JsonSyntaxException: Expected BEGIN_OBJECT but was BEGIN_ARRAY?
(2 answers)
Closed 3 years ago.
I'm make an API call with retrofit, the problem is the JSON I get from the call. Normally it was a simple Json with an array in it.
[
{
"herbID": 1,
"nameTrival": "Baldrian",
"nameWissenschaft": "Valeriana officinalis",
....
},
{
"herbID": 2,
"nameTrival": "Ringelblume",
"nameWissenschaft": "Calendula officinalis",
....
},
....
]
The new call looks like this
[
[
{
"nameTrival": "Baldrian",
"nameWissenschaft": "Valeriana officinalis",
"hoeheFrom": 20,
"hoeheTo": 200,
"familie": "Baldriangewaechse",
"pflanzentype": "Staude",
"auffaelligkeiten": "Je nach Standort sind die Fiederblätter schmäler oder breiter sowie dunkel- oder hellgrün, oft auch unterschiedlich geformt."
}
],
[
{
"standort": "Ufer"
},
{
"standort": "Graben"
},
{
"standort": "Wiesen"
},
{
"standort": "Waldrand"
}
],
[
{
"gebiet": "Nordeuropa"
},
{
"gebiet": "Südeuropa"
},
{
"gebiet": "Westeuropa"
},
{
"gebiet": "Osteuropa"
},
{
"gebiet": "Südosteuropa"
},
{
"gebiet": "Mitteleuropa"
},
{
"gebiet": "Südwesteuropa"
},
{
"gebiet": "Nordosteuoropa"
},
{
"gebiet": "Nordwesteuropa"
}
],
{
"fieldCount": 0,
"affectedRows": 0,
"insertId": 0,
"serverStatus": 34,
"warningCount": 0,
"message": "",
"protocol41": true,
"changedRows": 0
}
]
I parsed the first Json with the following code
Call<List<Herb>> call = service.getAllHerbs();
call.enqueue(new Callback<List<Herb>>() {
#Override
public void onResponse(Call<List<Herb>> call, Response<List<Herb>> response) {
herbList = response.body();
loadDataList(herbList);
}
#Override
public void onFailure(Call<List<Herb>> call, Throwable t) {
Toast.makeText(PlantListActivity.this, "Unable to load herbs\nCheck your internet connection", Toast.LENGTH_LONG).show();
}
});
The new class looks like this
public class Herb{
ArrayList<Botanical> botanical;
ArrayList<Location> locations;
ArrayList<Area> areas;
public ArrayList<Botanical> getBotanical() {
return botanical;
}
public ArrayList<Location> getLocations() {
return locations;
}
public ArrayList<Area> getAreas() {
return areas;
}
}
With the new Json and the class it always fail with the error "java.lang.IllegalStateException: Expected BEGIN_OBJECT but was BEGIN_ARRAY at line 1 column 2 path $"
Didn't I tell gson that the following is an array, after I declare them as an ArrayList? What is wrong with my class?
EDIT:
The difference between the possible duplicate and my question is, that the other one has arrays with names. My json arrays doesn't have one, that's why I have a hard time parsing them.

If you want to handle response with type Herb class, then you need to modify service response like
Rather than
[{},{},{}] // this is much difficult to handle
Try this which has to be done at server end
{
"botanical":[],
"area" : [],
"location" : []
}
Then call will be Call<Herb> someMethod();
If change not possible
then handle it as JsonArray
Call<JsonArray> someMethod();
By this you can handle existing response and fetching value key and parse as per requirement. But this is not recommended because it's difficult to maintain if future change and require lot of change

The problem is you're telling Gson you have an object of your type. You don't. You have an array of objects of your type. You can't just try and cast the result like that and expect it to magically work ;)
The User guide for Gson Explains how to deal with this

Your new response is very bad. It's impossible to parse that kind of JSON.
It represents array of different objects (different type). So, GSON can do nothing, you have to create a class that is equivalent to the entire response.
Your first JSON response is fine.

How can i convert my existing mongo db query in to spring boot using aggregation class

I wrote mongodb query. And am facing some issue while converting it in spring boot using aggregation class.So, please help me, i want it to convert it in spring boot using aggregation class.
db.api_audit.aggregate([{
$match: {
merchant_id: '015994832961',
request_time: {$gte: ISODate("2017-05-11T00:00:00.0Z"),
$lt: ISODate("2017-05-12T00:00:00.0Z")}}},
{
$group:
{
_id: {
SERVICE_NAME: "$service_name",
STATUS: "$status"
},
count: {
"$sum": 1
}
}
}, {
$group: {
_id: "$_id.SERVICE_NAME",
STATUSCOUNT: {
$push: {
Service_name: "$_id.STATUS",
count: "$count"
}
}
}
},
{ $sort : { "STATUSCOUNT.count" : -1} }
])
Below is the db query response
{
"_id" : "sendOTP",
"STATUSCOUNT" : [
{
"status" : "SUCCESS",
"count" : 2.0
}
]
}
Thanks in advance.

First you create all the required operations and then you add them to an aggregation pipeline. Then you feed it to an autowired mongotemplate.
Something like this:
#Autowired
private final MongoTemplate mongoTemplate;
void aggregate()
{
Criteria match = where("merchant_id").is("015994832961").andOperator(where("request_time").gte(Date.parse("2017-05-11T00:00:00.0Z")).lt(Date.parse("2017-05-11T00:00:00.0Z")));
GroupOperation groupOperation1 = group(fields().and("SERVICE_NAME").and("STATUS")).count().as("count");
GroupOperation groupOperation2 = ...//(not sure how push works here, but it should not be hard to figure out)
SortOperation sortOperation = sort(DESC, "STATUSCOUNT.count");
Aggregation aggegation = Aggregation.newAggregation(match, groupOperation1, groupOperation2, sortOperation);
List<Result> results = mongoTemplate.aggegate(aggregation, ObjectOfCollectionToRunOn.class, Result.class).getMappedResults();
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Druid - longSum metrics is not populating - java

OP confirmed this comment from Slim Bougerra worked: You need to add yourself on the Superset UI. Superset doesn't populate the metrics automatically.

Related

How to upload Json Data or file in Elasticsearch using Java?

Groovy jenkins script println doesn't work

DynamoDB updated only partially

Parsing JSON from API with GSON [duplicate]

How can i convert my existing mongo db query in to spring boot using aggregation class

Categories

Resources