I am using Apache Calcite for reading data from excel.
Excel has 'salary' table with following fields
Integer id
Integer emp_id
Integer salary
I have following model.json
{
"version": "1.0",
"defaultSchema": "excelSchema",
"schemas": [{
"name" : "excelSchema",
"type": "custom",
"factory": "com.syncnicia.testbais.excel.ExcelSchemaFactory",
"operand": {
"directory": "sheets/"
}
}]
}
This is my calcite connection code
Connection connection = DriverManager.getConnection("jdbc:calcite:model=src/main/resources/model.json");
CalciteConnection calciteConnection = connection.unwrap(CalciteConnection.class);
I am able to get data from above connection using following code.
Statement st1 = calciteConnection.createStatement();
ResultSet resultSet = st1.executeQuery("select * from \"excelSchema\".\"salary\"");
System.out.println("SALARY DATA IS");
while (resultSet.next()){
System.out.println("SALARY data is : ");
for (int i2 = 1; i2 <= resultSet.getMetaData().getColumnCount(); i2++) {
System.out.print(resultSet.getMetaData().getColumnLabel(i2)+" = "+resultSet.getObject(i2)+", ");
}
}
Above code is working fine it shows all entries from salary tables, but when I am trying to insert into same table i.e excel using following code
String insertSql = "INSERT INTO \"excelSchema\".\"salary\" values(5,345,0909944)";
Statement insertSt = calciteConnection.createStatement();
boolean insertResult = insertSt.execute(insertSql);
System.out.println("InsertResult is "+insertResult);
I am getting following exception
Exception in execute qry Error while executing SQL "INSERT INTO "employeeSchema"."salary" values(5,345,0909944)": There are not enough rules to produce a node with desired properties: convention=ENUMERABLE, sort=[].
Missing conversion is LogicalTableModify[convention: NONE -> ENUMERABLE]
There is 1 empty subset: rel#302:Subset#1.ENUMERABLE.[], the relevant part of the original plan is as follows
299:LogicalTableModify(table=[[employeeSchema, salary]], operation=[INSERT], flattened=[false])
293:LogicalValues(subset=[rel#298:Subset#0.NONE.[]], tuples=[[{ 5, 345, 909944 }]])
Root: rel#302:Subset#1.ENUMERABLE.[]
Original rel:
LogicalTableModify(table=[[employeeSchema, salary]], operation=[INSERT], flattened=[false]): rowcount = 1.0, cumulative cost = {2.0 rows, 1.0 cpu, 0.0 io}, id = 296
LogicalValues(tuples=[[{ 5, 345, 909944 }]]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io}, id = 293
Sets:
Set#0, type: RecordType(INTEGER id, INTEGER emp_id, INTEGER salary)
rel#298:Subset#0.NONE.[], best=null, importance=0.81
rel#293:LogicalValues.NONE.[[0, 1, 2], [1, 2], [2]](type=RecordType(INTEGER id, INTEGER emp_id, INTEGER salary),tuples=[{ 5, 345, 909944 }]), rowcount=1.0, cumulative cost={inf}
rel#305:Subset#0.ENUMERABLE.[], best=rel#304, importance=0.405
rel#304:EnumerableValues.ENUMERABLE.[[0, 1, 2], [1, 2], [2]](type=RecordType(INTEGER id, INTEGER emp_id, INTEGER salary),tuples=[{ 5, 345, 909944 }]), rowcount=1.0, cumulative cost={1.0 rows, 1.0 cpu, 0.0 io}
Set#1, type: RecordType(BIGINT ROWCOUNT)
rel#300:Subset#1.NONE.[], best=null, importance=0.9
rel#299:LogicalTableModify.NONE.[](input=RelSubset#298,table=[employeeSchema, salary],operation=INSERT,flattened=false), rowcount=1.0, cumulative cost={inf}
rel#302:Subset#1.ENUMERABLE.[], best=null, importance=1.0
rel#303:AbstractConverter.ENUMERABLE.[](input=RelSubset#300,convention=ENUMERABLE,sort=[]), rowcount=1.0, cumulative cost={inf}
Graphviz:
digraph G {
root [style=filled,label="Root"];
subgraph cluster0{
label="Set 0 RecordType(INTEGER id, INTEGER emp_id, INTEGER salary)";
rel293 [label="rel#293:LogicalValues.NONE.[[0, 1, 2], [1, 2], [2]]\ntype=RecordType(INTEGER id, INTEGER emp_id, INTEGER salary),tuples=[{ 5, 345, 909944 }]\nrows=1.0, cost={inf}",shape=box]
rel304 [label="rel#304:EnumerableValues.ENUMERABLE.[[0, 1, 2], [1, 2], [2]]\ntype=RecordType(INTEGER id, INTEGER emp_id, INTEGER salary),tuples=[{ 5, 345, 909944 }]\nrows=1.0, cost={1.0 rows, 1.0 cpu, 0.0 io}",color=blue,shape=box]
subset298 [label="rel#298:Subset#0.NONE.[]"]
subset305 [label="rel#305:Subset#0.ENUMERABLE.[]"]
}
subgraph cluster1{
label="Set 1 RecordType(BIGINT ROWCOUNT)";
rel299 [label="rel#299:LogicalTableModify\ninput=RelSubset#298,table=[employeeSchema, salary],operation=INSERT,flattened=false\nrows=1.0, cost={inf}",shape=box]
rel303 [label="rel#303:AbstractConverter\ninput=RelSubset#300,convention=ENUMERABLE,sort=[]\nrows=1.0, cost={inf}",shape=box]
subset300 [label="rel#300:Subset#1.NONE.[]"]
subset302 [label="rel#302:Subset#1.ENUMERABLE.[]",color=red]
}
root -> subset302;
subset298 -> rel293;
subset305 -> rel304[color=blue];
subset300 -> rel299; rel299 -> subset298;
subset302 -> rel303; rel303 -> subset300;
} caused by org.apache.calcite.plan.RelOptPlanner$CannotPlanException: There are not enough rules to produce a node with desired properties: convention=ENUMERABLE, sort=[].
Missing conversion is LogicalTableModify[convention: NONE -> ENUMERABLE]
There is 1 empty subset: rel#302:Subset#1.ENUMERABLE.[], the relevant part of the original plan is as follows
299:LogicalTableModify(table=[[employeeSchema, salary]], operation=[INSERT], flattened=[false])
293:LogicalValues(subset=[rel#298:Subset#0.NONE.[]], tuples=[[{ 5, 345, 909944 }]])
Root: rel#302:Subset#1.ENUMERABLE.[]
Original rel:
LogicalTableModify(table=[[employeeSchema, salary]], operation=[INSERT], flattened=[false]): rowcount = 1.0, cumulative cost = {2.0 rows, 1.0 cpu, 0.0 io}, id = 296
LogicalValues(tuples=[[{ 5, 345, 909944 }]]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io}, id = 293
Sets:
Set#0, type: RecordType(INTEGER id, INTEGER emp_id, INTEGER salary)
rel#298:Subset#0.NONE.[], best=null, importance=0.81
rel#293:LogicalValues.NONE.[[0, 1, 2], [1, 2], [2]](type=RecordType(INTEGER id, INTEGER emp_id, INTEGER salary),tuples=[{ 5, 345, 909944 }]), rowcount=1.0, cumulative cost={inf}
rel#305:Subset#0.ENUMERABLE.[], best=rel#304, importance=0.405
rel#304:EnumerableValues.ENUMERABLE.[[0, 1, 2], [1, 2], [2]](type=RecordType(INTEGER id, INTEGER emp_id, INTEGER salary),tuples=[{ 5, 345, 909944 }]), rowcount=1.0, cumulative cost={1.0 rows, 1.0 cpu, 0.0 io}
Set#1, type: RecordType(BIGINT ROWCOUNT)
rel#300:Subset#1.NONE.[], best=null, importance=0.9
rel#299:LogicalTableModify.NONE.[](input=RelSubset#298,table=[employeeSchema, salary],operation=INSERT,flattened=false), rowcount=1.0, cumulative cost={inf}
rel#302:Subset#1.ENUMERABLE.[], best=null, importance=1.0
rel#303:AbstractConverter.ENUMERABLE.[](input=RelSubset#300,convention=ENUMERABLE,sort=[]), rowcount=1.0, cumulative cost={inf}
Graphviz:
digraph G {
root [style=filled,label="Root"];
subgraph cluster0{
label="Set 0 RecordType(INTEGER id, INTEGER emp_id, INTEGER salary)";
rel293 [label="rel#293:LogicalValues.NONE.[[0, 1, 2], [1, 2], [2]]\ntype=RecordType(INTEGER id, INTEGER emp_id, INTEGER salary),tuples=[{ 5, 345, 909944 }]\nrows=1.0, cost={inf}",shape=box]
rel304 [label="rel#304:EnumerableValues.ENUMERABLE.[[0, 1, 2], [1, 2], [2]]\ntype=RecordType(INTEGER id, INTEGER emp_id, INTEGER salary),tuples=[{ 5, 345, 909944 }]\nrows=1.0, cost={1.0 rows, 1.0 cpu, 0.0 io}",color=blue,shape=box]
subset298 [label="rel#298:Subset#0.NONE.[]"]
subset305 [label="rel#305:Subset#0.ENUMERABLE.[]"]
}
subgraph cluster1{
label="Set 1 RecordType(BIGINT ROWCOUNT)";
rel299 [label="rel#299:LogicalTableModify\ninput=RelSubset#298,table=[employeeSchema, salary],operation=INSERT,flattened=false\nrows=1.0, cost={inf}",shape=box]
rel303 [label="rel#303:AbstractConverter\ninput=RelSubset#300,convention=ENUMERABLE,sort=[]\nrows=1.0, cost={inf}",shape=box]
subset300 [label="rel#300:Subset#1.NONE.[]"]
subset302 [label="rel#302:Subset#1.ENUMERABLE.[]",color=red]
}
root -> subset302;
subset298 -> rel293;
subset305 -> rel304[color=blue];
subset300 -> rel299; rel299 -> subset298;
subset302 -> rel303; rel303 -> subset300;
}
Please help me with how to insert data into excel using Apache Calcite.
Unfortunately Calcite doesn't support insertion for most of the available adapters. (I believe only for JDBC data sources at the moment.)
Related
I have a list of parameters to query, i want to get result sort by this param list order
in MYSQL, It looks like this
select * from tableA
where id in (3, 1, 2)
order by field(id, 3, 1, 2)
How to achieve the same effect in es?
"query":{
"bool":{
"must":{
{"terms":{ "xId" : #[givenIdList] }}
}
}
}
"sort":{how to sort by #[givenIdList]?}
Thanks for any suggestions.
The idea is that given the sorted list [3, 1, 2] you should return a smaller score for 3, a bigger for 1 and finally the biggest for 2. The easiest function you can consider, is a function from array element to its index. E.g., for 3 you should return 0, for 1 1 and for 2 2.
Concretely, you need a function that may look like this:
def myList = [3, 1, 2];
// Declares a map literal
Map m= [:];
// Generate from [3, 1, 2] the mapping { 3 = 0.0, 1 = 1.0, 2 = 2.0 }
def i = 0;
for (x in myList) {
m[x] = (double)i++;
}
// Extract the xId from the document
def xId = (int)doc['xId'].value;
// Return the mapped value, e.g., for 3 return 0
return m[xId];
Obviously, you can improve the performance, by passing directly the map as parameter to the script as reported here.
In this case the script reduces to:
def xId = doc['xId'].value.toString();
return params.m[xId];
FULL EXAMPLE
Index the data
POST _bulk
{"index": { "_index": "test", "_id": 1}}
{"xId": 1}
{"index": { "_index": "test", "_id": 2}}
{"xId": 2}
{"index": { "_index": "test", "_id": 3}}
{"xId": 3}
{"index": { "_index": "test", "_id": 4}}
{"another_field": "hello"}
Complete example with the list approach:
GET test/_search
{
"query": {
"terms": {
"xId": [3, 1, 2]
}
},
"sort": {
"_script": {
"type": "number",
"script": {
"lang": "painless",
"params": {
"list": [3, 1, 2]
},
"source": """
Map m= [:];
def i = 0;
for (x in params.list) {
m[x] = (double)i++;
}
def xId = (int)doc['xId'].value;
m[xId];
"""
},
"order": "asc"
}
}
}
Complete example with the map approach
GET test/_search
{
"sort": {
"_script": {
"type": "number",
"script": {
"lang": "painless",
"params": {
"list": [3, 1, 2],
"map": {
"3": 0.0,
"1": 1.0,
"2": 2.0
}
},
"source": """
def xId = doc['xId'].value.toString();
params.map[xId];
"""
},
"order": "asc"
}
},
"query": {
"terms": {
"xId": [3, 1, 2]
}
}
}
FINAL NOTES
The script is simplified by the fact that there is a terms query that guarantees, that only documents with the ids and present in the map are considered. If it is not the case you should deal with the case of missing xId and missing key in map.
You should be careful with types. Indeed, when you retrieve a field from a stored document, it is retrieved with the indexed type, e.g, xId is stored as long and is retrieved as long. In the second example the map is from string to double, therefore xId is converted into a string before using it as key in the map.
Hi I'm attempting to deserializer.deserialize this data from Google Analytics
[[/s417, 14945, 93.17823577906019], [/s413, 5996, 72.57178438000356],
[/s417/, 3157, 25.690567351200837], [/s420, 2985, 44.12472727272727],
[/s418, 2540, 64.60275150472916], [/s416, 2504, 69.72643979057591],
[/s415, 2379, 44.69660861594867], [/s422, 2164, 57.33786505538772],
[/s421, 2053, 48.18852894317578], [/s414, 1839, 93.22588376273218],
[/s412, 1731, 54.8431860609832], [/s411, 1462, 71.26186830015314],
[/s419, 1423, 51.88551401869159], [/, 63, 11.303571428571429],
[/s420/, 22, 0.3333333333333333], [/s413/, 21, 7.947368421052632],
[/s416/, 16, 96.0], [/s421/, 15, 0.06666666666666667], [/s411/, 13,
111.66666666666667], [/s422/, 13, 0.07692307692307693], [/g150, 11, 0.09090909090909091], [/s414/, 10, 2.0], [/s418/, 10, 0.4444444444444444], [/s415/, 9, 0.2222222222222222], [/s412/, 8, 0.6666666666666666], [/s45, 6, 81.0], [/s164, 5, 45.25], [/s28, 5, 16.2], [/s39, 5, 25.2], [/s27, 4, 59.5], [/s29, 4, 26.5], [/s365, 3, 31.666666666666668], [/s506, 3, 23.333333333333332], [/s1139, 2, 30.5], [/s296, 2, 11.0], [/s311, 2, 13.5], [/s35, 2, 55.0], [/s363, 2, 15.5], [/s364, 2, 17.5], [/s419/, 2, 0.0], [/s44, 2, 85.5], [/s482, 2, 28.5], [/s49, 2, 29.5], [/s9, 2, 77.0], [/s146, 1, 13.0], [/s228, 1, 223.0], [/s229, 1, 54.0], [/s231, 1, 0.0], [/s30, 1, 83.0], [/s312, 1, 15.0], [/s313, 1, 155.0], [/s316, 1, 14.0], [/s340, 1, 22.0], [/s350, 1, 0.0], [/s362, 1, 24.0], [/s43, 1, 54.0], [/s442, 1, 87.0], [/s465,
1, 14.0], [/s468, 1, 67.0], [/s47, 1, 41.0], [/s71, 1, 16.0], [/s72,
1, 16.0], [/s87, 1, 48.0], [/s147, 0, 0.0], [/s417, 0, 0.0]]
With this
#Immutable
private static JSONDeserializer<List<List<String>>> deserializer = new JSONDeserializer<List<List<String>>>();
And it's failing silently on the deserialization.
Only error I'm getting is from the xhtml
com.sun.faces.context.PartialViewContextImpl$PhaseAwareVisitCallback
visit
SEVERE: javax.el.ELException: /views/guide/edit.xhtml #257,102 value="#{GuideEditController.visitsByScene}": flexjson.JSONException:
Missing value at character 2
Any clues?
marekful had the right idea
replaceAll("[^\d,[]\,]+", "") to remove the offending characters did the trick
What better alternative to produce a crosstab in Sqlite via Android/java?
I have:
[TABLE people]
<i>_id, NAME
1, "mary"
2, "juan"
3, "jose"</i>
[TABLE GLASSES]
_id, COLOR
1, "BLACK"
2, "BLUE"
3, "GRAY"
4, "YELLOW"
...
[TABLE PEOPLE_GLASSES]
_id, idpeople, idglass, qty
1, 1, 1, 50
2, 1, 3, 30
3, 1, 4, 25
...
I need:
[crosstab]
NAME | BLACK | GRAY | YELLOW
"mary" | 50 | 30 | 25
...
how to do this?
SQLite does not have any built-in function to convert rows into columns.
You have to read the GLASSES table first, and then, based on that data, dynamically create another query like this:
SELECT NAME,
(SELECT qty
FROM PEOPLE_GLASSES
WHERE idpeople = people._id
AND idglass = 1
) AS "BLACK",
(SELECT qty
FROM PEOPLE_GLASSES
WHERE idpeople = people._id
AND idglass = 2
) AS "BLUE",
(SELECT qty
FROM PEOPLE_GLASSES
WHERE idpeople = people._id
AND idglass = 3
) AS "GRAY",
(SELECT qty
FROM PEOPLE_GLASSES
WHERE idpeople = people._id
AND idglass = 4
) AS "YELLOW"
FROM people
Well, I have the following questions, to perform the join between the tables by setting the nickname (alias), I need to make a decode, used the alias alias, but to use because it does not recognize the use of pure sql.
How do I return the name that defines the criteria for the tables? I'm using sqlGroupProjection, if you can suggest another way.
Criteria criteria = dao.getSessao().createCriteria(Chamado.class,"c");
criteria.createAlias("c.tramites","t").setFetchMode("t", FetchMode.JOIN);
projetos.add( Projections.rowCount(),"qtd");
criteria.add(Restrictions.between("t.dataAbertura", Formata.getDataD(dataInicio, "dd/MM/yyyy"), Formata.getDataD(dataFim, "dd/MM/yyyy")));
projetos.add(Projections.sqlGroupProjection("decode(t.cod_estado, 0, 0, 1, 1, 2, 1, 3, 2, 4, 1, 5, 3) as COD_ESTADO",
"decode(t.cod_estado, 0, 0, 1, 1, 2, 1, 3, 2, 4, 1, 5, 3)",
new String[]{"COD_ESTADO"},
new Type[]{Hibernate.INTEGER}));
criteria.setProjection(projetos);
List<Relatorio> relatorios = criteria.setResultTransformer(Transformers.aliasToBean(Relatorio.class)).list();
SQL generated by criteria:
select count(*) as y0_,
decode(t.cod_estado, 0, 0, 1, 1, 2, 1, 3, 2, 4, 1, 5, 3) as COD_ESTADO
from CHAMADOS this_
inner join TRAMITES t1_ on this_.COD_CHAMADO = t1_.COD_CHAMADO
where t1_.DT_ABERTURA between ? and ?
group by decode(t.cod_estado, 0, 0, 1, 1, 2, 1, 3, 2, 4, 1, 5, 3)
I have a typical web application in which I am trying to generate facets from the mongodb collection. This is currently being done using the aggregation framework using the Java driver (v2.10.1). The facets are generated correctly, except for the documents containing sub-arrays, for instance I have the following json documents:
{name: polo, fueltypes:[benzin, lpg], color: black}
{name: golf, fueltypes:[benzin, cng], color: blue}
{name: a4, fueltypes:[diesel], color: blue}
The returned result set is:
name:
{_id: polo, count: 1}
{_id: golf, count: 1}
{_id: a4, count: 1}
color:
{_id: black, count: 1}
{_id: blue, count: 2}
fueltypes:
{_id: [benzin,lpg,cng,diesel], count: 3}
The aggregated result of the fueltypes field contains all the array fields.
However the desired result should be:
fueltypes:
{_id: benzin, count: 2}
{_id: lpg, count: 1}
{_id: diesel, count: 1}
{_id: cng, count: 1}
and the corresponding java code:
String str = "name" ; //or fueltypes, color
// create match
BasicDBObject match = new BasicDBObject();
match.put("$match", new BasicDBObject());
// build the $projection operation
DBObject fields = new BasicDBObject();
// fields.put("count", 1);
DBObject project = new BasicDBObject();
// Now the $group operation
DBObject groupFields = new BasicDBObject();
DBObject unwindFields = new BasicDBObject();
// build the $projection operation
fields.put(str, 1);
project.put("$project", fields);
// Now the $group operation
groupFields.put("_id", "$" + str);
// performing sum and storing it in the count attribute
groupFields.put("count", new BasicDBObject("$sum", 1));
DBObject group = new BasicDBObject("$group", groupFields);
AggregationOutput output = serviceCollection.aggregate(match, project, group);
Grouping by the array "fueltypes" gives you the number of occurrences of the array as such.
To count it's elements individually, you'll have to use the $unwind operator, like so:
// create unwind
BasicDBObject unwind = new BasicDBObject();
unwind.put("$unwind", "$" + str);
and include this before the $group operator. Alternatively, you could call the $unwind only if str is "fueltypes".
For more information about unwind, see http://docs.mongodb.org/manual/reference/aggregation/