I'm trying to find duplicate entries in map values. But the thing is the list of values have multiple attributes/properties. Basically, if a title shows up more than once in a database, I would mark one entry as unique and mark the rest as duplicates.
Here's my current code:
// I have a Map that looks like...
host1 : id | title | host1 | url | state | duplicate
id | title | host1 | url | state | duplicate
host2 : id | title | host2 | url | state | duplicate
id | title | host2 | url | state | duplicate
for (Map.Entry<String, List<Record>> e : recordsByHost.entrySet()) {
boolean executed = false;
for (Record r : e.getValue()) {
int frequency = Collections.frequency(
e
.getValue()
.stream()
.map(Record::getTitle)
.collect(Collectors.toList()),
r.getTitle()
);
if ((frequency > 1) && (!executed)) {
markDuplicates(r.getId(), r.getTitle());
executed = true;
} else {
executed = false;
}
The issue is when frequency is more than 2 (three records with the same title), the line evaluates to false and treats the third record / second duplicate as "unique".
I've been trying to rework my logic but I'm afraid I'm stuck. Any help / suggestions to get me unstuck would be greatly appreciated.
Set.add (and in fact, Collection.add) returns true if and only if the value was actually added to the Set. Since a Set always enforces uniqueness, you can use this to find duplicates:
void markDuplicates(Iterable<? extends Record> records) {
Set<String> foundTitles = new HashSet<>();
for (Record r : records) {
String title = r.getTitle();
if (title != null && !foundTitles.add(title)) {
// title was not added, because it's already been found.
markAsDuplicate(r);
}
}
}
Related
I have one hive table and it is partitions on multiple columns.
I need to fetch partition list with partial partition name.
Ex: table foo_table is partition on
| PARTITIONED BY ( |
| `dt` string COMMENT 'Custom Partition.', |
| `h` string COMMENT 'Custom Partition.', |
| `b` string COMMENT 'Custom Partition.', |
| `sv` string COMMENT 'Custom Partition', |
| `p` string COMMENT 'Custom Partition', |
| `dc` string COMMENT 'Custom Partition')
Now i need to fetch all the partition let say where dt=somevalue
Below code works if I give value of all the partition columns.
List<String> list = ...
list.add("dt=2021-02-01/h=19/b=30/sv=1/p=03/dc=aa")
List<Partition> partitions = HiveMetaStoreClient.getPartitionsByNames(database, tableName, list)
But if i want to fetch partition by only giveing dt=2021-02-01/h=19this doesn't work.
List<String> list = ...
list.add("dt=2021-02-01/h=19")
//OR
list.add("dt=2021-02-01/h=19/")
//OR
list.add("dt=2021-02-01/h=19/*")
//OR
list.add("dt=2021-02-01/h=19/b=*/sv=*/p=*/dc=*")
List<Partition> partitions = HiveMetaStoreClient.getPartitionsByNames(database, tableName, list)
How to achieve this?
You can use listPartitions method available in HiveMetaStoreClient.
public List<Partition> listPartitions(String db_name,
String tbl_name,
List<String> part_vals,
short max_parts)
throws NoSuchObjectException,
MetaException,
org.apache.thrift.TException
you can query for partitions in sequence of your partitions added in hive table.
Example: Table ProductByTypeHeight partitioned [type String, Height int]
Drawback: you can query for one type of partition at a time.
List<String> pvals1 = new ArrayList<>();
pvals1.add("fruit");
List<String> pvals2 = new ArrayList<>();
pvals2.add("vegetable");
pvals2.add("450");
List<List<String>> vals = new ArrayList<>();
vals.add(pvals1);
vals.add(pvals2);
for (List<String> pval:vals) {
List<Partition> pNameList = hiveMetaStoreConnector.listPartitions("productdb", "productbytypeheight", pval);
pNameList.forEach(partition -> {
List<String> partitionValues = partition.getValues();
for (int i = 0; i < partitionValues.size(); i++) {
System.out.print(partitionValues.get(i) + " ");
}
System.out.println();
});
}
Output:
fruit 600
fruit 450
fruit 400
vegetable 450
ArrayList testdata_1 = new ArrayList<>();
for (int i = 0; i < testdata_1.size(); i++) {
System.out.println(testdata_1.get(i));
}
My output is
[]
[[Username,Password], [user_1, Test#100]]
I want to delete the first blank value ,iterate the arraylist, identify the key as username,fetch the value "user_1" and assign to a local string variable "username"
Datatable
| Username | user_1 |
| Password | Test#100 |
List> data = new ArrayList<>();
if(step.getRows()!=null)
step.getRows().forEach(row -> data.add(row.getCells()));
Resolved. Used data encapsulation to resolve this
I made a table-view that looks like the following:
For doing this I made the following:
1.- Create and observable list of the POJO that represents the table "Modulo" in MySQL database, with this list I created the columns of the table-view with this code:
public ObservableList<Modulo> cargaTablaHabilitadosMasHoras(Connection conex){
ObservableList<Modulo> listaMod = FXCollections.observableArrayList();
String sql = "SELECT id_Mod, nombre_Mod, \n" +
" Turno_Mod, capacidad_Mod, id_Us, \n" +
" status_Mod \n"+
"FROM Modulo;";
//Static first column
listaMod.add(new Modulo("123", "Horario", "Rhut", 10, "123", true));
try(PreparedStatement stta = conex.prepareStatement(sql);
ResultSet res = stta.executeQuery()) {
while (res.next()) {
if (res.getBoolean("status_Mod")) {
listaMod.add(new Modulo( res.getString ("id_Mod"),
res.getString ("nombre_Mod"),
res.getString ("Turno_Mod"),
res.getInt ("capacidad_Mod"),
res.getString ("id_Us"),
res.getBoolean("status_Mod")));
}
}
}catch (SQLException ex) {
ex.printStackTrace();
}
return listaMod;
}
2.- Create a table with the custom data with this code:
public void otraTabla(Connection conex){
//loads the observable list of the POJO that represents the table Modulo
columns = modu.cargaTablaHabilitadosMasHoras(conex);
/*
creates and observable list that is going to be the base
of the tableview creating a grid of 8 x number of colums
obtained of the first list + 1 column that represents the hours
*/
ObservableList<String> row = FXCollections.observableArrayList();
row.addAll("1","2","3","4","5","6","7","8");
//for loop that iterates the tableview columns
for(Modulo columName : columns) {
//creates and column object to be integrated and manipulated
//whit the name of the column in the first list
TableColumn<String, String> col = new TableColumn(columName.getNombre_Mod());
//verify if is the first column with contains the hours
if (columName.getNombre_Mod().equals("Horario")) {
//if is the one creates the rows with the hours staring at 6 am
col.setCellValueFactory(cellData -> {
//star at 6 am
LocalTime lol = LocalTime.of(6, 0);
//get the value of ObservableList<String> row for for adding to LocalTime
Integer p = Integer.valueOf(cellData.getValue());
//adds the value to localtime
lol = lol.plusHours(p);
//Gives a format for the hour
DateTimeFormatter kk = DateTimeFormatter.ofPattern("hh:mm");
//returns the new String
return new SimpleStringProperty(lol.format(kk));
});
}else{
//if is a column load dinamically then gets
//the next date where there is space in the column at that time
col.setCellValueFactory(cellData -> {
String regresaFecha = "";
//Conection to the database it conection to the database it
//have to be inside of the loop or else the conection is lost
try(Connection localConnection = dbConn.conectarBD()) {
//get the level of the row in this case the hour
LocalTime lol = LocalTime.of(6, 0);
Integer p = Integer.valueOf(cellData.getValue());
lol = lol.plusHours(p);
//calls the method that calculed the next date where there is space in the table of the database
LocalDate fechaApunter = rehab.compararDiaADia(localConnection, Date.valueOf(LocalDate.now()),
Time.valueOf(lol), columName.getId_Mod(), columName.getCapacidad_Mod(), 30);
//date sent to the row of the tableview
regresaFecha = fechaApunter.toString();
} catch (SQLException e) {
e.printStackTrace();
}
return new SimpleStringProperty(regresaFecha);
});
}
//change color of the date depending of the
//distant relevant to the day is making the query to the database
if (!columName.getNombre_Mod().equals("Horario")) {
col.setCellFactory (coli -> {
TableCell<String, String> cell = new TableCell<String, String>() {
#Override
public void updateItem(String item, boolean empty) {
super.updateItem(item, empty);
if (item != null) {
LocalDate lol = LocalDate.parse(item);
Text text = new Text(item);
if (lol.isAfter(LocalDate.now())) {
if (lol.isAfter(LocalDate.now().plusDays(5))) {
text.setStyle(" -fx-fill: #990000;" +
" -fx-text-alignment:center;");
}else
text.setStyle(" -fx-fill: #cccc00;" +
" -fx-text-alignment:center;");
}
this.setGraphic(text);
}
}
};
return cell;
});
}
//add the column to the tableview
tvDisponivilidad.getColumns().addAll(col);
}
//add the Observable list place holder
tvDisponivilidad.setItems(row);
}
For loading the data I used this method:
public LocalDate compararDiaADia(Connection conex, Date fecha, Time hora,
String id_Mod, int capacidad, int dias){
LocalDate contador = fecha.toLocalDate();
LocalDate disDeHoy = LocalDate.now();
for (int i = 0; i < dias; i++) {
contador = fecha.toLocalDate();
contador = contador.plusDays(i);
String sttm = "SELECT COUNT(id_Reab) AS Resultado\n" +
"FROM Rehabilitacion\n" +
"WHERE '"+contador+"' BETWEEN inicio_Reab AND fin_Reab\n" +
"AND horario_Reab = '"+hora+"'\n" +
"AND id_Modulo = '"+id_Mod+"';";
try(PreparedStatement stta = conex.prepareStatement(sttm);
ResultSet res = stta.executeQuery(); ) {
if (res.next()) {
if (res.getInt("Resultado") < capacidad || res.getInt("Resultado") == 0) {
disDeHoy = contador;
break;
}else
disDeHoy = contador;
}
} catch (SQLException ex) {
ex.printStackTrace();
}
}
return disDeHoy;
}
What this method does is that for each column it checks where is the next day where there is less of the capacity of the module (each module has different capacity) at certain hour and returns that day, in the calling of the method the hour changes to populate all the rows in the table.
There are several problems with my approach, first is the time, it cost to load the table, it takes like one minute to make the query and populate the table this is a combination of factors but the principal factor is that for every day I made a query to the database and example of this:.
Here is my table where I made the queries:
mysql> SELECT * FROM imssRehab.Rehabilitacion;
+---------+-------------+------------+-----------------+---------+-----------+
| id_Reab | inicio_Reab | fin_Reab | horario_Reab | id_Prog | id_Modulo |
+---------+-------------+------------+-----------------+---------+-----------+
| 1 | 2016-06-01 | 2016-06-10 | 07:00:00.000000 | 1 | 215A3 |
| 2 | 2016-06-01 | 2016-06-10 | 07:00:00.000000 | 1 | 215A3 |
| 3 | 2016-06-01 | 2016-06-10 | 07:00:00.000000 | 1 | 215A3 |
| 4 | 2016-06-01 | 2016-06-10 | 07:00:00.000000 | 1 | 215A3 |
| 5 | 2016-06-01 | 2016-06-10 | 07:00:00.000000 | 1 | 215A3 |
| 6 | 2016-06-01 | 2016-06-10 | 07:00:00.000000 | 1 | 215A3 |
+---------+-------------+------------+-----------------+---------+-----------+
here is my query:
SELECT COUNT(id_Reab) AS Resultado
FROM Rehabilitacion
WHERE '2016-06-01' BETWEEN inicio_Reab AND fin_Reab
AND horario_Reab = '07:00'
AND id_Modulo = '215A3';
The result is 6 in this module my capacity is 5 so I have to advance a day and ask again until it finds a day where are less than 5 in this example until 2016-06-11. To get here I have to make 10 queries and open 10 connections. I use a connection pool and it's very efficient, but it gets overwhelmed by these 10 queries are only for the first row in the first column, normally there are between 15 to 20 columns assuming there is only one query for a row, it still is around 120-160 connections.
I try to reuse a connection every time I can, my first instinct was to use the connection that get pass to the method for loading the Observable List of modules but when I do this the method that makes the query of dates receives the connection closed with out and apparent reason. After many tests I came to the conclusion that has something to do with the lambda of the setCellValueFactory method, and if I want to make a connection it has to be inside creating more connections. I would like i try to alleviate this by loading the table in a different thread with a Task but the results where similar.
A solution to this would be to make a POJO especially for table but I don't think it's possible to create a class dynamically. I could have a POJO whit 20 possible columns and only load the columns that I would use, but what happens when there is more than 20 columns or the name of the modules changes?
So my question is this: how do I make the creation of the table more rapidly? And is there a better way to achieve this table? I don't like my solution with code it's more complex than I would like I'm hoping for a better and cleaner way.
I'm currently stuck on my project on creating a Fuseki Triple Store Browser. I need to visualize all the data from a TripleStore and make the app browsable. The only problem is that the QuerySolution leaves out the "< >" that are in the triplestore.
If I use the ResultSetFormatter.asText(ResultSet) it returns this:
-------------------------------------------------------------------------------------------------------------------------------------
| subject | predicate | object |
=====================================================================================================================================
| <urn:animals:data> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq> |
| <urn:animals:data> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> | <urn:animals:lion> |
| <urn:animals:data> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2> | <urn:animals:tarantula> |
| <urn:animals:data> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#_3> | <urn:animals:hippopotamus> |
-------------------------------------------------------------------------------------------------------------------------------------
Notice that the some of the data contains the smaller/greater than signs "<" and ">". As soon as i try to parse the data from the ResultSet, it removes those sign, so that the data looks like this:
-------------------------------------------------------------------------------------------------------------------------------
| subject | predicate | object |
===============================================================================================================================
| urn:animals:data | http://www.w3.org/1999/02/22-rdf-syntax-ns#type | http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq |
| urn:animals:data | http://www.w3.org/1999/02/22-rdf-syntax-ns#_1 | urn:animals:lion |
| urn:animals:data | http://www.w3.org/1999/02/22-rdf-syntax-ns#_2 | urn:animals:tarantula |
| urn:animals:data | http://www.w3.org/1999/02/22-rdf-syntax-ns#_3 | urn:animals:hippopotamus |
As you can see, the data doesn't contain the "<" and ">" signs.
This is how I parse the data from the ResultSet:
while (rs.hasNext()) {
// Moves onto the next result
QuerySolution sol = rs.next();
// Return the value of the named variable in this binding.
// A return of null indicates that the variable is not present in
// this solution
RDFNode object = sol.get("object");
RDFNode predicate = sol.get("predicate");
RDFNode subject = sol.get("subject");
// Fill the table with the data
DefaultTableModel modelTable = (DefaultTableModel) this.getModel();
modelTable.addRow(new Object[] { subject, predicate, object });
}
It's quite hard to explain this problem, but is there a way to keep the "< >" signs after parsing the data?
The '<>' are used by the formatter to indicate that the value is a URI rather than a string: so "http://example.com/" is a literal text value, whereas <http://example.com/> is a URI.
You can do the same yourself:
RDFNode node; // subject, predicate, or object
if (node.isURIResource()) {
return "<" + node.asResource().getURI() + ">";
} else {
...
}
But it's much easier to use FmtUtils:
String nodeAsString = FmtUtils.stringForRDFNode(subject); // or predicate, or object
What you need to do is get that code invoked when the table cell is rendered: currently the table is using Object::toString().
In outline, the steps needed are:
modelTable.setDefaultRenderer(RDFNode.class, new MyRDFNodeRenderer());
Then see http://docs.oracle.com/javase/tutorial/uiswing/components/table.html#renderer about how to create a simple renderer. Note that value will be an RDFNode:
static class MyRDFNodeRenderer extends DefaultTableCellRenderer {
public MyRDFNodeRenderer() { super(); }
public void setValue(Object value) {
setText((value == null) ? "" : FmtUtils.stringForRDFNode((RDFNode) value));
}
}
I am writing my DSL's Model inferrer, which extends from AbstractModelInferrer. Until now, I have successfully generated classes for some grammar constructs, however when I try to generate an interface the type inferrer does not work and I get the following Exception:
0 [Worker-2] ERROR org.eclipse.xtext.builder.BuilderParticipant - Error during compilation of 'platform:/resource/pascani/src/org/example/namespaces/SLA.pascani'.
java.lang.IllegalStateException: equivalent could not be computed
The Model inferrer code is:
def dispatch void infer(Namespace namespace, IJvmDeclaredTypeAcceptor acceptor, boolean isPreIndexingPhase) {
acceptor.accept(processNamespace(namespace, isPreIndexingPhase))
}
def JvmGenericType processNamespace(Namespace namespace, boolean isPreIndexingPhase) {
namespace.toInterface(namespace.fullyQualifiedName.toString) [
if (!isPreIndexingPhase) {
documentation = namespace.documentation
for (e : namespace.expressions) {
switch (e) {
Namespace: {
members +=
e.toMethod("get" + Strings.toFirstUpper(e.name), typeRef(e.fullyQualifiedName.toString)) [
abstract = true
]
members += processNamespace(e, isPreIndexingPhase);
}
XVariableDeclaration: {
members += processNamespaceVarDecl(e)
}
}
}
}
]
}
def processNamespaceVarDecl(XVariableDeclaration decl) {
val EList<JvmMember> members = new BasicEList();
val field = decl.toField(decl.name, inferredType(decl.right))[initializer = decl.right]
// members += field
members += decl.toMethod("get" + Strings.toFirstUpper(decl.name), field.type) [
abstract = true
]
if (decl.isWriteable) {
members += decl.toMethod("set" + Strings.toFirstUpper(decl.name), typeRef(Void.TYPE)) [
parameters += decl.toParameter(decl.name, field.type)
abstract = true
]
}
return members
}
I have tried using the lazy initializer after the acceptor.accept method, but it still does not work.
When I uncomment the line members += field, which adds a field to an interface, the model inferrer works fine; however, as you know, interfaces cannot have fields.
This seems like a bug to me. I have read tons of posts in the Eclipse forum but nothing seems to solve my problem. In case it is needed, this is my grammar:
grammar org.pascani.Pascani with org.eclipse.xtext.xbase.Xbase
import "http://www.eclipse.org/xtext/common/JavaVMTypes" as types
import "http://www.eclipse.org/xtext/xbase/Xbase"
generate pascani "http://www.pascani.org/Pascani"
Model
: ('package' name = QualifiedName ->';'?)?
imports = XImportSection?
typeDeclaration = TypeDeclaration?
;
TypeDeclaration
: MonitorDeclaration
| NamespaceDeclaration
;
MonitorDeclaration returns Monitor
: 'monitor' name = ValidID
('using' usings += [Namespace | ValidID] (',' usings += [Namespace | ValidID])*)?
body = '{' expressions += InternalMonitorDeclaration* '}'
;
NamespaceDeclaration returns Namespace
: 'namespace' name = ValidID body = '{' expressions += InternalNamespaceDeclaration* '}'
;
InternalMonitorDeclaration returns XExpression
: XVariableDeclaration
| EventDeclaration
| HandlerDeclaration
;
InternalNamespaceDeclaration returns XExpression
: XVariableDeclaration
| NamespaceDeclaration
;
HandlerDeclaration
: 'handler' name = ValidID '(' param = FullJvmFormalParameter ')' body = XBlockExpression
;
EventDeclaration returns Event
: 'event' name = ValidID 'raised' (periodically ?= 'periodically')? 'on'? emitter = EventEmitter ->';'?
;
EventEmitter
: eventType = EventType 'of' emitter = QualifiedName (=> specifier = RelationalEventSpecifier)? ('using' probe = ValidID)?
| cronExpression = CronExpression
;
enum EventType
: invoke
| return
| change
| exception
;
RelationalEventSpecifier returns EventSpecifier
: EventSpecifier ({RelationalEventSpecifier.left = current} operator = RelationalOperator right = EventSpecifier)*
;
enum RelationalOperator
: and
| or
;
EventSpecifier
: (below ?= 'below' | above ?= 'above' | equal ?= 'equal' 'to') value = EventSpecifierValue
| '(' RelationalEventSpecifier ')'
;
EventSpecifierValue
: value = Number (percentage ?= '%')?
| variable = QualifiedName
;
CronExpression
: seconds = CronElement // 0-59
minutes = CronElement // 0-59
hours = CronElement // 0-23
days = CronElement // 1-31
months = CronElement // 1-2 or Jan-Dec
daysOfWeek = CronElement // 0-6 or Sun-Sat
| constant = CronConstant
;
enum CronConstant
: reboot // Run at startup
| yearly // 0 0 0 1 1 *
| annually // Equal to #yearly
| monthly // 0 0 0 1 * *
| weekly // 0 0 0 * * 0
| daily // 0 0 0 * * *
| hourly // 0 0 * * * *
| minutely // 0 * * * * *
| secondly // * * * * * *
;
CronElement
: RangeCronElement | PeriodicCronElement
;
RangeCronElement hidden()
: TerminalCronElement ({RangeCronElement.start = current} '-' end = TerminalCronElement)?
;
TerminalCronElement
: expression = (IntLiteral | ValidID | '*' | '?')
;
PeriodicCronElement hidden()
: expression = TerminalCronElement '/' elements = RangeCronList
;
RangeCronList hidden()
: elements += RangeCronElement (',' elements +=RangeCronElement)*
;
IntLiteral
: INT
;
UPDATE
The use of a field was a way to continue working in other stuff until I find a solution. The actual code is:
def processNamespaceVarDecl(XVariableDeclaration decl) {
val EList<JvmMember> members = new BasicEList();
val type = if (decl.right != null) inferredType(decl.right) else decl.type
members += decl.toMethod("get" + Strings.toFirstUpper(decl.name), type) [
abstract = true
]
if (decl.isWriteable) {
members += decl.toMethod("set" + Strings.toFirstUpper(decl.name), typeRef(Void.TYPE)) [
parameters += decl.toParameter(decl.name, type)
abstract = true
]
}
return members
}
From the answer in the Eclipse forum:
i dont know if that you are doing is a good idea. the inferrer maps
your concepts to java concepts and this enables the scoping for the
expressions. if you do not have a place for your expressions then it
wont work. their types never will be computed
thus i think you have a usecase which is not possible using xbase
without customizations. your semantics is not quite clear to me.
Christian Dietrich
My answer:
Thanks Christian, I though I was doing something wrong. If it seems not to be a common use case, then there is no problem, I will make sure the user explicitly defines a variable type.
Just to clarify a little bit, a Namespace is intended to define variables that are used in Monitors. That's why a Namespace becomes an interface, and a Monitor becomes a class.
Read the Eclipse forum thread