OrientDB Java Batch Import

OrientDB Java Batch Import - java

I have a bigger problem with the batch import in OrientDB when I use Java.
My data is a collection of recordID's and tokens. For each ID exists a set of tokens but tokens can be in several ID'S.
Example:
ID Tokens
1 2,3,4
2 3,5,7
3 1,2,4
My graph should have two types of verticies: rIDClass and tokenClass. I want to give each vertex an ID corresponding to the recordID and the token. So the total number of tokenClass vertices should be the total number of unique tokens in the data. (Each token is only created once!)
How can I realize this problem? I tried the "Custom Batch Insert" from the original documentation and I tried the method "Batch Implementation", described in the blueprints documentation.
The problem at the first method is that OrientDB creates for each inserted token a separate vertex with a custom ID, which is set by the system itself.
The problem at the second method is when I try to add a vertex to the batchgraph I can't set the corresponding vertex Class and additionally I get an Exception. This is my code from the second method:
BatchGraph<OrientGraph> bgraph = new BatchGraph<OrientGraph>(graph, VertexIDType.STRING, 1);
Vertex vertex1 = bgraph.addVertex(1);
vertex1.setProperty("uid", 1);
Maybe someone has a solution.
Vertex vertex2 = bgraph.addVertex(2);
vertex2.setProperty("uid", 2);
Edge edge1 = graph.addEdge(12, vertex1 , vertex2, "EdgeConnectClass");
And I get the following Exception:
Exception in thread "main" java.lang.ClassCastException:
com.tinkerpop.blueprints.util.wrappers.batch.BatchGraph$BatchVertex cannot be cast to com.tinkerpop.blueprints.impls.orient.OrientVertex
at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.addEdge(OrientBaseGraph.java:612)
at App.indexRecords3(App.java:83)
at App.main(App.java:47)

I don't know if I understood correctly but, if you want a schema like this:
try this:
Vertex vertex1 = g.addVertex("class:rIDClass");
vertex1.setProperty("uid", 1);
Vertex token2 = g.addVertex("class:tokenClass");
token2.setProperty("uid", 2);
Edge edge1 = g.addEdge("class:rIDClass", vertex1, token2, "EdgeConnectClass");
Hope it helps
Regards

Related

JAVA - MongoDB Select a value randomly according to the value of a field in my Data Base

I just started at coding in Java and I have a class assignment I struggle with because of this step:
"The application queries the database on Atlas for two more foods (food2,
food3). These foods must meet the following conditions:
a. The composition of food2 and food3 is measured in the same unit as
food1. There are two distinct units2: gram (g) and milliliter (ml).
b. The value of the selected property must be non-zero and must differ at
least 50% from the value of food1"
Here is one of my data in my data base :
{"_id":{"$oid":"624d84c9135dfc3bf0fb72c7"},"Name":"Branntwein aus Wein (z.B.Cognac, Brandy)","Kategorie":"Alkoholhaltige Getränke/Spirituosen","Bezugseinheit":"pro 100 ml","Kalorien":"228","Fett":"0","Fettsäuren, gesättigt":"0","Fettsäuren, einfach ungesättigt":"0","Fettsäuren, mehrfach ungesättigt":"0","Cholesterin":"0","Kohlenhydrate":"1.9","Zucker":"1.9","Stärke":"0","Nahrungsfasern":"0","Protein":"0","Salz":"0","Alkohol":"31.4","Wasser":"61.6","Betacarotin":"0","Vitamin B1":"0","Vitamin B2":"0","Vitamin B6":"0","Vitamin B12":"0","Niacin":"0","Folat":"0","Pantothensäure":"0","Vitamin C":"0","Vitamin D":"0","Vitamin E":"0","Kalium":"1.9","Natrium":"1.9","Chlorid":"2.9","Calcium":"0","Magnesium":"1","Phosphor":"0","Eisen":"0","Jod":"0","Zink":"0"}
To give more context I have chosen to work with the field "Eisen", I succeeded in picking a value randomly (Which is my first food) but now I am supposed to pick 2 other values (food 2 and food 3) randomly but with the constraints in the step I have introduced in my first paragraphe.
Unfortunately, despite all the answers I find on internet I fail to implement one, here is my code to find one value randomly:
AggregateIterable Rando =
collection.aggregate(Arrays.asList(new Document("$match",
new Document("Eisen",
new Document("$gt", "0"))),
new Document("$sample",
new Document("size", 1L)),
new Document("$unwind",
new Document("path", "$Eisen")))); ArrayList randoList = Rando.into(new ArrayList());
System.out.println("Random" + randoList );
Thank you a lot for your help,

JGraphT: Inconsistent graph size for same dataset

Version: 1.3.0
Graph Requirement: Child will have many parents or none.
Data Node: Id + List [Parent ids]
class Node {
String id;
List<String> parents;
}
Total dataset: 3500 nodes.
GraphType is selected using : Directed + No Multiple Edges + No Self Loops + No Weights + DefaultEdge
Graph Building logic:
Iterate through 3500 nodes
Create the vertex using node ID.
Graph.addVertex(childVertex)
then check if parents exists
if they do, iterate through parents
Create parent id vertex using parent id.
Graph.addVertex(parentVertex)
Graph.addEdge(parentVertex, childvertex)
However, running the same dataset (3500), for 5 times, I am getting different graph size calculated every time using: graph.vertexSet().size(). Expectation is 3500 everytime but its inconsistent.
All 3500 ids are unique and we should have graph size = 3500 but we got something like:
GraphType is - SimpleDirectedGraph and size is variable - 3500, 3208, 3283, 2856, 3284.
Any help would appreciated.
Thanks

how to use mongodb java-driver Projections.slice

I am trying to use Aggregates.project to slice the array in my documents.
My documents is like
{
"date":"",
"stype_0":[1,2,3,4]
}
in the mongochef looks like
and my code in java is :
Aggregates.project(Projections.fields(
Projections.slice("stype_0", pst-1, pen-pst),Projections.slice("stype_1", pst-1, pen-pst),
Projections.slice("stype_2", pst-1, pen-pst),Projections.slice("stype_3", pst-1, pen-pst))))
finally i get error
First argument to $slice must be an array, but is of type: int
I guess that is because the first element in stype_0 is int , but I really do not know why? Thanks a lot!

Slice has two versions. $slice(aggregation) & $slice(projection). You are using the wrong one.
Aggregate slice function doesn't have any built-in support. Below is an example for one such projection. Do the same for all the other projection fields.
List stype_0 = Arrays.asList("$stype_0", 1, 1);
Bson project = Aggregates.project(Projections.fields(new Document("stype_0", new Document("$slice", stype_0))));
AggregateIterable<Document> iterable = dbCollection.aggregate(Arrays.asList(project));

Composite index of edges & property (tinkerpop / orientDB)

I have a graph in OrientDB (uses Tinkerpop stack), and need to enable very fast lookups of edge values / properties / fields and edge in/out vertices.
So, basically the user will need to lookup as follows:
SELECT FROM myEdges WHERE inVertex = {VertexIdentity}, outVertex = {VertexIdentity}, property1 = 'xyz'
Essentially it's a composite index for the edge class, of 3 properties: inVertex, outVertex & property1
Basically - if the user already has a VertexIdentity for 2 vertices (maybe, in the form: #CLUSTER_ID:RECORD_ID) - and the the property value (in this case, xyz) - it will allow very fast lookup to see if the combination already exists in the graph (if 2 vertices are linked with property1) - without making a traversal.
So far I found the following code to help with composite indexes, but I cant see if it's possible to include in/out vertices in this (for a graph edge).
https://github.com/orientechnologies/orientdb/blob/master/tests/src/test/java/com/orientechnologies/orient/test/database/auto/SQLSelectCompositeIndexDirectSearchTest.java
Is it possible??

This is working fine for defining edge uniqueness:
OCommandSQL declareIn= new OCommandSQL();
declareIn.setText("CREATE PROPERTY E.in LINK");
OCommandSQL declareOut= new OCommandSQL();
declareOut.setText("CREATE PROPERTY E.out LINK");
OCommandSQL createIndexUniqueEdge= new OCommandSQL();
createIndexUniqueEdge.setText("CREATE INDEX unique_edge ON E (in, out) UNIQUE");
graph.command(declareIn).execute();
graph.command(declareOut).execute();
graph.command(createIndexUniqueEdge).execute();
In you case just add another property to the Edge class and consequently in the index

You can do it with OrientDB, just create the composite index against the in and out properties too (declare them in E class before).
This is used also as constraints to avoid multiple edges connect the same vertices.

Faceting using SolrJ and Solr4

I've gone through the related questions on this site but haven't found a relevant solution.
When querying my Solr4 index using an HTTP request of the form
&facet=true&facet.field=country
The response contains all the different countries along with counts per country.
How can I get this information using SolrJ?
I have tried the following but it only returns total counts across all countries, not per country:
solrQuery.setFacet(true);
solrQuery.addFacetField("country");
The following does seem to work, but I do not want to have to explicitly set all the groupings beforehand:
solrQuery.addFacetQuery("country:usa");
solrQuery.addFacetQuery("country:canada");
Secondly, I'm not sure how to extract the facet data from the QueryResponse object.
So two questions:
1) Using SolrJ how can I facet on a field and return the groupings without explicitly specifying the groups?
2) Using SolrJ how can I extract the facet data from the QueryResponse object?
Thanks.
Update:
I also tried something similar to Sergey's response (below).
List<FacetField> ffList = resp.getFacetFields();
log.info("size of ffList:" + ffList.size());
for(FacetField ff : ffList){
String ffname = ff.getName();
int ffcount = ff.getValueCount();
log.info("ffname:" + ffname + "|ffcount:" + ffcount);
}
The above code shows ffList with size=1 and the loop goes through 1 iteration. In the output ffname="country" and ffcount is the total number of rows that match the original query.
There is no per-country breakdown here.
I should mention that on the same solrQuery object I am also calling addField and addFilterQuery. Not sure if this impacts faceting:
solrQuery.addField("user-name");
solrQuery.addField("user-bio");
solrQuery.addField("country");
solrQuery.addFilterQuery("user-bio:" + "(Apple OR Google OR Facebook)");
Update 2:
I think I got it, again based on what Sergey said below. I extracted the List object using FacetField.getValues().
List<FacetField> fflist = resp.getFacetFields();
for(FacetField ff : fflist){
String ffname = ff.getName();
int ffcount = ff.getValueCount();
List<Count> counts = ff.getValues();
for(Count c : counts){
String facetLabel = c.getName();
long facetCount = c.getCount();
}
}
In the above code the label variable matches each facet group and count is the corresponding count for that grouping.

Actually you need only to set facet field and facet will be activated (check SolrJ source code):
solrQuery.addFacetField("country");
Where did you look for facet information? It must be in QueryResponse.getFacetFields (getValues.getCount)

In the solr Response you should use QueryResponse.getFacetFields() to get List of FacetFields among which figure "country". so "country" is idenditfied by QueryResponse.getFacetFields().get(0)
you iterate then over it to get List of Count objects using
QueryResponse.getFacetFields().get(0).getValues().get(i)
and get value name of facet using QueryResponse.getFacetFields().get(0).getValues().get(i).getName()
and the corresponding weight using
QueryResponse.getFacetFields().get(0).getValues().get(i).getCount()

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

OrientDB Java Batch Import - java

Related

JAVA - MongoDB Select a value randomly according to the value of a field in my Data Base

JGraphT: Inconsistent graph size for same dataset

how to use mongodb java-driver Projections.slice

Composite index of edges & property (tinkerpop / orientDB)

Faceting using SolrJ and Solr4

Categories

Resources