I have a graph in OrientDB (uses Tinkerpop stack), and need to enable very fast lookups of edge values / properties / fields and edge in/out vertices.
So, basically the user will need to lookup as follows:
SELECT FROM myEdges WHERE inVertex = {VertexIdentity}, outVertex = {VertexIdentity}, property1 = 'xyz'
Essentially it's a composite index for the edge class, of 3 properties: inVertex, outVertex & property1
Basically - if the user already has a VertexIdentity for 2 vertices (maybe, in the form: #CLUSTER_ID:RECORD_ID) - and the the property value (in this case, xyz) - it will allow very fast lookup to see if the combination already exists in the graph (if 2 vertices are linked with property1) - without making a traversal.
So far I found the following code to help with composite indexes, but I cant see if it's possible to include in/out vertices in this (for a graph edge).
https://github.com/orientechnologies/orientdb/blob/master/tests/src/test/java/com/orientechnologies/orient/test/database/auto/SQLSelectCompositeIndexDirectSearchTest.java
Is it possible??
This is working fine for defining edge uniqueness:
OCommandSQL declareIn= new OCommandSQL();
declareIn.setText("CREATE PROPERTY E.in LINK");
OCommandSQL declareOut= new OCommandSQL();
declareOut.setText("CREATE PROPERTY E.out LINK");
OCommandSQL createIndexUniqueEdge= new OCommandSQL();
createIndexUniqueEdge.setText("CREATE INDEX unique_edge ON E (in, out) UNIQUE");
graph.command(declareIn).execute();
graph.command(declareOut).execute();
graph.command(createIndexUniqueEdge).execute();
In you case just add another property to the Edge class and consequently in the index
You can do it with OrientDB, just create the composite index against the in and out properties too (declare them in E class before).
This is used also as constraints to avoid multiple edges connect the same vertices.
Related
I just started at coding in Java and I have a class assignment I struggle with because of this step:
"The application queries the database on Atlas for two more foods (food2,
food3). These foods must meet the following conditions:
a. The composition of food2 and food3 is measured in the same unit as
food1. There are two distinct units2: gram (g) and milliliter (ml).
b. The value of the selected property must be non-zero and must differ at
least 50% from the value of food1"
Here is one of my data in my data base :
{"_id":{"$oid":"624d84c9135dfc3bf0fb72c7"},"Name":"Branntwein aus Wein (z.B.Cognac, Brandy)","Kategorie":"Alkoholhaltige Getränke/Spirituosen","Bezugseinheit":"pro 100 ml","Kalorien":"228","Fett":"0","Fettsäuren, gesättigt":"0","Fettsäuren, einfach ungesättigt":"0","Fettsäuren, mehrfach ungesättigt":"0","Cholesterin":"0","Kohlenhydrate":"1.9","Zucker":"1.9","Stärke":"0","Nahrungsfasern":"0","Protein":"0","Salz":"0","Alkohol":"31.4","Wasser":"61.6","Betacarotin":"0","Vitamin B1":"0","Vitamin B2":"0","Vitamin B6":"0","Vitamin B12":"0","Niacin":"0","Folat":"0","Pantothensäure":"0","Vitamin C":"0","Vitamin D":"0","Vitamin E":"0","Kalium":"1.9","Natrium":"1.9","Chlorid":"2.9","Calcium":"0","Magnesium":"1","Phosphor":"0","Eisen":"0","Jod":"0","Zink":"0"}
To give more context I have chosen to work with the field "Eisen", I succeeded in picking a value randomly (Which is my first food) but now I am supposed to pick 2 other values (food 2 and food 3) randomly but with the constraints in the step I have introduced in my first paragraphe.
Unfortunately, despite all the answers I find on internet I fail to implement one, here is my code to find one value randomly:
AggregateIterable Rando =
collection.aggregate(Arrays.asList(new Document("$match",
new Document("Eisen",
new Document("$gt", "0"))),
new Document("$sample",
new Document("size", 1L)),
new Document("$unwind",
new Document("path", "$Eisen")))); ArrayList randoList = Rando.into(new ArrayList());
System.out.println("Random" + randoList );
Thank you a lot for your help,
Version: 1.3.0
Graph Requirement: Child will have many parents or none.
Data Node: Id + List [Parent ids]
class Node {
String id;
List<String> parents;
}
Total dataset: 3500 nodes.
GraphType is selected using : Directed + No Multiple Edges + No Self Loops + No Weights + DefaultEdge
Graph Building logic:
Iterate through 3500 nodes
Create the vertex using node ID.
Graph.addVertex(childVertex)
then check if parents exists
if they do, iterate through parents
Create parent id vertex using parent id.
Graph.addVertex(parentVertex)
Graph.addEdge(parentVertex, childvertex)
However, running the same dataset (3500), for 5 times, I am getting different graph size calculated every time using: graph.vertexSet().size(). Expectation is 3500 everytime but its inconsistent.
All 3500 ids are unique and we should have graph size = 3500 but we got something like:
GraphType is - SimpleDirectedGraph and size is variable - 3500, 3208, 3283, 2856, 3284.
Any help would appreciated.
Thanks
I have a bigger problem with the batch import in OrientDB when I use Java.
My data is a collection of recordID's and tokens. For each ID exists a set of tokens but tokens can be in several ID'S.
Example:
ID Tokens
1 2,3,4
2 3,5,7
3 1,2,4
My graph should have two types of verticies: rIDClass and tokenClass. I want to give each vertex an ID corresponding to the recordID and the token. So the total number of tokenClass vertices should be the total number of unique tokens in the data. (Each token is only created once!)
How can I realize this problem? I tried the "Custom Batch Insert" from the original documentation and I tried the method "Batch Implementation", described in the blueprints documentation.
The problem at the first method is that OrientDB creates for each inserted token a separate vertex with a custom ID, which is set by the system itself.
The problem at the second method is when I try to add a vertex to the batchgraph I can't set the corresponding vertex Class and additionally I get an Exception. This is my code from the second method:
BatchGraph<OrientGraph> bgraph = new BatchGraph<OrientGraph>(graph, VertexIDType.STRING, 1);
Vertex vertex1 = bgraph.addVertex(1);
vertex1.setProperty("uid", 1);
Maybe someone has a solution.
Vertex vertex2 = bgraph.addVertex(2);
vertex2.setProperty("uid", 2);
Edge edge1 = graph.addEdge(12, vertex1 , vertex2, "EdgeConnectClass");
And I get the following Exception:
Exception in thread "main" java.lang.ClassCastException:
com.tinkerpop.blueprints.util.wrappers.batch.BatchGraph$BatchVertex cannot be cast to com.tinkerpop.blueprints.impls.orient.OrientVertex
at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.addEdge(OrientBaseGraph.java:612)
at App.indexRecords3(App.java:83)
at App.main(App.java:47)
I don't know if I understood correctly but, if you want a schema like this:
try this:
Vertex vertex1 = g.addVertex("class:rIDClass");
vertex1.setProperty("uid", 1);
Vertex token2 = g.addVertex("class:tokenClass");
token2.setProperty("uid", 2);
Edge edge1 = g.addEdge("class:rIDClass", vertex1, token2, "EdgeConnectClass");
Hope it helps
Regards
I have searched for two days and haven't seen any code that is closed to this. This is the only code in java that I seen and it's not exactly what I wanted.
conn.transact(list(list("db.fn/cas", datomic_id, "attribute you want to update", old value, new value))).get();
I have tried this code with a single value in the old value and a single value in the new value but it just stack the information instead of overlaying it.
Example: old value is chicken and new value is fish. After the transaction, it's [chicken, fish] instead of what I expected to be just [fish] and chicken will be move into archive(history).
So the question is, how do you ref the old array value and how do you give the new value an array so it'll update as what I expected to be as stated above.
I remember reading somewhere that under the hood it's just a series of values linking to one attribute. If this is the case does that mean that I have to find the datomic id of the string and change it? Also have to remove it if it's not on the new list?
FYI, these are the generic transaction functions I currently use for this kind of task (declared from Clojure, but should be fairly easy to adapt to Java if required):
[{:db/ident :bsu.fns/replace-to-many-scalars,
:db/doc "Given an entity's lookup ref, a to-many (scalar) attribute, and a list of new values,
yields a transaction that replaces the old values by new ones"
:db/id (d/tempid :db.part/user),
:db/fn (d/function
'{:lang :clojure,
:imports [],
:requires [[datomic.api :as d]],
:params [db entid attr new-vals],
:code (let [old-vals (if-let [e (d/entity db entid)] (get e attr) ())
to-remove (remove (set (seq new-vals)) old-vals)]
(concat
(for [ov to-remove] [:db/retract entid attr ov])
(for [nv new-vals] [:db/add entid attr nv]))
)}),
}
{:db/ident :bsu.fns/to-many-retract-all-but,
:db/doc "Given an entity lookup ref, a to-many (entity) attribute, and a list of lookup refs
expands to a transaction which will retract all the [origin `to-many-attr` target] relationships but those for which target is among the `to-spare-lookup-refs`"
:db/id (d/tempid :db.part/user),
:db/fn (d/function
'{:lang :clojure,
:imports [],
:requires [[datomic.api :as d]],
:params [db origin to-many-attr to-spare-lookup-refs],
:code (let [old-targets-ids (d/q '[:find [?t ...] :in $ ?to-many-attr ?origin :where [?origin ?to-many-attr ?t]]
db to-many-attr origin)
to-spare-ids (for [lr to-spare-lookup-refs] (:db/id (d/entity db lr)))
to-delete (->> old-targets-ids (remove (set to-spare-ids)))]
(for [eid to-delete] [:db/retract origin to-many-attr eid])
#_[old-targets-ids to-update-ids to-delete])}),
}]
I don't claim at all they're optimal performance or design-wise, but they've worked for me so far. HTH.
If you need a "last write wins" style consistent solution to replace all values for a particular entity for a card many attribute, your best bet is to go with a transaction function. You could take the following approach:
Get all datoms matching entity + attribute you want to retract all values for.
Generate retractions for all of them.
Create add transactions for all new values (e.g. from a passed collection)
Remove any conflicts (i.e. if you have the same EAV with both an add and a result)
Return the resulting transaction data.
I am using orient 2.1-rc4, executed the following commands: My motive is to fetch only the outgoing vertices path.
1. Correct Result with Simple Graph
create class Depends extends E
create vertex set name="persians"
create vertex set name="vikings"
create vertex set name="teutons"
create vertex set name="mayans"
create vertex set name="aztecs"
select * from v
\# |#RID|#CLASS|name
0 |#9:0|V |persians
1 |#9:1|V |vikings
2 |#9:2|V |teutons
3 |#9:3|V |mayans
4 |#9:4|V |aztecs
create edge Depends from #9:0 to #9:2
create edge Depends from #9:1 to #9:2
create edge Depends from #9:2 to #9:3
create edge Depends from #9:2 to #9:4
SELECT #this.toJSON('fetchPlan:in_*:-2 *:-1') FROM #9:2
{"out_Depends":[{"out":"#9:2","in":{"name":"mayans","in_Depends":["#11:2"]}},{"out":"#9:2","in":{"name":"aztecs","in_Depends":["#11:3"]}}],"name":"teutons"}
Only the outgoing nodes are fetched as expected.
2. Incorrect result
Adding two more vertices:
create vertex set name="britons"
create vertex set name="mongols"
create edge Depends from #9:5 to #9:0
create edge Depends from #9:6 to #9:4
select * from e
----+-----+-------+----+----
\# |#RID |#CLASS |out |in
----+-----+-------+----+----
0 |#11:0|Depends|#9:0|#9:2
1 |#11:1|Depends|#9:1|#9:2
2 |#11:2|Depends|#9:2|#9:3
3 |#11:3|Depends|#9:2|#9:4
4 |#11:4|Depends|#9:5|#9:0
5 |#11:5|Depends|#9:6|#9:4
----+-----+-------+----+----
Trying to fetch the out vertices as per http://orientdb.com/docs/last/Fetching-Strategies.html
SELECT #this.toJSON('fetchPlan:in_*:-2') FROM #9:2
{"out_Depends":["#11:2","#11:3"],"name":"teutons"}
Not all out vertices are fetched
SELECT #this.toJSON('fetchPlan:in_*:-2 *:-1') FROM #9:2
{"out_Depends":[{"out":"#9:2","in":{"name":"mayans","in_Depends":["#11:2"]}},{"out":"#9:2","in":{"in_Depends":["#11:3",{"out":{"name":"mongols","out_Depends":["#11:5"]},"in":"#9:4"}],"name":"aztecs"}}],"name":"teutons"}
Extra vertex mongols fetched, which means the rule has not being applied at other levels. (out_Depends is excluded only from the 0th level)
Adding a [*] to apply the exclusion rule on all levels as per the documentation
SELECT #this.toJSON('fetchPlan:[*]in_*:-2 *:-1') FROM #9:2
{"out_Depends":[{"out":"#9:2","in":{"name":"mayans","in_Depends":["#11:2"]}},{"out":"#9:2","in":{"in_Depends":["#11:3",{"out":{"name":"mongols","out_Depends":["#11:5"]},"in":"#9:4"}],"name":"aztecs"}}],"in_Depends":[{"out":{"name":"persians","out_Depends":["#11:0"],"in_Depends":[{"out":{"name":"britons","out_Depends":["#11:4"]},"in":"#9:0"}]},"in":"#9:2"},{"out":{"name":"vikings","out_Depends":["#11:1"]},"in":"#9:2"}],"name":"teutons"}
This however fetches the entire tree.
Can someone give a suggestion?
I followed your instructions and your query works, can you post the graph's image of your db,please?