Gremlin Java : Get Vertices by max Edge count - java

Suppose we have a graph like this.
(User)-[:KNOWS]->(Friend)
I want to count all outgoing relationship from User and group them by user, then add some condition to filter. (Like more than 10 Knows)
This is what I did,
g.V().hasLabel("Friend").in("KNOWS").hasLabel("User").groupCount().next()
This is returning a map, so I can add the condition to filter the results. My question is , do we have any efficient alternative way to do this ?

I am not sure if I understood your question correctly, but it sounds like you just want to filter all users based on their number of outgoing edges with the label knows.
In that case you can directly start at the User vertices and filter them based on the number of their KNOWS edges instead of doing a groupCount:
g.V().hasLabel('User').where(outE('KNOWS').count().is(gt(10)))
Until now I ignored any performance constrains. But as Paul Jackson mentioned in his comment it is not efficient to execute such a query in OLTP mode like this. Neo4j will probably iterate over all vertices, check whether they have the label User and then count their KNOWS edges.
You basically have two options to speed this up:
As Paul Jackson suggested: Add the edge count as a property to the vertices, pre-compute it and then index this property or
Use something like Spark-Gremlin if you really want to compute the edge count on the fly.

Related

Alphanumeric sorting in Gremlin

I have a database and am trying to find a safe way to read this data into an XML file. Reading the verticies is not the problem. But with the edges the app runs since a while into a HeapSpace-Exception. g.V().has(Timestamp)
.order()
.by(Timestamp, Order.asc)
.range(startIndex, endIndex)
.outE()
Data to be Taken eg. Label, Id, Properties
I use the timestamp of the verticies as a reference for sorting to avoid capturing duplicates.
But because some verticies make up the largest value of all outEdges, the program runs out of the heap. My question is can I use the alphanumeric ID(UUID) of the edges to sort them safely or is there another or better way to reach the goal.
Something like this:
g.E().order().by(T.id, Order.asc).range(startIndex, endIndex)
Ordering the edges by T.id would be a fine property to order by however the problem is not related to the property chosen, it is instead related to the sheer number of edges being exported. Neptune, as with other databases, has to retrieve all the edges in order to then order them. Retrieving all these edges is why you are running out of heap. To solve this problem you can either increase the instance size to get additional memory for the query or export the data differently. If you take a look at this utility https://github.com/awslabs/amazon-neptune-tools/tree/master/neptune-export you can see current recommended best practice. This utility will export the entire graph as a CSV file. In your use case you may be able to either use this utility and then transform the CSV into an XML document or modify the code to export as an XML document.

Count number of agents at a node

I have set up a polygonal node (called area_wait) that a single type of agent remain at whilst in a queue. I'm trying to find the number of agents at a node using a function. I don't want to count the agents in the queue as I have set up one queue for all waiting agents which might be at different nodes.
I'm using the following code which always returns zero.
int count_X = area_wait.agents().size();
In fact the list is empty when I check with:
List list_X = area_wait.agents();
What am I doing wrong? Thanks in advance.
I will give you the same answer I gave in the anylogic users group which can be found here: https://www.linkedin.com/feed/update/urn:li:activity:6721800348408791040
so this function you are trying to use doesn't work... unless the thing that is inside the node is a transporter and only if the node has a speed or access restriction... this might either be a bug or something explained poorly on the documentation, but it sounds like a bug to me
If you want to know the number of agents in a node you can use the alternative method count(myAgents,a->a.getNetworkNode()!=null && a.getNetworkNode().equals(yourNode)) but this fails if you change the node position without a moveTo block or some other natural movement (such as defining your node in the agent location parameter of a block)... so.. that's another bug, but maybe it won't apply to you
So summary... no easy and safe solution as far as I know

Clustering of images to evaluate diversity (Weka?)

Within a university course I have some features of images (as text files). I have to rank those images according to their diversity.#
The idea I have in mind is to feed a k-means classifier with the images and then compute the euclidian-distance from the images within a cluster to the cluster's centroïd. Then do a rotation between clusters and take always the (next) closest image to the centroïd. I.e., return closest to centroïd 1, then closest to centroïd 2, then 3.... then second closest to centroïd 1, 2, 3 and so on.
First question: would this be a clever approach? Or am I on the wrong path?
Second question: I'm a bit confused. I thought I'd feed the data to Weka and it'd tell me "hey, if I were you, I'd split this data into 7 clusters", or something like that. I mean, that it'd be able to give me some information about the clusters I need. Instead, to use simplekmeans I'm supposed to know a priori how many clusters I'll use... how could I possibly know that?
One example of what I mean: let's say I have 3 mono-color images: light-blue, blue, red.
I thought Weka would notice that the 2 blues are similar and cluster them together.
Btw I'm kind of new to Weka (as you might have seen) so if you could provide some information on which functions I miggt want to use (and why :P) I'd be grateful!
Thank you!
Simple K-means - is an algorithm where you have to specify a number of the possible clusters in the data set.
If you don't know how many clusters there might be, it's better to get different algorithm or find out a number of the clusters.
You can use X-means -there you don't need to specify k parameter. (http://weka.sourceforge.net/doc.packages/XMeans/weka/clusterers/XMeans.html)
X-Means is K-Means extended by an Improve-Structure part In this part of the algorithm the centers are attempted to be split in its region. The decision between the children of each center and itself is done comparing the BIC-values of the two structures.
or you can observe a cut point chart based on AHC - hierarchical clustering algorithm (https://en.wikipedia.org/wiki/Hierarchical_clustering)
and then deduct a number of the clusters

How to calculate what entity is drawn first (2D java rendering using Slick2D)

Explanation:
EDIT3: MASSIVE CLEAN UP as this was not clearly explained.
I'm trying to build up a 2D level out of tiles and entities. Where the entities are for example trees that can be cut. I need to store the data (how many chops are left for example) for each entity. I want them to have a more dynamic position (doubles) and a more dynamic sprite-width and height. My tiles are 32x32 pixels whilst my trees are not going to be one tile but a sprite with greater height than width.
I want objects that are closer to the top of the level to be drawn before the other objects. In this case a character behind the tree will cannot be rendered in or in front of the tree. This case also applies to other objects of the same kind (like trees).
I think it might be too inefficient to loop through the entities and calculate each entity's position since there may be a LOT of entites in the level.
As I've done some research I found that certain libraries allow the storage of both the object and it's position in a MAP (BiMap in google's Guava).
Questions:
Is this an inefficient manner.. but are there some changes that can
be applied to make the rendering more efficient (if so, what could be
optimized)?
Or is this an inefficient manner to render the entities and is
there a better way (if so, what other methods are there in Java)?
Or is there something else that I haven't listed?
EDIT2: I looked through the link I've posted in the edit below.
It seems that Google's Guava (I think that's all correct) has BiMaps. Is there an equivalent to this in regular Java? Otherwise Google's Library will probably be able to fix this for me. But I'd rather not install such a huge library for this one interface.
At last:
It's very much possible that the answer has been right in front of my nose here on StackOverflow or somewhere else on the internet. I've tried my best searching but found nothing.
If you've got any suggestions for search queries or any relevant links that might be of use to me I would appriciate it if you'd post them in the comments.
Thanks for taking the time to read through this/helping me ;)
EDIT:
I have looked at; Efficient mapping of game entity positions in Java .
I think it's narrowly related to this question. But I think it's just not what I'm looking for. I am going to look through the second answer very closely since that might be able to solve this for me.. but I'm not sure.
SOLUTION
The solution is to have an array, arraylist or another manner to keep track of your entities. Every tick/update you'll take all the object's Y coordinates and store them in another array/arraylist/map/other with the same size as where the entities are stored in. On every equivalent position to the entity you'll store it's Y. Then you'll order it with another loop or using http://www.leepoint.net/notes-java/data/arrays/70sorting.html .
Then when rendering:
for(int i = 0; i < entityArray.length; i++)
entityArray[i].render();
Off course you'll render it more efficiently by rendering only whats on or near your screen.
But that's basically how one does this in 2D top-view/front-view.
In my own 2d game attempts I come up with the following solution:
use an enum to specify different types of objects in game and give them priorities (sample order: grass, rivers, trees, critters, characters, clouds, birds, GUI)
make all visual objects implement interface which allows for getting this DrawPriority enum
use a sorted implementation of list with comparator based on the enum
use the list to draw all elements
That way the order computing is not very expensive, because it is done only on Visual Object insertion (which is in my case done while loading a level).
.. And since you will already using a comparator, do a x/y comparison when the enum priority values are the same. This should solve your y-order draw problem.

Draw nodes in e.g. a Chord ring

I have a set of nodes that I would like to put into a ring. They all have a numeric property which I would like to use a reference when putting into a ring.
E.g, node with param 32 comes after node with para 22.
What I really need is a library (or something like that) which can make it possible to have the correct "distance" between the nodes, e.g: between 22 and 32 is 10 "units", and between 32 and 35 is 3 "units" where "units" may be an empty numeric slot.
Sounds like you need a sorted list where the end links to the start. I know of no standard implementation, but it would be pretty easy to implement one yourself.
Something like a doubly linked list with the head and tail connected would work. Add operations would have to traverse the list to find the appropriate position to insert into, making insert an O(n) operation. This would make your list perform realtivly poorly, with pretty much all standard list operations being O(n).
You could implement a distanceToNext and/or distanceToPrevious pretty easily by just getting the values of the current and next/previous nodes and returning the difference.
Edit:
Just realised from the question title that you are probably looking for some GUI library to draw these and I just hinted at the model you might use. I'll have a think about the GUI.
Edit 2:
Your problem boils down to how do you draw a polygon when you only know the length of the sides. I asked on the maths stack exchange for you.

Categories