I'm trying to write a query that returns a fairly large amount of data (200ish nodes). The nodes are very simple:
public class MyPojo{
private String type;
private String value;
private Long createdDate;
...
}
I originally used the Spring Data Neo4j template interface, but found that it was very slow after around 100 nodes were returned.
public interface MyPojoRepository extends GraphRepository<MyPojo> {
public List<MyPojo> findByType(String type);
}
I turned on debugging to see why it was so slow, and it turned out SDN was making a query for each node's labels. This made sense, as I understand SDN it needs the labels to do its duck-typing. However, Cypher returns all pertinent data in one go so there's no need for this.
So, I tried rewriting it as a Cypher query instead:
public interface MyPojoRepository extends GraphRepository<MyPojo> {
#Query("MATCH(n:MyPojo) WHERE n.type = {0} RETURN n")
public List<MyPojo> findByType(String type);
}
This had the same problem. I dug a little deeper, and while this query returned all node data in one go, it leaves out the labels. There is a way to get them, which works in the Neo4j console so I tried it with SDN:
"MATCH(n:MyPojo) WHERE n.type = {0} RETURN n, labels(n)"
Unfortunately, this caused an exception about having multiple columns. After looking through the source code, this makes sense because Neo4j returns a map of returned columns which in my case looked like: n, labels(n). SDN would have no way of knowing that there was a label column to read.
So, my question is this: is there a way to provide labels as part of the Cypher query to prevent needing to query again for each node? Or, barring that, is there a way to feed SDN a Node containing labels and properties and convert it to a POJO?
Note: I realize that the SDN team is working on using Cypher completely under the hood in a future release. As of now, a lot of the codebase uses the old (and, I believe, deprecated) REST services. If there is any future work going on that would affect this, I would be overjoyed to hear about it.
You're right it would be solvable for the simple use-case and should also solved.
Unfortunately the current APIs don't return the labels as part of the node so we would have to rewrite the inner workings to generate the extra meta-information and return all of that correctly.
One idea is to use RETURN {id:id(n), labels:labels(n), data:n)} as n for the full representation.
The problem is this breaks with user defined queries.
Not sure when and how we can schedule that work. Feel free to raise it as JIRA issue or watch/upvote any related issues.
Related
I'm coming from a C# background and trying to implement an Android App. In .Net C#, retrieving specific data from a database is relatively easy using Entity Framework and Linq, my usual approach is something like this (simplified for clarity):
public IQueryable<T> GetElements<T>()
where T : class, IDBKeyProvider
{
return this.db.Set<T>().Where(e => e.Dbstate == (int)DBState.Active);
}
This method call results in a generic IQueryable and later on, I can use the power of deferred execution and expression trees to specify exactly which elements I want using a predicate, loading only desired elements in memory.
This is something I would very much like to go for in my Android App, however, I'm not exactly sure how I could arrive at a similar result, if I can at all.
I looked into some Java Predicate examples, which seemed promising and I also found Room to be delightfully familiar. My problem, however, is that I cannot make my queries fully customizable due to the fact that, Room still needs some hard-coded info about my db (original here):
#Dao
public interface MyDao {
#Query("SELECT first_name, last_name FROM user WHERE region IN (:regions)")
public LiveData<List<User>> loadUsersFromRegionsSync(List<String> regions);
}
I could perhaps extract the relevant pieces of information with Java Reflection from the predicate parameter, but I feel this to be a hack rather than a proper solution.
Using Spring 5.0.6 and Spring-Data-Mongo 2.0.7, I have an issue when fetching entities being transformed into the wrong class. See the following simplified scenario:
Entity setup:
public class PersistableObject {
#Id #Field("_id") private String id;
}
#Document(collection = "myapp_user")
public class User extends PersistableObject {...}
public class RealUser extends User {...}
public class VirtualUser extends User {...}
So, there is a common MongoDB collection storing both types of User, discriminated by the automatically added _class property.
Furthermore, there is a Repository into which the MongoTemplate is injected.
#Autowired
private org.springframework.data.mongodb.core.MongoTemplate template;
Everything fine, so far. Now, if I want to fetch all documents that contain a RealUser, I could call this
template.findAll(RealUser.class)
I'd expect the template to find all documents that have the discriminator property _class set to com.myapp.domain.RealUser.
But this doesn't work as expected. I even get all VirtualUsers, as well, put into objects of type RealUser with all VirtualUser-specific properties missing, and all RealUser-specific properties set to null.
Furthermore, when I go and save a User, which is actually a VirtualUser in MongoDB, but has been squeezed into a RealUser class, Spring would change the _class-property to the wrong type, magically converting a VirtualUser into a RealUser.
So both methods here would load the entire collection and squeeze all objects into the specified class, even if it is the wrong one:
template.findAll(VirtualUser.class)
template.findAll(RealUser.class)
This behavior is probably not desired, or if so, then it is extremely misleading and harmful. You can easily shred your whole data with this.
Can anyone shed some light on this?
I've created a ticket at Spring's Jira. Find Olivers comment below:
The method actually works as expected but I agree that we need to
improve the JavaDoc. The method is basically specified as "Load the
documents the given type is configured to persisted in and map all of
them (hence the name) to the given type". The type given to it is not
used as a type mapping criteria at the same time. Every restriction
you want to apply on the documents returned needs to be applied
through a Query instance, which exposes a ….restrict(…) method that
allows to only select documents that carry type information.
The reason that findAll works the way it works is that generally
speaking – i.e. without an inheritance scenario in place – we need to
be able to read all documents, even if they don't carry any type
information. Assume a collection with documents representing people
that have not been written using Spring Data. If a
findAll(Person.class) applied type restrictions, the call would return
no documents even if there were documents present. Unfortunately we
don't know if the collection about to be queried carries type
information. In fact, some documents might carry type information,
some might not. The only way to reasonably control this, is to let the
user decide, which she can by either calling Query.restrict(…) or not.
The former selects documents with type information only, the latter.
As I said, I totally see that the JavaDoc might be misleading here.
I'm gonna use this ticket to improve on that. Would love to hear if
the usage of Query.restict(…) allows you to achieve what you want.
I'm struggling to find any type of documentation on how to query more complex attributes in my models.
For example I have
public class MyEmbedded{
#EmbeddedID
private MyEmbeddedPK embeddedPK;
}
#Embeddable
public class MyEmbeddedPK{
private Integer age;
private Integer zip;
}
In my repository I am implementing the CrudRepository and would expect to be able to do
public List<MyEmbedded> findByageAndZip(String age, String zip);
But that doesn't seem to work. The documentation doesn't really say anything regarding #EmbeddedId's. The same goes for querying a #OneToMany attribute, I never found anything for that.
Documentation I am referencing. http://docs.spring.io/spring-data/jpa/docs/current/reference/html/#repository-query-keywords
Is there any better documentation on how this query creation works?
I'm not sure if Spring Data Jpa supports this functionality and it seems a bit complex to query based on embedded id properties as it can equally be applicable to state fields of the enclosing entity itself. But this can be achieved easily with JP QL by specifying it with Query and #Param
#Query("SELECT m FROM MyEmbedded m WHERE m.embeddedPK.age = :age AND m.embeddedPK.zip = :zip")
public List<MyEmbedded> findByageAndZip(#Param("age") String age, #Param("zip") String zip);
Also don't forget to specify your repository with the following signature as Spring data runtime needs to know the actual type of the ID class.
#Repository
public interface MyEmbeddedRepository extends CrudRepository<MyEmbedded, MyEmbeddedPK> {..}
I think I found my answer, oddly enough it was in the documentation but I just didn't pick up on it. You just need to combine the properties together via camel case. I could have sworn I tried this but apparently I had my cases messed up.
http://docs.spring.io/spring-data/jpa/docs/current/reference/html/#repositories.query-methods.query-property-expressions
Section 4.4.3. Property expressions
However, you can also define constraints by traversing nested properties. Assume a Person has an Address with a ZipCode. In that case a method name of
List findByAddressZipCode(ZipCode zipCode);
creates the property traversal x.address.zipCode. The resolution algorithm starts with interpreting the entire part (AddressZipCode) as the property and checks the domain class for a property with that name (uncapitalized). If the algorithm succeeds it uses that property. If not, the algorithm splits up the source at the camel case parts from the right side into a head and a tail and tries to find the corresponding property, in our example, AddressZip and Code. If the algorithm finds a property with that head it takes the tail and continue building the tree down from there, splitting the tail up in the way just described. If the first split does not match, the algorithm move the split point to the left (Address, ZipCode) and continues.
I am struggling with solr to make a better search than current implementation on my code.
The current code looks into some caches/hashmaps to retrieve data and what I want to do is to optimize the query response time.
So I already indexed 2 version of documents (some simple documents which does not contain othe objects inside them.only strings and ints). and everything works great.
But now I'm facing another problem while I'm trying to index another core for a more complex bean.
I have a bean like:
Public class Person{
String name;
String surname;
List<Adresse> adress;
List<Stuff> stuff;
List<HashMap<String,String>> otherStuff;
}
Solr helped me only by mapping the simple Lists and the List of Maps, so I mannualy mapped the remaining members (Lists) by transforming from object to List of strings and viceversa from string to Object and set the value into current obtaining object.
But this approach caused really slow response times for my queries.
I am also facing another problem.The execution times gets very slow while I'm receiving more that 10 documents from the index.
Can you guys please help me with suggestions/ideas on how to make all this faster ???
If you have a very complex structure, you might be better off not trying to get it back from Solr. Instead, have the field definitions with stored=false and just get back IDs. Then, round-trip to your original source to get the actual objects.
Then, Solr because just the way to search and you can skip sending any fields to it that you are not searching against.
I'm working on a desktop application in Java6 using H2 as the db and Hibernate 3.6.
Because of a construct with a third-party library involving JNI and some interesting decisions made a priori, I am unable to pass around long identifiers in their index code, and can only pass int. These indexes are generated quickly and repeatedly(not my choice), and get handed around via callbacks. However, I can split my expected dataset along the lines of a string value, and keep my id size at int without blowing out my id's. To this end, I'm keeping a long value as pk on the core object, and then using that as a one-to-one into another table, where it maps the int id back to the core entity, which when combined with the string, is unique.
So I've considered embedded compound keys and such in hibernate, but what I REALLY want is to just have this "extra" id that is unique within the context of the extra string key, but not necessarily universally unique.
So something like(not adding extraneous code/annotations):
#Entity
public class Foo{
...
#Id
public Long getId(){...}
...
#OneToOne
#PrimaryKeyJoinColumn
public ExtraKey getExtra(){...}
}
#Entity
public class ExtraKey{
...
#Id
public Long getFooId(){...}
...
public Integer getExtraId(){...}
...
public String getMagicString(){...}
}
In that case, I could really even remove the magicString, and just have the fooId -> extraId mapping in the table, and then have the extraId + magicString be in another where magicString is unique. However, I want hibernate to allow the creation of new magicString's at whim(app requirement), ideally one per row in a table, and then have hibernate just update the extraId associated to that magicString via incrementation/other strategy.
Perusing all of the hibernate manuals and trying a few tests on my own in a separate environment has not quite yielded what I want(dynamically created named and sequential id's basically), so I was hoping for SO's input. It's entirely possible I'll have to hand-code all of it myself in the db with sequences or splitting a long and doing logic on the upper and lower, but I'd really rather not, as I might have to maintain this code someday(really likely).
Edit/Addendum
As a sneaky way of getting around this, I'm just adding the extraId to the Foo object(ditching the extraKey class), and generating it from another object singleton, that at load time, does a group by select over the backing Foo table, returning magicKey, and the max(extraId). When I create a new Foo, I ask that object(multithread safe) to hand me the next extraId for the given magicKey and push that into Foo, and store it, thus updating my effective extraId for each magicKey on next app reload without an extra table. It costs me one group by query on the first request for a new extraId, which is suboptimal, but it's fast enough for what I need, simple enough to maintain in the future, and all contained in an external class, so I COULD replace it in one place if I ever come up with something more clever. I do dislike having the extra "special query" in my dao for this purpose, but it's easy enough to remove in the future, and well-documented.
Maybe I still didn't understand your problem properly, but I think you can consider using Hibernate's hilo algorithm. It will generate unique identifier for the whole database, based on a table that Hibernate creates and manages. More details here:
http://docs.jboss.org/hibernate/core/3.5/reference/en/html/mapping.html#mapping-declaration-id