java complex search

java complex search - java

I am trying to build a java based web application using Spring Boot and REST architecture using Spring MVC for the following purpose:
Search car parts through multiple set of criteria.
I try to explain it in different scenarios:
find part A of Brand B for Make C of Model D of year x.
find out what parts are available of Brand B for Make C of Model D of Year x.
Search multiple items at once for Vehicle C of Model D of Year x. For example if an engine is damaged and I want to quickly find out whether I have the parts (like pistons, cylinders, gaskets, etc.) in the supply. The result of this search is a list of the parts with their brands and prices.
My primary concern at this moment are the following two questions:
How should I model the data so that the search scenarios are achieved efficiently? What I mean is that how the relation between the entities in the Java and the persistence system should look like?
What kind of database should I use? SQL or NoSQL?
All the REST end-points will return Json objects.
I will be using Angular with Bootstrap for the front-end

Isn't this scenario a typical "faceted search"? I think that any solution designed to implement faceted search should work fine. For example Solr or Elasticsearch.
The advantage of the "faceted search" for the end users is the option to refine the search. Users can start with a broad search and the system will provide refining filter criteria, based on the current search results.
Today, all the major e-commerce sites have a kind of faceted search and every search engine support this type of browsing.
It seems to me that engines like Solr and Elasticsearch are the more natural solution, but even a standard RDBMS like Oracle has support for faceted search.
Faceted search in Solr
Filters vs. Facets: Definitions

I would put the focus on modelling it cleanly rather than efficiently - unless you already know that you will have a massive amount of data. Having it structured cleanly will make it easy to optimise if that is required later.
Normalise your data - there will be plenty of information out there on how to do this. The car industry is becoming more consolidated so many parts are now shared across different models and even different brands.
An ORM like hibernate can be used to map your entities to your tables. Spring provides extra support in this area which you might consider as you are already plan on using Spring MVC.

Related

Generate Search SQL from HTTP GET request parameters

We have a Java web app with a hibernate backend that provides REST resources. Now we're facing the task to implement a generic search that is controlled by the query parameters in our get request:
some/rest/resource?name_like=foo&created_on>=2012-09-12&sort_by_asc=something
or similar.
We don't want to predefine all possible parameters(name, created_on,
something)
We don't want to have to analyze the request String to pick up control characters (like >=)
nor do we don't want to implement our own grammar to reflect things like _eq _like _goe and so on (as an alternative or addition to control characters)
Is there some kind of framework that provides help with this mapping from GET request parameters to database query?
Since we know which REST resource we're GETing we have the entity / table (select). It probably will also be necessary to predefine the JOINs that will be executed in order to limit the depths of a search.
But other than that we want the REST consuming client to be able to execute any search without us having to predefine how a certain parameter and a certain control sequence will get translated into a search.
Right now I'm trying some semi automatic solution building on Mysemas QueryDSL. It allows me to predefine the where columns and sort columns and I'm working on a simple string comparison to detect things like '_like', '_loe', ... in a parameter and then activate the corresponding predefined part of the search. Not much different from an SQL String except that it's SQL injection proof an type save.
However I still have to tell my search object that it should be able to potentially handle a query "look for a person with name like '???'". Right now this is okay as we only consume the REST resource internally and isolate the actual search creation quite well. If we need to make a search do more we can just add more predefinitions for now. But should we make our REST resources public at some time in the future that won't be so great.
So we're wondering, there has to be some framework or best practice or recommended solution to approaching this. We're not the first who want this. Redmine for example offers all of its resource via a REST interface and I can query at will. Or facebook with its Graph API. I'm sure those guys didn't just predefine all possibilities but rather created some generic grammar. We'd like to save as much as possible on that effort and use available solutions instead.
Like I said, we're using Hibernate so an SQL or HQL solution would be fine or anything that builds on entities like QueryDsl. (Also there's the security issue concerning SQL injection)
Any suggestions? Ideas? Will we just have to do it all ourselves?

From a .NET perspective the closest thing I can think of would be a WCF data service.
Take a look at the uri-conventions specified on the OData website. There is some good information on the section on 4.5 Filter System Query Option. You'll notice that a lot of the examples on this site are .NET related, but there are other suggestions for getting this to work with Java.

How to model entity relationships in GAEJ?

I would like to know -an example is highly appreciated-
How to model relationships in Google App Engine for Java?
-One to Many
-Many to Many
I searched allover the web and I found nothing about Java all guides and tutorials are about Python.
I understood from this article that in Python the relationships are modeled using ReferenceProperty. However, I found nothing about this class in the Javadoc reference.
Furthermore, in this article they discussed the following:
there's currently a shortage of tools for Java users, largely due to the relative newness of the Java platform for App Engine.
However, that's was written in 2009.
At the end, I ended up modeling the relationships using the ancestor path of each entity. I discovered afterwords that this approach has problems and limit the scalability of the app.
Can you please guide me to the equivalent Java class to the Python's ReferenceProperty class? Or can you please give me an example of how to model the relationships in AppEngine using the java datastore low-level API.
Thanks in advance for your help.

Creating relationships between entities in GAE/J depends on db API that you are using:
JDO: entity relationships.
JPA: see docs.
Objectify: single-value relationships.
Low-level API: add a Key of one Entity as a property to another Entity: see property types.

Just a tip. When defining your data model think in terms of end-user queries and define your data model accordingly.
For example, let's take the example of a store renting books. In a traditional application, you would have three main entities :
--> Book
--> Client
--> Rent (to solve the many-to-many)
To display a report with which client is renting which book, you would issue a query joining on the Rent table, Book table and client table.
However, in GAE that won't work because the join operation is not supported.
The solution I found (maybe other solution) is to model with the same three tables but embedding the book and client definitions in the Rent table.
This way, displaying the list of books being rent by whom is extremely fast and inexpensive. The only drawback is that if for example the title of a book changes, I have to go through all the embedded objects. However, how often does that happen vs. read-only queries.
As a summary, think in terms of end-user queries

How to manage two different entities in SOLR?

I have several different entities I want to index in SOLR, for example:
users
products
blogs
All are completely different in schema.
All are searched for in different places in my app.
Is there a way to do it in the same core? Is it the right approach?
Is a core the conceptual equivalence of a table in a relational DB (In which case the answer is obvious).

Really depends on how you will search this data. The main question is: What will you search for?
If you will search for products (i.e. the search results are products), then design the schema around products. If you search for products by users or blogs, model users/blogs as dynamic/multivalued fields.
If you have an app that searches for products, and another app that searches for blogs, and they're completely unrelated, put them in separate cores.
From the Solr wiki:
The more heterogeneous (different kinds of data) you have in one field or in one index, the less useful it is.
So don't blindly put everything in a single core. Carefully consider what your search scenarios are.

Here is some guidance from the Solr Wiki on Flattening Data into a Single Index. The key take away from flattening data is:
This type of approach can be particularly well suited to situations where you need to "blend" results from conceptually distinct sets of documents.
If you want to index your three types and keep them separate and distinct, you can leverage Cores within Solr to keep them fairly isolated, but allow you to manage them under one Solr container.

Implementing a named search criteria

I'm looking for implementation ideas for the following scenario:
I have a search screen with a bunch of dropdowns and free form text fields/text areas. I would like to provide the option of saving the search criteria for the logged in user so they can reuse that criteria later if they want to. Some options I could think of are name/value pairs, XML & serialized objects. Are there other options and what are your recommendations on the best option. We use glassfish, hibernate, oracle, j2ee 1.4 & java 6.. Thank you for your help!!

I think as part of your validation logic you can persist the name value pairs that were used for search. For enterprise there are no personalization features that can be easily adopted, depending upon the requirements we have to go with one of the above.

XML vs. object trees

In my current project (an order management system build from scratch), we are handling orders in the form of XML objects which are saved in a relational database.
I would outline the requirements like this:
Selecting various details from anywhere in the order
Updating / enriching data (e.g. from the CRM system)
Keeping a record of the changes (invalidating old data, inserting new values)
Details of orders should be easily selectable by SQL queries (for 2nd level support)
What we did:
The serialization is done with proprietary code, disassembling the order into tables like customer, address, phone_number, order_position etc.
Whenever an order is processed a bit further (e.g. due to an incoming event), it is read completely from the database and assembled back into a XML document.
Selection of data is done by XPath (scattered over code).
Most updates are done directly in the database (the order will then be reloaded for the next step).
The problems we face:
The order structure (XSD) evolves with every release. Therefore XPaths and the custom persistence often breaks and produces bugs.
We ended up having a mixture of working with the document and the database (because the persistence layer can not persist the changes in the document).
Performance is not really an issue (yet), since it is an offline system and orders are often intentionally delayed by days.
I do not expect free consultancy here, but I am a little confused on how the approach could be improved (next time, basically).
What would you think is a good solution for handling these requirements?
Would working with an object graph, something like JXPath and OGNL and an OR mapper be a better approach? Or using XML support of e.g. the Oracle database?

If your schema changes often, I would advise against using any kind of object-mapping. You'd keep changing boilerplate code just for the heck of it.
Instead, use the declarative schema definition to validate data changes and access.
Consider an order as a single datum, expressed as an XML document.
Use a document-oriented store like MongoDB, Cassandra or one of the many XML databases to manipulate the document directly. Don't bother with cutting it into pieces to store it in a relational db.
Making the data accessible via reporting tools in a relational database might be considered secondary. A simple map-reduce job on a MongoDB, for example, could populate the required order details into a relational database whenever required, separating the two use cases quite naturally.

The standard Java EE approach is to represent your data as POJOs and use JPA for the database access and JAXB to convert the objects to/from XML.
JPA
Object-to-Relational standard
Supported by all the application server vendors.
Multiple available implementations EclipseLink, Hibernate, etc.
Powerful query language JPQL (that is very similar to SQL)
Handles query optimization for you.
JAXB
Object-to-XML standard
Supported by all the application server vendors.
Multiple implementations available: EclipseLink MOXy, Metro, Apache JaxMe, etc.
Example
http://bdoughan.blogspot.com/2010/08/creating-restful-web-service-part-15.html

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.