How to properly outline an Apache Solr document? - java

What is the difference between delta import and full import in ApacheSolr?
What are all the naming conventions to be followed while writing deltaImportQuery and deltaQuery ( ID, TXT_ID etc), any references or tutorial explaining in detail about differences/relations between deltaImportQuery and deltaQuery, Like what it is and what it does etc, How to write deltaImportQuery and deltaQuery ?
How to configure multiple entities in one document, Suppose if there are three tables in database like T1, T2, T3, Then in schema.xml how to configure this, issue with only one <uniquekey>somename</uniquekey> been considered for each schema.xml file?
How to parse BLOB type input from mysql, following convert ( column_name using utf8 ) as alias_name solves this but what is the right convention ,some other methods are also available like using TikaEntityProcessor/ writing custom BLOBTRANSFORMERS etc ?
Just like ORM any concepts explaining how to denormalize and outline an Apache Solr document , Any showcase project including all use cases and features ?
How to configure entities like this in data-config.xml?
<dataConfig>
<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/solrdemo" user="root" password="123" batchSize='1' />
<document name="user">
<entity name ="user1" query = "SELECT * FROM user1">
<field column='id' name='id1' />
</entity>
<entity name ="user2" query = "SELECT * FROM user2">
<field column='id' name='id2' />
</entity>
<entity name ="user3" query = "SELECT * FROM user3">
<field column='id' name='id3' />
</entity>
</document>
</dataConfig>
When the above kind of configuration is done then in schema.xml which id should be configured into <uniquekey></uniquekey> ?
The result of above configuration is
Indexing completed. Added/Updated: 2,866 documents. Deleted 0 documents. (Duration: 03s)
Indexing is successfully completed but 0 documents added / updated , How to resolve this issue ?
Overall any references available for proper conventions and configurations to work with Apache Solr?

Related

Solr data-config multiple statements in query

Solr example:
<entity name="item" query="select * from item">
If I want to do a query like this:
<entity name="item" query="BEGIN some_package.some_function(x,y); END;
select * from item">
how can I achieve this in Solr? In the event it isn't obvious what I want to do, I want to run some_function(x,y) right before I execute the standard select * query against my table. I have achieved similar functionality in JDBC

Does not working less than operator in solr query

I'm using solr4.7 with CoreContainer to create core and EmbeddedSolrServer to connect and ModifiableSolrParams to fetch the data..
I have configured "solrconfig.xml" with requestHelper "import" to import the data and other configuration file for data config as under....
<dataConfig>
<dataSource driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost:3306/koupon"
user="root"
password="root" />
<document>
<entity name="koupon"
query="SELECT k.kouponid as kid, k.name, k.image,k.description,k.discount as discount,
k.startdate,k.enddate,k.actualamount,
k.discountamount,k.discountamount, c.name as category
FROM koupon.koupon k
INNER JOIN koupon.category c on c.categoryid = k.categoryid
where
k.startDate <= NOW() and endDate >= NOW()
AND c.isActive=true AND c.isDeleted=false AND k.isActive =true AND lower(k.status)=lower('approved')
order by k.kouponid">
<field column="kid" name="kid"/>
<field column="name" name="name"/>
<field column="image" name="image"/>
<field column="description" name="description"/>
<field column="startdate" name="startdate"/>
<field column="enddate" name="enddate"/>
<field column="actualamount" name="actualamount"/>
<field column="discountamount" name="discountamount"/>
<field column="discount" name="discount"/>
<field column="category" name="category"/>
</entity>
</document>
</dataConfig>
In this code use "k.startDate <= NOW() and endDate >= NOW()" for fetching in between record but solr query not providing this.
I have one solution that's start "To" End but that's not exact solution..
I am very tired for this Issue any know about this? How to solve this ?
You can try to use this mysql query for indexing:
WHERE (NOW() BETWEEN k.startDate and k.endDate)
instead of:
k.startDate <= NOW() and endDate >= NOW()
to prevent the error
The value of attribute "query" associated with an element type
"entity" must not contain the '<' character
For your Solr query
startdate:[* to NOW]
to work, make sure your startdate field conforms to the Solr dateField type. For more info on this, check this link

Solr query only returns Id's

I have wanting to import data from a table and index it using solr..
I am using solr-tomcat admin panel.
But whenever I query it returns to me only the id's and value.
I have also tried adding FIELDS to fl , but that also does not help.
here is my data-config.xml file:
<dataConfig>
<dataSource type="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://127.0.0.1:3306/{DB_NAME}"
user="{DB_USER}"
password="{DB_PASSS}"
/>
<document>
<entity name="id" query="select s3_location,file_name from video">
<field column="s3_location" name="s3_location"/>
<field column="file_name" name="file_name"/>
</entity>
</document>
</dataConfig>
Is there any way to get the above s3_location and file_name fields also.
You need to specify the actual field names in the fl parameter or use * to indicate all fields. Also, please note that the fields must have been defined with stored=true in your schema.xml file for them to be returned/visible during a query.
fl=id,s3_location,file_name
fl=*
Are you sure you are importing the data at all? If you start with empty index, do you get anything?
The reason I ask is because you are not mapping the id field explicitly. Now, I believe there is implicit mapping of the fields by Jdbc data source based on names, but relying on it is risky when you are just starting.
Otherwise, like Paige said, make sure you defined those fields in your schema and that they are actually stored.

Solr search is returning partial string matches

Using Solr 3.6.1, I have this field in my schema.xml:
<field name="names" type="text_general" indexed="true" stored="false" multiValued="true"/>
<dynamicField name="names_*" type="text_general" indexed="true" stored="true"/>
The documentation in the schema.xml states that "text_general" should:
tokenize with StandardTokenizer
removes stop words from case-insensitive "stopwords.txt" (which is currently empty)
down cases the string.
At query time only, it also applies synonyms (which is also empty at this time)
I have two documents indexed in Solr with this data for the field:
<!-- doc 1 -->
<str name="names_data">Name ABC Dev Loc</str>
<!-- doc 2 -->
<str name="names_data">Name ABC Dev Location</str>
When I execute the following query:
id:(doc1 OR doc2) AND names:Dev+Location)
Both documents are returned. I would have expected that only doc2 would have been returned based on my understanding of how Solr's StandardTokenizer works.
Why does "Dev+Location" match "Dev Loc" and "Dev Location"?
The type text_general is probably configured to use a stemmer, which is treating Loc as a variant of Location.
You could configure the type to not use a stemmer, or try searching for the whole string using names:"Dev Location"
This might be why.
This part of the query names:Dev+Location is only searching where names:Dev since the Location term does not have a field name qualifier it is searching for Location against whatever the <defaultSearchField> is set to in schema.xml
So you could try to quote the field like names:"Dev Location" or prefix it names:Dev AND names:Location

How to do inheritance in Hibernate?

I have some 10 tables with the following schema
ID
Year
Code
Section
Period
Date
Status
Each table has a name data_1, data_2 and so on. Now I want to write Hibernate mapping for these tables. As all these tables have the same schema with only the names different I wrote a POJO file with data as super class and all the other 10 classes inheriting it.
What do I do now with the hbm files? Do I have to write one hbm file for each table? I tried the union-subclass, but somehow I couldn't get it right. I am getting a lot of unexplained errors in Hibernate.
How can I write the Hibernate mapping in this type of scenario? I am a starter in Hibernate and please note that the choice of database design is not in my hands. I have 30 such similar type of hierarchies.
First you must understand that there is no such thing as inheritance in relational database system. But there are strategies to map the inheritance structure to the database.
Check out the hibernate documentation at http://docs.jboss.org/hibernate/core/3.3/reference/en/html/inheritance.html
As far as I understand, your strategy is "Table per concrete class"
First of all you need to choose what type of strategy you want for the inheritance. There are a few options.
Have a look at this link here where it describes a little about inheritance in JPA. Hibernate supports JPA, so the mappings should be the same, ie:
Note however that this is the mapping type for EclipseLink
<entity name="Project" class="Project" access="FIELD">
<table name="PROJECT"/>
<inheritance strategy="JOINED"/>
<discriminator-value>P</discriminator-value>
<discriminator-column name="TYPE"/>
<attributes>
<id name="id"><column name="ID"/> </id>
</attributes>
</entity>
<entity name="LargeProject" class="LargeProject" access="FIELD">
<table name="L_PROJECT"/>
<discriminator-value>L</discriminator-value>
</entity>

Categories