Solr - Index MySQL Database - java

Is it possible index a complete database without mentioning the table names explicitly in the data-config.xml as new tables are added everyday and I cannot change the data-config.xml everyday to add new tables.

Haven table names based on the date smells like there is something wrong in your Design. But given this requirement in your question you can add Data to your solr server without telling you have a DB. You just have to make sure you hav a unique ID for the data record in you solr Server with whom you can identify the corresponding record in your DB, something like abcd_2011_03_19.uniqueid. You can post the data to solr in Java in solrj or just plain xml or json.
Example:
--------------
| User Input |
--------------
|post
V
-----------------------------------
| My Backend (generate unique id) |
-----------------------------------
|post(sql) |post (e.g. solrj)
V V
------ --------
| DB | | solr |
------ --------
My ascii skillz are mad :D

Related

How to format data in column for WHERE clause just before executing SELECT?

I am using Microsoft SQL Server with already stored data.
In one of my tables I can find data like:
+--------+------------+
| id | value |
+--------+------------+
| 1 | 12-34 |
| 2 | 5678 |
| 3 | 1-23-4 |
+--------+------------+
I realized that the VALUE column was not properly formatted when inserted.
What I am trying to achieve is to get id by given value:
SELECT d.id FROM data d WHERE d.value = '1234';
Is there any way to format data in column just before SELECT clause?
Should I create new view and modify column in that view or maybe use complicated REGEX to get only digits (with LIKE comparator)?
P.S. I manage database in Jakarta EE project using Hibernate.
P.S.2. I am not able to modify stored data.
One method is to use replace() before the comparison:
WHERE REPLACE(d.value, '-', '') = '1234'

Compare Two Table and Update Second Table

I am new to java and I am working on two table one table is on server side and another one is on client end. What i want is when there is any update in server database it should reflect in client database and when client data not found in server it should get deleted. In below code i fetch both table data.
Server Table 1
ID | NAME
1 | ABC
2 | ABC
4 | ABC
5 | ABC
Local Table 2
ID | NAME
1 | ABC
2 | ABC
3 | ABC
4 | ABC
Expected Result on Local Table
ID | NAME
1 | ABC
2 | ABC
4 | ABC
5 | ABC
Here is my code
// LOCAL DATA
ResultSet local_data = stmth2.executeQuery("SELECT * FROM emoticon");
local_data.next();
//SERVER DATA
Class.forName("com.mysql.jdbc.Driver");
Connection con9 = DriverManager.getConnection("jdbc:mysql://localhost:3306/mood","root","");
Statement stmt9 = con9.createStatement();
ResultSet server_data = stmt9.executeQuery("SELECT * FROM emo");
Is there any standard way to achieve this
If both tables are on different MySQL servers, you might use replication.
Otherwise, if the local table is not in another MySQL server, you'll maybe find a way to mimic a MySQL server in your Java program. That is, make it useable as a slave for replication anyway.
Or, this should always be possible in your Java program, use polling. Have a thread checking the remote table and synchronizing it with the local one every n secs. Of course that's not too gentle on resources though. And there will be a delay until changes are propagated to the client.

Get TEXT column value in psql

I have created simple entity with Hibernate with #Lob String field. Everything works fine in Java, however I am not able to check the values directly in DB with psql or pgAdmin.
Here is the definition from DB:
=> \d+ user_feedback
Table "public.user_feedback"
Column | Type | Modifiers | Storage | Stats target | Description
--------+--------+-----------+----------+--------------+-------------
id | bigint | not null | plain | |
body | text | | extended | |
Indexes:
"user_feedback_pkey" PRIMARY KEY, btree (id)
Has OIDs: no
And here is that I get from select:
=> select * from user_feedback;
id | body
----+-------
34 | 16512
35 | 16513
36 | 16514
(3 rows)
The actual "body" content is for all rows "normal" text, definitely not these numbers.
How to retrieve actual value of body column from psql?
This will store the content of LOB 16512 in file out.txt :
\lo_export 16512 out.txt
Although using #Lob is usually not recommended here (database backup issues ...). See store-strings-of-arbitrary-length-in-postgresql for alternatives.
Hibernate is storing the values as out-of-line objects in the pg_largeobject table, and storing the Object ID for the pg_largeobject entry in your table. See PostgreSQL manual - large objects.
It sounds like you expected inline byte array (bytea) storage instead. If so, you may want to map a byte[] field without a #Lob annotation, rather than a #Lob String. Note that this change will not be backward compatible - you'll have to export your data from the database then drop the table and re-create it with Hibernate's new definition.
The selection of how to map your data is made by Hibernate, not PostgreSQL.
See related:
proper hibernate annotation for byte[]
How to store image into postgres database using hibernate

Storing List data in Hbase?

Im trying to store a list,collection of data objects in Hbase. For example ,a User table where a the userId is the Rowkey and column family Contacts with column Contacts:EmailIds where EmailIds is a list of emails as
{abcd#example.com,bpqrs#gmail.com....etc}
How do we model this in Hbase ? How do we do this in Java?/Python?Ive tried pickling and unpickling data in Python but this is one solution which I do not want to use due to performance issues.
You can use it in the following manner:
| userid | contacts |
| test | c:email1=test#example.com; c:email2=te.st#example.com |
or
| userid | contacts |
| test | c:test#example.com=1; c:te.st#example.com=2 |
This way you can use versioning, add/remove as much email addresses as you want, use filters, and it is really easy to iterate over these KV pairs in the client code

look for a database design related manner

I am working for a log analyzer system,which read the log of tomcat and display them by a chart/table in web page.
(I know there are some existed log analyzer system,I am recreating the wheel. But this is my job,my boss want it.)
Our tomcat log are saved by day. For example:
2011-01-01.txt
2011-01-02.txt
......
The following is my manner for export logs to db and read them:
1 The DB structure
I have three tables:
1)log_current:save the logs generated today.
2)log_past:save the logs generated before today.
The above two tables own the SAME schema.
+-------+-----------+----------+----------+--------+-----+----------+----------+--------+---------------------+---------+----------+-------+
| Id | hostip | username | datasend | method | uri | queryStr | protocol | status | time | browser | platform | refer |
+-------+-----------+----------+----------+--------+-----+----------+----------+--------+---------------------+---------+----------+-------+
| 44359 | 127.0.0.1 | - | 0 | GET | / | | HTTP/1.1 | 404 | 2011-02-17 08:08:25 | Unknown | Unknown | - |
+-------+-----------+----------+----------+--------+-----+----------+----------+--------+---------------------+---------+----------+-------+
3)log_record:save the information of log_past,it record the days whose logs have been exported to the log_past table.
+-----+------------+
| Id | savedDate |
+-----+------------+
| 127 | 2011-02-15 |
| 128 | 2011-02-14 |
..................
+-----+------------+
The table shows log of 2011-02-15 have been exported.
2 Export(to db)
I have two schedule work.
1) day work.
at 00:05:00,check the tomcat log directory(/tomcat/logs) to find all the latest 30 days log files(of course it include logs of yesterday.
check the log_record table to see if logs of one day is exported,for example,2011-02-16 is not find in the log_record,so I will read the 2011-02-16.txt,and export them to log_past.
After export log of yesterday,I start the file monitor for today's log(2011-02-17.txt) not matter it exist or not.
2)the file monitor
Once the monitor is started,it will read the file hour by hour. Each log it read will be saved in the log_current table.
3 tomcat server restart.
Sometimes we have to restart the tomcat,so once the tomcat is started,I will delete all logs of log_current,then do the day work.
4 My problem
1) two table (log_current and log_past).
Because if I save the today's log to log_past,I can not make sure all the log file(xxxx-xx-xx.txt) are exported to db. Since I will do a check in 00:05:00 every day which make sure that logs before today must be exported.
But this make it difficult to query logs accros yestersay and today.
For example,query from 2011-02-14 00:00:00 to 2011-02-15 00:00:00,these log must be at log_past.
But how about from 2011-02-14 00:00:00 to 2011-02-17 08:00:00 ?(suppose it is 2011-02-17 09:00:00 now).
It is complex to query across tables.
Also,I always think my desing for the table and work manner(schedule work of export/read) are not perfect,so anyone can give a good suggestion?
I just need to export and read log and can do a almost real-time analysis where real-time means I have to make logs of current day visiable by chart/table and etc.
First of all, IMO you don't need 2 different tables log_current and log_past. You can insert all the rows in the same table, say logs and retrieve using
select * from logs where id = (select id from log_record where savedDate = 'YOUR_DATE')
This will give you all the logs of the particular day.
Now, once you are able to remove the current and past distinction between tables using above way, I think the problem you are asking here would be solved. :)

Categories