Automatically syncing ElasticSearch with SQL - java

I've ran this query and it worked well.
curl -XPUT 'localhost:9200/_river/my_jdbc_river/_meta' -d '{
"type" : "jdbc",
"jdbc" : {
"url" : "jdbc:mysql://localhost:3306/test",
"user" : "myaccount",
"password" : "myaccount",
"sql" : "select * from orders"
}
}'
Everything seems to be indexed. However, when I changed a data from the Orders Table, the changes did not reflect the document in ElasticSearch. Is it possible to automatically sync updated/changed data?

You need to add another parameter for schedule to tell jdbc-river to pull data periodically.
Here is a reference to this.

I'm using elastic search on Windows 7. I have a trouble with load river-jdbc to sync elasticsearch with mysql. I try loading all river-jdbc river but the result is :
Execution Error
[java.lang.NoClassDefFoundError:org/elasticsearch/rest/XcontentThrowableRestResponse], NoClassDefFoundError[org/elasticsearch/rest/XcontentThrowableRestResponse], ClassNotFoundExeption[org/elasticsearch/rest/XcontentThrowableRestResponse]
P/s: OS: Window 7, elasticsearch 1.2.1, mysql-connector-java-5.1.25-bin.jar

Related

How to delete all the content and not mapping in elasticsearch index using Java API

I have an index called event which has around 250 records. I want to delete this 250 records without deleting the full mapping.
DeleteIndex delete = new DeleteIndex.Builder("events").build();
client.execute(delete);
the above code deletes the full event index. How can to only delete the content?
Based on ElasticSearch documentation Delete By Query API, you can delete all documents within an index using a query as following :
POST twitter/tweet/_delete_by_query
{
"query": {
"match_all": {}
}
}
With ElasticSearch Java API "Delete By Query API" documentation :
BulkByScrollResponse response =
DeleteByQueryAction.INSTANCE.newRequestBuilder(client)
.filter(QueryBuilders.matchQuery("gender", "male"))
.source("persons")
.get();
long deleted = response.getDeleted();
Please note that this answer is suitable for ElasticSearch 6.1 but may differ for other versions of ElasticSearch.
I hope this will help you.
Since I am using spring data I was able to resolve this issue using deleteAll funtion.
eventRepo.deleteAll();
did the trick for me.

Load a big json file in Mysql or Oracle database

At work, we supply files for other services. Size of them are between 5mo and 500mo.
We want to use Json instead of XML, but i m wondering how our customers could
upload those files in an easy way in their database, Oracle or Mysql.
I mean, i can t find on the web APi or functions or tools, in Mysql or oracle, to do that.
I know that it s easy to work data by data to load a small Json file, decode each
object or array and put them at the right place in database.
But is there an other way to do this, like sqlloader in Oracle ?
And if so, size of our file aren t they too big to produce JSON file, in JAVA for example ?
I guess it might be difficult to do this load job automatically, especially because of arrays like this :
{"employees":[
{"firstName":"John", "lastName":"Doe", "salaryHistory":[1000,2000,3000]},
{"firstName":"Anna", "lastName":"Smith", "salaryHistory":[500,800]},
{"firstName":"Peter", "lastName":"Jones", "salaryHistory":[400]}
]}
where salaryHistory must produce problems because their sizes are different, and data are not madatoryly
in the same table.
Any ideas or help would be welcomed !
Edit
i m looking for a solution to put each data in the good column of a table, i don t need to store a Json structure in a single column of simple table.
like this :
table employees : column are id, FirstName, lastName and
table salaryHistory : column are id, order, salary
and each data must go in the good column like "John" in firstname, "Doe" in lastname, then "1000" in a new row of table salaryHistory , "2000" in another new row of salaryHistory and so on.
Starting with MySQL 5.7 there is a new data type: JSON.
Take a look here for more details.
Example for Oracle 12c:
create table transactions (
id number not null primary key,
trans_msg clob,
constraint
check_json check (trans_msg is json)
);
regular insert:
insert into transactions
values
(
sys_guid(),
systimestamp,
'{
"TransId" : 3,
"TransDate" : "01-JAN-2015",
"TransTime" : "10:05:00",
"TransType" : "Deposit",
"AccountNumber" : 125,
"AccountName" : "Smith, Jane",
"TransAmount" : 300.00,
"Location" : "website",
"CashierId" : null,
"ATMDetails" : null,
"WebDetails" : {
"URL" : "www.proligence.com/acme/dep.htm"
},
"Source" : "Transfer",
"TransferDetails" :
{
"FromBankRouting" : "012345678",
"FromAccountNo" : "1234567890",
"FromAccountType" : "Checking"
}
}'
)
/
SQL*Loader control file and data file:
load data into table transactions
fields terminated by ','
(
trans_id sequence(max,1),
fname filler char(80),
trans_body lobfile(fname) terminated by EOF
)

MongoDB + Morphia - full text search using AND instead of OR

I've setup full text search and MongoDB and it's working quite well (Mongo 2.6.5).
However it does an OR instead of and AND.
1) Is it possible to make the query an AND query, while still getting all the benefits of full text search (stemming etc.)
2) And if so, is it possible to add this option via the Morphia wrapper library
EDIT
I see that the full text search includes a 'score' for each document returned. Is it possible to only return docs with a certain score or above. Is there some score that would represent a 'fuzzy' and query. That is usually all tokens are in the document but not absolutely always. If so this would solve the problem as well.
Naturally if possible to do this via Morphia that would be super helpful. But I can use the native java driver as well.
Any pointers in the correct direction, much appreciated.
EDIT
Code looks like this, I'm using Morphia 1.0.1:
Datastore ds = Dao.instance().getDatabase();
Query<Product> q = ds.createQuery(Product.class).search("grey vests");
List<Product> prods = q.asList();
Printing the query gives:
{ "$text" : { "$search" : "grey vests"}}
Note: I am able to do take an intersection of multiple result sets to create an AND query. However this is very slow since something like "grey" will return a massive result set and be slow at feeding the results back.
EDIT
I've tried to chain the search() calls and add a single 'token' to each call. But I am getting a run time error. Code becomes:
q.search("grey").search("vests");
The query I get is (which seems like it's doing the right thing) ...
{ "$and" : [ { "$text" : { "$search" : "grey"}} , { "$text" : { "$search" : "vests"}}]}
The error is:
com.mongodb.MongoQueryException: Query failed with error code 17287 and error message 'Can't canonicalize query: BadValue Too many text expressions' on server ...
at com.mongodb.connection.ProtocolHelper.getQueryFailureException(ProtocolHelper.java:93)

Difference between _id & $oid ; $date & IsoDate in mongo database

We are using mongo db to store certain records in production database.
We see our records having "_id" : { "$oid" : "50585fbcb046b2709a534502"} in production database , while we see same record as "_id" : ObjectId(" 50585fbcb046b2709a534502 ") in the qa database.
For dates we see "ld" : { "$date" : "2011-12-03T17:00:00Z"} in prod database, while "ld" :ISODate("2011-12-03T17:00:00Z") in qa database.
We have tested our queries successfully in qa environment, but worried it might fail in production
1) Will my java queries work seamlessly on prod & qa both? (I am using morphia apis to query)
2) Are they internally being stored in the same identical way?
To answer the two questions:
Yes they will
Yes they are the same, it is merely the representation within the item you are looking in (console or app) as to how they display. Console (later versions anyway, about 1.4+) will display ObjectId and ISODate (normally) whereas picking it out directly from the server language (Java in your case) will tend to show the full objects properties ($oid and $date in this case).

Unexpected MongoDB "OR" query behaviour

I'm testing out spring-data and it's mongodb support.
I have a question about the query creation when using or-queries. Consider the following:
Query query = new Query().or(new Query(where("receiverId").is(userId)), new Query(where("requesterId").is(userId)));
query.and(where("status").is(status));
This will result in the following mongodb query:
"$or" : [ { "receiverId" : { "$oid" : "4d78696025d0d46b42d9c579"}} , { "requesterId" : { "$oid" : "4d78696025d0d46b42d9c579"}}] , "status" : "REQUESTED"}
This returns zero results while one is expected. Running this query in mongodb command results in following error:
error: { "$err" : "invalid operator: $oid", "code" : 10068 }
Modifying the query and running it in mongodb command works fine:
{ "$or" : [ { "receiverId" : ObjectId("4d78696025d0d46b42d9c579")} , { "requesterId" : ObjectId("4d78696025d0d46b42d9c579")}] , "status" : "REQUESTED"}
Notice the use of ObjectId("...") instead of $oid.
Am I going about something the wrong way? Maybe setting up the query wrong?
Are you inspecting that query variable at runtime or is that what you are seeing in MongoDB's logs?
Int he C# driver, if you inspect the query variable, you see $oid as well, but that is not the actual query that is sent to the server. At some point, it changes that to a valid MongoDB query.
If you are running on linux, you may want to start up mongosniff which will show you realtime queries, updates and inserts as they happen. If you are on Windows, you should start up mongod.exe with -vvvv flag which will enable it to log every query, update, insert, or command to the log file.
Then you can actually see the exact query that is being submitted.

Categories