What we're trying to do is what Meteor is doing with Mongo with LiveQuery, which is this:
Livequery can connect to the database, pretend to be a replication
slave, and consume the replication log. Most databases support some
form of replication so this is a widely applicable approach. This is
the strategy that Livequery prefers with MongoDB, since MongoDB does
not have triggers.
Source of that quote here
So is there a way with com.mongodb.*; in Java to create such replication slave so that it receives any notifications for each update that happens on the primary Mongo server?
Also, I don't see any replication log in the local database. Is there a way to turn them on?
If it's not possible to do it in Java, is it possible to create such solution in other languages (C++ or Node.js maybe)?
You need to start your database with the --replSet rsName option, and then run rs.initiate(). After that you will see a rs.oplog collection in the local database.
What you are describing is commonly referred to as "tailing the oplog", which is based on using a Tailable Cursor on a capped collection (the MongoDB oplog in this case). The mechanics are relatively simple, there are numerous oplog tailing examples out there written in Java, here are a few:
Event Streaming with MongoDB
TailableCursorExample
Wordnik mongo-admin-utils
IncrementalBackupUtil
Related
I have some application written in JAVA.
We are using MySQL DB.
It is possible to integrate that MySQL DB with Apache Ignite as In Memory cache and use that configuration without any updates in JAVA application (of course some DB connection details should be changed)?
So my application do the same staff but only difference will be connection with Apache Ignite instead of MySQL?
It is possible this kind of configuration?
I suppose you are looking for the write-through feature. I'm not sure what is your use case, but you should be aware of some limitations like your data have to be preloaded into Ignite before running SELECT queries. From a very abstract perspective, you need to define POJOs and implement a custom CacheStore interface. Though GridGain Control Center can do the latter for you automatically, check this demo as a reference.
I need to make a web application in java, that offers a dashboard based on the content of a db table.
It needs to be "autorefreshing", and always syncronized with the actual data in the db.
For the browser <-> servlet interaction I can use websockets or at least long polling to achieve the "freshness", but I'm stuck with the java <-> db communication.
I can have some polling, but I would really have some "notification" from the db itself.
Is there some way / some library to achieve?
For my case the db is oracle, but I'm interested also in solution for postgres.
To monitor db changes debezium connector is good. By using this u will get every change event of database in kafka topics.
For oracle look at this tutorial
For postgresql look at this tutorial
I would like to use Apache Ignite as failover read-only storage so my application will be able to access the most sensitive data if main storage (Oracle) is down.
So I need to
Start nodes
Create schema (execute DDL queries)
Load data from Oracle to Ignite
Seems like it's not the same as database caching and I don't need to use Cache. However, this page says that I need to implement a store to load a large amount of data from 3rd parties.
So, my questions are:
How to effectively transfer data from Oracle to Ignite? Data Streamers?
Who should init this transfer? First started node? How to do that? (tutorials explain how to achieve that via clients, should I follow this advice?)
Actually, I think, use of a cache store without read/write-through would be a suitable option here. You can configure a CacheJdbcPojoStore, for example, and call IgniteCache#loadCache(...) on your cache, once the cluster is up. More on this topic: https://apacheignite.readme.io/docs/3rd-party-store
If you don't want to use a cache store, then IgniteDataStreamer could be a good choice. This is the fastest way to upload big amount of data to the cluster. Data loading is usually performed from a client node, when all server nodes are up and running.
I am using Postgres 9.3 on my production servers. I would like to achieve high availability of Postgres DB using Master-Master configuration where each master would run in an Active-Active mode with bidirectional replication.
I have 2 Java Spring REST web-services pointed to 2 separate database engines each having their own storage. Both web services point to its own database plus the other one in HA configuration.
Now if any one of the Databases fails, I want the active database server to work and when the failed one recover, the data should be synced back to the recovered one.
I tried doing bidirectional replication using Bucardo 5.3.1 but the recovered database does not get updated with the new data and Bucardo syncs need to be kicked again. (see bug: https://github.com/bucardo/bucardo/issues/88)
Is there any way I can achieve this with some other bi-directional replication tool?
Or is there any other way where I can have with 2 Postgres engines pointing to a shared storage running in Active-Active configuration?
2nd Quadrant released Postgres BDR which is a patched version of PostgreSQL that can do multimaster replication using logical WAL decoding. You will find more informations here : https://www.2ndquadrant.com/fr/resources/bdr/
I have finally decided to move to Enterprise DB of Postgres (a paid licence) that provides replication tools via GUI which are easy to use and configure.
What are the options to index large data from Oracle DB to elastic search cluster? Requirement is to index 300Million records one time into multiple indexes and also incremental updates having around approximate 1 Million changes every day.
I have tried JDBC plugin for elasticsearch river/feeder, both seems to be running inside or require locally running elastic search instance. Please let me know if there is any better option for running elastic search indexer as a standalone job (probably java based). Any suggestions will be very helpful.
Thanks.
We use ES as a reporting db and when new records are written to SQL we take the following action to get them into ES:
Write the primary key into a queue (we use rabbitMQ)
Rabbit picks up the primary key (when it has time) and queries the relation DB to get the info it needs and then writes the data into ES
This process works great because it handles both new data and old data. For old data just write a quick script to write 300M primary keys into rabbit and you're done!
there are many integration options - I've listed out a few to give you some ideas, the solution is really going to depend on your specific resources and requirements though.
Oracle Golden Gate will look at the Oracle DB transaction logs and feed them in real-time to ES.
ETL for example Oracle Data Integrator could run on a schedule and pull data from your DB, transform it and send to ES.
Create triggers in the Oracle DB so that data updates can be written to ES using a stored procedure. Or use the trigger to write flags to a "changes" table that some external process (e.g. a Java application) monitors and uses to extract data from the Oracle DB.
Get the application that writes to the Oracle DB to also feed ES. Ideally your application and Oracle DB should be loosely coupled - do you have an integration platform that can feed the messages to both ES and Oracle?