Asynchronous multiple query from different datasources or databases - java

I'm having trouble to find appropriate solution for that:
I have several databases with the same structure but with different data. And when my web app execute a query, it must separate this query for each database and execute it asynchronously and then aggregate results from all databases and return it as single result. Additionaly I want to be able to pass a list of databases where query would be executed and also I want to pass maximum expiration time for query executing. Also result must contains meta information for each databases such as excess execution time.
It would be great if it possible to use another datasource such as remote web service with specific API, rather than relational database.
I use Spring/Grail and need java solution but I will be glad to any advice.
UPD: I want to find prepared solution, maybe framework or something like that.

This is basic OO. You need to abstract what you are trying to achieve - loading data - from the mechanism you are using to achieve - a database query or a web-service call.
Such a design would usually involve an interface that defines the contract of what can be done and then multiple implementing classes that make it happen according to their implementation.
For example, you'd end up with an interface that looked something like:
public interface DataLoader
{
public Collection<Data> loadData() throws DataLoaderException;
}
You would then have implementations like JdbcDataLoader, WebServiceDataLoader, etc. In your case you would need another type of implementation that given one or more instances of DataLoader, runs each sumulatiously aggregating the results. This implementation would look something like:
public class AggregatingDataLoader implements DataLoader
{
private Collection<DataLoader> dataLoaders;
private ExecutorService executorService;
public AggregatingDataLoader(ExecutorService executorService, Collection<DataLoader> dataLoaders)
{
this.executorService = executorService;
this.dataLoaders = dataLoaders;
}
public Collection<Data> loadData() throws DataLoaderException
{
Collection<DataLoaderCallable>> dataLoaderCallables = new ArrayList<DataLoaderCallable>>();
for (DataLoader dataLoader : dataLoaders)
{
dataLoaderCallables.add(new DataLoaderCallable(dataLoader));
}
List<Future<Collection<Data>>> futures = executorService.invokeAll(dataLoaderCallables);
Collection<Data> data = new ArrayList<Data>();
for (Future<Collection<Data>> future : futures)
{
add.addAll(future.get());
}
return data;
}
private class DataLoaderCallable implements Callable<Collection<Data>>
{
private DataLoader dataLoader;
public DataLoaderCallable(DataLoader dataLoader)
{
this.dataLoader = dataLoader;
}
public Collection<Data> call()
{
return dataLoader.load();
}
}
}
You'll need to add some timeout and exception handling logic to this, but you get the gist.
The other important thing is your call code should only ever use the DataLoader interface so that you can swap different implementations in and out or use mocks during testing.

Related

How do I implement it like Mockito?

I am currently working on a project to manage the reservation system.
There is a new requirement, which is to be able to keep track of all booking status changes.
I hope this does not affect the existing logic and exists as an independent module.
At first I thought of AOP, but there are some problems.
This request should record how what data was changed by what action.
I thought that I could extract the different data by applying AOP to the save method of the repository.
However, this is not possible because there are many different actions that update data.
For example, for reservation, the update by using the save method in the repository, but this method is used in various actions such as check in, check out and etc..
Therefore, the difference in data can be obtained, but it is not possible to tell which action the data was updated.
#Service
public class BookingService {
#Autowired
private BookingRepository bookingRepository;
public Booking create(Booking booking) {
return bookingRepository.save(booking);
}
public void update(Booking booking) {
Booking oldBooking = bookingRepository.findById(booking.getId()).orElseThrow(() -> new RuntimeException("Entity not found"));
oldBooking.update(booking);
bookingRepository.save(oldBooking);
}
public void checkIn(long id) {
Booking booking = bookingRepository.findById(id).orElseThrow(() -> new RuntimeException("Entity not found"));
booking.setStatus(Booking.Status.CheckIn);
bookingRepository.save(booking);
}
}
And since I use AOP, I don't want the parameters or result values of the existing logic to fit in a certain form.
While contemplating how to solve this, how about using the method used by Mockito.
In Mockito, We can know when a method is executed within a method.
Wouldn't it be possible to create a method like this, for example?
#Aspect
public class BookingHistory {
#Autowired
private BookingRepository bookingRepository;
#Around("execution(* *Service.update(..))")
public void update(ProceedingJoinPoint proceedingJoinPoint) {
long id = getBookingId(proceedingJoinPoint);
Booking origin = getBooking(id);
final DiffData diffData;
when(bookingRepository::save).thenReturn(result -> diffData = diff(origin, result));
saveHistory("UPDATE", "Booking", diffData);
}
}
But I have no idea how to implement "when", "thenReturn" etc in Mockito.
Could I get some hints to implement Mockito?
And if not this way, is there any other good way?
Mockito is a testing framework and should only be used for unit testing. If you want to keep track of which method changes the data using Spring's AOP, you can use custom annotations. With custom annotations, you can just pass the value which identifies the action and do whatever you want to it, e.g.: log it, publish it to MQ for analytics, etc. Try the following article on creating custom annotations and this on how to get the method's caller information.

Should I always have a separate "DataService" that make invokes another service?

I am building a new RESTful service that interacts with other microservices.
The routine task is to fetch some data from another RESTful service, filter it, match it against existing data and return a response.
My question is is it a good design pattern to always separate steps "get data" and "filter it" in two different classes and name one is as EntityDataService and the second one is simply EntityService?
For instance, I can make a call to a service that returns a list of countries that has to be filtered against some conditions as inclusion in EU or date of creation, etc.
In this case, which option is better:
separate CountryDataService class that only have one method
getAllCountries and EUCountryService that filters them
make one class CountryService with public methods getEUCountries and
getCountriesCreatedInDateRange and private getAllCountries
Which one is better?
I'm trying to follow KISS pattern but also want to make my solution maintainable and extensible.
In systems with lots of data, having a method getAllSomething is not that good of an idea.
If you don't have lots of data, it's ok to have it, but still be careful.
If you have 50 records, it's not that bad, but if you have millions of records that whould be a problem.
Having a Service or Repository with methods getBySomeCriteria is the better way to go.
If you have lots of different queries that you want to perform, so you may end up with lots of methods: getByCriteria1, getByCriteria2,..., getByCriteria50. Also, each time you need a different query you will have to add a new method to the Service.
In this case you can use the Specification Pattern. Here's an example:
public enum Continent { None, Europe, Africa, NorthAmerica, SouthAmerica, Asia }
public class CountrySpecification {
public DateRange CreatedInRange { get; set; }
public Continent Location { get; set; }
}
public class CountryService {
public IEnumerable<Country> Find(CountrySpecification spec) {
var url = "https://api.myapp.com/countries";
url = AddQueryParametersFromSpec(url, spec);
var results = SendGetRequest(url);
return CreateCountryFromApiResults(results);
}
}

Vertx add handler to Spring data API

In vertx if we want to execute jdbc operation that will not block main loop we use the following code,
client.getConnection(res -> {
if (res.succeeded()) {
SQLConnection connection = res.result();
connection.query("SELECT * FROM some_table", res2 -> {
if (res2.succeeded()) {
ResultSet rs = res2.result();
// Do something with results
}
});
} else {
// Failed to get connection - deal with it
}
});
Here we add handler that will execute when our operation will be done.
Now I want to use Spring Data API but is it in the same way as above
Now I used it as follow
#Override
public void start() throws Exception {
final EventBus eventBus = this.vertx.eventBus();
eventBus.<String>consumer(Addresses.BEGIN_MATCH.asString(), handler-> {
this.vertx.executeBlocking(()-> {
final String body = handler.body();
final JsonObject resJO = this.json.asJson(body);
final int matchId = Integer.parseInt(resJO.getString("matchid"));
this.matchService.beginMatch(matchId);//this service call method of crudrepository
log.info("Match [{}] is started",matchId);
}
},
handler->{});
}
Here I used execute blocking but it use thread from the worker pool is it any alternative to wrap blocking code?
To answer the question: the need of using executeBlocking method goes away if:
You run multiple instance of your verticle in separate pids (using systemd or docker or whatever that allows you to run independent java process safely with recovery mode) and listening the same eventbus channel in cluster mode (with hazelcast for example).
You run multiple instance of your verticle as worker verticles as suggested by tsegismont in comment of this answer.
Also, it's not related to the question and it's really a personal opinion but I give it anyway: I think it's a bad idea to use Spring dependencies inside a vert.x application. Spring is relevant for Servlet based applications using at least Spring Core. I mean that's relevant to be used in an eco-system totally based on Spring. Otherwise you'll bring back a lot of unused big dependencies into your jar files.
You have for almost each Spring modules, small, lighter and independent libs with the same purposes. For example, for IoC you have guice, hk2, weld...
Personally if I need to use SQL based database, I'd be inspired by the Spring's JdbcTemplate and RowMapper model without using any Spring dependencies. It's pretty simple to reproduce that with a simple interface like that :
import java.io.Serializable;
import java.sql.ResultSet;
import java.sql.SQLException;
public interface RowMapper<T extends Serializable> {
T map(ResultSet rs) throws SQLException;
}
And another interface DatabaseProcessor with a method like that :
<T extends Serializable> List<T> execute(String query, List<QueryParam> params, RowMapper<T> rowMapper) throws SQLException;
And a class QueryParam with the value, the order and the name of your query parameters (to avoid SQL injection vulnerability).

Subscribe to an Observable without triggering it and then passing it on

This could get a little bit complicated and I'm not that experienced with Observables and the RX pattern so bear with me:
Suppose you've got some arbitrary SDK method which returns an Observable. You consume the method from a class which is - among other things - responsible for retrieving data and, while doing so, does some caching, so let's call it DataProvider. Then you've got another class which wants to access the data provided by DataProvider. Let's call it Consumer for now. So there we've got our setup.
Side note for all the pattern friends out there: I'm aware that this is not MVP, it's just an example for an analogous, but much more complex problem I'm facing in my application.
That being said, in Kotlin-like pseudo code the described situation would look like this:
class Consumer(val provider: DataProvider) {
fun logic() {
provider.getData().subscribe(...)
}
}
class DataProvider(val sdk: SDK) {
fun getData(): Consumer {
val observable = sdk.getData()
observable.subscribe(/*cache data as it passes through*/)
return observable
}
}
class SDK {
fun getData(): Observable {
return fetchDataFromNetwork()
}
}
The problem is, that upon calling sdk.subscribe() in the DataProvider I'm already triggering the Observable's subscribe() method which I don't want. I want the DataProvider to just silently listen - in this example the triggering should be done by the Consumer.
So what's the best RX compatible solution for this problem? The one outlined in the pseudo code above definitely isn't for various reasons one of which is the premature triggering of the network request before the Consumer has subscribed to the Observable. I've experimented with publish().autoComplete(2) before calling subscribe() in the DataProvider, but that doesn't seem to be the canonical way to do this kind of things. It just feels hacky.
Edit: Through SO's excellent "related" feature I've just stumbled across another question pointing in a different direction, but having a solution which could also be applicable here namely flatMap(). I knew that one before, but never actually had to use it. Seems like a viable way to me - what's your opinion regarding that?
If the caching step is not supposed to modify events in the chain, the doOnNext() operator can be used:
class DataProvider(val sdk: SDK) {
fun getData(): Observable<*> = sdk.getData().doOnNext(/*cache data as it passes through*/)
}
Yes, flatMap could be a solution. Moreover you could split your stream into chain of small Observables:
public class DataProvider {
private Api api;
private Parser parser;
private Cache cache;
public Observable<List<User>> getUsers() {
return api.getUsersFromNetwork()
.flatMap(parser::parseUsers)
.map(cache::cacheUsers);
}
}
public class Api {
public Observable<Response> getUsersFromNetwork() {
//makes https request or whatever
}
}
public class Parser {
public Observable<List<User>> parseUsers(Response response) {
//parse users
}
}
public class Cache {
public List<User> cacheUsers(List<User> users) {
//cache users
}
}
It's easy to test, maintain and replace implementations(with usage of interfaces). Also you could easily insert additional step into your stream(for instance log/convert/change data which you receive from server).
The other quite convenient operator is map. Basically instead of Observable<Data> it returns just Data. It could make your code even simpler.

Is it OK to use static "database helper" class?

I have some Android projects and most of them are connected with SQLite databases. I'm interested is it a good programming practice (or a bad habbit) to use some static class like "DatabaseHelper.class" in which I would have all static method related for database manipulation. For example
public static int getId(Context context, String name) {
dbInit(context);
Cursor result = db.rawQuery("SELECT some_id FROM table WHERE some_name = '" + name + "'", null);
result.moveToFirst();
int id = result.getInt(result.getColumnIndex("some_id"));
result.close();
return id;
}
where dbInit(context) (which is used in all my static methods for database manipluation) is
private static void dbInit(Context context) {
if (db == null) {
db = context.openOrCreateDatabase(DATABASE_NAME, Context.MODE_PRIVATE, null);
}
}
Then when I need something I can easily call those method(s) with for example
int id = DatabaseHelper.getId(this, "Abc");
EDIT: Do I have to use dbClose on every connection or leave it open per-activity and close per-activity? So do I have change that upper code to something like this?
...
dbClose();
return id;
}
private static void dbClose() {
if (db != null) {
db.close();
}
}
I would suggest you get into the habit of getting a database connection every time you need one, and releasing it back when you are done with it. The usual name for such a facility is a "database connection pool".
This moves the connection logic out of your actual code and into the pool, and allow you to do many things later when you need them. One simple thing, could be that the pool logs how long time a connection object was used, so you can get information about the usage of the database.
Your initial pool can be very simple if you only need a single connection.
I would definitely have your database related code in a separate class, but would really recommend against using a static class or Singleton. It might look good at first because of the convenience, but unfortunately it tightly couples your classes, hides their dependencies, and also makes unit testing harder.
The drawbacks section in wikipedia gives you a small overview of why you might want to explore other techniques. You can also head over here or over there where they give concrete examples of a class that uses a database access singleton, and how using dependency injection instead can solve some of the issues I mentioned.
As a first step, I would recommend using a normal class that you instantiate in your constructor, for ex:
public class MyActivity extends Activity {
private DBAccess dbAccess;
public MyActivity() {
dbAccess = new DBAccess(this);
}
}
As a second step, you might want to investigate frameworks like RoboGuice to break the hard dependency. You code would look something like:
public class MyActivity extends Activity {
#Inject private DBAccess dbAccess;
public MyActivity() {
}
}
Let us know if you want more details!
If you're going to use a singleton the very minimum requirement is that you make it stateless/threadsafe. If you use your getId method as it stands concurrent invocations could potentially cause all manner of strange bugs...
dbInit(context);
May be called for Thread A which then stops processing before hitting the query statement. Subsequently Thread B executes getId and also calls dbInit passing in a different context all together. Thread A would then resume and attempt to execute the query on B's context.
Maybe this isn't a problem in your application but I'd recommend sticking a synchronized modifier on that getId method!

Categories