I am developing a program that, based on a configuration file, allows different types of databases (e.g., YAML, MySQL, SQLite, and others to be added in the future) to be used to store data.
Currently it is all running on the main thread but I would like to start delegating to secondary threads so as not to block the execution of the program.
For supported databases that use a connection, I use HikariCP so that the process is not slowed down too much by opening a new connection every time.
The main problem is the multitude of available databases. For example, for some databases it might be sufficient to store the query string in a queue and have an executor check it every X seconds; if it is not empty it executes all the queries. For others, however, it is not, because perhaps they require other operations (e.g., YAML files that use a key-value system with a map).
What I can't do is something "universal", that doesn't give problems with the order of queries (cannot just create a Thread and execute it, because then one fetch thread might execute before another insertion thread and the data might not be up to date) and that can return data on completion (in the case of get functions).
I currently have an abstract Database class that contains all the get() and set(...) methods for the various data to be stored. Some methods need to be executed synchronously (must be blocking) while others can and should be executed asynchronously.
Example:
public abstract class Database {
public abstract boolean hasPlayedBefore(#Nonnull final UUID uuid);
}
public final class YAMLDatabase extends Database {
#Override
public boolean hasPlayedBefore(#Nonnull final UUID uuid) { return getFile(uuid).exists(); }
}
public final class MySQLDatabase extends Database {
#Override
public boolean hasPlayedBefore(#Nonnull final UUID uuid) {
try (
final Connection conn = getConnection(); // Get a connection from the poll
final PreparedStatement statement = conn.prepareStatement("SELECT * FROM " + TABLE_NAME + " WHERE UUID= '" + uuid + "';");
final ResultSet result = statement.executeQuery()
) {
return result.isBeforeFirst();
} catch (final SQLException e) {
// Notifies the error
Util.sendMessage("Database error: " + e.getMessage() + ".");
writeLog(e, uuid, "attempt to check whether the user is new or has played before");
}
return true;
}
}
// Simple example class that uses the database
public final class Usage {
private final Database db;
public Usage(#Nonnull final Database db) { this.db = db; }
public User getUser(#Nonnull final UUID uuid) {
if(db.hasPlayedBefore(uuid))
return db.getUser(uuid); // Sync query
else {
// Set default starting balance
final User user = new User(uuid, startingBalance);
db.setBalance(uuid, startingBalance); // Example of sync query that I would like to be async
return user;
}
}
}
Any advice? I am already somewhat familiar with Future, CompletableFuture and Callback.
Related
I am experimenting with the integration of MongoDB on Android using Java as the language.
I followed the guide provided by MongoDB to configure the Atlas account and the Realm to communicate with.
After that I tried implementing CRUD methods, for insertions I did not encounter any problems, while for queries I did.
In particular to get all the objects of a certain class in a certain collection.
I used this method, as suggested by the wiki (https://www.mongodb.com/docs/realm/sdk/java/quick-start-local/)
RealmResults<Contact> contacts = backgroundThreadRealm.where(Contact.class).findAll();
inserted in a class made to handle background tasks:
public class BackgroundTasks implements Runnable {
#Override
public void run() {
String realmName = "MyApp";
RealmConfiguration config = new RealmConfiguration.Builder().name(realmName).build();
Realm backgroundThreadRealm = Realm.getInstance(config);
// all Tasks in the realm
RealmResults<Contact> contacts = backgroundThreadRealm.where(Contact.class).findAll();
Log.v("Contacts", String.valueOf(contacts.size()));
backgroundThreadRealm.close();
}
}
While in the MainActivity I inserted this (looks like a battlefield, maybe I inserted stuff I won't even need, but I was experimenting):
// initialize mongodb realm
realm.init(this);
// open realm
String realmName = "MyApp";
RealmConfiguration config = new RealmConfiguration.Builder().name(realmName).build();
backgroundThreadRealm = Realm.getInstance(config);
app = new App(new AppConfiguration.Builder(appId).build());
User user = app.currentUser();
mongoClient = user.getMongoClient("mongodb-atlas");
mongoDatabase = mongoClient.getDatabase("MyApp");
MongoCollection<Document> mongoCollection = mongoDatabase.getCollection("Contacts");
FutureTask<String> task = new FutureTask(new BackgroundTasks(), "test");
ExecutorService executorService = Executors.newFixedThreadPool(2);
executorService.execute(task);
#Override
protected void onDestroy() {
super.onDestroy();
// the ui thread realm uses asynchronous transactions, so we can only safely close the realm
// when the activity ends and we can safely assume that those transactions have completed
backgroundThreadRealm.close();
}
I get no exceptions but the Log:
Log.v("Contacts", String.valueOf(contacts.size()));
results in 0.
Yet I have these contacts in the DB (they have different IDs):
And the related model in java:
#RealmClass
public class Contact extends RealmObject implements Serializable {
#PrimaryKey
private String nameSurname;
private int age;
// Drawable resource ID
private int imageResourceId;
public Contact() {
}
public Contact(String name, String surname, int age, int imageResourceId) {
this.nameSurname = name+" "+surname;
this.age = age;
this.imageResourceId = imageResourceId;
}
// In addition all the getters and setters
Can you help me?
It would also help to understand when it's appropriate to make synchronous and asynchronous calls, because I guess I've confused the implementations in general.
I'd like to use synchronous calls to get all the objects in the DB and then display them on the app, but it seems ill-advised online so I tried asynchronous, although I'm sure I did something wrong..
Thanks
Motivation
I have a service which I want to make #Transactional. My service is storing complex data in multiple tables, there is a referential integrity between the tables.
public class MyData { // simplified
MasterData masterData;
List<Detail> details;
public static class MasterData {
UUID id;
String field1;
String field2;
String field3;
}
public static class Detail {
UUID masterId;
String fieldA;
String fieldB;
}
}
First I save the master data into one table using R2dbcRepository<MasterData, UUID>. The INSERT command is simple and I may use the R2dbcRepository.
Then a list of details into another table using DatabaseClient. Each detail has a foreign key constraint to the master table. I want to use batch INSERT and I complete the SQL using more complex approach in DatabaseClient.
Problem
The problem is that I cannot save the detail data - I get the error
insert or update on table "detail" violates foreign key constraint
I suspect that the reason is that each SQL command is executed in a different connection so the master data are not yet visible when the details are stored.
Question
Is it really the root cause? How to make R2DBC always use the same connection across all the calls to the database inside one #Transactional service call, even if it goes via various instances of R2dbcRepository and DatabaseClient?
If the solution is completely wrong, how to correctly implement #Transactional in R2DBC?
I prefer calling all the INSERTs into the detail table in a batch.
Code
My (simplified) code looks like this:
#Service
public class MyService {
private final MasterRepository masterRepository;
private final DbConnector dbConnector;
#Transactional
public Mono<Void> saveMasterAndDetails(MyData data) {
return Mono.just(data)
.map(MyData::getMaster)
.flatMap(masterRepository::insertMasterData)
.thenReturn(data)
.map(MyData::getDetails)
.flatMap(dbConnector::insertDetails)
.then()
;
}
}
The code of MasterRepository is something like
import org.springframework.data.r2dbc.repository.R2dbcRepository;
public interface MasterRepository extends R2dbcRepository<MasterData, UUID> {
#Query("""
INSERT INTO master(id, col_1, col_2, col_3)
VALUES (
:#{#masterData.id},
:#{#masterData.field1},
:#{#masterData.field2},
:#{#masterData.field3})
""")
Mono<Void> insertMasterData(MasterData masterData);
}
And the code of DbConnector is more complex - but maybe overly complex? There is still missing direct support for batches and prepared statements in DatabaseClient: spring-data-r2dbc-259, spring-framework-27229
import org.springframework.r2dbc.core.DatabaseClient;
import io.r2dbc.spi.Connection;
import io.r2dbc.spi.Statement;
public class DbConnector {
private final DatabaseClient databaseClient;
public Mono<Integer> insertDetails(List<Detail> details) {
// usingWhen() is the reactive analogy of "use-with-resources"
return Flux.usingWhen(
// "try(Create the resource)"
databaseClient.getConnectionFactory().create(),
// "{ the body }"
connection -> {
final Statement statement = connection.createStatement("""
insert into detail (masterId, col_A, col_B)
values ($1, $2, $3)
""");
details.forEach(detail ->
statement
.bind("$1", detail.getMasterId())
.bind("$2", detail.getColA())
.bind("$3", detail.getColB())
.add()
);
return statement.execute();
},
// "finally close()"
Connection::close)
.flatMap(Result::getRowsUpdated)
.reduce(0, Integer::sum);
}
}
TL;DR:
I replaced
return Flux.usingWhen(
databaseClient.getConnectionFactory().create(),
connection -> {
statement = ... // prepare the statement
return statement.execute();
},
Connection::close
)
with
return databaseClient.inConnection(connection -> {
statement = ... // prepare the statement
return statement.execute();
);
Detailed answer
I have discovered a method which I was not aware of: DatabaseConnection.inConnection(). The DatabaseConnection interface inherits it from ConnectionAccessor:
Execute a callback Function within a Connection scope. The function is responsible for creating a Mono. The connection is released after the Mono terminates (or the subscription is cancelled). Connection resources must not be passed outside of the Function closure, otherwise resources may get defunct.
I changed my code using the DatabaseClient and it seems that the SQL commands are executed in the same connection.
However, I would still like to understand it better. I am not sure, if I am just lucky now and if it can change with the next implementation. I still do not know how to have the full control over the connections and hence over the transactional code.
import org.springframework.r2dbc.core.DatabaseClient;
import io.r2dbc.spi.Connection;
import io.r2dbc.spi.Statement;
public class DbConnector {
private final DatabaseClient databaseClient;
public Mono<Integer> insertDetails(List<Detail> details) {
return databaseClient.inConnection(connection -> {
final Statement statement = connection.createStatement("""
insert into detail (masterId, col_A, col_B)
values ($1, $2, $3)
""");
details.forEach(detail ->
statement
.bind("$1", detail.getMasterId())
.bind("$2", detail.getColA())
.bind("$3", detail.getColB())
.add()
);
return Flux.from(statement.execute())
.flatMap(Result::getRowsUpdated)
.reduce(0, Integer::sum)
;
});
}
My flink program should do a Cassandra look up for each input record and based on the results, should do some further processing.
But I'm currently stuck at reading data from Cassandra. This is the code snippet I've come up with so far.
ClusterBuilder secureCassandraSinkClusterBuilder = new ClusterBuilder() {
#Override
protected Cluster buildCluster(Cluster.Builder builder) {
return builder.addContactPoints(props.getCassandraClusterUrlAll().split(","))
.withPort(props.getCassandraPort())
.withAuthProvider(new DseGSSAPIAuthProvider("HTTP"))
.withQueryOptions(new QueryOptions().setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM))
.build();
}
};
for (int i=1; i<5; i++) {
CassandraInputFormat<Tuple2<String, String>> cassandraInputFormat =
new CassandraInputFormat<>("select * from test where id=hello" + i, secureCassandraSinkClusterBuilder);
cassandraInputFormat.configure(null);
cassandraInputFormat.open(null);
Tuple2<String, String> out = new Tuple8<>();
cassandraInputFormat.nextRecord(out);
System.out.println(out);
}
But the issue with this is, it takes nearly 10 seconds for each look up, in other words, this for loop takes 50 seconds to execute.
How do I speed up this operation? Alternatively, is there any other way of looking up Cassandra in Flink?
I came up with a solution that is fairly fast at querying Cassandra with streaming data. Would be of use to someone with the same issue.
Firstly, Cassandra can be queried with as little code as,
Session session = secureCassandraSinkClusterBuilder.getCluster().connect();
ResultSet resultSet = session.execute("SELECT * FROM TABLE");
But the problem with this is, creating Session is a very time-expensive operation and something that should be done once per key space. You create Session once and reuse it for all read queries.
Now, since Session is not Java Serializable, it cannot be passed as an argument to Flink operators like Map or ProcessFunction. There are a few ways of solving this, you can use a RichFunction and initialize it in its Open method, or use a Singleton. I will use the second solution.
Make a Singleton Class as follows where we create the Session.
public class CassandraSessionSingleton {
private static CassandraSessionSingleton cassandraSessionSingleton = null;
public Session session;
private CassandraSessionSingleton(ClusterBuilder clusterBuilder) {
Cluster cluster = clusterBuilder.getCluster();
session = cluster.connect();
}
public static CassandraSessionSingleton getInstance(ClusterBuilder clusterBuilder) {
if (cassandraSessionSingleton == null)
cassandraSessionSingleton = new CassandraSessionSingleton(clusterBuilder);
return cassandraSessionSingleton;
}
}
You can then make use of this session for all future queries. Here I'm using the ProcessFunction to make queries as an example.
public class SomeProcessFunction implements ProcessFunction <Object, ResultSet> {
ClusterBuilder secureCassandraSinkClusterBuilder;
// Constructor
public SomeProcessFunction (ClusterBuilder secureCassandraSinkClusterBuilder) {
this.secureCassandraSinkClusterBuilder = secureCassandraSinkClusterBuilder;
}
#Override
public void ProcessElement (Object obj) throws Exception {
ResultSet resultSet = CassandraLookUp.cassandraLookUp("SELECT * FROM TEST", secureCassandraSinkClusterBuilder);
return resultSet;
}
}
Note that you can pass ClusterBuilder to ProcessFunction as it is Serializable. Now for the cassandraLookUp method where we execute the query.
public class CassandraLookUp {
public static ResultSet cassandraLookUp(String query, ClusterBuilder clusterBuilder) {
CassandraSessionSingleton cassandraSessionSingleton = CassandraSessionSingleton.getInstance(clusterBuilder);
Session session = cassandraSessionSingleton.session;
ResultSet resultSet = session.execute(query);
return resultSet;
}
}
The singleton object is created only the first time the query is run, after that, the same object is reused, so there is no delay in look up.
I have designed and implemented a simple webstore based on traditional MVC Model 1 architecture using pure JSP and JavaBeans (Yes, I still use that legacy technology in my pet projects ;)).
I am using DAO design pattern to implement my persistence layer for a webstore. But I am not sure if I have implemented the classes correctly in my DAO layer. I am specifically concerned about the QueryExecutor.java and DataPopulator.java classes (mentioned below). All the methods in both these classes are defined as static which makes me think if this is the correct approach in multithreaded environment. Hence, I have following questions regarding the static methods.
Will there be synchronization issues when multiple users are trying to do a checkout with different products? If answer to the above question is yes, then how can I actually reproduce this synchronization issue?
Are there any testing/tracing tools available which will actually show that a specific piece of code will/might create synchronization issues in a multithreaded environment? Can I see that a User1 was trying to access Product-101 but was displayed Product-202 because of non thread-safe code?
Assuming there are synchronization issues; Should these methods be made non-static and classes instantitable so that we can create an instance using new operator OR Should a synchronized block be placed around the non thread-safe code?
Please guide.
MasterDao.java
public interface MasterDao {
Product getProduct(int productId) throws SQLException;
}
BaseDao.java
public abstract class BaseDao {
protected DataSource dataSource;
public BaseDao(DataSource dataSource) {
this.dataSource = dataSource;
}
}
MasterDaoImpl.java
public class MasterDaoImpl extends BaseDao implements MasterDao {
private static final Logger LOG = Logger.getLogger(MasterDaoImpl.class);
public MasterDaoImpl(DataSource dataSource) {
super(dataSource);
}
#Override
public Product getProduct(int productId) throws SQLException {
Product product = null;
String sql = "select * from products where product_id= " + productId;
//STATIC METHOD CALL HERE, COULD THIS POSE A SYNCHRONIZATION ISSUE ??????
List<Product> products = QueryExecutor.executeProductsQuery(dataSource.getConnection(), sql);
if (!GenericUtils.isListEmpty(products)) {
product = products.get(0);
}
return product;
}
}
QueryExecutor.java
public final class QueryExecutor {
private static final Logger LOG = Logger.getLogger(QueryExecutor.class);
//SO CANNOT NEW AN INSTANCE
private QueryExecutor() {
}
static List<Product> executeProductsQuery(Connection cn, String sql) {
Statement stmt = null;
ResultSet rs = null;
List<Product> al = new ArrayList<>();
LOG.debug(sql);
try {
stmt = cn.createStatement();
rs = stmt.executeQuery(sql);
while (rs != null && rs.next()) {
//STATIC METHOD CALL HERE, COULD THIS POSE A SYNCHRONIZATION ISSUE ???????
Product p = DataPopulator.populateProduct(rs);
al.add(p);
}
LOG.debug("al.size() = " + al.size());
return al;
} catch (Exception ex) {
LOG.error("Exception while executing products query....", ex);
return null;
} finally {
try {
if (rs != null) {
rs.close();
}
if (stmt != null) {
stmt.close();
}
if (cn != null) {
cn.close();
}
} catch (Exception ex) {
LOG.error("Exception while closing DB resources rs, stmt or cn.......", ex);
}
}
}
}
DataPopulator.java
public class DataPopulator {
private static final Logger LOG = Logger.getLogger(DataPopulator.class);
//SO CANNOT NEW AN INSTANCE
private DataPopulator() {
}
//STATIC METHOD DEFINED HERE, COULD THIS POSE A SYNCHRONIZATION ISSUE FOR THE CALLING METHODS ???????
public static Product populateProduct(ResultSet rs) throws SQLException {
String productId = GenericUtils.nullToEmptyString(rs.getString("PRODUCT_ID"));
String name = GenericUtils.nullToEmptyString(rs.getString("NAME"));
String image = GenericUtils.nullToEmptyString(rs.getString("IMAGE"));
String listPrice = GenericUtils.nullToEmptyString(rs.getString("LIST_PRICE"));
Product product = new Product(new Integer(productId), name, image, new BigDecimal(listPrice));
LOG.debug("product = " + product);
return product;
}
}
Your code is thread-safe.
The reason, and the key to thread-safety, is your (static) methods do not maintain state. ie your methods only use local variables (not fields).
It doesn't matter if the methods are static or not.
Assuming there are synchronization issues; Should these methods be made non-static and classes instantitable so that we can create an instance using new operator
This won't help. Multiple threads can do as they please with a single object just as they can with a static method, and you will run into synchronization issues.
OR Should a synchronized block be placed around the non thread-safe code?
Yes this is the safe way. Any code inside of a synchronized block is guaranteed to have at most one thread in it for any given time.
Looking through your code, I don't see many data structures that could possibly be shared amongst threads. Assuming you had something like
public final class QueryExecutor {
int numQueries = 0;
public void doQuery() {
numQueries++;
}
}
Then you run into trouble because 4 threads could have executed doQuery at the same moment, and so you have 4 threads modifying the value of numQueries - a big problem.
However with your code, the only shared class fields is the logging class, which will have it's own thread-safe synchronization built in - therefore the code you have provided looks good.
There is no state within your code (no mutable member variables or fields, for example), so Java synchronisation is irrelevant.
Also as far as I can tell there are no database creates, updates, or deletes, so there's no issue there either.
There's some questionable practice, for sure (e.g. the non-management of the database Connection object, the wide scope of some variables, not to mention the statics), but nothing wrong as such.
As for how you would test, or determine thread-safety, you could do worse than operate your site manually using two different browsers side-by-side. Or create a shell script that performs automated HTTP requests using curl. Or create a WebDriver test that runs multiple sessions across a variety of real browsers and checks that the expected products are visible under all scenarios...
For university, it is my excercise to develop a multiplayer game with Java. The communication between the clients shall not be handled with sockets or the like, but with the help of a MySQL database where the clients are adding their steps in the game. Because it is a game of dice, not a lot of queries are needed. (approximiately 30 queries per gaming session are needed).
I never used MySQL in connection with Java before, so this maybe is a beginner's fault. But actually, I often get an exception during the execution of my java project.
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: User my_username already has more than 'max_user_connections' active connections
My queries are executed in a DatabaseHelper.java class. The results are returned and evaluated in another class of the project. Since I use an MVC pattern, I evaluate the results in a controller or model class.
This for example is one of my quers in the DatabaseHelper.java class. The other queries are similar:
private static Connection conn;
private Connection getConn() {
return conn;
}
public void db_connect() throws ClassNotFoundException, SQLException{
// JDBC Klassen laden
Class.forName(dbClassName);
// Verbindungsversuch auf 5 Sekunden setzen
DriverManager.setLoginTimeout(5);
this.setConn(DriverManager.getConnection(CONNECTION,p)); // p contains the username and the database
}
public void db_close(){
try {
this.getConn().close();
} catch (SQLException e) {
if(GLOBALVARS.DEBUG)
e.printStackTrace();
}
}
public String[] query_myHighscores(int gameid, PlayerModel p) throws SQLException{
List<String> rowValues = new ArrayList<String>();
PreparedStatement stmnt;
if(gameid == GLOBALVARS.DRAGRACE)
stmnt = this.getConn().prepareStatement("SELECT score FROM highscore WHERE gid = ? and pname = ? ORDER BY score ASC LIMIT 0,3");
else
stmnt = this.getConn().prepareStatement("SELECT score FROM highscore WHERE gid = ? and pname = ? ORDER BY score DESC LIMIT 0,3");
stmnt.setInt(1, gameid);
stmnt.setString(2, p.getUname());
ResultSet rs = stmnt.executeQuery();
rs.beforeFirst();
while(rs.next()){
rowValues.add(rs.getString(1));
}
stmnt.close();
rs.close();
return (String[])rowValues.toArray(new String[rowValues.size()]);
}
The CONNECTION string is a string which looks like jdbc:mysql://my_server/my_database
In the HighscoreGUI.java class, I request the data like this:
private void actualizeHighscores(){
DatabaseHelper db = new DatabaseHelper();
try{
db.db_connect();
String[] myScoreDragrace = db.query_myHighscores(GLOBALVARS.GAME1); // id of the game as parameter
// using the string
} finally {
db.db_close();
}
So I tried:
Closing the statement and the ResultSet after each query
Used db_close() to close the connection to the dabase in the finally-block
Never returning a ResultSet (found out this may become a performance leak)
The stacktrace leads in the DatabaseHelper.java class to the line
this.setConn(DriverManager.getConnection(CONNECTION,p));
But I cannot find my mistake why I still get this exception.
I cannot change every settings for the database since this is a shared host. So I'd prefer a solution on Java side.
The problem is that you exceed your allowed set of connections to that database. Most likely this limit is exactly or very close to "1". So as soon as you request your second connection your program crashes.
You can solve this by using a connection pooling system like commons-dbcp.
That is the recommended way of doing it and the other solution below is only if you may not use external resources.
If you are prohibited in the external code that you might use with your solution you can do this:
Create a "Database" class. This class and only this class ever connects to the DB and it does so only once per program run. You set it up, it connects to the database and then all the queries are created and run through this class, in Java we call this construct a "singleton". It usually has a private constructor and a public static method that returns the one and only instance of itself. You keep this connection up through the entire livetime of your program and only reactivate it if it gets stall. Basically you implement a "Connection Pool" for the specific case of the pool size "1".
public class Database {
private static final Database INSTANCE = new Database();
private Database() {}
public static Database getInstance() {
return INSTANCE;
}
// add your methods here.
}
When the program terminates, close the Connection (using a shutdown hook).