How do I insert a lot of entities in a Play! Job?

How do I insert a lot of entities in a Play! Job? - java

In my application I have to simulate various situations for analysis. Thus insert a (very) large amount of lines into a database. (We're talking about a very large amount of data...several billion)
Model
#Entity
public class Case extends Model {
public String url;
}
Job
public class Simulator extends Job {
public void doJob() {
for (int i = 0; i !=) {
// Somestuff
new Case(someString).save();
}
}
}
After half an hour, there is still nothing in the database. But debug traces show Play inserts some stuff. I suspect it is some kind of cache.
I've tried about everything :
Model.em().flush();
Changes nothing.
Model.em().getTransaction().commit();
throws TransactionRequiredException occured : no transaction is in progress
Model.em().setFlushMode(FlushModeType.COMMIT);
Model.em().setFlushMode(FlushModeType.AUTO);
Changes nothing.
I've also tried #NoTransaction annotations everywhere :
Class & functions in Controller
Class Case
Overriding save method in Model
Class & functions of my Job
Getting quite desperate. Every kind of advice is welcome.
EDIT : After a little research, the first row appears in database. The associated ID is about 550.000. That means about half a million rows are somewhere in between my application and database.

Try
em.getTransaction().begin();
em.persist(model);
em.getTransaction().commit();
You can't commit a transaction before you begin it.

as per documentation, the job should have its own transaction enabled as Play request do, so that's not the issue. Try doing this:
for (int i = 0; i !=) {
// Somestuff
Case tmp = new Case(someString);
tmp = JPA.em().merge(tmp);
tmp.save();
}
The idea is that you add the newly created object to the EntityManager explicitly before saving, making sure the object is part of the "dirty objects" that will be persisted.

You need to instruct Play! when it should run your job by annotating your class with one of these annotations #OnApplicationStart, #Every or #On.
Please check Play! documentation on jobs

Related

How to create a reusable Map

Is there a way to populate a Map once from the DB (through Mongo repository) data and reuse it when required from multiple classes instead of hitting the Database through the repository.

As per your comment, what you are looking for is a Caching mechanism. Caches are components which allow data to live in memory, as opposed to files, databases or other mediums so as to allow for the fast retrieval of information (against a higher memory footprint).
There are probably various tutorials online, but usually caches all have the following behaviour:
1. They are key-value pair structures.
2. Each entity living in the cache also has a Time To Live, that is, how long will it considered to be valid.
You can implement this in the repository layer, so the cache mechanism will be transparent to the rest of your application (but you might want to consider exposing functionality that allows to clear/invalidate part or all the cache).
So basically, when a query comes to your repository layer, check in the cache. If it exists in there, check the time to live. If it is still valid, return that.
If the key does not exist or the TTL has expired, you add/overwrite the data in the cache. Keep in mind that when updating the data model yourself, you also invalidate the cache accordingly so that new/fresh data will be pulled from the DB on the next call.

You can declare the map field as public static and this would allow application wide access to hit via ClassLoadingData.mapField
I think a better solution, if I understood the problem would be a memoized function, that is a function storing the value of its call. Here is a sketch of how this could be done (note this does not handle possible synchronization problem in a multi threaded environment):
class ClassLoadingData {
private static Map<KeyType,ValueType> memoizedValues = new HashMap<>();
public Map<KeyType,ValueType> getMyData() {
if (memoizedData.isEmpty()) { // you can use more complex if to handle data refresh
populateData(memoizedData);
} else {
return memoizedData;
}
}
private void populateData() {
// do your query, and assign result to memoizedData
}
}

Premise: I suggest you to use an object-relational mapping tool like Hibernate on your java project to map the object-oriented
domain model to a relational database and let the tool handle the
cache mechanism implicitally. Hibernate specifically implements a multi-level
caching scheme ( take a look at the following link to get more
informations:
https://www.tutorialspoint.com/hibernate/hibernate_caching.htm )
Regardless my suggestion on premise you can also manually create a singleton class that will be used from every class in the project that goes to interact with the DB:
public class MongoDBConnector {
private static final Logger LOGGER = LoggerFactory.getLogger(MongoDBConnector.class);
private static MongoDBConnector instance;
//Cache period in seconds
public static int DB_ELEMENTS_CACHE_PERIOD = 30;
//Latest cache update time
private DateTime latestUpdateTime;
//The cache data layer from DB
private Map<KType,VType> elements;
private MongoDBConnector() {
}
public static synchronized MongoDBConnector getInstance() {
if (instance == null) {
instance = new MongoDBConnector();
}
return instance;
}
}
Here you can define then a load method that goes to update the map with values stored on the DB and also a write method that instead goes to write values on the DB with the following characteristics:
1- These methods should be synchronized in order to avoid issues if multiple calls are performed.
2- The load method should apply a cache period logic ( maybe with period configurable ) to avoid to load for each method call the data from the DB.
Example: Suppose your cache period is 30s. This means that if 10 read are performed from different points of the code within 30s you
will load data from DB only on the first call while others will read
from cached map improving the performance.
Note: The greater is the cache period the more is the performance of your code but if the DB is managed you'll create inconsistency
with cache if an insertion is performed externally ( from another tool
or manually ). So choose the best value for you.
public synchronized Map<KType, VType> getElements() throws ConnectorException {
final DateTime currentTime = new DateTime();
if (latestUpdateTime == null || (Seconds.secondsBetween(latestUpdateTime, currentTime).getSeconds() > DB_ELEMENTS_CACHE_PERIOD)) {
LOGGER.debug("Cache is expired. Reading values from DB");
//Read from DB and update cache
//....
sampleTime = currentTime;
}
return elements;
}
3- The store method should automatically update the cache if insert is performed correctly regardless the cache period is expired:
public synchronized void storeElement(final VType object) throws ConnectorException {
//Insert object on DB ( throws a ConnectorException if insert fails )
//...
//Update cache regardless the cache period
loadElementsIgnoreCachePeriod();
}
Then you can get elements from every point in your code as follow:
Map<KType,VType> liveElements = MongoDBConnector.getElements();

Java Mock data base for testing Spring application

I have made simple application for study perpose and i want to write some unit/intagration tests. I read some information about that i can mock data base insted of create new db for tests. I will copy the code which a write. I hope that some one will explain me how to mock database.
public class UserServiceImpl implements UserService {
#Autowired
private UserOptionsDao uod;
#Override
public User getUser(int id) throws Exception {
if (id < 1) {
throw new InvalidParameterException();
}
return uod.getUser(id);
}
#Override
public User changeUserEmail(int id, String email) {
if (id < 1) {
throw new InvalidParameterException();
}
String[] emailParts = email.split("#");
if (emailParts[0].length() < 5) {
throw new InvalidParameterException();
} else if (!emailParts[1].equals("email.com")) {
throw new InvalidParameterException();
}
return uod.changeUserEmail(id, email);
}
This above i a part of the code that i want to test with the mock data base.

Generally you have three options:
Mock the data returned by UserOptionsDao as #Betlista suggested, thus creating a "fake" DAO object.
Use an in-memory database like HSQLDB to create a database with mock data when the test starts, or
Use something like a Docker container to spin up an instance of MySQL or the like and populate it with data, so you can restart it as necessary.
None of these solutions are perfect.
With #1, your test will skip the intermediate steps of authenticating to the database and looking for data. That leaves a part of your code untested, and as they say, "the devil is in the details." Often people run into problems when they mock DAO's like this when they try to deploy.
With #2, you connect to an actual database, but you have to make sure that either you are using the exact same type of database in your production code or something compatible. It also makes debugging a pain because you have to pause the test to see the contents of the database if something goes wrong.
With #3, you avoid all the problems with #1 and #2, but then you have to wire up all the Docker stuff. (I'm doing this right now, and I'm having problems too). The advantage, though, is that like #2 you can set up all of your test data at once, and be guaranteed that the production database you choose will be exactly the same as your unit test.
In your case, I would go with #2 since the application is for study purposes. Yes, I know this is a long-winded answer, but as you gain experience, you will probably want to know how to "scale up."

What you can do very easily is to have your implementation of UserOptionsDao in test package and set this one to UserServiceImpl. This new implementation can return fixed set of data for example...
This is a highlevel idea. You probably do not want to have many implementations (different for each test in general), so you should use some mocking framework like Mockito or EasyMock, look at the documentation for more details.

App Engine: different results for same objectify query

I am developing a small backend with app engine. Now I get some weird behaviour after saving an entity multiple times with different values.
My code to load entities is the same for all entities - each entity gets a changeId so I can transfer only changed entities to the clients:
public class VersionableRecordHelper<T extends VersionableRecord> {
final Class<T> clazz;
public VersionableRecordHelper(Class<T> clazz) {
this.clazz = clazz;
}
Query<T> load() {
return ofy().load().type(clazz);
}
List<T> loadOrdered() {
return load().order("changeId").list();
}
public List<T> loadOrdered(Long since) {
return since == null ? loadOrdered() : load().filter("changeId >", since).order("changeId").list();
}
}
The clients then can query all objects of a class by providing a since value. For example:
private final VersionableRecordHelper<Cat> helper
= new VersionableRecordHelper<>(Cat.class);
// actually an #ApiMethod, simplified here
public List<Cat> getCats(Long since) {
return helper.loadOrdered(since);
}
My Cat entity looks like the following:
#Entity
#Cache
#JsonSerialize(include = JsonSerialize.Inclusion.ALWAYS)
public class Cat extends VersionableRecord {
// some fields, getters, setters
}
public class VersionableRecord {
#Id
private String id;
#Index
private Long changeId;
// getters, setters and more
}
Now, if I do the same REST request with since == 4, I get completely different results - sometimes with changeId == 5, but also with 2, 3 or 4 - which should not even be possible!
I am completely lost here. This is what I checked yet:
I did not change the records during the test. In fact, I completely left the records alone for more than 90 minutes.
I checked that only one app engine instance was running.
I tried to flush the memcache - but the same 2 ObjectifyCache keys keep hanging around.
The memcache service level is 'Shared'.
I checked that the value of sinceis not null by any chance. So the code definitely gets executed.
Currently I am using objectify version 5.0.3. From my build.gradle: compile 'com.googlecode.objectify:objectify:5.0.3'
I also made sure the entity has the correct changeId in the datastore by checking https://console.developers.google.com/project/project-id/datastore/query?authuser=0
Does anyone have a helpful idea? I also checked for different type of entities - the same behaviour.

Wild guess, this is related to FAQ #3:
https://code.google.com/p/objectify-appengine/wiki/FrequentlyAskedQuestions#Strange_things_are_showing_up_in_my_session_cache!_(or_missing_f
or, when googlecode dies, the third one down:
https://github.com/objectify/objectify/wiki/FrequentlyAskedQuestions
You need to have the ObjectifyFilter installed otherwise you will bleed session data into subsequent requests. Upgrade to a more recent version of Objectify; it will give you a more explicit error (at the cost of complicating test and remote api usage, but that's a different story).
If this isn't your issue, you need to describe your exact code in more detail.

Coherence and container managed transactions

I'm implementing simultaneous write into database and Oracle Coherence 3.7.1 and want to make whole operation transactional.
I would like to have a critique on my approach.
Currently, I've created façade class like this:
public class Facade {
#EJB
private JdbcDao jdbcDao;
#EJB
private CoherenceDao coherenceDao;
#TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
private void updateMethod(List<DomainObject> list) {
jdbcDao.update(list);
coherenceDao.update(list);
}
}
I guess JDBC DAO would not need to do anything specific about transactions, it something happens Hibernate would throw some kind of RuntimeException.
public class JdbcDao {
private void update(List<DomainObject> list) {
// I presume there is nothing specific I have to do about transactions.
// if I don't catch any exceptions it would work just fine
}
}
Here is interesting part. How do I make Coherence support transactions?
I guess I should open coherence transaction inside update() method and on any exceptions inside it I should throw RuntimeException myself?
I currently thinking of something like this:
public class CoherenceDao {
private void update(List<DomainObject> list) {
// how should I make it transactional?
// I guess it should somehow throw RuntimeException?
TransactionMap mapTx = CacheFactory.getLocalTransaction(cache);
mapTx.setTransactionIsolation(TransactionMap.TRANSACTION_REPEATABLE_GET);
mapTx.setConcurrency(TransactionMap.CONCUR_PESSIMISTIC);
// gather the cache(s) into a Collection
Collection txnCollection = Collections.singleton(mapTx);
try {
mapTx.begin();
// put into mapTx here
CacheFactory.commitTransactionCollection(txnCollection, 1);
} catch (Throwable t) {
CacheFactory.rollbackTransactionCollection(txnCollection);
throw new RuntimeException();
}
}
}
Would this approach work as expected?

I know that you asked this question a year ago and my answer now might not be as much as value for you after a year but I still give it a try.
What you are trying to do works as long as there is no RuneTimeException after the method call of coherenceDao.update(list); You might be assuming that you don't have any line of codes after that line but that's not the whole story.
As an example: You might have some deferrable constraints in your Database. Those constraints will be applied when the container is trying to commit the transaction which is on method exit of updateMethod(List<DomainObject> list) and after your method call to coherenceDao.update(list). Another cases would be like a connection timeout to database after that coherenceDao.update(list) is executed but still before the transaction commit.
In both cases your update method of CoherenceDAO class is executed safe and sound and your coherence transaction is not rollbacked anymore which will put your cache in an inconsistent state because you will get a RuneTimeException because of those DB or Hibernate Exceptions and that will cause your container managed transaction to be rollbacked!

Update row in Play Framework when running a job

I'm running a Job in Play! that takes a while (more than 1 hour). During this job, I want to save some id's to my MySQL database. However, after writing the data, if I query the database in my terminal (using the MySQL monitor), the rows don't seem to be updated. So.. I need to update this data while the job is running.
This is the code:
this.ga.lastUID = msgnums[i-1];
this.ga.count = count[i-1];
this.ga.merge();
where this.ga is the instance of my JPA model.
If I use the save() method, I get a detached exception. If I use the findById(this.ga.id) method to get a new object, I can use save() but it has the same result as merge() (i.e. nothing is updated). Anyone has an idea how to fix this?
Thanks!

A job execution is transactional. Everything you do in your job will be commited at the end of the job.
To achieve what you want you can create a subjob class an call it in the main one.
Something like
public class SubJob extends Job {
private Ga ga;
public SubJob(Ga ga) {
this.ga = ga;
}
#Override
public doJob() throws Exception {
ga = ga.merge();
ga.lastUID = msgnums[i-1];
ga.count = count[i-1];
ga.save();
}
}
And in your main job you call
new SubJob(ga).now().get();
Your subjob will execute the save in another transaction that will be commited at the end of the subjob.
Be careful, we this way of work you can't rollback all the main job

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.