JPA Best Practice to update an Entity with Collections - java

I am using JPA in a Glassfish Container. I have the following Modell (not complete)
#Entity
public class Node {
#Id
private String serial;
#Version
#Column(updatable=false)
protected Integer version;
private String name;
#ManyToMany(cascade = {CascadeType.PERSIST,CascadeType.MERGE})
private Set<LUN> luns = new HashSet<LUN>();
#Entity
public class LUN {
#Id
private String wwid;
#Version
#Column(updatable=false)
protected Integer version;
private String vendor;
private String model;
private Long capacity;
#ManyToMany(mappedBy = "luns")
private Set<Node> nodes = new HashSet<Node>();
This information will be updated daily. Now my question is, what is the best practice to do this.
My fist approach was, I generate the Node Objects on the client (with LUNs) every day new, and merge it to the Database (I wanted to let JPA do the work) via service.
Now I did some tests without LUNs yet. I have the following service in a stateless EJB:
public void updateNode(Node node) {
if (!nodeInDB(node)) {
LOGGER.log(Level.INFO, "persisting node {0} the first time", node.toString());
em.persist(node);
} else {
LOGGER.log(Level.INFO, "merging node {0}", node.toString());
node = em.merge(node);
}
}
The test:
#Test
public void addTest() throws Exception {
Node node = new Node();
node.setName("hostname");
node.setSerial("serial");
nodeManager.updateNode(node);
nodeManager.updateNode(node);
node.setName("newhostname");
nodeManager.updateNode(node);
}
This works without the #Version Field. With the #Version field I get an OptimisticLockException.
Is that the wrong approach? Do I have to always perform an em.find(...) and then modify the managed entity via getter and setter?
Any help is appreciated.
BR Rene

The #version annotation is used to enable optimistic locking.
When you use optimistic locking, each successful write to your table increases a version counter, which is read and compared every time you persist your entities. If the version read when you first find your entity doesn't match the version in the table at write time, an exception is thrown.
Your program updates the table several times after reading the version column only once. Therefore, at the second time you call persist() or merge(), the version numbers don't match, and your query fails. This is the expected behavior when using optimistic locking: you were trying to overwrite a row that was changed since the time you first read it.
To answer your last question: You need to read the changed #version information after every write to your database. You can do this by calling em.refresh().
You should, however, consider re-thinking your strategy: Optimistic locks are best used on transactions, to ensure data consistency while the user performs changes. These usually read the data, display it to the user, wait for changes, and then persist the data after the user has finished the task. You wouldn't really want nor need to write the same data rows several times in this context, because the transaction could fail due to optimistic locking on every one of these write calls - it would complicate things rather than make them more simple.

Related

Multi-Tenant Sequence generator in Hibernate

I have a Client entity with orgId and clientId as a composite key. When I have to insert a new client object, I have to generate clientId id sequentially for each orgId, so to do that, I am generating clientId by maintaining the last clientId of every orgId in a separate table, and selecting, adding 1, and updating it.
#Entity
#Table(name = "ftb_client")
public class Client implements Serializable {
#Id
#JoinColumn(name = "ORG_ID")
protected String orgId;
#Id
#Column(name = "CLIENT_ID")
protected int clientId;
#Column(name = "CLIENT_NAME_ENG")
private String clientNameEng;
//....
}
#Entity
#Table
public class MySeq implements Serializable {
#Id
protected String orgId;
private int lastClientId;
//....
}
public Long getNewClientId(String orgId) {
MySeq mySeq = getSession()
.createQuery("from MySeq where orgId = :orgId", MySeq.class)
.setParameter("orgId", orgId)
.setLockMode(LockModeType.PESSIMISTIC_WRITE)
.uniqueResult();
mySeq.setLastClientId(mySeq.getLastClientId() + 1);
return mySeq.getLastClientId();
}
But, this leads to duplicate id generation if there are thousands of concurrent requests. So, to make it thread-safe I have to use Pessimistic locking, so that multiple requests do not generate the same clientId. But now, the problem is that lock doesn't get released until the transaction ends and concurrent requests keep pending for a long time.
So instead of using a lock, if I could use a separate sequence per orgId then I could make the id generation concurrent too. I want to manually execute the sequence generator by determining the sequence name in the runtime by doing something like client_sequence_[orgId] and execute it to generate the id.
And I also want to make it database-independent, or at least for Oracle, MySQL, and Postgres.
I want to know if it is possible or is there any other approach?
It doesn't matter if you use PESSIMISTIC_WRITE or not, a lock will be acquired anyway if you update the entity. The difference is that the lock is acquired eagerly in the case you describe here which prevents lost writes.
Usually, this is solved by creating a separate transaction for the sequence increment. To improve performance, you should increment by a batching factor i.e. 10 and keep 10 values in a queue in-memory to serve from. When the queue is empty, you ask for another 10 values etc.
Hibernate implements this behind the scenes with the org.hibernate.id.enhanced.TableGenerator along with org.hibernate.id.enhanced.PooledOptimizer. So if you know the sequences that you need upfront, I would recommend you use these tools for that purpose. You can also do something similar though yourself if you like.

H2 Database generation strategy is leaving gaps between id values

I'm working on a REST API using Spring. I have this class, which id's is being generated automatically:
#Entity
public class Seller implements Serializable{
private static final long serialVersionUID = 1L;
#Id
#GeneratedValue(strategy = GenerationType.AUTO)
private Long id;
private String name;
private double tasa;
public Long getId() {
return id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public double getTasa() {
return tasa;
}
public void setTasa(double tasa) {
this.tasa = tasa;
}
}
I added some endpoints to create, delete and get a seller from the DB. So my problem arises when I delete one seller from the DB. When I try to create a new one, I was expecting to get the lower available value for the id but what is actually doing is using some kind of counter/sequence. Let me show you:
So in my second post instruction I was expecting a json with id = 1, instead I received a 2. I tried using TABLE and IDENTITY strategies but the unwanted behavior continued. So my question is: how can I achieve the behavior I desire? I don´t want gaps between my seller's ids.
In general the database are designed to be incremental. When the ID is generated, it is not generated based on the content of the tables. instead of it, the ID is generated using a sequence. In your example you have some records, but imagine a database with a lot of records. The database generates the IDs based on a Sequence (or similar), to avoid read the data, an expensive process.
If the ID is not relevant to the business, then this behavior doesn't affect your process. (Like the message's id in a chat).
If the ID is important, I recommend to redefine the delete process. you probably need to preserve all the ids, like a customer id.
If you want to preserve the sequence and allow delete records, the recommendation is to generate the id by yourself, but you need to lead with problems like concurrence
I tried using TABLE and IDENTITY strategies but the unwanted behavior continued.
This is not unwanted behaviour. Check
How primary keys are generated.
So my question is: how can I achieve the behavior I desire? I don´t want gaps between my seller's ids
One way to achieve this is to not use #GeneratedValue(strategy = GenerationType.AUTO) and set id manually from program and in there you can put any logic you want.
It's not recommended to set primary key manually. If you want you can use any other field like seller_code for this behaviour.
Another question here which is similar to this.

How to index a boolean #Field in Hibernate Search if the field is not in the db table

I have a problem with indexing the boolean #field in Hibernate Search, the problem is when the object has changed the rest of the fields are changed as well only the boolean field keeps the old state of the object.
#JsonIgnore
#Field(name = "isWarning", index = Index.YES)
#SortableField(forField = "isWarning")
private boolean isWarning() {
//some logic
}
what is the right way to approach this problem?
I assume this "logic" you mention accesses other entities. You need to tell Hibernate Search that those entities are included in the entity with the isWarning method.
Let's say the isWarning method is defined in an entity called MainEntity, and it accesses data from another entity called SomeOtherEntity.
In SomeOtherEntity, you will have the reverse side of the association:
public class SomeOtherEntity {
#ManyToOne // Or #OneToOne, or whatever
private MainEntity mainEntity;
}
Just add #ContainedIn and you should be good:
public class SomeOtherEntity {
#ManyToOne // Or #OneToOne, or whatever
#ContainedIn
private MainEntity mainEntity;
}
Note that, unfortunately, this can have a significant impact in terms of performance if SomeOtherEntity is frequently updated: Hibernate Search will not be aware of exactly which part of SomeOtherEntity is used in MainEntity, and thus will reindex MainEntity each time SomeOtherEntity changes, even if the changes in SomeOtherEntity don't affect the result of isWarning. A ticket has been filed to address this issue, but it's still pending.

Hibernate/JPA version concurrency control and DTO/change command patterns

I would like to use #Version for optimistic concurrency control with JPA & Hibernate.
I know how it works in the typical scenario of two parallel transactions. I also know that if I have a CRUD with 1:1 mapping between the form and entity, I can just pass version along as a hidden field and use this to prevent concurrent modifications by users.
What about more interesting cases, which use DTOs or change command patterns? Is it possible to use #Version in this scenario as well, and how?
Let me give you an example.
#Entity
public class MyEntity {
#Id private int id;
#Version private int version;
private String someField;
private String someOtherField;
// ...
}
Now let's say two users open the GUI for this, make some modifications and save changes (not at the same time, so the transactions don't overlap).
If I pass the entire entity around, the second transaction will fail:
#Transactional
public void updateMyEntity(MyEntity newState) {
entityManager.merge(newState);
}
That's good, but I don't like the idea of passing entities everywhere and sometimes would use DTOs, change commands etc.
For simplicity change command is a map, eventually used in a call like this on some service:
#Transactional
public void updateMyEntity(int entityId, int version, Map<String, Object> changes) {
MyEntity instance = loadEntity(entityId);
for(String field : changes.keySey()) {
setWithReflection(instance, field, changes.get(field));
}
// version is unused - can I use it somehow?
}
Obviously, if two users open my GUI, both make a change, and execute it one after another, in this case both changes will be applied, and the last one will "win". I would like this scenario to detect concurrent modification as well (the second user should get an exception).
How can I achieve it?
If I understand your question correctly, all you need is a setter for your private int version field and when you update the entity, you set it in your entity. Of course your DTO must always transport version data. Eventually, you would do also something like:
MyEntity instance = loadEntity(entityId);
entityManager.detach(instance);
for(String field : changes.keySey()) {
setWithReflection(instance, field, changes.get(field));
}
//set also the version field, if the loop above does not set it
entityManager.merge(instance);

JPA EclipseLink 2 query performance

APPLICATION and ENVIRONMENT
Java EE / JSF2.0 / JPA enterprise application, which contains a web and an EJB module. I am generating PDF documents which contains evaluated data queried via JPA.
I am using MySQL as database, with MyISAM engine on all tables. JPA Provider is EclipseLink with cache set to ALL. FetchType.EAGER is used at relationships.
AFTER RUNNING NETBEANS PROFILER
Profiler results show that the following method is called the most. In this session it was 3858 invocations, with ~80 seconds from request to response. This takes up 80% of CPU time. There are 680 entries in the Question table.
public Question getQuestionByAzon(String azon) {
try {
return (Question) em.createQuery("SELECT q FROM Question q WHERE q.azonosito=:a").setParameter("a", azon).getSingleResult();
} catch (NoResultException e) {
return null;
}
}
The Question entity:
#Entity
#Inheritance(strategy = InheritanceType.SINGLE_TABLE)
public abstract class Question implements Serializable {
private static final long serialVersionUID = 1L;
#Id
#GeneratedValue(strategy = GenerationType.AUTO)
private Long id;
#Column(unique = true)
private String azonosito;
#Column(nullable = false)
#Basic(optional = false)
private String label;
#Lob
#Column(columnDefinition = "TEXT")
private String help;
private int quizNumber;
private String type;
#ManyToOne
private Category parentQuestion;
...
//getters and setters, equals() and hashCode() function implementations
}
There are four entities extending Question.
The column azonosito should be used as primary key, but I don't see this as the main reason for low performance.
I am interested in suggestions for optimization. Feel free to ask if you need more information!
EDIT See my answer summarizing the best results
Thanks in advance!
Using LAZY is a good start, I would recommend you always make everything LAZY if you are at all concerned about performance.
Also ensure that you are using weaving, (Java SE agent, or Java EE/Spring, or static), as LAZY OneToOne and ManyToOne depend on this.
Changing the Id to your other field would be a good idea, if you always query on it and it is unique. You should also check why your application keeps executing the same query over and over.
You should make the query a NameDQuery not use a dynamic query.
In EclipseLink you could also enable the query cache on the query (once it is a named query), this will enable cache hits on the query result.
Have you got unique index on the azonosito column in your database. Maybe that will help.
I would also suggest to fetch only the fields you really need so maybe some of then could be lazy i.e. Category.
Since changing fetch type of relationship to LAZY dramatically improved performance of your application, perhaps you don't have an index for foreign key of that relationship. If so, you need to create it.
In this answer I will summarize what was the best solution for that particular query.
First of all, I set azonosito column as primary key, and modified my entities accordingly. This is necessary because EclipseLink object cache works with em.find:
public Question getQuestionByAzon(String azon) {
try {
return em.find(Question.class, azon);
} catch (NoResultException e) {
return null;
}
}
Now, instead of using a QUERY_RESULT_CACHE on a #NamedQuery, I configured the Question entity like this:
#Entity
#Inheritance(strategy = InheritanceType.SINGLE_TABLE)
#Cache(size=1000, type=CacheType.FULL)
public abstract class Question implements Serializable { ... }
This means an object cache of maximum size 1000 will be maintained of all Question entities.
Profiler Results ~16000 invocations
QUERY_RESULT_CACHE: ~28000ms
#Cache(size=1000, type=CacheType.FULL): ~7500ms
Of course execution time gets shorter after the first execution.

Categories