How to properly modify attached hibernate collections without too many side effects - java

I have the following code:
I have a unidirectional one-to-many relationship between Article and Comments:
#Entity
public class Article {
#OneToMany(orphanRemoval=true)
#JoinColumn(name = "article_id")
private List<Comment> comments= new ArrayList<>();
…
}
I used set ophanRemoval=true in order to mark the "child" entity to be removed when it's no longer referenced from the "parent" entity, e.g. when you remove the child entity from the corresponding collection of the parent entity.
Here is an example:
#Service
public class MyService {
public Article modifyComment(Long articleId) {
Article article = repository.findById(articleId);
List<Comments> comments = article.getComments();
//Calls a method which modifies removes some comments from the collection based on some logic
removeSomeComments(comments); //side effect
modifyComments(comments); //side effect
.....
return repository.save(article);
}
}
So I have some statements that perform some actions on the collection, which will then get persisted in the database. In the example above I am getting the article from the database, performing some mutations on the object, by deleting/modifying some comments and then saving it in the database.
I am not sure what's the cleanest way of modifying collections of objects without having to many side-effects, which leads to an error-prone code (my code is more complex and requires multiple mutations on the collection).
Since I am inside the transaction any changes (adding, deleting or modifying children) to the collection will be persisted the next time EntityManager.commit() is called.
However, I tried to refactor this code and write it in more expressive functional style:
public Article modifyComment(Long articleId) {
Article article = repository.findById(articleId);
List<Comment> updatedComments = article.getComments().stream()
filter(some logic..) //remove some comments from the list based on a filter
sorted()
.filter(again some logic) //do more stuff
.collect(Collectors.toList());
article.add(updatedComments);
return repository.save(article);
}
I like this approach more, as it short, concise and more expressive.
However this won't work since it throws:
A collection with cascade=“all-delete-orphan” was no longer referenced by the owning entity instance
That's because I am assigning a new list (updatedComments) .
If I want to remove or modify children from the parent I have to modify the contents of the list instead of assigning a new list.
So I had to do this at the end:
article.getComments().clear();
article.getComments().addAll(updatedComments);
repository.save(article)
Do you consider the second example a good practice?
I am not sure how to work with collections in JPA.
My business logic is more complex and i want to avoid having 3-4 methods that mutate a given collection (attached to a hibernate session) which was passed in as parameter.
I think the second example has less potential for side effects because it doesn't mutate any input parameter. What do you think?
(I am using Spring-Boot 2.2.5)

You can actually try and turn the predicate logic used in your filter
.filter(some logic..) //remove some comments from the list based on a filter
to be used within removeIf and perform the modification as:
Article article = repository.findById(articleId);
article.getComments().removeIf(...inverse of some logic...) //this
return repository.save(article);

Related

How to cache methods that return the same list of objects, but based on different conditions?

For example, I have 3 methods. All return data from the same repository and convert it to DTO.
How should I annotate them?
Would it be ok to annotate all three with the same #Cacheable("Dishes_DTO")? And what will happen when one of the methods executes after another, will it override data or make duplicates?
public List<DishResponseDTO> getAllToday() {
List<Dish> dishlsit = dishRepository.getAllByDateAdded(LocalDate.now(clock));
return dishlsit.stream()
.map(DishMapper::toDishResponseDTO)
.collect(Collectors.toList());
}
public List<DishResponseDTO> getAll() {
List<Dish> dishlsit = dishRepository.findAll();
return dishlsit.stream()
.map(DishMapper::toDishResponseDTO)
.collect(Collectors.toList());
}
public List<DishResponseDTO> getDishHistoryByRestaurant(int restaurantId) {
return dishRepository.getAllByRestaurantId(restaurantId)
.stream()
.map(DishMapper::toDishResponseDTO)
.collect(Collectors.toList());
}
In case you use one cache, you need to have separate keys. One approach you could do:
#Cacheable(value="dishdto", key="-1")
public List<DishResponseDTO> getAllToday();
#Cacheable(value="dishdto", key="-2");
public List<DishResponseDTO> getAll();
#Cacheable("dishdto")
public List<DishResponseDTO> getDishHistoryByRestaurant(int restaurantId);
This uses integer keys and expects that the restaurantId will not get negative.
Your question has a lot more aspects to it:
Caching queries
List results
Time dependent results
Reporting on movable data, in general
Design problems:
Your current design holds duplicate data in the cache and memory, because all three method results might contain the same dish.
Since its moving data you need to update the cache often, e.g. by setting an expiry parameter (or TTL) on the cache of 5 minutes. This means you will reread the same dish data, although it will probably not change any more. This can be solved by a cache within the repository or databse. Still, you generate the DTO for the same data entry many times.
If things get to evasive, its better to separate the cache for dish dto objects and query results.

Java streams map with sideeffect and collect or foreach and populate a result-list

I have a piece of code that looks something like this.
I have read two contradicting(?) "rules" regarding this.
That .map should not have side effects
That .foreach should not
update a mutable variable (so if I refactor to use foreach and
populate a result list, then that breaks that) as mentioned in http://files.zeroturnaround.com/pdf/zt_java8_streams_cheat_sheet.pdf
How can I solve it so I use streams and still returns a list, or should I simply skip streams?
#Transactional
public Collection<Thing> save(Collection<Thing> things) {
return things.stream().map(this::save).collect(Collectors.toList());
}
#Transactional
public Thing save(Thing thing) {
// org.springframework.data.repository.CrudRepository.save
// Saves a given entity. Use the returned instance for further operations as the save operation might have changed the entity instance completely.
Thing saved = thingRepo.save(thing);
return saved;
}
Doesn't that paper say shared mutable state? In your case if you declare the list inside the method and then use forEach, everything is fine. The second answer here mentions exactly what you are trying to do.
There is little to no reason to collect a entirely new List if you don't mutate it at all. Besides that your use case is basically iterating over every element in a collection and save that which could simply be achieved by using a for-each.
If for some reason thingRepo.save(thing) mutates the object you can still return the same collection, but at this point it's a hidden mutation which is not clearly visible at all since thingRepo.save(thing) does not suggest that.

Updating only some attributes of objects in cache

Let's say I have a class Item. Items have object attributes and collection of other objects attributes:
public class Item
{
//Object attributes
String name;
int id;
Color color;
//Collection of object attributes
List<Parts> parts;
Map<int,Owner> ownersById;
}
I have a fairly simple web application that allows crud operations on these items. This is split up into separate operations:
a page where you can update the simple object attributes (name, id...).
a page where you can edit the collection of parts.
a page where you can edit the map of owners.
Because the server load was getting too high, I implemented a cache in the application which holds the "most recently used item objects" with their simple attributes and their collection attributes.
Whenever an edit is made to the name of an item, I want to do the following do things:
Persist the change to the item's name. This is done by converting the item object to xml (without any collection attributes) and calling a web service named "updateItemData".
Update the current user's cache by updating the relevant item's nme inside the cache. This way the cache stays relevant without having to load the item again after persisting it.
To do this I created the following method:
public void updateItem(Item itemWithoutCollectionData)
{
WebServiceInvoker.updateItemService(itemWithoutCollectionData)
Item cachedItemWithCollectionData = cache.getItemById(itemWithoutCollectionData.getId());
cachedItemWithCollectionData.setName(itemWithoutCollectionData.getName());
cachedItemWithCollectionData.setColor(itemWithoutCollectionData.getColor());
}
This method is very annoying because I have to copy the attributes one by one, because I cannot know beforehand which ones the user just updated. Bugs arised because the objects changed in one place but not in this piece of code. I can also not just do the following: cachedItem = itemWithoutCollectionData; because this would make me lose the collection information which is not present in the itemWithoutCollectionData variable.
Is there way to either:
Perhaps by reflection, to iterate over all the non-collection attributes in a class and thus write the code in a way that it does not matter if future fields are added or removed in the Item class
Find a way so that, if my Item class gains a new attribute, a warning is shown in the class that deals with the caching to signal "hey, you need to update me too!")?
an alternative which might seem a bit overkill: wrap all the non-collection attributes in a class, for example ItemSimpleData and use that object instead of separate attributes. However, this doesn't work well with inheritance. How would you implement this method in the following structure?
classes:
public class BaseItem
{
String name;
int id;
}
public class ColoredItem
{
Color color;
}
There many things that can be done to enhance what you currently have but I am going to point out just two things that may help you with your problem.
Firstly, I am assuming that public void updateItem is a simplified version from your production code. So; make sure this method is thread safe, since it is a common source or problems when it comes to caching.
Secondly, you mentioned that
Perhaps by reflection, to iterate over all the non-collection
attributes in a class and thus write the code in a way that it does
not matter if future fields are added or removed in the Item class.
If I understand the problem correctly; then, you can easily achieve this using BeanUtils.copyProperties() here is an example:
http://www.mkyong.com/java/how-to-use-reflection-to-copy-properties-from-pojo-to-other-java-beans/
I hope it helps.
Cheers,

To initialize or not initialize JPA relationship mappings?

In one to many JPA associations is it considered a best practice to initialize relationships to empty collections? For example.
#Entity
public class Order {
#Id
private Integer id;
// should the line items be initialized with an empty array list or not?
#OneToMany(mappedBy="order")
List<LineItem> lineItems = new ArrayList<>();
}
In the above example is it better to define lineItems with a default value of an empty ArrayList or not? What are the pros and cons?
JPA itself doesn't care whether the collection is initialized or not. When retrieving an Order from the database with JPA, JPA will always return an Order with a non-null list of OrderLines.
Why: because an Order can have 0, 1 or N lines, and that is best modeled with an empty, one-sized or N-sized collection. If the collection was null, you would have to check for that everywhere in the code. For example, this simple loop would cause a NullPointerException if the list was null:
for (OrderLine line : order.getLines()) {
...
}
So it's best to make that an invariant by always having a non-null collection, even for newly created instances of the entity. That makes the production code creating new orders safer and cleaner. That also makes your unit tests, using Order instances not coming from the database, safer and cleaner.
I would also recommend using Guava's immutable collections, e.g.,
import com.google.common.collect.ImmutableList;
// ...
#OneToMany(mappedBy="order")
List<LineItem> lineItems = ImmutableList.of();
This idiom never creates a new empty list, but reuses a single instance representing an empty list (the type does not matter). This is a very common practice of functional programming languages (Scala does this too) and reduces to zero the overhead of having empty objects instead of null values, making any efficiency argument against the idiom moot.
I would rather prefer an utility like this:
public static <T> void forEach(Collection<T> values, Consumer<T> consumer) {
if (values != null) values.stream().forEach(consumer);
}
and use it in code like:
Utils.forEach(entity.getItems(), item -> {
// deal with item
});
My suggestion would be to not initialize them.
We ran into a situation where we initialized our collections, then retrieved same entity essentially twice successively. After the second retrieve, a lazy loaded collection that should have had data was empty after calling its getter. If we called the getter after the first retrieve, on the other hand, the collection did load the data. Theory is that the second retrieve got a managed entity from the session that had its collection initialized to empty and appeared to already be loaded or appeared to be modified, and therefore no lazy load took place. Solution was to NOT initialize the collections. This way we could retrieve the entity multiple times in the transaction and have its lazy loaded collections load correctly.
One more item to note: in a different environment, the behavior was different. The collection was lazy loaded just fine when calling the collection's getter on the entity that was retrieved the second time in the same transaction.
Unfortunately I don't have information on what was different between the two environments. It appears - although we didn't prove it 100% and didn't identify the implementations - that different JPA implementations work differently with respect to initialized collections.
We were using hibernate - just don't know which version we were using on each of the two platforms.

How do I guarantee the order of items in a collection

I have a list of objects and each and every object in the list have a position which may not change unless explicitly changed, in other words, the objects are in a queue. What collection should I use in my entity class to store these objects and how should that collection be annotated?
I currently have this
#Entity
class Foo {
...
#OneToMany(mappedBy = "foo", cascade = CascadeType.ALL)
List<Bar> bars = new ArrayList<Bar>();
...
}
If this is not possible with JPA purely, I'm using EclipseLink as my JPA provider, so EclipseLink specific annotations will ok if nothing else helps.
EDIT: Note people, the problem is not that Java wouldn't preserv the order, I know most collections do, the problem is that I don't know a smart way for JPA to preserv the order. Having an order id and making the query order by it is possible, but in my case maintaining the order id is laborious (because the user interface allows reordering of items) and I'm looking for a smarter way to do it.
If you want this to be ordered after round-tripping to SQL, you should provide some sort of ordering ID within the entity itself - SQL is naturally set-oriented rather than list-oriented. You can then either sort after you fetch them back, or make sure your query specifies the ordering too.
If you give the entity an auto-generated integer ID this might work, but I wouldn't like to guarantee it.
Use a sort order id, as Jon suggested, then add an #OrderBy annotation below the #OneToMany. This will order any query by the specified field.
As an example, if you add a new field called "sortId" to Bar, Foo would look like this:
#Entity
class Foo {
...
#OneToMany(mappedBy = "foo", cascade = CascadeType.ALL)
#OrderBy("sortId ASC")
List bars = new ArrayList();
...
}
You can
Sort a List before creation
Sort a List after creation
Use a collection that performs a sort on insert. TreeMap, TreeSet
A linked list implements the Queue inteface in java and allows you to add things in the middle...
TBH most of the collections are ordered aren't they...
Check the docs, most say whether or not they are ordered.
It's worth trying LinkedList instead of ArrayList, however, as Jon said, you need to find a way of persisting the order information.
A solution will probably involve issuing an order number to each entry an storing it as a SortedMap, when converting into the List, if List is that you need.
However, ORM could potentially be clever enough to do all the conversions for you if you stored the collection as LinkedList.

Categories