Using Objectify to concurrently write data on GAE

Using Objectify to concurrently write data on GAE - java

Let's for example say I have the following objectify model:
#Cache
#Entity
public class CompanyViews implements Serializable, Persistence {
#Id
private Long id;
private Date created;
private Date modified;
private Long companyId;
........
private Integer counter;
........
#Override
public void persist() {
persist(false);
}
#Override
public void persist(Boolean async) {
ObjectifyService.register(Feedback.class);
// setup some variables
setUuid(UUID.randomUUID().toString().toUpperCase());
setModified(new Date());
if (getCreated() == null) {
setCreated(new Date());
}
// do the persist
if (async) {
ofy().save().entity(this);
} else {
ofy().save().entity(this).now();
}
}
}
I want to use the counter field to track the number of views, or number opens or basically count something using an integer field.
What happens now is that for one GAE instance, the following will be called:
A:
CompanyViews views = CompanyViews.findByCompanyId(...);
views.setCounter(views.getCounter() + 1);
views.persist();
and for another instance:
B:
CompanyViews views = CompanyViews.findByCompanyId(...);
views.setCounter(views.getCounter() + 1);
views.persist();
If they both read the counter at the same time or read the counter before the other instance has persisted it, they will overwrite each other.
In MySQL / Postgres you get row-level locking, how does one do a "row-level lock" for Objectify entities on GAE?

You need to use transactions when concurrently updating entities.
Note that since you update same entity you will have a limitation of about 1 write/s. To work around that look into sharding counters.

Related

Hibernate/Spring - excessive memory usage in org.hibernate.engine.internal.StatefulPersistenceContext

I have an application that uses Hibernate and it's running out of memory with a medium volume dataset (~3 million records). When analysing the memory dump using Eclipse's Memory Analyser I can see that StatefulPersistenceContext appears to be holding a copy of the record in memory in addition to the object itself, doubling the memory usage.
I'm able to reproduce this on a slightly smaller scale with a defined workflow, but am unable to simplify it to the level that I can put the full application here. The workflow is:
Insert ~400,000 records (Fruit) into the database from a file
Get all of the Fruits from the database and find if there are any complementary items to create ~150,000 Baskets (containing two Fruits)
Retrieve all of the data - Fruits & Baskets - and save to a file
It's running out of memory at the final stage, and the heap dump shows StatefulPersistenceContext has hundreds of thousands of Fruits in memory, in addition to the Fruits we retrieved to save to the file.
I've looked around online and the suggestion appears to be to use QueryHints.READ_ONLY on the query (I put it on the getAll), or to wrap it in a Transaction with the readOnly property set - but neither of these seem to have stopped the massive StatefulPersistenceContext.
Is there something else I should be looking at?
Examples of the classes / queries I'm using:
public interface ShoppingService {
public void createBaskets();
public void loadFromFile(ObjectInput input);
public void saveToFile(ObjectOutput output);
}
#Service
public class ShoppingServiceImpl implements ShoppingService {
#Autowired
private FruitDAO fDAO;
#Autowired
private BasketDAO bDAO;
#Override
public void createBaskets() {
bDAO.add(Basket.generate(fDAO.getAll()));
}
#Override
public void loadFromFile(ObjectInput input) {
SavedState state = ((SavedState) input.readObject());
fDAO.add(state.getFruits());
bDAO.add(state.getBaskets());
}
#Override
public void saveToFile(ObjectOutput output) {
output.writeObject(new SavedState(fDAO.getAll(), bDAO.getAll()));
}
public static void main(String[] args) throws Throwable {
ShoppingService service = null;
try (ObjectInput input = new ObjectInputStream(new FileInputStream("path\\to\\input\\file"))) {
service.loadFromFile(input);
}
service.createBaskets();
try (ObjectOutput output = new ObjectOutputStream(new FileOutputStream("path\\to\\output\\file"))) {
service.saveToFile(output);
}
}
}
#Entity
public class Fruit {
#Id
#GeneratedValue(strategy = GenerationType.SEQUENCE)
private Long id;
private String name;
// ~ 200 string fields
}
public interface FruitDAO {
public void add(Collection<Fruit> elements);
public List<Fruit> getAll();
}
#Repository
public class JPAFruitDAO implements FruitDAO {
#PersistenceContext
private EntityManager em;
#Override
#Transactional()
public void add(Collection<Fruit> elements) {
elements.forEach(em::persist);
}
#Override
public List<Fruit> getAll() {
return em.createQuery("FROM Fruit", Fruit.class).getResultList();
}
}
#Entity
public class Basket {
#Id
#GeneratedValue(strategy = GenerationType.SEQUENCE)
private Long id;
#OneToOne
#JoinColumn(name = "arow")
private Fruit aRow;
#OneToOne
#JoinColumn(name = "brow")
private Fruit bRow;
public static Collection<Basket> generate(List<Fruit> fruits) {
// Some complicated business logic that does things
return null;
}
}
public interface BasketDAO {
public void add(Collection<Basket> elements);
public List<Basket> getAll();
}
#Repository
public class JPABasketDAO implements BasketDAO {
#PersistenceContext
private EntityManager em;
#Override
#Transactional()
public void add(Collection<Basket> elements) {
elements.forEach(em::persist);
}
#Override
public List<Basket> getAll() {
return em.createQuery("FROM Basket", Basket.class).getResultList();
}
}
public class SavedState {
private Collection<Fruit> fruits;
private Collection<Basket> baskets;
}

Have a look at this answer here... How does Hibernate detect dirty state of an entity object?
Without access to the heap dump or your complete code, I would believe that you are seeing exactly what you are saying that you see. As long as hibernate believes that it is possible that the entities will change, it keeps a complete copy in memory so that it can compare the current state of the object to the state as it was originally loaded from the database. Then at the end of the transaction (the transactional block of code), it will automatically write the changes to the database. In order to do this, it needs to know what the state of the object used to be in order to avoid a large number of (potentially expensive) write operations.
I believe that setting the transaction-block so that it is read-only is a step on the right-track. Not completely sure, but I hope the information here helps you at least understand why you are seeing large memory consumption.

1: Fetching all Fruits at once from DB, or Persisting large set of bucket once will impact DB performance as well as application performance because of huge objects in Heap memory (young gen + Old gen based on Object survive in heap). Use batch process instead of processing all data once.
use spring batch or implement or a custom logic to process data in set of chunks.
2: The persistence context stores newly created and modified entities in memory. Hibernate sends these changes to the database when the transaction is synchronized. This generally happens at the end of a transaction. However, calling EntityManager.flush() also triggers a transaction synchronization.
Secondly, the persistence context serves as an entity cache, also referred to as the first level cache. To clear entities in the persistence context, we can call EntityManager.clear().
Can take ref for batch processing from here.
3.If you don't plan on modifying Fruit, you could just fetch entries in read-only mode: Hibernate will not retain the dehydrated state which it normally uses for the dirty checking mechanism. So, you get half the memory footprint.

Quick Solution: If you just execute this method one time for db create increase jvm -Xmx value.
Real Solution: When you try to persist everything it will keep all datas in memory until commit, and memory easily consume, so rather than this, try to save datas part part like this dump modes. For example:
EntityManager em = ...;
for (Fruid fruid : fruids) {
try {
em.getTransaction().begin();
em.persist(fruid);
em.getTransaction().commit();
} finally {
if (em.getTransaction().isActive()) {
em.getTransaction().rollback();
}
if (em.isOpen())
em.close();
}
}

Axon: Create and Save another Aggregate in Saga after creation of an Aggregate

Update: The issue seems to be the id that I'm using twice, or in other words, the id from the product entity that I want to use for the productinventory entity. As soon as I generate a new id for the productinventory entity, it seems to work fine. But I want to have the same id for both, since they're the same product.
I have 2 Services:
ProductManagementService (saves a Product entity with product details)
1.) For saving the Product Entity, I implemented an EventHandler that listens to ProductCreatedEvent and saves the product to a mysql database.
ProductInventoryService (saves a ProductInventory entity with stock quantities of product to a certain productId defined in ProductManagementService )
2.) For saving the ProductInventory Entity, I also implemented an EventHandler that listens to ProductInventoryCreatedEvent and saves the product to a mysql database.
What I want to do:
When a new Product is created in ProductManagementService, I want to create a ProductInventory entity in ProductInventoryService directly afterwards and save it to my msql table. The new ProductInventory entity shall have the same id as the Product entity.
For that to accomplish, I created a Saga, which listes to a ProductCreatedEvent and sends a new CreateProductInventoryCommand. As soon as the CreateProductInventoryCommand triggers a ProductInventoryCreatedEvent, the EventHandler as described in 2.) should catch it. Except it doesn't.
The only thing thta gets saved is the Product Entity, so in summary:
1.) works, 2.) doesn't. A ProductInventory Aggregate does get created, but it doesn't get saved since the saving process that is connected to an EventHandler isn't triggered.
I also get an Exception, the application doesn't crash though: Command 'com.myApplication.apicore.command.CreateProductInventoryCommand' resulted in org.axonframework.commandhandling.CommandExecutionException(OUT_OF_RANGE: [AXONIQ-2000] Invalid sequence number 0 for aggregate 3cd71e21-3720-403b-9182-130d61760117, expected 1)
My Saga:
#Saga
#ProcessingGroup("ProductCreationSaga")
public class ProductCreationSaga {
#Autowired
private transient CommandGateway commandGateway;
#StartSaga
#SagaEventHandler(associationProperty = "productId")
public void handle(ProductCreatedEvent event) {
System.out.println("ProductCreationSaga, SagaEventHandler, ProductCreatedEvent");
String productInventoryId = event.productId;
SagaLifecycle.associateWith("productInventoryId", productInventoryId);
//takes ID from product entity and sets all 3 stock attributes to zero
commandGateway.send(new CreateProductInventoryCommand(productInventoryId, 0, 0, 0));
}
#SagaEventHandler(associationProperty = "productInventoryId")
public void handle(ProductInventoryCreatedEvent event) {
System.out.println("ProductCreationSaga, SagaEventHandler, ProductInventoryCreatedEvent");
SagaLifecycle.end();
}
}
The EventHandler that works as intended and saves a Product Entity:
#Component
public class ProductPersistenceService {
#Autowired
private ProductEntityRepository productRepository;
//works as intended
#EventHandler
void on(ProductCreatedEvent event) {
System.out.println("ProductPersistenceService, EventHandler, ProductCreatedEvent");
ProductEntity entity = new ProductEntity(event.productId, event.productName, event.productDescription, event.productPrice);
productRepository.save(entity);
}
#EventHandler
void on(ProductNameChangedEvent event) {
System.out.println("ProductPersistenceService, EventHandler, ProductNameChangedEvent");
ProductEntity existingEntity = productRepository.findById(event.productId).get();
ProductEntity entity = new ProductEntity(event.productId, event.productName, existingEntity.getProductDescription(), existingEntity.getProductPrice());
productRepository.save(entity);
}
}
The EventHandler that should save a ProductInventory Entity, but doesn't:
#Component
public class ProductInventoryPersistenceService {
#Autowired
private ProductInventoryEntityRepository productInventoryRepository;
//doesn't work
#EventHandler
void on(ProductInventoryCreatedEvent event) {
System.out.println("ProductInventoryPersistenceService, EventHandler, ProductInventoryCreatedEvent");
ProductInventoryEntity entity = new ProductInventoryEntity(event.productInventoryId, event.physicalStock, event.reservedStock, event.availableStock);
System.out.println(entity.toString());
productInventoryRepository.save(entity);
}
}
Product-Aggregate:
#Aggregate
public class Product {
#AggregateIdentifier
private String productId;
private String productName;
private String productDescription;
private double productPrice;
public Product() {
}
#CommandHandler
public Product(CreateProductCommand command) {
System.out.println("Product, CommandHandler, CreateProductCommand");
AggregateLifecycle.apply(new ProductCreatedEvent(command.productId, command.productName, command.productDescription, command.productPrice));
}
#EventSourcingHandler
protected void on(ProductCreatedEvent event) {
System.out.println("Product, EventSourcingHandler, ProductCreatedEvent");
this.productId = event.productId;
this.productName = event.productName;
this.productDescription = event.productDescription;
this.productPrice = event.productPrice;
}
}
ProductInventory-Aggregate:
#Aggregate
public class ProductInventory {
#AggregateIdentifier
private String productInventoryId;
private int physicalStock;
private int reservedStock;
private int availableStock;
public ProductInventory() {
}
#CommandHandler
public ProductInventory(CreateProductInventoryCommand command) {
System.out.println("ProductInventory, CommandHandler, CreateProductInventoryCommand");
AggregateLifecycle.apply(new ProductInventoryCreatedEvent(command.productInventoryId, command.physicalStock, command.reservedStock, command.availableStock));
}
#EventSourcingHandler
protected void on(ProductInventoryCreatedEvent event) {
System.out.println("ProductInventory, EventSourcingHandler, ProductInventoryCreatedEvent");
this.productInventoryId = event.productInventoryId;
this.physicalStock = event.physicalStock;
this.reservedStock = event.reservedStock;
this.availableStock = event.availableStock;
}
}

What you are noticing right now is the uniqueness requirement of the [aggregate identifier, sequence number] pair within a given Event Store. This requirement is in place to safe guard you from potential concurrent access on the same aggregate instance, as several events for the same aggregate all need to have a unique overall sequence number. This number is furthermore use to identify the order in which events need to be handled to guarantee the Aggregate is recreated in the same order consistently.
So, you might think this would opt for a "sorry there is no solution in place", but that is luckily not the case. There are roughly three things you can do in this set up:
Life with the fact both aggregates will have unique identifiers.
Use distinct bounded contexts between both applications.
Change the way aggregate identifiers are written.
Option 1 is arguably the most pragmatic and used by the majority. You have however noted the reuse of the identifier is necessary, so I am assuming you have already disregarded this as an option entirely. Regardless, I would try to revisit this approach as using UUIDs per default for each new entity you create can safe you from trouble in the future.
Option 2 would reflect itself with the Bounded Context notion pulled in by DDD. Letting the Product aggregate and ProductInventory aggregate reside in distinct contexts will mean you will have distinct event stores for both. Thus, the uniqueness constraint would be kept, as no single store is containing both aggregate event streams. Whether this approach is feasible however depends on whether both aggregates actually belong to the same context yes/no. If this is the case, you could for example use Axon Server's multi-context support to create two distinct applications.
Option 3 requires a little bit of insight in what Axon does. When it stores an event, it will invoke the toString() method on the #AggregateIdentifier annotated field within the Aggregate. As your #AggregateIdentifier annotated field is a String, you are given the identifier as is. What you could do is have typed identifiers, for which the toString() method doesn't return only the identifier, but it appends the aggregate type to it. Doing so will make the stored aggregateIdentifier unique, whereas from the usage perspective it still seems like you are reusing the identifier.
Which of the three options suits your solution better is hard to deduce from my perspective. What I did do, is order them in most reasonable from my perspective.
Hoping this will help your further #Jan!

Spring JPA inserting to wrong column when saving multiple entities

I have encountered an issue where I save a list of entities all at once, sometimes some rows have values being written to wrong columns.
Basically, I have a Movie entity, which extends Show (annotated with #MappedSuperclass), which extends TraceableEntity that is also annotated with #MappedSuperclass as shown below:
#MappedSuperclass
#EntityListeners(TraceableEntity.TraceableEntityListener.class)
public abstract class TraceableEntity {
#Column
private Date createdOn;
#Column
private Date dateUpdated;
public Date getCreatedOn() {
return createdOn;
}
public void setCreatedOn(Date createdOn) {
this.createdOn = createdOn;
}
public Date getDateUpdated() {
return dateUpdated;
}
public void setDateUpdated(Date dateUpdated) {
this.dateUpdated = dateUpdated;
}
public static class TraceableEntityListener {
#PrePersist
public void beforeInsert(TraceableEntity entity) {
if (entity.getCreatedOn() == null) {
entity.setCreatedOn(new Date());
}
}
#PreUpdate
public void beforeUpdate(TraceableEntity entity) {
entity.setDateUpdated(new Date());
}
}
}
Now, on some occasions, the value of createdOn ends up in dateUpdated, as shown in this screenshot.
In a nutshell, my application is a scraper that retrieves data from an API. I'm using RestTemplate in CompletableFuture to download data concurrently, and then save everything in one go. The method in which .save(...) is invoked is annotated with #Transactional. When the size of the list is under approximately 1500 the saving is fine, it seems that things go wrong when the size exceeds 1500 for some reason. I'd really appreciate your time and help in this matter!

Does it always happen at the same place? Are you missing some rows for example? perhaps the text of some of the stuff you're scraping has special characters that were not escaped properly? You may want to turn on your logging to see exactly what is being sent to the server.
The reason that if you scrape just the "problematic" movies and it turns out fine is probably because the actual problem occurred before the movies in question.

Spring-Boot: a #Component class to hold a List of Database table objects

I need to find a proper solution to have a Spring-Boot #Component (singleton) class hold a List of database table objects, which could be accessed throughout the life of an application. I need to get a value of a certain language column value (there could be many language columns) depending on the parameters.
My idea was to do it like this:
#Component
public class CardTypeValueComponent {
private List<CardTypesTabModel> listOfCardTypes;
private CardTypesModelRepository cardTypesModelRepository;
private static final String UNKNOWN = "-";
#Autowired
public CardTypeValueComponent(CardTypesModelRepository cardTypesModelRepository) {
Assert.notNull(cardTypesModelRepository, "CardTypesModelRepository cannot be null");
this.cardTypesModelRepository = cardTypesModelRepository;
}
#PostConstruct
private void getAllCardTypesFromDb() {
this.listOfCardTypes = cardTypesModelRepository.findAll();
}
public String getCardTypeLanguageValue(int cardType, String language) {
String cardTypeLangValue = UNKNOWN;
for (CardTypesTabModel cardTypesTabModel : listOfCardTypes) {
if (cardTypesTabModel.getTypeId() == cardType && "spanish".equals(language)) {
cardTypeLangValue = cardTypesTabModel.getSpanishValue();
} else {
cardTypeLangValue = cardTypesTabModel.getEnglishValue();
}
}
return cardTypeLangValue;
}
}
Is it a proper way of completing such a task whilst keeping in mind that the table object column count could increase in the future?
Excuse me for the pseudo code. Thanks.
Added more details:
CardTypesTabModel Entity class:
#Entity
public class CardTypesTabModel {
private int type;
private String englishValue;
private String spanishValue;
// other values, getters & setters
}

What you're trying to do is re-inventing the caching mechanisme.
You may consider to relay on the Spring Cache Abstraction http://docs.spring.io/spring/docs/current/spring-framework-reference/html/cache.html then choose JCache (JSR-107) as implementation.

Multiple calls to entities "setter" methods by EclipseLink

Can somebody explain this behaviour?
Given an Entity MyEntity below, the following code
EntityManagerFactory emf = Persistence.createEntityManagerFactory("greetingPU");
EntityManager em = emf.createEntityManager();
MyEntity e = new MyEntity();
e.setMessage1("hello"); e.setMessage2("world");
em.getTransaction().begin();
em.persist(e);
System.out.println("-- Before commit --");
em.getTransaction().commit();
System.out.println("-- After commit --");
results in an output indicating multiple calls to the "setter" methods of MyEntity by the EclipseLinks EntityManager or its associates. Is this behaviour to be expected? Possibly for some internal performance or structural reasons? Do other JPA implementations show the same behaviour?
-- Before commit --
setId
setId
setMessage1
setMessage2
setId
setMessage1
setMessage2
-- After commit --
There seem to be two different kinds of reassignments. First, an initial set of the Id. Second, two consecutive settings of the whole Entity.
Debugging shows that all calls of a given "setter" have the same object as their parameter.
#Entity
public class MyEntity {
private Long id;
private String message1;
private String message2;
#Id
#GeneratedValue(strategy=GenerationType.SEQUENCE)
public Long getId(){ return id; }
public void setId(Long i) {
System.out.println("setId");
id = i;
}
public String getMessage1() { return message1; }
public void setMessage1(String m) {
message1 = m;
System.out.println("setMessage1");
}
public String getMessage2() { return message2; }
public void setMessage2(String m) {
message2 = m;
System.out.println("setMessage2");
}
}

Are you using weaving? http://wiki.eclipse.org/EclipseLink/UserGuide/JPA/Advanced_JPA_Development/Performance/Weaving
EclipseLink must call setId once to set the generated ID in the managed entity instance. It will also create an instance and set its values for the shared cache, explaining another setId and set values calls. If you are not using weaving, because the EntityManager still exists, EclipseLink will also create a backup instance to use to compare for future changes - any changes to the managed entity after the transaction commits still need tracked.
If this isn't desirable, weaving allows attribute change tracking to be used instead so that backup copies aren't needed to track changes. You can also turn off the shared cache, but unless you are running into performance or stale data issues, this is not recommended.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.