App Engine: different results for same objectify query

App Engine: different results for same objectify query - java

I am developing a small backend with app engine. Now I get some weird behaviour after saving an entity multiple times with different values.
My code to load entities is the same for all entities - each entity gets a changeId so I can transfer only changed entities to the clients:
public class VersionableRecordHelper<T extends VersionableRecord> {
final Class<T> clazz;
public VersionableRecordHelper(Class<T> clazz) {
this.clazz = clazz;
}
Query<T> load() {
return ofy().load().type(clazz);
}
List<T> loadOrdered() {
return load().order("changeId").list();
}
public List<T> loadOrdered(Long since) {
return since == null ? loadOrdered() : load().filter("changeId >", since).order("changeId").list();
}
}
The clients then can query all objects of a class by providing a since value. For example:
private final VersionableRecordHelper<Cat> helper
= new VersionableRecordHelper<>(Cat.class);
// actually an #ApiMethod, simplified here
public List<Cat> getCats(Long since) {
return helper.loadOrdered(since);
}
My Cat entity looks like the following:
#Entity
#Cache
#JsonSerialize(include = JsonSerialize.Inclusion.ALWAYS)
public class Cat extends VersionableRecord {
// some fields, getters, setters
}
public class VersionableRecord {
#Id
private String id;
#Index
private Long changeId;
// getters, setters and more
}
Now, if I do the same REST request with since == 4, I get completely different results - sometimes with changeId == 5, but also with 2, 3 or 4 - which should not even be possible!
I am completely lost here. This is what I checked yet:
I did not change the records during the test. In fact, I completely left the records alone for more than 90 minutes.
I checked that only one app engine instance was running.
I tried to flush the memcache - but the same 2 ObjectifyCache keys keep hanging around.
The memcache service level is 'Shared'.
I checked that the value of sinceis not null by any chance. So the code definitely gets executed.
Currently I am using objectify version 5.0.3. From my build.gradle: compile 'com.googlecode.objectify:objectify:5.0.3'
I also made sure the entity has the correct changeId in the datastore by checking https://console.developers.google.com/project/project-id/datastore/query?authuser=0
Does anyone have a helpful idea? I also checked for different type of entities - the same behaviour.

Wild guess, this is related to FAQ #3:
https://code.google.com/p/objectify-appengine/wiki/FrequentlyAskedQuestions#Strange_things_are_showing_up_in_my_session_cache!_(or_missing_f
or, when googlecode dies, the third one down:
https://github.com/objectify/objectify/wiki/FrequentlyAskedQuestions
You need to have the ObjectifyFilter installed otherwise you will bleed session data into subsequent requests. Upgrade to a more recent version of Objectify; it will give you a more explicit error (at the cost of complicating test and remote api usage, but that's a different story).
If this isn't your issue, you need to describe your exact code in more detail.

Related

Kinesis Data Analytics - Flink state serializer incompatible after recovering from Snapshot

We have our Flink application(version 1.13.2) deployed on AWS KDA. The strategy is that we do not want the application to stop at all, so we always recover the application from a snapshot when updating the jar with new changes.
Recently, we found a problem where a lower-level POJO class is corrupted. It contains a few getters and setters with wrong namings. This early mistake essentially hinders us from adding the POJO class with new fields. So we decided to rename the getter/setter directly. But it led us to the following exception after updating the application.
org.apache.flink.util.StateMigrationException: The new state serializer (org.apache.flink.api.common.typeutils.base.ListSerializer#46c65a77) must not be incompatible with the old state serializer (org.apache.flink.api.common.typeutils.base.ListSerializer#30c9146c).
at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.updateRestoredStateMetaInfo(RocksDBKeyedStateBackend.java:704) at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.tryRegisterKvStateInformation(RocksDBKeyedStateBackend.java:624)
at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.createInternalState(RocksDBKeyedStateBackend.java:837) at org.apache.flink.runtime.state.KeyedStateFactory.createInternalState(KeyedStateFactory.java:47) at org.apache.flink.runtime.state.ttl.TtlStateFactory.createStateAndWrapWithTtlIfEnabled(TtlStateFactory.java:71)
at org.apache.flink.runtime.state.AbstractKeyedStateBackend.getOrCreateKeyedState(AbstractKeyedStateBackend.java:301) at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.getOrCreateKeyedState(StreamOperatorStateHandler.java:315) at
org.apache.flink.streaming.api.operators.AbstractStreamOperator.getOrCreateKeyedState(AbstractStreamOperator.java:494) at org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.open(WindowOperator.java:243) at org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:442)
at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:582) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) at org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:562)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:647) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:537) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:764) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:571)
at java.base/java.lang.Thread.run(Thread.java:829)
As far as we understand, the failure happens specifically in the 2 CoGroup functions we implemented. They are both consuming the corrupted POJO class nested in another POJO class, Session. A code snippet of the Cogroup function is shown below. BTW, we are using google guava list here, not sure if it causes list serializer problem.
public class OutputCoGroup extends CoGroupFunction<Session, Event, OutputSession> {
#Override
public void coGroup(Iterable<Session> sessions, Iterable<Event> events,
Collector<OutputSession> collector) throws Exception {
// we are using google guava list here, not sure if it causes list serializer problem
if (Lists.newArrayList(sessions).size() > 0) {
...
if (events.iterator().hasNext()) {
List<Event> eventList = Lists.newArrayList(events);
...
As we can see in the input, the session is the POJO class that contains the problematic POJO class.
public class Session{
private problematicPOJO problematicpojo;
...
}
The problematic POJO class has 2 Boolean fields with the wrong getter/setter namings(literally missing Is :´<). Other fields in the class are ignored, they do not have any issues.
public class problematicPojo {
private Boolean isA;
private Boolean isB;
...
public getA(){ ... }
public setA(...){ ... }
public getB(){ ... }
public setB(...){ ... }
...
}
We have looked up some possible solutions.
Using State Processor API -> AWS does not provide access to KDA snapshots, so we're not able to modify it
Providing TypeInformation to the problematic POJO class -> did not seem to be working
We are thinking of specifying listStateDescriptor in the cogroup function(changing to RichCoGroup) to be able to manually update the states when recovering from a snapshot. But we could not get too much insight from the official docs. Is anyone here familiar with this method and can help us out?
Thank you!

Using Spring #Cacheable with #PostFilter

I'm attempting to use both the #Cacheable and #PostFilter annotations in Spring. The desired behavior is that the application will cache the full, unfiltered listed of Segments (it's a very small and very frequently referenced list so performance is the desire), but that a User will only have access to certain Segments based on their roles.
I started out with both #Cacheable and #PostFilter on a single method, but when that wasn't working I broke them out into two separate classes so I could have one annotation on each method. However, it seems to behave the same either way I do it, which is to say when User A hits the service for the first time they get their correct filtered list, then when User B hits the service next they get NO results because the cache is only storing User A's filtered results, and User B does not have access to any of them. (So the PostFilter still runs, but the Cache seems to be storing the filtered list, not the full list.)
So here's the relevant code:
configuration:
#Configuration
#EnableCaching
#EnableGlobalMethodSecurity(prePostEnabled = true)
public class BcmsSecurityAutoConfiguration {
#Bean
public CacheManager cacheManager() {
SimpleCacheManager cacheManager = new SimpleCacheManager();
cacheManager.setCaches(Arrays.asList(
new ConcurrentMapCache("bcmsSegRoles"),
new ConcurrentMapCache("bcmsSegments")
));
return cacheManager;
}
}
Service:
#Service
public class ScopeService {
private final ScopeRepository scopeRepository;
public ScopeService(final ScopeRepository scopeRepository) {
this.scopeRepository = scopeRepository;
}
// Filters the list of segments based on User Roles. User will have 1 role for each segment they have access to, and then it's just a simple equality check between the role and the Segment model.
#PostFilter(value = "#bcmsSecurityService.canAccessSegment( principal, filterObject )")
public List<BusinessSegment> getSegments() {
List<BusinessSegment> segments = scopeRepository.getSegments();
return segments; // Debugging shows 4 results for User A (post-filtered to 1), and 1 result for User B (post-filtered to 0)
}
}
Repository:
#Repository
public class ScopeRepository {
private final ScopeDao scopeDao; // This is a MyBatis interface.
public ScopeRepository(final ScopeDao scopeDao) {
this.scopeDao = scopeDao;
}
#Cacheable(value = "bcmsSegments")
public List<BusinessSegment> getSegments() {
List<BusinessSegment> segments = scopeDao.getSegments(); // Simple SELECT * FROM TABLE; Works as expected.
return segments; // Shows 4 results for User A, breakpoint not hit for User B cache takes over.
}
}
Does anyone know why the Cache seems to be storing the result of the Service method after the filter runs, rather than storing the full result set at the Repository level as I'm expecting it should? Or have another way to achieve my desired behavior?
Bonus points if you know how I could gracefully achieve both caching and filtering on the same method in the Service. I only built the superfluous Repository because I thought splitting the methods would resolve the caching problem.

Turns out that the contents of Spring caches are mutable, and the #PostFilter annotation modifies the returned list, it does not filter into a new one.
So when #PostFilter ran after my Service method call above it was actually removing items from the list stored in the Cache, so the second request only had 1 result to start with, and the third would have zero.
My solution was to modify the Service to return new ArrayList<>(scopeRepo.getSegments()); so that PostFilter wasn't changing the cached list.
(NOTE, that's not a deep clone of course, so if someone modified a Segment model upstream from the Service it would likely change in the model in the cache as well. So this may not be the best solution, but it works for my personal use case.)
I can't believe Spring Caches are mutable...

Stream API not working for lazy loaded collections in EclipseLink / Glassfish?

After detecting a flaw in one of my web services I tracked down the error to the following one-liner:
return this.getTemplate().getDomains().stream().anyMatch(domain -> domain.getName().equals(name));
This line was returning false when I positively knew that the list of domains contained a domain which name was equal to the provided name. So after scratching my head for a while, I ended up splitting the whole line to see what was going on. I got the following in my debugging session:
Please notice the following line:
List<Domain> domains2 = domains.stream().collect(Collectors.toList());
According to the debugger, domains is a list with two elements. But after applying .stream().collect(Collectors.toList()) I get a completely empty list. Correct me if I'm wrong, but from what I understand, that should be the identity operation and return the same list (or a copy of it if we are strict). So what is going on here???
Before you ask: No, I haven't manipulated that screenshot at all.
To put this in context, this code is executed in a stateful request scoped EJB using JPA managed entities with field access in a extended persistence context. Here you have some parts of the code relevant to the problem at hand:
#Stateful
#RequestScoped
#Consumes(MediaType.APPLICATION_JSON)
#Produces(MediaType.APPLICATION_JSON)
public class DomainResources {
#PersistenceContext(type = PersistenceContextType.EXTENDED) #RequestScoped
private EntityManager entityManager;
public boolean templateContainsDomainWithName(String name) { // Extra code included to diagnose the problem
MetadataTemplate template = this.getTemplate();
List<Domain> domains = template.getDomains();
List<Domain> domains2 = domains.stream().collect(Collectors.toList());
List<String> names = domains.stream().map(Domain::getName).collect(Collectors.toList());
boolean exists1 = names.contains(name);
boolean exists2 = this.getTemplate().getDomains().stream().anyMatch(domain -> domain.getName().equals(name));
return this.getTemplate().getDomains().stream().anyMatch(domain -> domain.getName().equals(name));
}
#POST
#RolesAllowed({"root"})
public Response createDomain(#Valid #EmptyID DomainDTO domainDTO, #Context UriInfo uriInfo) {
if (this.getTemplate().getLastVersionState() != State.DRAFT) {
throw new UnmodifiableTemplateException();
} else if (templateContainsDomainWithName(domainDTO.name)) {
throw new DuplicatedKeyException("name", domainDTO.name);
} else {
Domain domain = this.getTemplate().createNewDomain(domainDTO.name);
this.entityManager.flush();
return Response.created(uriInfo.getAbsolutePathBuilder().path(domain.getId()).build()).entity(new DomainDTO(domain)).type(MediaType.APPLICATION_JSON).build();
}
}
}
#Entity
public class MetadataTemplate extends IdentifiedObject {
#OneToMany(cascade = CascadeType.ALL, fetch = FetchType.EAGER, mappedBy = "metadataTemplate", orphanRemoval = true) #OrderBy(value = "creationDate")
private List<Version> versions = new LinkedList<>();
#OneToMany(cascade = CascadeType.ALL, fetch = FetchType.LAZY, orphanRemoval = true) #OrderBy(value = "name")
private List<Domain> domains = new LinkedList<>();
public List<Version> getVersions() {
return Collections.unmodifiableList(versions);
}
public List<Domain> getDomains() {
return Collections.unmodifiableList(domains);
}
}
I've included both getVersions and getDomains methods because I have similar operations running flawlessly on versions. The only significant difference I'm able to find is that versions are eagerly fetched while domains are lazily fetched. But as far as I know the code is being executed inside a transaction and the list of domains is being loaded. If not I'd get a lazy initialization exception, wouldn't I?
UPDATE: Following #Ferrybig 's suggestion I've investigated the issue a bit further, and there doesn't seem to have anything to do with improper lazy loading. If I traverse the collection in a classic way I still can't get proper results using streams:
boolean found = false;
for (Domain domain: this.getTemplate().getDomains()) {
if (domain.getName().equals(name)) {
found = true;
}
}
List<Domain> domains = this.getTemplate().getDomains();
long estimatedSize = domains.spliterator().estimateSize(); // This returns 0!
domains.spliterator().forEachRemaining(domain -> {
// Execution flow never reaches this point!
});
So it seems that even when the collection has been loaded you still have that odd behavior. This seems to be a missing or empty spliterator implementation in the proxy used to manage lazy collections. What do you think?
BTW, this is deployed on Glassfish / EclipseLink

The problem here comes from a combination of somebody else's mistakes in several places. The sum of all those mistakes provokes this buggy behavior.
First mistake: Dubious inheritance. EclipseLink seems to create a proxy to manage lazy collections of type org.eclipse.persistence.indirection.IndirectList. This class extends java.util.Vector although it overrides everything but removeRange. Why on earth, dear Eclipse developers, do you extend a class to override almost everything in the parent, instead of declaring that class to implement a suitable interface (Iterable<E>, Collection<E> or List<E>)?
Second mistake: Hey, I inherit from you but don't give a $#|T about your internals. So IndirectList does its magic of lazy loading things using a delegate. But, oh my! How do I compute size? Do I use (and maintain updated) the parent's elementCount property? No, of course, I just delegate that task to my delegate... so if the parent class needs to do anything related to size, well, bad luck. Anyway I've overrided everything... and they won't add anything new to that class, will they?
Third mistake: Encapsulation breakage. Enters Vector. In Java 1.8 this class is augmented and now provides a spliterator method to support the new stream functionalities. They create a static inner class (VectorSpliterator) that lets clients traverse the vector using the shiny new API. Everything ok until you notice that in order to know when to finish the traversal they use the protected instance variable elementCount instead of using the public API method size(). Because who would extend a non final class and return a size not based on elementCount? Do you see the disaster coming?
So here we are, IndirectList is unawarely inheriting new functionality from Vector (remember that it probably shouldn't inherit from it in first place), and breaking things with this combination of mistakes.
Summing up, it seems that stream traversal of lazy collections won't work even for already loaded collections when using EclipseLink (default JPA provider in Glassfish). Remember that these products come from the same vendor. Hooray!
WORKAROUND: In case you face this problem and still want to leverage the functional programming style provided by stream() you can make a copy of the collection so a proper iterator is built. In my case I was able to keep all similar uses of domains as one-liners modifying the getDomains method. I favor code readability (with functional style) over performance in this case:
public List<Domain> getDomains() {
return Collections.unmodifiableList(new ArrayList<>(domains));
}
NOTE TO THE READER: Sorry for the sarcasm, but I hate to lose my precious development time with these things.
Thanks to #Ferrybig for the initial clue
UPDATE: Bug reported. If this has hit you, you can follow its progress at https://bugs.eclipse.org/bugs/show_bug.cgi?id=487799

I've hit a very similar issue with this code in a unit test:
Optional<ChildTable> ct = st.getChildren().stream().filter(i -> i.getId().equals(20001000l)).findFirst();
ct.get() failed with a NoSuchElementException.
Updating EclipseLink from 2.5.2 to 2.6.2 solved this issue. You didn't mention the EclipseLink version.
I think your bug report is a duplicate of https://bugs.eclipse.org/bugs/show_bug.cgi?id=433075.
See also the unresolved bug with EclipseLink and Java 8 stream API https://bugs.eclipse.org/bugs/show_bug.cgi?id=467470.

Does GWT RequestFactory support implementation of optimistic concurrency control?

In a GWT app I present items that can be edited by users. Loading and saving the items is perfomed by using the GWT request factory. What I now want to achive is if two users concurrently edit an item that the user that saves first wins in the fashion of optimistic concurrency control. Meaning that when the second user saves his changes the request factory backend recognizes that the version or presence of the item stored in the backend has changed since it has been transfered to the client and the request factory/backend then somehow prevents the items from being updated/saved.
I tried to implement this in the service method that is used to save the items but this will not work because request factory hands in the items just retrieved from the backend with applied user's changes meaning the versions of these items are the current versions from the backend and a comparison pointless.
Are there any hooks in the request factory processing I coud leverage to achieve the requested behaviour? Any other ideas? Or do I have to use GWT-RPC instead...

No: http://code.google.com/p/google-web-toolkit/issues/detail?id=6046
Until the proposed API is implemented (EntityLocator, in comment #1, but it's not clear to me how the version info could be reconstructed from its serialized form), you'll have to somehow send the version back to the server.
As I said in the issue, this cannot be done by simply making the version property available in the proxy and setting it; but you could add another property: getting it would always return null (or similar nonexistent value), so that setting it on the client-side to the value of the "true" version property would always produce a change, which guaranties the value will be sent to the server as part of the "property diff"; and on the server-side, you could handle things either in the setter (when RequestFactory applies the "property diff" and calls the setter, if the value is different from the "true" version, then throw an exception) or in the service methods (compare the version sent from the client –which you'd get from a different getter than the one mapped on the client, as that one must always return null– to the "true" version of the object, and raise an error if they don't match).
Something like:
#ProxyFor(MyEntity.class)
interface MyEntityProxy extends EntityProxy {
String getServerVersion();
String getClientVersion();
void setClientVersion(String clientVersion);
…
}
#Entity
class MyEntity {
private String clientVersion;
#Version private String serverVersion;
public String getServerVersion() { return serverVersion; }
public String getClientVersion() { return null; }
public void setClientVersion(String clientVersion) {
this.clientVersion = clientVersion;
}
public void checkVersion() {
if (Objects.equal(serverVersion, clientVersion)) {
throw new OptimisticConcurrencyException();
}
}
}
Note that I haven't tested this, this is pure theory.

We came up with another workaround for optimistic locking in our app. Since the version can't be passed with the proxy itself (as Thomas explained) we are passing it via HTTP GET parameter to the request factory.
On the client:
MyRequestFactory factory = GWT.create( MyRequestFactory.class );
RequestTransport transport = new DefaultRequestTransport() {
#Override
public String getRequestUrl() {
return super.getRequestUrl() + "?version=" + getMyVersion();
}
};
factory.initialize(new SimpleEventBus(), transport);
On the server we create a ServiceLayerDecorator and read version from the RequestFactoryServlet.getThreadLocalRequest():
public static class MyServiceLayerDecorator extends ServiceLayerDecorator {
#Override
public final <T> T loadDomainObject(final Class<T> clazz, final Object domainId) {
HttpServletRequest threadLocalRequest = RequestFactoryServlet.getThreadLocalRequest();
String clientVersion = threadLocalRequest.getParameter("version") );
T domainObject = super.loadDomainObject(clazz, domainId);
String serverVersion = ((HasVersion)domainObject).getVersion();
if ( versionMismatch(serverVersion, clientVersion) )
report("Version error!");
return domainObject;
}
}
The advantage is that loadDomainObject() is called before any changes are applied to the domain object by RF.
In our case we're just tracking one entity so we're using one version but approach can be extended to multiple entities.

How do I insert a lot of entities in a Play! Job?

In my application I have to simulate various situations for analysis. Thus insert a (very) large amount of lines into a database. (We're talking about a very large amount of data...several billion)
Model
#Entity
public class Case extends Model {
public String url;
}
Job
public class Simulator extends Job {
public void doJob() {
for (int i = 0; i !=) {
// Somestuff
new Case(someString).save();
}
}
}
After half an hour, there is still nothing in the database. But debug traces show Play inserts some stuff. I suspect it is some kind of cache.
I've tried about everything :
Model.em().flush();
Changes nothing.
Model.em().getTransaction().commit();
throws TransactionRequiredException occured : no transaction is in progress
Model.em().setFlushMode(FlushModeType.COMMIT);
Model.em().setFlushMode(FlushModeType.AUTO);
Changes nothing.
I've also tried #NoTransaction annotations everywhere :
Class & functions in Controller
Class Case
Overriding save method in Model
Class & functions of my Job
Getting quite desperate. Every kind of advice is welcome.
EDIT : After a little research, the first row appears in database. The associated ID is about 550.000. That means about half a million rows are somewhere in between my application and database.

Try
em.getTransaction().begin();
em.persist(model);
em.getTransaction().commit();
You can't commit a transaction before you begin it.

as per documentation, the job should have its own transaction enabled as Play request do, so that's not the issue. Try doing this:
for (int i = 0; i !=) {
// Somestuff
Case tmp = new Case(someString);
tmp = JPA.em().merge(tmp);
tmp.save();
}
The idea is that you add the newly created object to the EntityManager explicitly before saving, making sure the object is part of the "dirty objects" that will be persisted.

You need to instruct Play! when it should run your job by annotating your class with one of these annotations #OnApplicationStart, #Every or #On.
Please check Play! documentation on jobs

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

App Engine: different results for same objectify query - java

Related

Kinesis Data Analytics - Flink state serializer incompatible after recovering from Snapshot

Using Spring #Cacheable with #PostFilter

Stream API not working for lazy loaded collections in EclipseLink / Glassfish?

Does GWT RequestFactory support implementation of optimistic concurrency control?

How do I insert a lot of entities in a Play! Job?

Categories

Resources