GAE #Persistent(valueStrategy = IdGeneratorStrategy.SEQUENCE) not generating sequential numbers - java

I've a field:
When I put it in :
It does not get generated sequentially
Any reason why the sequence numbers are not in order? I'll add the logs in 2 seperate posts because they are too big for this post.
1:1422624487.413000 [s~server-test-killesk/12.381882801386260762].<stdout>: 13:28:07,413 DEBUG Request FC14E205 ValueGeneration:debug:58 - Creating ValueGenerator instance of "com.google.appengine.datanucleus.valuegenerator.SequenceGenerator" for "com.eurekaapp.server.dao.mappedobjects.DAOJobEvent.sequence"
1:1422624487.424000 [s~server-test-killesk/12.381882801386260762].<stdout>: 13:28:07,423 DEBUG Request FC14E205 ValueGeneration:debug:58 - Generated value for field "com.eurekaapp.server.dao.mappedobjects.DAOJobEvent.sequence" using strategy="sequence" (Generator="com.google.appengine.datanucleus.valuegenerator.SequenceGenerator") : value=1,050,002
1:1422624487.624000 [s~server-test-killesk/12.381882801386260762].<stdout>: 13:28:07,624 DEBUG Request FC14E205 ValueGeneration:debug:58 - Generated value for field "com.eurekaapp.server.dao.mappedobjects.DAOJobEvent.sequence" using strategy="sequence" (Generator="com.google.appengine.datanucleus.valuegenerator.SequenceGenerator") : value=1,040,003
1:1422624487.908000 [s~server-test-killesk/12.381882801386260762].<stdout>: 13:28:07,907 DEBUG Request FC14E205 ValueGeneration:debug:58 - Generated value for field "com.eurekaapp.server.dao.mappedobjects.DAOJobEvent.sequence" using strategy="sequence" (Generator="com.google.appengine.datanucleus.valuegenerator.SequenceGenerator") : value=3,010,003

IdGeneratorStrategy.SEQUENCE is implemented on top of DatastoreService.allocateIds() which is how Cloud Datastore internally assigns Ids. Everything that applies to Cloud Datastore auto-Id allocation applies to SEQUENCE.
Sequences are only guaranteed to be unique, not monotonically increasing.

Related

Limiting the nested fields in Elasticsearch

I am trying to index json documents in ElasticSearch with dynamic mappings on. Some of the documents have unpredictable number of keys (nested levels) because of which I started getting this error from ES Java api.
[ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Limit of total fields [1000] in index [my_index] has been exceeded]]]failure in bulk execution
I was wondering if there is an option which can be configured at index level where I can define to scan for fields only till certain level (maybe 2) and store the rest of the document as a string or in flattened form. I did come across some settings like index.mapping.depth.limit but it seems if I set it to 2, this setting rejects the document if there are more levels. link
For Total Field
PUT <index_name>/_settings
{
"index.mapping.total_fields.limit": 2000
}
For depth limit
PUT <index_name>/_settings
{
"index.mapping.depth.limit": 2
}
https://www.elastic.co/guide/en/elasticsearch/reference/master/mapping.html
Add this to your index _settings:
"settings": {
"index.mapping.nested_fields.limit": 150,
...
}
"mappings": {
...
}

What is the right way to fetch an entry to process in my route

This is my first question here, so if my question is missing some core information, please bear with me. I'll try to add the needed info as fast as possible.
I am setting up a camel route that retrieves a URL from my Database and sends this URL to a crawler, that collects some product data. The data that I receive from the crawler is then saved back to the Database into a different collection.
When I crawl another URL, that contains the same product, I want to add the newly received data to the object in my Database.
I tried to save the newly received data in a Property and then set two constants to find the corresponding item in the database. After the object is found, I want to send that received data to my processor, where I add the new Data to the retrieved object and then save it back to the DB.
from("direct:myRoute")
...
.setProperty("newItem").simple("${body}")
.setBody().constant("{ 'name': $.name}")
.setBody().constant("{ 'brand': $.brand}")
.to("mongodb:mongoBean?database={{db}}&collection=
{{col}}&operation=findOneByQuery")
...
The expected result should be, that my Database object has the old info and the newly received info stored, but what I get is, that { 'name': $.name} is no valid BasicDBObject, same for the brand.
Update
So, I found out, that setting two constants is not possible for what I wanted to achieve. I experimented a bit and was able to make it work with hard coded examples.
from("direct:myRoute")
...
.setProperty("newItem").simple("${body}")
.setBody().constant("{ 'name': 'product', 'brand': 'manufacturer'}")
.to("mongodb:mongoBean?database={{db}}&collection=
{{col}}&operation=findOneByQuery")
...
But the Problem remains, that I want to set the 'name' and 'brand' as json variables extracted from the body.
Update 2
I changed the code around a bit and tried the following example in my code.
from("direct:myRoute")
...
.setProperty("newItem").simple("${body}")
.setBody().constant("{ 'name' : '{$.name}', 'brand' : '{$.brand}' }")
.to("mongodb:mongoBean?database={{db}}&collection=
{{col}}&operation=findOneByQuery")
...
I don't get an error from this, but it does not work as expected. I hoped, that the $.name and $.brand will get replaced to the variables stored in the body, but it seems, they get used "as-is"
2019-11-06 13:27:43.363 INFO 2132 --- [ XNIO-1 task-1] DEBUG
: Exchange[ExchangePattern: InOut, BodyType: String, Body: { 'name' :
'{$.name}', 'brand' : '{$.brand}' }]
Ok, so we found a solution to this problem that works for our usecase.
We set the name and the brand as properties and used a simple to set the body with these properties.
from("direct:myRoute")
...
.setProperty("newItem").simple("${body}")
.setProperty("name").jsonpath("$.name")
.setProperty("brand").jsonpath("$.brand")
.setBody().simple("{'name':'${property.name}','brand':'${property.brand}'}")
.convertBodyTo(String.class)
.to("mongodb:mongoBean?database={{db}}&collection=
{{col}}&operation=findOneByQuery")
...

Cassandra, Java and MANY Async request : is this good?

I'm developping a Java application with Cassandra with my table :
id | registration | name
1 1 xxx
1 2 xxx
1 3 xxx
2 1 xxx
2 2 xxx
... ... ...
... ... ...
100,000 34 xxx
My tables have very large amount of rows (more than 50,000,000). I have a myListIds of String id to iterate over. I could use :
SELECT * FROM table WHERE id IN (1,7,18, 34,...,)
//image more than 10,000,000 numbers in 'IN'
But this is a bad pattern. So instead I'm using async request this way :
List<ResultSetFuture> futures = new ArrayList<>();
Map<String, ResultSetFuture> map = new HashMap<>();
// map : key = id & value = data from Cassandra
for (String id : myListIds)
{
ResultSetFuture resultSetFuture = session.executeAsync(statement.bind(id));
mapFutures.put(id, resultSetFuture);
}
Then I will process my data with getUninterruptibly() method.
Here is my problems : I'm doing maybe more than 10,000,000 Casandra request (one request for each 'id'). And I'm putting all these results inside a Map.
Can this cause heap memory error ? What's the best way to deal with that ?
Thank you
Note: your question is "is this a good design pattern".
If you are having to perform 10,000,000 cassandra data requests then you have structured your data incorrectly. Ultimately you should design your database from the ground up so that you only ever have to perform 1-2 fetches.
Now, granted, if you have 5000 cassandra nodes this might not be a huge problem(it probably still is) but it still reeks of bad database design. I think the solution is to take a look at your schema.
I see the following problems with your code:
Overloaded Cassandra cluster, it won't be able to process so many async requests, and you requests will be failed with NoHostAvailableException
Overloaded cassadra driver, your client app will fails with IO exceptions, because system will not be able process so many async requests.(see details about connection tuning https://docs.datastax.com/en/developer/java-driver/3.1/manual/pooling/)
And yes, memory issues are possible. It depends on the data size
Possible solution is limit number of async requests and process data by chunks.(E.g see this answer )

Select the right architecture for simple java bean application

I need to make a simple java application, and now I am working on the architecture for this. Please, help me to build way to make my app. I only need advice on how to make this (what classes I need, what methods to include), but the code I will write myself. If it is not difficult for you, please write your opinion about the best and right way to make my app. Thanks!
My technical task below:
Given: TEST table in any database (use in memory databases are not recommended), containing one integer column (FIELD).
You must write a console application in Java, using standard library JDK7 (preferably) or JDK8 and implements the following functionality:
The main application class must follow the rules of JavaBean, that is initialized through the setters. Initialization parameters - the data to connect to the database and the number N.
Upon launch, the application inserts a TEST N records with values โ€‹โ€‹1..N. If the TEST table were recording, they are removed before inserting.
The application then requests the data from TEST.FIELD and generates the correct XML-type document
<entries>
<Entry>
<Field> field to field </ field>
</ Entry>
...
<Entry>
<Field> field to field </ field>
</ Entry>
</ entries>
(with N nested elements ) The document is saved in the file system as "1.xml".
By means of XSLT, the application converts the contents of the "1.xml" to the
following form:
< entries> < entry> < field>value of field 'field' < /entry> ...
< entry> < field>value of field 'field' < /entry>
(with N nested elements ) The new document is saved in the file system as "2.xml".
The application parses "2.xml" and outputs the arithmetic sum of the values โ€‹โ€‹of all the attributes field in the console.
For large N (~ 1,000,000) while the application should not be more than five minutes.

java: convert HashMap with dynamic keys to Bean

I'm trying to convert a large Map> to some JavaBean. The key of map corresponds to some property of JavaBean, and the value somehow is decoded to property value. So I decided to use some util for that, but don't know what will work. There are some requirements I have to this util (or framework):
all configuration must be in separate files
should be a possibility to map dynamic quantity of keys:
there is a map:
key | value
quan | n
key_1| value_1
key_2| value_2
........ | .............
key_n| value_n
where n - is any number
and the JavaBean has a List of some beans. They have a property. value_1, value_2, ... must be mapped in this property, and in the end there must be so much beans in list, as these keys and values in map.
3.. should be a possibility to set up custom decoder for property mapping, because in most cases the value in map is a List with 1 value, so I need to get the first item of list (if it's not empty).
4.. should be a possibility run some script to execute extraordinary mappings, for example:
there is a map, that is described in 2d point.
and the JavaBean has a property of type HashMap, where value_1 is mapped to Bean1 and some analogous value from input map is mapped to Bean2.
I've tried to use smooks, but when I've started, all these requirements were not clear yet and the smooks was something new, I haven't worked with it until now. So the smooks config doesn't contain the whole business-logic (because of second req.) and looks ugly, I don't like that. I can show the most ugliest fragment for 2d point:
<jb:bean beanId="javaBean" class="com.example.JavaBean" createOnElement="map">
<jb:wiring property="someBeans" beanIdRef="someBeanItems"/>
</jb:bean>
<jb:bean beanId="someBeanItems" class="java.util.ArrayList" createOnElement="map/entry">
<jb:wiring beanIdRef="someBeanItem"/>
</jb:bean>
<jb:bean beanId="someBeanItem" class="com.example.someBeanItem" createOnElement="map/entry">
<condition>map.quan[0]>0</condition>
<jb:expression property="property1">
index = map.quan[0]-1;
value = additionalProperties.property1_List[index];
map.quan[0] = map.quan[0] - 1;
return value;
</jb:expression>
</jb:bean>
Here "property1_List" is builded before executing smooks.
Now I look for something more nice and need your help: maybe you know how to make that better using smooks? Or what another frameworks for mapping can you recommend for my issue?

Categories