I've developed a java application that uses an Atlas MongoDB Serverless DB.
This application performs an Aggregation query with the following steps:
$match
$project
$addFields
$sort
$facet
$project
When I perform a query thats returns a lot of results I'm obtaining this exception: QueryExceededMemoryLimitNoDiskUseAllowed.
I've tried to modify my code adding allowDiskUse: true in the aggregation, but didn't resolve the exception.
I've tried to replicate my aggregation pipeline in Atlas console and found that every think works fine until $facet step that returns
Reason: PlanExecutor error during aggregation :: caused by :: Sort exceeded memory limit of 33554432 bytes, but did not opt in to external sorting.
This is my $facet step:
{$facet: {
paginatedResults: [{ $skip: 0 }, { $limit: 50 }],
totalCount: [
{
$count: 'count'
}
]
}
}
As you can see I'm using it to paginate my query results.
Any suggestion to avoid this problem?
I was thinking about making two different query one for the results and one for total count, but I'm not sure this is the best solution.
EDIT: added query
db.vendor_search.aggregate(
{$match: {
$or: [
{'searchKeys.value': {$regex: "vendor"}},
{'searchKeys.value': {$regex: "test"}},
{'searchKeys.valueClean': {$regex: "vendor"}},
{'searchKeys.valueClean': {$regex: "test"}},
],
buyerId: 7
}},
{$project: {
companyId: 1,
buyerId: 1,
companyName: 1,
legalForm: 1,
country: 1,
supplhiCompanyCode: 1,
vat: 1,
erpCode: 1,
visibility: 1,
businessStatus: 1,
city: 1,
logo: 1,
location: {$concat : ["$country.value",'$city']},
searchKeys: {
"$filter": {
"input": "$searchKeys",
"cond": {
"$or": [
{$regexMatch: {input: "$$this.value",regex: "vendor"}},
{$regexMatch: {input: "$$this.value",regex: "test"}}
{$regexMatch: {input: "$$this.valueClean",regex: "vendor"}},
{$regexMatch: {input: "$$this.valueClean",regex: "test"}}
]
}
}
}
}},
{$addFields: {
searchMatching: {
$reduce: {
input: "$searchKeys.type",
initialValue: [],
in: {
$concatArrays: [
"$$value",
{$cond: [{$in: ["$$this", "$$value"]},[],["$$this"]]}
]
}
}
},
'sort.supplhiId': { $toLower: "$supplhiCompanyCode" },
'sort.companyName': { $toLower: "$companyName" },
'sort.location': { $toLower: {$concat : ["$country.value"," ","$city"]}},
'sort.vat': { $toLower: "$vat" },
'sort.companyStatus': { $toLower: "$businessStatus" },
'sort.erpCode': { $toLower: "$erpCode" }
}},
{$sort: {"sort.companyName": 1}},
{$facet: {
paginatedResults: [{ $skip: 0 }, { $limit: 50 }],
totalCount: [
{
$count: 'count'
}
]
}
},
{$project: {paginatedResults:1, 'totalCount': {$first : '$totalCount.count'}}}
)
EDIT: Added model
{
"buyerId": 1,
"companyId": 869048,
"address": "FP8R+52H",
"businessStatus": "AC",
"city": "Chiffa",
"companyName": "Test Algeria 25 agosto",
"country": {
"lookupId": 78,
"code": "DZA",
"value": "Algeria"
},
"erpCode": null,
"legalForm": "Ltd.",
"logo": "fc4d821a-e814-49e4-96d1-f32421fdaa6d_1.jpg",
"searchKeys": [
{
"type": "contact",
"value": "pebiw81522#xitudy.com",
"valueClean": "pebiw81522xitudycom"
},
{
"type": "company_registration_number",
"value": "112211331144",
"valueClean": "112211331144"
},
{
"type": "vendor_name",
"value": "test algeria 25 agosto ltd.",
"valueClean": "test algeria 25 agosto ltd"
},
{
"type": "contact",
"value": "tredicisf2#ottobre2022.com",
"valueClean": "tredicisf2ottobre2022com"
},
{
"type": "contact",
"value": "ty#s.com",
"valueClean": "tyscom"
},
{
"type": "contact",
"value": "info#x.com",
"valueClean": "infoxcom"
},
{
"type": "tin",
"value": "00112341675",
"valueClean": "00112341675"
},
{
"type": "contact",
"value": "hatikog381#rxcay.com",
"valueClean": "hatikog381rxcaycom"
},
{
"type": "supplhi_id",
"value": "100059410",
"valueClean": "100059410"
},
{
"type": "contact",
"value": "tredici#ottobre2022.com",
"valueClean": "trediciottobre2022com"
},
{
"type": "country_key",
"value": "00112341675",
"valueClean": "00112341675"
},
{
"type": "vat",
"value": "00112341675",
"valueClean": "00112341675"
},
{
"type": "address",
"value": "fp8r+52h",
"valueClean": "fp8r52h"
},
{
"type": "city",
"value": "chiffa",
"valueClean": "chiffa"
},
{
"type": "contact",
"value": "prova#supplhi.com",
"valueClean": "provasupplhicom"
},
{
"type": "contact",
"value": "saraxo2669#dmonies.com",
"valueClean": "saraxo2669dmoniescom"
}
],
"supplhiCompanyCode": "100059410",
"vat": "00112341675",
"visibility": true
}
in ATLAS M0 free clusters and M2/M5 shared clusters sort in memory limit is 32 MB. ( ref ) , this limit seems to apply also to serverless
For not limited mongod you may usually increase this limit from 32MB for example to 320MB as follow:
db.adminCommand({setParameter: 1, internalQueryExecMaxBlockingSortBytes: 335544320})
You can check the current value with:
db.runCommand( { getParameter : 1, "internalQueryExecMaxBlockingSortBytes" : 1 } )
But it is best to optimize your queries to not hit this limit , if you post your full query and indexes ( db.collection.getIndexes() perhaps there is a better way ...
I have my json:
{
"title": "Regular Python Developer",
"street": "Huston 10",
"city": "Miami",
"country_code": "USA",
"address_text": "Huston 10, Miami",
"marker_icon": "python",
"workplace_type": "remote",
"company_name": "Merixstudio",
"company_url": "http://www.merixstudio.com",
"company_size": "200+",
"experience_level": "mid",
"latitude": "52.4143773",
"longitude": "16.9610657",
"published_at": "2020-04-21T10:00:07.446Z",
"remote_interview": true,
"id": "merixstudio-regular-django-developer",
"employment_types": [
{
"type": "b2b",
"salary": {
"from": 8000,
"to": 13500,
"currency": "usd"
}
},
{
"type": "permanent",
"salary": {
"from": 6500,
"to": 11100,
"currency": "usd"
}
}
],
"company_logo_url": "https://bucket.justjoin.it/offers/company_logos/thumb/07dd4eaf9a6ffb6b85bd03c5bd5c95016d5804ce.png?1628853121",
"skills": [
{
"name": "REST",
"level": 4
},
{
"name": "Python",
"level": 4
},
{
"name": "Django",
"level": 4
}
],
"remote": true
}
Online json to pojo converter splits this to 4 Classes. I have a problem with Salary.
I need Salary class to be not separated from Rootit's needs to be insideRoot class.
How should Root class looks like?
In Java you can nest classes:
class Root {
String title;
List<EmpType> employmentTypes;
class EmpType {
String type;
Salary salary;
class Salary {
int from;
int to;
}
}
}
In practice the difference to creating separate files for each class is often neglegible. The main purpose is to have stronger encapsulation or to group classes that belong together.
You could look at the source code of for example java.util.ImmutableCollections for a scenario where this makes sense. There are several nested classes that belong together and should not be accessible from anywhere else.
EDIT: Filter by salary.from:
// given:
List<EmpType> employmentTypes = ...;
List<EmpType> filtered = employmentTypes.stream()
.filter(et -> et.salary.from > 3000);
.collect(Collectors.toList());
EDIT 2:
// given:
List<Root> roots = ...;
List<Root> filtered = roots.stream()
.filter(r -> r.employmentTypes.stream()
.anyMatch(e -> e.salary.from > 3000))
.collect(Collectors.toList());
I have a JSON
{
"Id": "xxx",
"Type": "Transaction.Create",
"Payload": {
"result": 2,
"description": "Pending",
"body": {
"redirect": {
"url": "xxx",
"fields": {
"MD": "8a829449620619e80162252adeb66a39"
}
},
"card": {
"expiryMonth": "1",
"expiryYear": "2033"
},
"order": {
"amount": 1
}
}
}
}
And I want to remove the card info of it like this:
{
"Id": "xxx",
"Type": "Transaction.Create",
"Payload": {
"result": 2,
"description": "Pending",
"body": {
"redirect": {
"url": "xxx",
"fields": {
"MD": "8a829449620619e80162252adeb66a39"
}
},
"order": {
"amount": 1
}
}
}
}
How can I do this with Apache velocity?
What works is:
#set($content = $util.urlEncode($input.json('$')))
#set($new = $content.replaceAll("2033","2055"))
Action=SendMessage&MessageBody={"body": "$new","Event-Signature": "$util.urlEncode($input.params('Event-Signature'))"}
This gives me
{
"Id": "xxx",
"Type": "Transaction.Create",
"Payload": {
"result": 2,
"description": "Pending",
"body": {
"redirect": {
"url": "xxx",
"fields": {
"MD": "8a829449620619e80162252adeb66a39"
}
},
"card": {
"expiryMonth": "1",
"expiryYear": "2050"
},
"order": {
"amount": 1
}
}
}
}
But now I want to remove the card part but it does not work:
#set($content = $util.urlEncode($input.json('$')))
#set($new = $content.delete("$.Payload.body.card"))
Action=SendMessage&MessageBody={"body": "$new","Event-Signature": "$util.urlEncode($input.params('Event-Signature'))"}
what am I doing wrong?
Main goal is transform a mapping template in API Gateway for a webhook. The webhook contains to many information and we want to remove some part of the JSON POST call.
Try using the below
#set($dummy=$content.Payload.remove("card"))
I have one Object with this structure:
#JsonProperty("id")
private Long codigoCategoria;
#JsonProperty("parentId")
private Long codigoCategoriaPai;
#JsonProperty("name")
private String nomeCategoria;
#JsonInclude(JsonInclude.Include.NON_EMPTY)
private ComissaoPadraoEntity comissao;
#JsonProperty("categories")
private List<CategoriaDTO> subCategorias;
How can you see, it has a list of his own type, i need map this categories with a Map <Long,List<Long>>. Where the key is codigoCategoria, and the value must be a List of Long with codigoCategoria inside the subCategorias.
This is the payload structure:
{
"categories": [
{
"id": "1813",
"parentId": null,
"name": "Malas e Mochilas",
"items": 12,
"categories": [
{
"id": "1827",
"parentId": "1813",
"name": "Conjuntos de Malas",
"items": 0,
"categories": [
],
"attributes": null
},
{
"id": "1830",
"parentId": "1813",
"name": "Mochilas",
"items": 4,
"categories": [
{
"id": "1831",
"parentId": "1830",
"name": "Mochila Esportiva",
"items": 0,
"categories": [
],
At so far i have been tried on many different ways, this is the code i have done, but even doesn't compile:
private Map<Long, List<Long>> mapATreeofCategories() {
List<CategoriaDTO> categories = getAll();
Map<Long, List<Long>> treeCategories = categories.forEach(categoriaDTO -> {
categories.stream()
.collect(Collectors.toMap(categoriaDTO.getCodigoCategoria(),
categoriaDTO.getSubCategorias().forEach(categoriaDTO1 -> categoriaDTO1.getCodigoCategoria())));
});
return treeCategories;
}
Thanks for any help guys.
The forEach method has a void return type so cannot be used as the return value in the valueMapper function.
instead, it seems like you want to extract the CodigoCategoria from the subCategories collection, in which case you need to do it as follows:
categories.stream()
.collect(Collectors.toMap(k -> k.getCodigoCategoria(),
v -> v.getSubCategorias().stream()
.map(e -> e.getCodigoCategoria())
.collect(Collectors.toList())
);
{
"task": {
"vendorConnectorId": 9901,
"operation": "query",
"objectType": "",
"attributes": {
"q": "SELECT Name FROM Asset where AccountId = '00128000005O5uPAAS'"
}
},
"httpStatusCode": 200,
"resultSet": {
"result": {
"totalSize": 1,
"records": [
{
"attributes": {
"type": "Asset",
"url": "/services/data/v34.0/sobjects/Asset/02i280000007BcpAAE"
},
"Name": "Flight To LA"
}
],
"done": true
}
},
"messages": [],
"startedTime": 1441969739375,
"endTime": 1441969750317
}
I want to map records node via polymorphic de-serialization feature of Jackson but type information is nested inside the record node(attribute node).