Storing large Mongo Document using GridFs - java

I have large mongo db document that I want to store using GridFs library.
For small documents, we use MongoDbTemplate as:
DBObject dbObject = new DbObject();
dbObject.put("user", "alex");
mongoDbTemplate.save(dbObject, "collectionName");
For large documents, we use GridFsTemplate as:
DBObject metaData = new BasicDBObject();
metaData.put("user", "alex");
InputStream inputStream = new FileInputStream("src/main/resources/test.png");
gridFsTemplate.store(inputStream, "test.png", "image/png", metaData).toString();
Here's we don't define any collectionName. Is there any way to store large documents within a given collection?

GridFS uses files and chunks collections. Their names are not configurable.
You can choose which database you store the data in.

Related

Execute MongoTemplate.aggregate without row retrival

I'm using the Spring Mongo driver to execute a large mongo aggregation statement that will run for a period of time. The output stage of this aggregation writes the output of the aggregation into a new collection. At no point do I need to retrieve the results of this aggregation in-memory.
When I run this in Spring boot, the JVM is running out of memory doing row retrieval, although I'm not using or storing any of the results.
Is there a way to skip row retrieval using MongoTemplate.aggregate?
Ex:
mongoTemplate.aggregate(Aggregation.newAggregation(
Aggregation.sort(new Sort(new Sort.Order(Sort.Direction.DESC, "createdOn"))),
Aggregation.group("accountId")
.first("bal").as("bal")
.first("timestamp").as("effectiveTimestamp"),
Aggregation.project("_id", "effectiveTimestamp")
.andExpression("trunc(bal * 10000 + 0.5) / 100").as("bal"),
aggregationOperationContext -> new Document("$addFields", new Document("history",Arrays.asList(historyObj))),
// Write results out to a new collection - Do not store in memory
Aggregation.out("newBalance")
).withOptions(Aggregation.newAggregationOptions().allowDiskUse(true).build()),
"account", Object.class
);
Use AggregationOption - skipOutput() . This will not return a result in case of aggregation pipeline contains $out/$merge operation.
mongoTemplate.aggregate(aggregation.withOptions(newAggregationOptions().skipOutput().allowDiskUse(true).build()), "collectionNme", EntityClass.class);
If you are using MongoDriver without framework.
MongoClient client = MongoClients.create("mongodb://localhost:27017");
MongoDatabase database = client.getDatabase("my-collection");
MongoCollection<Document> model = database.getCollection(collectionName);
AggregateIterable<Document> aggregateResult = model.aggregate(bsonListOfAggregationPipeline);
// instead iterating over call toCollection() to skipResult
aggregateIterable.toCollection();
References:
https://jira.mongodb.org/browse/JAVA-3700
https://developer.mongodb.com/community/forums/t/mongo-java-driver-out-fails-in-lastpipelinestage-when-aggregate/9784
I was able to resolve this by using
MongoTempalte.aggregateStream(...).withOptions(Aggregation.newAggregationOptions().cursorBatchSize(0).build)

How do i convert an Object of mongodb to an String or integer in Java?

actually I am fetching data from MongoDB into 'document' object. So, I want to use that data further. I have to convert that object into an integer format or a String.
MongoClient mongoClient = new MongoClient("localhost", 27017);
MongoDatabase database = mongoClient.getDatabase("OLTP");
MongoCollection<org.bson.Document> collection = database.getCollection("Item");
Document document = collection //this document object.
.find(new BasicDBObject("i_name","Mobile"))
.projection(Projections.fields(Projections.include("i_price"), Projections.excludeId())).first();
In this code, I fetched price of a Mobile. Because I want to multiply that price to the quantity, that is not possible without converting the object into integer form.
NOTE: I have tried.
price=document.toString();
or
price=Integer.parseInt(document);
or
price=Integer.parceInt(toStrint(document));
Please give the relevant answer, with related to MongoDB and Java.
Thank you.
MongoDB Document class is a representation of a document as a Map, so you need to get the value with the key.
document.getInteger("your_value_key");
http://api.mongodb.com/java/current/org/bson/Document.html#getInteger-java.lang.Object-

Apache Lucene Sort Issues with GAE-Lucene addDocuments

I have been trying to get Sort working for Apache Lucene and Google App Engine. I am using the https://github.com/UltimaPhoenix/luceneappengine to integrate Luncene in GAE. Here is what I am doing
I have a list of Documents, which I am putting into Lucene using the IndexWriter using addDocuments() method.
for(Object object : objects) {
Document doc = new Document();
document.add(new Field("id", generateDocId(object), idType));
document.add(new NumericDocValuesField("sortLong",<Long Value>));
documents.add(doc)
}
I am basically aggregating all the documents into a list and writing to index using
IndexWriter writer = getWriter();
writer.addDocuments(documents);
I am trying to query a few documents, based on some Query as well as Sort
Sort sort = new Sort(new SortField("sortLong", SortField.Type.LONG, true));
TopFieldDocs docs = searcher.search(new MatchAllDocsQuery(),2000,sort);
Problem:
When I use addDocuments to bulk index the documents, my Sort Queries are not returning the data in the correct Sort Order, basically they are wrong, however if I index each document using addDocument(), the Sort Queries are working correctly.
This has led me to deduce that there is something inherently wrong with addDocuments(). The sort wont work unless, I open the indexWriter, addDocument and Close the indexWriter. Which I am unwilling to do, because I have may thousands of records to index.
Is there any solution for this problem? Or is it a known defect.

How to update a document in MongoDB using ObjectID in Java

What I am trying to accomplish here is pretty simple. I am trying to update a single document in MongoDB collection. When I look up the document using any field, such as "name", the update query succeeds. Here is the query:
mongoDB.getCollection("restaurants").updateOne(
new BasicDBObject("name", "Morris Park Bake Shop"),
new BasicDBObject("$set", new BasicDBObject("zipcode", "10462"))
);
If I try to lookup the document with the ObjectId, it never works as it doesn't match any document.
mongoDB.getCollection("restaurants").updateOne(
new BasicDBObject("_id", "56110fe1f882142d842b2a63"),
new BasicDBObject("$set", new BasicDBObject("zipcode", "10462"))
);
Is it possible to make this query work with Object IDs?
I agree that my question is a bit similar to How to query documents using "_id" field in Java mongodb driver? however I am not getting any errors while trying to update a document. It just doesn't match anything.
You're currently trying to update based on a string, not an ObjectId.
Make sure to initialise a new ObjectId from the string when building your query:
mongoDB.getCollection("restaurants").updateOne(
new BasicDBObject("_id", new ObjectId("56110fe1f882142d842b2a63")),
new BasicDBObject("$set", new BasicDBObject("zipcode", "10462"))
);
#sheilak's answer is the best one but,
You could use {"_id", {"$oid","56110fe1f882142d842b2a63"}} as the filter for the update query if you want it to be in the string format
Convert the string to objectid:
from bson.objectid import ObjectId
db.collection.find_one({"_id":ObjectId('5a61bfadef860e4bf266edb2')})
{u'_id': ObjectId('5a61bfadef860e4bf266edb2'), ...

MongoInternalException while inserting into mongoDB

I was entering data into mongodb but suddenly encountered with this error.Don't know how to fix this.Is this due to maximum size exceeded?.If no then why I am getting this error?.Anyone know how to fix this? Below is the error which I encountered
Exception in thread "main" com.mongodb.MongoInternalException: DBObject of size 163745644 is over Max BSON size 16777216
I know my dataset is large...but is there any other solution??
the document you are trying to insert is exceeding the max BSON document size ie 16 MB
Here is the reference documentation : http://docs.mongodb.org/manual/reference/limits/
To store documents larger than the maximum size, MongoDB provides the GridFS API.
The mongofiles utility makes it possible to manipulate files stored in
your MongoDB instance in GridFS objects from the command line. It is
particularly useful as it provides an interface between objects stored
in your file system and GridFS.
Ref : MongoFiles
For inserting an document of size greater than 16MB you need to use GRIDFS by MongoDB. GridFS is an abstraction layer on mongoDB which divide data in chunks (by default 255K ). As you are using java, its simple to use with java driver too. I am inserting an elasticsearch jar(of size 20mb) in mongoDB. Sample code :
MongoClient mongo = new MongoClient("localhost", 27017);
DB db = mongo.getDB("testDB");
String newFileName = "elasticsearch-Jar";
File imageFile = new File("/home/impadmin/elasticsearch-1.4.2.tar.gz");
GridFS gfs = new GridFS(db);
//Insertion
GridFSInputFile inputFile = gfs.createFile(imageFile);
inputFile.setFilename(newFileName);
inputFile.put("name", "devender");
inputFile.put("age", 23);
inputFile.save();
//Fetch back
GridFSDBFile outputFile = gfs.findOne(newFileName);
Find out more here.
If you want to insert directly using mongoclient you will use mongofiles as mentioned in other answer.
Hope that helps.....:)

Categories