How to get single GridFS file using Java driver 3.7+? - java

I need to get single the GridFS file using Java driver 3.7+.
I have two collections with file in a database: photo.files and photo.chunks.
The photo.chunks collection contains the binary file like:
The photo.files collection contains the metadata of the document.
To find document using simple database I wrote:
Document doc = collection_messages.find(eq("flag", true)).first();
String messageText = (String) Objects.requireNonNull(doc).get("message");
I tried to find file and wrote in same way as with an example above, according to my collections on screens:
MongoDatabase database_photos = mongoClient.getDatabase("database_photos");
GridFSBucket photos_fs = GridFSBuckets.create(database_photos,
"photos");
...
...
GridFSFindIterable gridFSFile = photos_fs.find(eq("_id", new ObjectId()));
String file = Objects.requireNonNull(gridFSFile.first()).getMD5();
And like:
GridFSFindIterable gridFSFile = photos_fs.find(eq("_id", new ObjectId()));
String file = Objects.requireNonNull(gridFSFile.first()).getFilename();
But I get an error:
java.lang.NullPointerException
at java.util.Objects.requireNonNull(Objects.java:203)
at project.Bot.onUpdateReceived(Bot.java:832)
at java.util.ArrayList.forEach(ArrayList.java:1249)
Also I checked docs of 3.7 driver, but this example shows how to find several files, but I need single:
gridFSBucket.find().forEach(
new Block<GridFSFile>() {
public void apply(final GridFSFile gridFSFile) {
System.out.println(gridFSFile.getFilename());
}
});
Can someone show me an example how to realize it properly?
I mean getting data, e.g. in chunks collection by Object_id and md5 field also by Object_id in metadata collection.
Thanks in advance.

To find and use specific files:
photos_fs.find(eq("_id", objectId)).forEach(
(Block<GridFSFile>) gridFSFile -> {
// to do something
});
or as alternative, I can find specific field of the file.
It can be done firstly by creating objectId of the first file, then pass it to GridFSFindIterable object to get particular field and value from database and get finally file to convert into String.
MongoDatabase database_photos =
mongoClient.getDatabase("database_photos");
GridFSBucket photos_fs = GridFSBuckets.create(database_photos,
"photos");
...
...
ObjectId objectId = Objects.requireNonNull(photos_fs.find().first()).getObjectId();
GridFSFindIterable gridFSFindIterable = photos_fs.find(eq("_id", objectId));
GridFSFile gridFSFile = Objects.requireNonNull(gridFSFindIterable.first());
String file = Objects.requireNonNull(gridFSFile).getMD5();
But it checks files from photo.files not from photo.chunkscollection.
And I'm not sure that this way is code-safe, because of debug info, but it works despite the warning:
Inconvertible types; cannot cast 'com.mongodb.client.gridfs.model.GridFSFile' to 'com.mongodb.client.gridfs.GridFSFindIterableImpl'

Related

MongoDB "Invalid BSON Field Name"

I know that there's probably a better way to do this however I'm completely stumped. I'm writing a Discord bot in which a user is able to add points to other users, however I can't figure out how to replace a user's "points". My code is as follows:
BasicDBObject cursor = new BasicDBObject();
cursor.put(user.getAsMember().getId(), getMongoPoints(user.getAsMember()));
if(cursor.containsKey(user.getAsMember().getId())) {
Document old = new Document(user.getAsMember().getId(), getMongoPoints(user.getAsMember()));
Document doc = new Document(user.getAsMember().getId(), getMongoPoints(user.getAsMember()) + Integer.parseInt(amount.getAsString()));
collection.findOneAndUpdate(old, doc);
}
My getMongoPoints function:
public static int getMongoPoints(Member m) {
ConnectionString connectionString = new ConnectionString("database");
MongoClientSettings settings = MongoClientSettings.builder()
.applyConnectionString(connectionString)
.build();
MongoClient mongoClient = MongoClients.create(settings);
MongoDatabase database = mongoClient.getDatabase("SRU");
MongoCollection<Document> collection = database.getCollection("points");
DistinctIterable<Integer> docs = collection.distinct(m.getId(), Integer.class);
MongoCursor<Integer> result = docs.iterator();
return result.next();
}
I've tried findOneAndReplace, however that simply makes a new entry without deleting the old one. The error I receive is: Invalid BSON field name 262014495440896000
Everything else works, include writing to the database itself which is why I'm stumped. Any help would be greatly appreciated and I apologize if this is written poorly.
BSON field names must be string. From the spec:
Zero or more modified UTF-8 encoded characters followed by '\x00'. The (byte*) MUST NOT contain '\x00', hence it is not full UTF-8.
To use 262014495440896000 as a field name, convert it to string first.

List all files in (sub)directories in Azure

I am developing an azure function using Java. I need to iterate all the files in the following folder
aDirectory/aSubdirectoryWithManyFiles/
There are many files in that path,:
aDirectory/aSubdirectoryWithManyFiles/file1
aDirectory/aSubdirectoryWithManyFiles/file2
aDirectory/aSubdirectoryWithManyFiles/file3
aDirectory/aSubdirectoryWithManyFiles/file4
aDirectory/aSubdirectoryWithManyFiles/file5
so I wrote the following code in order to get them all:
// myCloudBlobContainer is a CloudBlobContainer
// I expected to get all files thanks to the next row
Iterable<ListBlobItem> blobs = myCloudBlobContainer.listBlobs();
// The only blob found in the container is the directory itself
for (ListBlobItem blob : blobs) {
//log the current blob URI
if (blob instanceof CloudBlob) { // this never happens
CloudBlob cloudBlob = (CloudBlob) blob;
//make nice things with every found file
}
}
The only blob iterated in the for is the directory, noone of the expected files. so in logs i get only the following URI:
https://blablablabla.blob.core.windows.net/aDirectory/aSubdirectoryWithManyFiles/
What should I do in order to access every file?
And in case I would have more than one subdirectory, as in the following example?
aDirectory/aSubdirectoryWithManyFiles/files(1-5)
aDirectory/anotherSubdirectoryWithManyFiles/files(6-10)
Thanks in advance
Edit
In order to make methods testable, the project uses wrappers and interfaces instead of directly using directly a CloudBlobContainer; basically, the CloudBlobContainer is given by CloudBlobClient.getContainerReference("containername")
After the answer to this question, I changed teh code to the following
so I used listBlobs with parameters myCloudBlobContainer.listBlobs("aDirectory", true) and I wrote the following code in order to get them all:
// myCloudBlobClient is a CloudBlobClient
CloudBlobContainer myCloudBlobContainer = myCloudBlobClient.getContainerReference("containername")
// I expected to get all files thanks to the next row
Iterable<ListBlobItem> blobs = myCloudBlobContainer.listBlobs("aDirectory", true); // HERE THE CHANGE
// No blob found this time
for (ListBlobItem blob : blobs) { // NEVER IN THE FOR
//log the current blob URI
if (blob instanceof CloudBlob) {
CloudBlob cloudBlob = (CloudBlob) blob;
//make nice things with every found file
}
}
But this time, it doesn't go at all in the for...
I must say that the previous answer made me to waste time; the problem was in the fact that only one for is not enough to find files in folders. The first for finds the folders and subfolders, plus (maybe, i didn't check) files that are in the "root" (let's call it like that).
Having the folders, for each of them we have to cast as CloudBlobDirectory in order to see and iterate all contained files with another for.
Here the solution that works for me:
// myCloudBlobClient is a CloudBlobClient
CloudBlobContainer myCloudBlobContainer = myCloudBlobClient.getContainerReference("containername")
// I expected to get all files thanks to the next row
Iterable<ListBlobItem> blobs = myCloudBlobContainer.listBlobs();
// only directories here, another for needed to scan files
for (ListBlobItem blob : blobs) {
if (blob instanceof CloudBlobDirectory) {
CloudBlobDirectory directory = (CloudBlobDirectory)blob;
//next is in try/catch
Iterable<ListBlobItem> fileBlobs = directory.listBlobs();
for (ListBlobItem fileBlob : fileBlobs) {
if (fileBlob instanceof CloudBlob) {
CloudBlob cloudBlob = (CloudBlob) fileBlob;
//make nice things with every found file
}
}
} // else: may be we found a cloudBlob in root?
}
This helped me to find the right way:
https://social.msdn.microsoft.com/Forums/en-US/1cfdc91f-e588-4839-a878-9650339a0a06/list-all-blobs-in-c?forum=windowsazuredata
Try using the following override of listBlobs method:
listBlobs(String prefix, boolean useFlatBlobListing)
So your code would be:
Iterable<ListBlobItem> blobs = myCloudBlobContainer.listBlobs("aDirectory", true);
This will list all blobs inside "aDirectory" virtual folder in your blob container.

How to search file inside a specific folder in google API v3

As i am using v3 of google api,So instead of using parent and chidren list i have to use fileList, So now i want to search list of file inside a specific folder.
So someone can suggest me what to do?
Here is the code i am using to search the file :
private String searchFile(String mimeType,String fileName) throws IOException{
Drive driveService = getDriveService();
String fileId = null;
String pageToken = null;
do {
FileList result = driveService.files().list()
.setQ(mimeType)
.setSpaces("drive")
.setFields("nextPageToken, files(id, name)")
.setPageToken(pageToken)
.execute();
for(File f: result.getFiles()) {
System.out.printf("Found file: %s (%s)\n",
f.getName(), f.getId());
if(f.getName().equals(fileName)){
//fileFlag++;
fileId = f.getId();
}
}
pageToken = result.getNextPageToken();
} while (pageToken != null);
return fileId;
}
But in this method it giving me all the files that are generated which i don't want.I want to create a FileList which will give file inside a specific folder.
It is now possible to do it with the term parents in q parameter in drives:list. For example, if you want to find all spreadsheets in a folder with id folder_id you can do so using the following q parameter (I am using python in my example):
q="mimeType='application/vnd.google-apps.spreadsheet' and parents in '{}'".format(folder_id)
Remember that you should find out the id of the folder files inside of which you are looking for. You can do this using the same drives:list.
More information on drives:list method can be seen here, and you can read more about other terms you can put to q parameter here.
To search in a specific directory you have to specify the following:
q : name = '2021' and mimeType = 'application/vnd.google-apps.folder' and '1fJ9TFZOe8G9PUMfC2Ts06sRnEPJQo7zG' in parents
This examples search a folder called "2021" into folder with 1fJ9TFZOe8G9PUMfC2Ts06sRnEPJQo7zG
In my case, I'm writing a code in c++ and the request url would be:
string url = "https://www.googleapis.com/drive/v3/files?q=name+%3d+%272021%27+and+mimeType+%3d+%27application/vnd.google-apps.folder%27+and+trashed+%3d+false+and+%271fJ9TFZOe8G9PUMfC2Ts06sRnEPJQo7zG%27+in+parents";
Searching files by folder name is not yet supported. It's been requested in this google forum but so far, nothing yet. However, try to look for other alternative search filters available in Search for Files.
Be creative. For example make sure the files within a certain folder contains a unique keyword which you can then query using
fullText contains 'my_unique_keyword'
You can use this method to search the files from google drive:
Files.List request = this.driveService.files().list();
noOfRecords = 100;
request.setPageSize(noOfRecords);
request.setPageToken(nextPageToken);
String searchQuery = "(name contains 'Hello')";
if (StringUtils.isNotBlank(searchQuery)) {
request.setQ(searchQuery);
}
request.execute();

Solr Read Index Files

I am working with Solr and I feel interested in understanding all the nitty gritty details of the Solr Index. I am using Solrcloud and the index folder contains several files which includes:
_k.fdt -> field data
_k.fnm -> fields
segments_5
_k.fdx -> field index
_k.si -> segment info
...
They all look like binary/serialized object. I tried to follow this code to read the index file but failed with the following error. Can anyone help me on that?
public class Readfdt {
public static void main(String[] args) throws IOException {
final byte segmentID[];
Path indexpath = Paths.get(
"<solrhome>/example/cloud/node1/solr/gettingstarted_shard1_replica1/data/indexbackup");
String indexfile = "_k.fdt";
Codec codec = new Lucene54Codec();
Directory dir = FSDirectory.open(indexpath);
String segmentName = "_k";
segmentID = new byte[StringHelper.ID_LENGTH];
IOContext ioContext = new IOContext();
SegmentInfo segmentInfos = codec.segmentInfoFormat().read(dir, segmentName, segmentID, ioContext.READ);
System.out.println(segmentInfos);
}
}
And the error message is:
Exception in thread "main" org.apache.lucene.index.CorruptIndexException: file mismatch, expected id=0, got=2umd1rtwuv6lu48qbzywr533s (resource=BufferedChecksumIndexInput(MMapIndexInput(path="<solrhome>/example/cloud/node1/solr/gettingstarted_shard1_replica1/data/indexbackup/_k.si")))
at org.apache.lucene.codecs.CodecUtil.checkIndexHeaderID(CodecUtil.java:266)
at org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:256)
at org.apache.lucene.codecs.lucene50.Lucene50SegmentInfoFormat.read(Lucene50SegmentInfoFormat.java:86)
at com.datafireball.Readfdt.main(Readfdt.java:29)
Suppressed: org.apache.lucene.index.CorruptIndexException: checksum passed (13f6e228). possibly transient resource issue, or a Lucene or JVM bug (resource=BufferedChecksumIndexInput(MMapIndexInput(path="<solrhome>/example/cloud/node1/solr/gettingstarted_shard1_replica1/data/indexbackup/_k.si")))
at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:379)
at org.apache.lucene.codecs.lucene50.Lucene50SegmentInfoFormat.read(Lucene50SegmentInfoFormat.java:117)
... 1 more
In the end but not the least, I am new to Java generally and wondering what is the best practice to quickly locate the code to be able to locate the right class/code to deserialize any serialized object.

Java MongoDB getting value for sub document

I am trying to get the value of a key from a sub-document and I can't seem to figure out how to use the BasicDBObject.get() function since the key is embedded two levels deep. Here is the structure of the document
File {
name: file_1
report: {
name: report_1,
group: RnD
}
}
Basically a file has multiple reports and I need to retrieve the names of all reports in a given file. I am able to do BasicDBObject.get("name") and I can get the value "file_1", but how do I do something like this BasicDBObject.get("report.name")? I tried that but it did not work.
You should first get the "report" object and then access its contents.You can see the sample code in the below.
DBCursor cur = coll.find();
for (DBObject doc : cur) {
String fileName = (String) doc.get("name");
System.out.println(fileName);
DBObject report = (BasicDBObject) doc.get("report");
String reportName = (String) report.get("name");
System.out.println(reportName);
}
I found a second way of doing it, on another post (didnt save the link otherwise I would have included that).
(BasicDBObject)(query.get("report")).getString("name")
where query = (BasicDBObject) cursor.next()
You can also use queries, as in the case of MongoTemplate and so on...
Query query = new Query(Criteria.where("report.name").is("some value"));
You can try this, this worked for me
BasicDBObject query = new BasicDBObject("report.name", "some value");

Categories