Best approach to upload multiples files in Spring Boot - java

I'm working in Spring Boot project where I have two entities.
Client entity :
#Entity
public class Client {
// mapping annotation ...
private Long id;
// mapping annotation ...
private String firstName;
// mapping annotation ...
private String lastName;
// mapping annotation ...
private Set<Document> listDocument;
....
}
Document entity :
#Entity
public class Document{
// mapping annotation ...
private Long id;
// mapping annotation ...
private String name;
// mapping annotation ...
private int size;
// mapping annotation ...
private Client client;
....
}
My app has a form where I set all the information of the clients. Also, I have an input file where I have to upload multiple documents.So, when I click on submit button I have to persist the client information into the database along with all information (doc name,size..) about documents (with client Id) should be persisted in database and then files to be uploaded to the server.
I'm using Spring Boot with Angular, I'm not asking for code but I just want to know what will the best approach to achieve this according to the best practices.

I also had a similar Use Case. We have done this with File Zipping approach (require less storage, fast for small documents ). When the Client uploads the documents we create the new Zip file and named it in a unique way.
(not changing the names of original documents). For example, you can give a unique name with clientID + uploadTime.
Now to store there can be multiple ways (for rapid document retrieval)
Create only one directory (not an ideal way)
Create directories according to ClientId
Create directories according to UploadTime (DayWise, MonthWise)
If all the documents are uploaded successfully then you can save the information of documents in the table. Note that storing the path of a document can create a problem if the path changes in future so store only name of the document. As here you need to store details of each document you can create two tables. One table with id(pk), client Id, zip filename another with id(fk), document name, size etc.
you can configure max file-size, max request-size as below in application properties
MULTIPART (MultipartProperties)
spring.servlet.multipart.enabled=true # Whether to enable support of multipart uploads.
spring.servlet.multipart.file-size-threshold=0B # Threshold after which files are written to disk.
spring.servlet.multipart.location= # Intermediate location of uploaded files.
spring.servlet.multipart.max-file-size=1MB # Max file size.
spring.servlet.multipart.max-request-size=10MB # Max request size.
spring.servlet.multipart.resolve-lazily=false # Whether to resolve the multipart request lazily at the time of file or parameter access.

I did not understand the essense of the question.
In my opinion, it's necessary to upload files to the storage first. Operation of upload should be transactional (all or nothing). Error during any file upload fail whole upload. If the upload was successful - then save the information about files to the database.
I suggest to store the following additional information about uploaded files:
Date and time the file was uploaded
Id of the request. To know that multiple files have been uploaded within one request. You can use time in millis System.currentTimeMillis() or UUID UUID.randomUUID().toString()
Also, if the system contains a lot of files - I recommend storing files in separate directories to speed up the search. You can store to directories by the time of creation (for example every month new directory), or by the user id. It all depends on the search criteria for the files.
I would recommend you to rename files before store with any unique id (UUID for example) to avoid collisions. Of course you should store original and renamed file names in the database. Also this approach will not allow user to pick up the file name if the directory with files will be open. I mean that users can't guess alien files https://file-storage/user-john-dou/logo.jpg
If you are working with images you can think about resize before store.

Related

SDMX-ML: SAS libname XML

Eurostat data can be downloaded via a REST API. The response format of the API is a XML file formatted according to the SDMX-ML standard. With SAS, very conveniently, one can access XML files with the libname statement and the XML or XMLv2 engine.
Currently, I am using the xmlv2 engine together with the automap= option to generate an xmlmap to access the data. It works. But the resulting SAS data sets are very unstructured, and for another data set to be downloaded the data structure might change. Also the request might depend on the DSD-file that Eurostat provides for each database item within a different XML file.
Here comes the code:
%let path = /your/working/directory/;
filename map "&path.map.txt";
filename resp "&path.resp.txt";
proc http
URL="http://ec.europa.eu/eurostat/SDMX/diss-web/rest/data/cdh_e_fos/..PC.FOS1.BE/?startperiod=2005&endPeriod=2011"
METHOD="GET"
OUT=resp;
run;quit;
libname resp XMLv2 automap=REPLACE xmlmap=map;
proc datasets;
copy out=WORK in=resp;
run;quit;
With the code above, you can view all downloaded data in your WORK library. Its a mess.
To download another time series change parameters of the URL according to Eurostat's description.
So here is my question
Is there a way to easily generate a xmlmap from a call to the DSD file so that the data are stored in a well structured way?
As the SDMX-ML standard is widely used in public institutions such as the ECB, Eurostat, OECD... I am wondering if somebody has implemented requests to the databases, already. I know about the tool from Banca Italia which uses a javaObject. However, I was wondering if there might be a solution without the javaObject.

Handle file upload in RFC 6902 Json Patch

I'm working on a application in which users can update their information. For the moment, RFC 6902 Json-patch is used to update textual information (firstname, lastname, phone...) via a basic HTML Form.
User can now add images to their profile. Is there any way to use Json-patch to perform multipart operations ?
Note : The images are stored in a file system. So in the client side, only the image path is given and it can be updated only after the form submission. My dto is as below :
public class ProfileDto {
private Integer id;
private String firstname;
private String lastname;
private String defaultMedia; // <-- image path
...
}
Solution to which I think :
Since defaultMedia is of type String, Json-patch can be used to update the image path. The idea is when the form is submitted, perform a Multipart POST request to upload the image and get its URL. Then set defaultMedia of my DTO to the new URL.
This solution can create unsued images in the case when a error happened server side on form submission. So I need to add something to clean the file system.
Is there any easier solution to meet my needs ?
I'm using :
Spring Boot : 1.5.1
Angular 2 : 2.4.5

Why no entity kind with name _BlobInfo_ in datastore is created when application is deployed on GAE?

When we upload files to Blobstore on Google App Engine we find that with every upload an entity of kind _BlobInfo_ is created which can be seen in the local development console under datastore viewer at http://localhost:8888/_ah/admin, however after the application is deployed to App Engine no such entities are created when we upload files to the Blobstore. It looks strange to me and wanted to know if I'm missing something here.
_BlobInfo_ is not a special name and most likely your app doesn't create entities with this name.
In production environment __BlobInfo__ is an internal name for storing information about blobs stored in the Blobstore. Note that there are two underscore characters (_) before and after the word BlobInfo. This entity is only created if your app creates and saves blobs into the Blobstore.
Since this is an internal entity, it is excluded from the Datastore Viewer by default. It is also exluded from the Datastore Statistics page, but they appear as BlobInfo under Kind: "All Entities".
By using a little trick, you can also show detailed statistics for the __BlobInfo__ entity: choose any entity from the dropdown list, and afte the page has reloaded, in the url modify the parameter kind=XXX to kind=__BlobInfo__ and hit enter. Now the page will reload showing statistics for this even though it is hidden from the dropdown list.
However, you can list these entities. For example go to the Datastore Viewer of your admin console, and check "By GQL" so you can enter a GQL to list your entites. Now enter the following GQL query:
SELECT * FROM __BlobInfo__
This will list your BlobInfo entities.
Note that the Blob Viewer page of your admin console also displays blobs based on the entities stored under the name __BlobInfo__. __BlobInfo__ entities also contain more properties than just the ones displayed on the Blob Viewer page.
All the properties are the following:
ID/Name
content_type
creation
creation_handle
filename
md5_hash
size
upload_id
These are also available from your application if you happen to query these entities.

Document processing in Liferay portal

I've been using Liferay a lot for past 2 years, but I have never needed any extensive document management.
Now I have a portlet where users upload documents (MS office OLE2 documents, ODS documents, PDF etc.) and I have to persist them with all metadata available.
I know how would I do that without using Liferay, I'd probably use Apache solr with Apache Tika (UpdateRichDocuments and ExtractingRequestHandler) or Apache Jackrabbit that are using Apache Tika under the hood (org.apache.jackrabbit.extractor.*).
The problem is, that If I look at the trunk of Liferay, there are some key classes :
Hooks (JCRHook, FileSystemHook, CMISHook, s3Hook) that are employed from within DLLocalServiceImpl kinda directly
Another alternative is using DLAppLocalServiceImpl that is employing DLRepositoryLocalServiceImpl and the files are persisted into repository also via Hooks, but a lot of additional stuff is done in there.
There is not jackrabbit-text-extractors library in Liferay, so I suppose If I wanted metadata to be extracted from PDF, DOCs, ODS documents, I would have very hard times... because the DL service layer doesn't accept additional properties
I think I'd have to avoid using DL services and JCR hook and access Jackrabbit directly... But I would loose the compatibility and possibility migrate my repository etc.
Could please anybody collaborate on this one please ? Thank you
SOLR for indexing, Jackrabbit for document storage. Managing Liferay Document Library in code is fairly easy, just look at the DL*LocalServiceUtil classes, namely DLFolderLocalServiceUtil and DLFileLocalServiceUtil. By default Liferay just creates a matching folder/file structure on the hard drive (with names changed) so you'd only need to write code or use Jackrabbit if you wanted more than this since Liferay allows up/download and viewing out of the box via the control panel and various portlets.
I haven't used JackRabbit with Liferay but once configured everything should be managed under the covers and you shouldn't need to worry about it on the front end.
When you say "with all metadata available" I'm not sure what is retained, but aside from renaming the file so that it can be tracked there shouldn't be any other changes. It should be quick and easy to test by uploading a file of each type and checking the entries in the LIFERAY/data/document_library directory and subdirectories. Again this would be different if Jackrabbit is used.
those two services DLLocalServiceImpl and DLAppLocalServiceImpl both are and will, I suppose, important. The former one if for direct access to repository. Notice that when adding a file via this service you need to persist corresponding DlFileEntry into database and than reference that addFile(...., fileEntryId, ...).
The latter service is doing additional stuff for you, mainly asset management and workflow.
Regarding your use case, I would avoid using document library, because no metadata can go down into the JCR repository. Actually only metadata/custom properties that you could store would be custom properties AKA Expando feature of Liferay portal.
Best way for you seem to be implement your own jackrabbit hook to store data into repository and let Liferay document library use that repository.
Think Edgar is correct. If you check the current trunk via http://svn.liferay.com/repos/public/portal/trunk/portal-service/src/com/liferay/documentlibrary/service/DLLocalService.java (login as guest and no password), you will no longer find the class DLFolderLocalServiceUtil. We are using the existing DLFolderLocalServiceUtil class as well. Thanks for the heads up. We will refactor our code so when 6.1 comes around we can still use the DocumentLibrary services.
You need to always use DLAppServiceUtil ( as Liferay instructs specifically ). Here is my working code that saves a file to the CMS:
public static void saveFileToCMS(ActionRequest aReq, long groupId, String fileName, File filenameWithPath) {
try {
ServiceContext serviceContext = ServiceContextFactory.getInstance(
Group.class.getName(), aReq);
// prevents duplicate entries based on unique title name
Random rand = new Random();
Integer suffix = new Integer(rand.nextInt(10000));
DLAppServiceUtil.addFileEntry(groupId, 0, fileName, "application/vnd.ms-excel",
fileName + suffix.toString(), "description goes here", "changelogname",
filenameWithPath, serviceContext);
//log.info("Successfully added the new file");
} catch (PortalException pe) {
log.error("Portal Exception occurred while saving file to CMS");
pe.printStackTrace();
} catch (SystemException e) {
log.error("System Exception occurred while saving file to CMS");
e.printStackTrace();
}
}

mapping byte[] in Hibernate and adding file chunk by chunk

I have web service which receives 100 Mb video file by chunks
public void addFileChunk(Long fileId, byte[] buffer)
How can I store this file in Postgresql database using hibernate?
Using regular JDBC is very straight forward. I would use the following code inside my web service method:
LargeObject largeObject = largeObjectManager.Open(fileId, LargeObjectManager.READWRITE);
int size = largeObject.Size();
largeObject.Seek(size);
largeObject.Write(buffer);
largeObject.Close();
How can I achieve the same functionality using Hibernate? and store this file by chunk?
Storing each file chunk in separate row as bytea seems to me not so smart idea. Pease advice.
its now advisable to store 100MB files in database. I would instead store them in the filesystem, but considering transactions are active, employing Servlets seems reasonable.
process http request so that file (received one) is stored in some temporal location.
open transaction, persist file metadata including temporal location, close transaction
using some external process which will monitor temporal files, transfer this file to its final destination from which it will be available to user through some Servlet.
see http://in.relation.to/Bloggers/PostgreSQLAndBLOBs
Yeah byteas would be bad. Hibernate has a way to continue to use large objects and you get to keep the streaming interface.

Categories