mapping byte[] in Hibernate and adding file chunk by chunk

mapping byte[] in Hibernate and adding file chunk by chunk - java

I have web service which receives 100 Mb video file by chunks
public void addFileChunk(Long fileId, byte[] buffer)
How can I store this file in Postgresql database using hibernate?
Using regular JDBC is very straight forward. I would use the following code inside my web service method:
LargeObject largeObject = largeObjectManager.Open(fileId, LargeObjectManager.READWRITE);
int size = largeObject.Size();
largeObject.Seek(size);
largeObject.Write(buffer);
largeObject.Close();
How can I achieve the same functionality using Hibernate? and store this file by chunk?
Storing each file chunk in separate row as bytea seems to me not so smart idea. Pease advice.

its now advisable to store 100MB files in database. I would instead store them in the filesystem, but considering transactions are active, employing Servlets seems reasonable.
process http request so that file (received one) is stored in some temporal location.
open transaction, persist file metadata including temporal location, close transaction
using some external process which will monitor temporal files, transfer this file to its final destination from which it will be available to user through some Servlet.

see http://in.relation.to/Bloggers/PostgreSQLAndBLOBs
Yeah byteas would be bad. Hibernate has a way to continue to use large objects and you get to keep the streaming interface.

Related

Best approach to upload multiples files in Spring Boot

I'm working in Spring Boot project where I have two entities.
Client entity :
#Entity
public class Client {
// mapping annotation ...
private Long id;
// mapping annotation ...
private String firstName;
// mapping annotation ...
private String lastName;
// mapping annotation ...
private Set<Document> listDocument;
....
}
Document entity :
#Entity
public class Document{
// mapping annotation ...
private Long id;
// mapping annotation ...
private String name;
// mapping annotation ...
private int size;
// mapping annotation ...
private Client client;
....
}
My app has a form where I set all the information of the clients. Also, I have an input file where I have to upload multiple documents.So, when I click on submit button I have to persist the client information into the database along with all information (doc name,size..) about documents (with client Id) should be persisted in database and then files to be uploaded to the server.
I'm using Spring Boot with Angular, I'm not asking for code but I just want to know what will the best approach to achieve this according to the best practices.

I also had a similar Use Case. We have done this with File Zipping approach (require less storage, fast for small documents ). When the Client uploads the documents we create the new Zip file and named it in a unique way.
(not changing the names of original documents). For example, you can give a unique name with clientID + uploadTime.
Now to store there can be multiple ways (for rapid document retrieval)
Create only one directory (not an ideal way)
Create directories according to ClientId
Create directories according to UploadTime (DayWise, MonthWise)
If all the documents are uploaded successfully then you can save the information of documents in the table. Note that storing the path of a document can create a problem if the path changes in future so store only name of the document. As here you need to store details of each document you can create two tables. One table with id(pk), client Id, zip filename another with id(fk), document name, size etc.
you can configure max file-size, max request-size as below in application properties
MULTIPART (MultipartProperties)
spring.servlet.multipart.enabled=true # Whether to enable support of multipart uploads.
spring.servlet.multipart.file-size-threshold=0B # Threshold after which files are written to disk.
spring.servlet.multipart.location= # Intermediate location of uploaded files.
spring.servlet.multipart.max-file-size=1MB # Max file size.
spring.servlet.multipart.max-request-size=10MB # Max request size.
spring.servlet.multipart.resolve-lazily=false # Whether to resolve the multipart request lazily at the time of file or parameter access.

I did not understand the essense of the question.
In my opinion, it's necessary to upload files to the storage first. Operation of upload should be transactional (all or nothing). Error during any file upload fail whole upload. If the upload was successful - then save the information about files to the database.
I suggest to store the following additional information about uploaded files:
Date and time the file was uploaded
Id of the request. To know that multiple files have been uploaded within one request. You can use time in millis System.currentTimeMillis() or UUID UUID.randomUUID().toString()
Also, if the system contains a lot of files - I recommend storing files in separate directories to speed up the search. You can store to directories by the time of creation (for example every month new directory), or by the user id. It all depends on the search criteria for the files.
I would recommend you to rename files before store with any unique id (UUID for example) to avoid collisions. Of course you should store original and renamed file names in the database. Also this approach will not allow user to pick up the file name if the directory with files will be open. I mean that users can't guess alien files https://file-storage/user-john-dou/logo.jpg
If you are working with images you can think about resize before store.

SDMX-ML: SAS libname XML

Eurostat data can be downloaded via a REST API. The response format of the API is a XML file formatted according to the SDMX-ML standard. With SAS, very conveniently, one can access XML files with the libname statement and the XML or XMLv2 engine.
Currently, I am using the xmlv2 engine together with the automap= option to generate an xmlmap to access the data. It works. But the resulting SAS data sets are very unstructured, and for another data set to be downloaded the data structure might change. Also the request might depend on the DSD-file that Eurostat provides for each database item within a different XML file.
Here comes the code:
%let path = /your/working/directory/;
filename map "&path.map.txt";
filename resp "&path.resp.txt";
proc http
URL="http://ec.europa.eu/eurostat/SDMX/diss-web/rest/data/cdh_e_fos/..PC.FOS1.BE/?startperiod=2005&endPeriod=2011"
METHOD="GET"
OUT=resp;
run;quit;
libname resp XMLv2 automap=REPLACE xmlmap=map;
proc datasets;
copy out=WORK in=resp;
run;quit;
With the code above, you can view all downloaded data in your WORK library. Its a mess.
To download another time series change parameters of the URL according to Eurostat's description.
So here is my question
Is there a way to easily generate a xmlmap from a call to the DSD file so that the data are stored in a well structured way?
As the SDMX-ML standard is widely used in public institutions such as the ECB, Eurostat, OECD... I am wondering if somebody has implemented requests to the databases, already. I know about the tool from Banca Italia which uses a javaObject. However, I was wondering if there might be a solution without the javaObject.

Uploading a picture to mysql using jsp [duplicate]

This question already has answers here:
How can I upload files to a server using JSP/Servlet?
(14 answers)
Closed 5 years ago.
I have to create a form using JavaScript and an user will upload a JPG file and submit along with other info such as name, email, etc. When the user clicks submit all the information from the form will be loaded to a value object. For the image file I've set it to be byte[].
So assuming:
public String name;
public String email;
public byte[] logo;
I've set up a servlet as well to handle the submission but I'm not sure how to get started. How does the upload work? When user submits, how do I get the information for the image? Here's a screenshot: http://imageshack.us/f/32/77675354.png/ I need to convert that image and save it to a byte[] then convert to blob so I can insert it to a table.

For the file upload part, you need to set enctype="multipart/form-data" on the HTML form so that the webbrowser will send the file content and you'd like to use request.getPart() in servlet's doPost() method to get the file as an InputStream. For a concrete code example, see also How to upload files to server using JSP/Servlet?
Then, to save this InputStream in the DB, just use PreparedStatement#setBinaryStream() on a BLOB/varbinary/bytea column or whatever column represents "binary data" in your favorite DB engine.
preparedStatement = connection.prepareStatement("INSERT INTO user (name, email, logo) VALUES (?, ?, ?)");
preparedStatement.setString(1, name);
preparedStatement.setString(2, email);
preparedStatement.setBinaryStream(3, logo);
// ...
You don't necessarily need to convert this InputStream to byte[], it would not have been memory efficient either. Imagine that 100 user simultaneously upload images of 10MB, then 1GB of server memory would have been allocated at that point.

You probably should not be storing an image in a database. The database is literally the most expensive place where you can store binary data. DB size will grow fast and querying cost is high. You might end up with non-scalable and barely efficient solution of image hosting.
Store it in separate resource server, like Amazon S3 or anywhere else (local Nginx, Tomcat etc).
Instead you can store unique file names and/or full path to the file. In such a way you'll facilitate DB's workload and columns data will be readable, so you can quickly find desired picture. I'm not even talking about performance in general, simple benchmark will easily prove it.

Writing to a PDF from inside a GAE app

I need to read several megabytes (raw text strings) out of my GAE Datastore and then write them all to a new PDF file, and then make the PDF file available for the user to download.
I am well aware of the sandbox restrictions that prevent you from writing to the file system. I am wondering if there is a crafty way of creating a PDF in-memory (or a combo of memory and the blobstore) and then storing it somehow so that the client-side (browser) can actually pull it down as a file and save it locally.
This is probably a huge stretch, but my only other option is to farm this task out to a non-GAE server, which I would like to avoid at all cost, even if it takes a lot of extra development on my end. Thanks in advance.

You can definitely achieve your use case using GAE itself. Here are the steps that you should follow at a high level:
Download the excellent iText library, which is a Java library to work with PDFs. First build out your Java code to generate the PDF content. Check out various examples at : http://itextpdf.com/book/toc.php
Since you cannot write to a file directly, you need to generate your PDF content in bytes and then write a Servlet which will act as a Download Servlet. The Servlet will use the Response object to open a stream, manipulate the Mime Headers (filename, filetype) and write the PDF contents to the stream. A browser will automatically present a download option when you do that.
Your Download Servlet will have high level code that looks like this:
public class DownloadPDF extends HttpServlet {
public void doGet(HttpServletRequest req, HttpServletResponse res)
throws ServletException, IOException {
//Extract some request parameters, fetch your data and generate your document
String fileName = "<SomeFileName>.pdf";
res.setContentType("application/pdf");
res.setHeader("Content-Disposition", "attachment;filename=\"" + fileName + "\"");
writePDF(<SomeObjectData>, res.getOutputStream());
}
}
}
Remember the writePDF method above is your own method, where you use iText libraries Document and other classes to generate the data and write it ot the outputstream that you have passed in the second parameter.

While I'm not aware of the PDF generation on Google App Engine and especially in Java, but once you have it you can definitely store it and later serve it.
I suppose the generation of the PDF will take more than 30 seconds so you will have to consider using Task Queue Java API for this process.
After you have the file in memory you can simply write it to the Blobstore and later serve it as a regular blob. In the overview you will find a fully functional example on how to upload, write and serve your binary data (blobs) on Google App Engine.

I found a couple of solutions by googling. Please note that I have not actually tried these libraries, but hopefully they will be of help.
PDFJet (commercial)
Write a Google Drive document and export to PDF

The Process to Store & Retrieve an Image in a Database

I have never saved and retrieved an image to and from the database before. I wrote down what I guessed would be the process. I would just like to know if this is correct though:
Save image:
Select & Upload image file from jsp (Struts 2) which will save it as a .tmp file.
Convert the .tmp file to a byte[] array (Java Server-Side)
Store the byte[] array as a blob in the database (Java Server-Side)
Get image:
Get the byte[] array from the database (Java Server-Side)
Convert the byte[] array to an image file (Java Server-Side)
Create the file in a location (Java Server-Side)
Use an img tag to display the file (JSP Client-Side)
Delete the file after it's finished being used? (Java Server-Side)
I'm aware of the fact that it is highly recommended to not save & retrieve images to and from the database. I would like to know how to do it anyway.
Thanks

Almost correct.
It's expensive and not so great to create the file on the fly and then delete it.
Yes, you store it as the raw bytes in the database, but the way to retrieve it and display it to a client machine is to implement a web handler that sets the content-type of the response to the appropriate MIME type and then dumps the bytes out to the response stream.

Yes, You get it right.
Save Image :
The decision to save image is very much dependent on further usage. You have one option to save the file on the file system. The location for saved file should be saved into the metadata in the database table.
Get Image:
You do not have to right file data on any temp location. It can be easily rendered from the database only. Just send a request from client and intercept that request in a spacial designed Servlet. This Servlet will read the file metadata and corresponding file, if successful, write the file back on the response stream.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.