MimeType via Java Apache tika - java

I have a problem with file determination.
On developer server and on production servers Apache tika determine all kind of files. But on test server most time I got :
'application/octet-stream'
public static String detectMimeType(final File file) throws IOException {
TikaInputStream tikaIS = null;
try {
tikaIS = TikaInputStream.get(file);
final Metadata metadata = new Metadata();
return DETECTOR.detect(tikaIS, metadata).toString();
} finally {
if (tikaIS != null) {
tikaIS.close();
}
}
}
I can't understand the problem. Please help.

application/octet-stream is the fallback if no more specific mimetype could be detected. It simply means that your file is just a series of octets or bytes.

Related

Spring Batch Java configuration - Writing to remote sftp xml file without local file

I have a requirement to write xml file to a sftp server in a Spring Batch application. Currently below code writes xml file to local file system using StaxEventItemWriter. I need to write directly to remote server instead of writing it to local and then moving to the sftp server. Referred this link (Writing to a remote file using Spring Integrations Sftp Streaming java configuration) but not sure how to write using StaxEventItemWriter/setup Resource object with remote file
public void write(List<? extends UserDTO> items) throws Exception {
for(UserDTO item : items) {
StaxEventItemWriter<UserDTO> staxWriter = getStaxEventItemWriter(item);
staxWriter.write(Arrays.asList(item));
}
}
private StaxEventItemWriter<UserDTO> getStaxEventItemWriter(UserDTO user) {
String key = user.getDomain();
StaxEventItemWriter<UserDTO> writer = writers.get(key);
if (writer == null) {enter code here
writer = new StaxEventItemWriter<>();
try {
UrlResource resource = new UrlResource("file:"+outputDir+"/"+key+"_"+fileName+".xml");
writer.setResource(resource);
writer.setRootTagName("customerSet");
Jaxb2Marshaller UserMarshaller = new Jaxb2Marshaller();
UserMarshaller.setClassesToBeBound(UserDTO.class);
writer.setMarshaller(UserMarshaller);
writer.setOverwriteOutput(Boolean.TRUE);
writer.open(executionContext);
} catch (MalformedURLException e) {
e.printStackTrace();
}
writers.put(key, writer);
}
return writer;
}
You can probably try to use SftpResource which is based on Spring Integration (similar to the solution in the link you shared) and use it in your StaxEventItemWriter.

FTPS storeFile return always false Java

im trying to send files to FTPS server
connection method: FTPS, ACTIVE, EXPLICIT
setFileType(FTP.BINARY_FILE_TYPE);
setFileTransferMode(FTP.BLOCK_TRANSFER_MODE);
Checking the reply string right after connect i got:
234 AUTH command ok. Expecting TLS Negotiation.
from here
234 Specifies that the server accepts the authentication mechanism specified by the client, and the exchange of security data is complete. A higher level nonstandard code created by Microsoft.
while trying to send file with storeFile or storeUniqeFile i get false
checking the reply string right after store file i got: 501 Server cannot accept argument.
what is weird i was able creating a directory to this client without any issues
with makeDirectory("test1");
i was trying both this links : link1 , link2
FOR EXAMPLE when i was trying to use ftp.enterLocalPassiveMode(); before ftp.storeFile(destinationfile, in);
i got time out error .
Does anyone have any idea how to solve it ?
public static void main(String[] args) throws Exception {
FTPSProvider ftps = new FTPSProvider();
String json = "connection details";
DeliveryDetailsFTPS details = gson.fromJson(json, DeliveryDetailsFTPS .class);
File file = File.createTempFile("test", ".txt");
FileUtils.write(file, " some test", true);
try (FileInputStream stream = new FileInputStream(file)) {
ftps.sendInternal(ftps.getClient(details), details, stream, file.getName());
}
}
protected void sendInternal(FTPClient client, DeliveryDetailsFTPS details, InputStream stream, String filename) throws Exception {
try {
// release the enc
DeliveryDetailsFTPS ftpDetails = (DeliveryDetailsFTPS) details;
setClient(client, ftpDetails);
boolean isSaved = false;
try (BufferedInputStream bis = new BufferedInputStream(stream)) {
isSaved = client.storeFile(filename, bis);
}
client.makeDirectory("test1");
client.logout();
if (!isSaved) {
throw new IOException("Unable to upload file to FTP");
}
} catch (Exception ex) {
LOG.debug("Unable to send to FTP", ex);
throw ex;
} finally {
client.disconnect();
}
}
#Override
protected FTPClient getClient(DeliveryDetails details) {
return new FTPSClient(isImplicitSSL((DeliveryDetailsFTPS ) details));
}
protected void setClient(FTPClient client, DeliveryDetailsFTPS details) throws Exception {
DeliveryDetailsFTPS ftpDetails = (DeliveryDetailsFTPS ) details;
client.setConnectTimeout(100000);
client.setDefaultTimeout(10000 * 60 * 2);
client.setControlKeepAliveReplyTimeout(300);
client.setControlKeepAliveTimeout(300);
client.setDataTimeout(15000);
client.connect(ftpDetails.host, ftpDetails.port);
client.setBufferSize(1024 * 1024);
client.login(ftpDetails.username, ftpDetails.getSensitiveData());
client.setControlEncoding("UTF-8");
int code = client.getReplyCode();
if (code == 530) {
throw new IOException(client.getReplyString());
}
// Set binary file transfer
client.setFileType(FTP.BINARY_FILE_TYPE);
client.setFileTransferMode(FTP.BLOCK_TRANSFER_MODE);
if (ftpDetails.ftpMode == FtpMode.PASSIVE) {
client.enterLocalPassiveMode();
}
client.changeWorkingDirectory(ftpDetails.path);
}
I have tried this solution as well didn't solve the problem:
they only way i was able send file is with FileZilla and it is using FTPES .
But i need my Java code to do it . can anyone give me a clue
I have tried almost any possible solution offered on different websites could not make it work with Apache FTPS CLIENT ,
had to use a different class which worked like a charm here is a snippet:
com.jscape.inet.ftps Link
private Ftps sendWithFtpsJSCAPE(ConnDetails details, InputStream stream, String filename) throws FtpException, IOException {
Ftps ftp;
FtpConnectionDetails ftpDetails = FtpConnectionDetails details;
ftp = new Ftps(ftpDetails.getHost(), ftpDetails.getUsername(), ftpDetails.getPassword());
if (ftpDetails.getSecurityMode().equals(FtpConnectionDetails.SecurityMode.EXPLICIT)) {
ftp.setConnectionType(Ftps.AUTH_TLS);
} else {
ftp.setConnectionType(Ftps.IMPLICIT_SSL);
}
ftp.setPort(ftpDetails.getPort());
if (!ftpDetails.getFtpMode().equals(FtpMode.ACTIVE)) {
ftp.setPassive(true);
}
ftp.setTimeout(FTPS_JSCAPE_TIME_OUT);
ftp.connect();
ftp.setBinary();
ftp.setDir(ftpDetails.getPath());
ftp.upload(stream, filename);
return ftp;
}

Generating pdf with wkhtmltopdf and download the pdf

I am working in a old project.The project is in Spring MVC .In the project I have to generate a pdf file from a jsp page and store in a location and download that file. For that I am using wkhtmltopdf tool to convert the one specific jsp page into pdf format. Using wkhtmltopdf sometime works fine, it generate the pdf in specific location, but sometime it require more time. Also when I am trying to download the file from specific location , sometime it download a 0KB size file or sometime the downloaded file can't be open (with some size) but sometime download perfectly. If I check the file at define location, it exist and open normally.
Here is my code in controller class.
#RequestMapping(value="/dwn.htm",method=RequestMethod.GET)
public void dwAppFm(HttpSession session,HttpServletRequest request,HttpServletResponse response,#RequestParam String id) throws IOException,InterruptedException
{
final int BUFFER_SIZES=4096;
ServletContext context=request.getServletContext();
String savePath="/tmp/";//PDF file Generate Path
String fileName="PDFFileName"; //Pdf file name
FileInputStream inputStream=null;
BufferedInputStream bufferedInputStream=null;
OutputStream outputStream=null;
printApp(id,fileName);
Thread.sleep(1000);
printApp(id,fileName);
File download=new File(savePath+fileName+".pdf");
while(!download.canRead())
{
Thread.sleep(1000);
printApp(id,fileName);
download=new File(savePath+fileName+".pdf");
}
if(download.canRead()){//if the file can read
try{
Thread.sleep(1000);
inputStream=new FileInputStream(download);
bufferedInputStream=new BufferedInputStream(inputStream);
String mimeType = context.getMimeType(savePath+fileName+".pdf");
if (mimeType == null) {
mimeType = "application/octet-stream";
}
System.out.println("MIME type: " + mimeType);
response.setContentType(mimeType);
response.setContentLength((int)download.length());
String headerKey="Content-Disposition";
String headerValue=String.format("attachment;filename=\"%s\"", download.getName());
response.setHeader(headerKey, headerValue);
outputStream=response.getOutputStream();
byte[] buffer=new byte[BUFFER_SIZES];
int bytesRead=-1;
while ((bytesRead = bufferedInputStream.read(buffer)) != -1) {
outputStream.write(buffer, 0, bytesRead);
}
}catch(Exception e)
{
e.printStackTrace();
}
finally
{
try{
if(inputStream!=null)inputStream.close();
if(bufferedInputStream!=null)bufferedInputStream.close();
if(outputStream!=null)outputStream.close();
}
catch(Exception e)
{
e.printStackTrace();
}
}
}
}
public void printApp(String id,String fileName)
{
try{
String urlPath="http://localhost:8080/proj";
urlPath+="/genApp.htm?id="+id;//generate url to execute wkhtmltopdf
String wxpath="/home/exm/wkhtmltopdf";//the path where wkhtmltopdf located
String save="/tmp/"+fileName+".pdf";//File save Pathname
Process process=null;
process=Runtime.getRuntime().exec(wxpath+" "+urlPath+" "+save);
}catch(Exception e)
{}
}
#RequestMapping(value="/genApp.htm",method=RequestMethod.GET)
public String getApplicationPDF(HttpServletRequest request,HttpSession session,#RequestParam String id)
{
UDets uDets=uService.getAllById(Long.parseLong(id));//Methods to get details
request.setAttribute("uDets",uDets );
return "makeApp";//Name of the jsp page
}
In my code I have use Thread.sleep(1000) and printApp(id,fileName) method three times , since sometime wkhtmltopdf fail to generate pdf in certain time and then probability of downloading 0KB file is more. I haven't share the jsp page since the jsp page contain simple jsp page code of lots of line (the size of the generated pdf file is two page).
So the problem is what should I change in my code so that the pdf file generated and download without a failure also in heavy load in server.
If there is any best procedure or idea please share.
I don't like to use itext, since the jsp page contain complex design. Any advise is also appreciable and also thanks in advance.
I would say that your code is flawed not just a little but big time. You are checking if a file can be read, if not you start again a proces writing to the same file (at least twice). At some time you will endup with multiple processes trying to write to the same file, resulting in strange behavior.
I would refactor the printApp method to return the Process it created. Then call waitFor on that process. If it returns 0 and doesn't get interrupted it completed successfully and you should be able to download the file.
#RequestMapping(value="/dwn.htm",method=RequestMethod.GET)
public void dwAppFm(HttpSession session,HttpServletRequest request,HttpServletResponse response,#RequestParam String id) throws IOException,InterruptedException
{
String savePath="/tmp/";//PDF file Generate Path
String fileName="PDFFileName.pdf"; //Pdf file name
File download = new File(savePath, fileName);
try {
Process process = printApp(id, download.getPath());
int status = process.waitFor();
if (status == 0) {
response.setContentType("application/pdf");
response.setContentLength((int)download.length());
String headerKey="Content-Disposition";
String headerValue=String.format("attachment;filename=\"%s\"", download.getName());
StreamUtils.copy(new FileInputStream(download), response.getOutputStream())
} else {
// do something if it fails.
}
} catch (IOException ioe) {
// Do something to handle exception
} catch (InterruptedException ie) {
// Do something to handle exception
}
}
}
public Process printApp(String id, String pdf) throws IOException {
String urlPath="http://localhost:8080/proj";
urlPath+="/genApp.htm?id="+id;//generate url to execute wkhtmltopdf
String wxpath="/home/exm/wkhtmltopdf";//the path where wkhtmltopdf located
String command = wxpath+" "+urlPath+" "+pdf;
return Runtime.getRuntime().exec(command);
}
Something like the code above should to the trick.

File download returns corrupted file (I think) in Play framework 2.2.2

I'm struggling with getting file upload/download to work properly in Play framework 2.2.2. I have a Student class with a field called "cv". It's annotated with #Lob, like this:
#Lob
public byte[] cv;
Here are the upload and download methods:
public static Result upload() {
MultipartFormData body = request().body().asMultipartFormData();
FilePart cv = body.getFile("cv");
if (cv != null) {
filenameCV = cv.getFilename();
String contentType = cv.getContentType();
File file = cv.getFile();
Http.Session session = Http.Context.current().session();
String studentNr = session.get("user");
Student student = Student.find.where().eq("studentNumber", studentNr).findUnique();
InputStream is;
try {
is = new FileInputStream(file);
student.cv = IOUtils.toByteArray(is);
} catch (IOException e) {
Logger.debug("Error converting file");
}
student.save();
flash("ok", "Vellykket! Filen " + filenameCV + " ble lastet opp til din profil");
return redirect(routes.Profile.profile());
} else {
flash("error", "Mangler fil");
return redirect(routes.Profile.profile());
}
}
public static Result download() {
Http.Session session = Http.Context.current().session();
Student student = Student.find.where().eq("studentNumber", session.get("user")).findUnique();
File f = new File("/tmp/" +filenameCV);
FileOutputStream fos;
try {
fos = new FileOutputStream(f);
fos.write(student.cv);
fos.flush();
fos.close();
} catch(IOException e) {
}
return ok(f);
}
The file seems to be correctly saved to the database (the cv field is populated with data, but it's obviously cryptic to me so I don't know for sure that the content is what it's supposed to be)
When I go to my website and click the "Download CV" link (which runs the download action), the file gets downloaded but can't be opened - saying the PDF viewer can't recognize the file etc. (Files uploaded have to be PDF)
Any ideas on what might be wrong?
Don't keep your files in DB, filesystem is much better for that! Save uploaded file on the disk with some unique name, then in your database keep only path to the file as a String!
It's cheaper in longer run (as said many times)
It's easier to handle downloads, i.e. in Play all you need to serve PDF is:
public static Result download() {
File file = new File("/full/path/to/your.pdf");
return ok(file);
}
it will set proper headers, like Content-Disposition, Content-Length and Content-Type not only for PDFs

HTTP GET to retrieve compressed files

I've developed a HTTP communication object for the purpose pof downloading files via a GET request.
This works just fine when downloading a text file. However, downloading a compressed file such as zip, gz or tar.gz appears to download the file but the file is not valid.
In the case zip I get a meesage saying it tried to move the pointer before the beginnning of the file.
In the case of .tar.gz the message is Data error in file.tar. File is broken.
In all cases the download links I use do allow a complete and correct download from the URL. Yet, the Java code based download brings the file down but it is not valid.
The code is as follows:
public class HTTPCommunicationGet {
private URIBuilder sendData;
private URI target;
private HttpGet getConnection;
public HTTPCommunicationGet(String url, TreeMap<String, String> components) {
super(url, components);
}
public HTTPCommunicationGet(String url, String queryString) {
super(url, queryString);
}
protected void defineSendData() throws URISyntaxException, IOException {
this.sendData = new URIBuilder(new URI(this.getUrl()));
if (this.getComponents() != null && this.getComponents().size() > 0) {
for (Map.Entry<String, String> component : this.getComponents().entrySet()) {
this.sendData.setParameter(component.getKey(), component.getValue());
}
}
}
protected void retrieveRemoteData() throws IOException, MalformedURLException, URISyntaxException, DataMapHTTPGetException {
this.target = this.sendData.build();
this.getConnection = new HttpGet(target);
HttpResponse response = client.execute(this.getConnection);
if (response.getStatusLine().toString().toUpperCase().contains("200 OK")) {
this.setResponse(response.getStatusLine().toString(), "Data Retrieved");
BufferedReader rd = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
String line = "";
while ((line = rd.readLine()) != null) {
this.remoteData.append(line);
}
} else {
String message = String.format("%s: Provider connection exception; response returned was not 200 OK", this.target.toASCIIString());
this.setResponse(response.getStatusLine().toString(), message);
DataMapHTTPGetException ex = new DataMapHTTPGetException(target.toString(), message);
throw ex;
}
}
public void downloadFiles(String localFile) throws DataMapConnectionException, FileNotFoundException, IOException, URISyntaxException {
// check that we have remoteData set
this.defineSendData();
this.retrieveRemoteData(); // everything is bubbled up to the controller class that is calling this.
File localMetaFile = new File(localFile);
switch (this.archiveMetaFile(localMetaFile)) {
case -1:
IOException ex = new IOException(String.format("The file %s could not be moved", localFile));
throw ex;
//break;
case 0:
infoLog.info(String.format("%s: this file did not already exist", localFile));
break;
case 1:
infoLog.info(String.format("%s: this file was found and successfully archived to the processed directory", localFile));
break;
}
BufferedWriter fileWriter = new BufferedWriter(new FileWriter(localFile));
fileWriter.write(this.remoteData.toString());
fileWriter.close();
}
}
As you can see this is called via downloadFiles after the object has been initialised. I've cut out the code that is not needed for this example such as the archiveMetaFile method.
Any pointers on why this is not working for compressed files is much appreciated.
Cheers
Nathan
The problem is likely that you are using a BufferedReader instead of an InputStream. Readers are used for text data and impose a character encoding whereas InputStreams can handle raw binary data.
Try switching to a BufferedInputStream instead. The use of any Reader class will corrupt binary data.

Categories