Creating mp4 file doesn't remove tmp files - java

I'm trying to write an InputStream that is an mp4 that I get from calling an external SOAP service, when I do so, it always generates this tmp files for my chosen temporary directory(java.io.tmpdir) that aren't removable and stay after the writing is done.
Writing images that I also get from the SOAP service works normal without the permanent tmp on the directory. I'm using java 1.8 SpringBoot
tmp files
This is what I'm doing:
File targetFile = new File("D:/archive/video.mp4");
targetFile.getParentFile().mkdirs();
targetFile.setWritable(true);
InputStream inputStream = filesToWrite.getInputStream();
OutputStream outputStream = new FileOutputStream(targetFile);
try {
int byteRead;
while ((byteRead = inputStream.read()) != -1) {
outputStream.write(byteRead);
}
} catch (IOException e) {
logger.fatal("Error# SaveFilesThread for guid: " + guid, e);
}finally {
try {
inputStream.close();
outputStream.flush();
outputStream.close();
}catch (Exception e){
e.printStackTrace();
}
also tried:
byte data[] = IOUtils.toByteArray(inputStream);
Path file = Paths.get("video.mp4");
Files.write(file, data);
And from apache commons IO:
FileUtils.copyInputStreamToFile(initialStream, targetFile);

When your code starts, the damage is already done. Your code is not the source of the temporary files (It's.. a ton of work for something that could be done so much simpler, though, see below), it's the framework that ends up handing you that filesToWrite variable.
It is somewhat likely that you can hook in at an earlier point and get the raw inputstream representing the socket or HTTP connection, and start saving the files straight from there. Alternatively, Perhaps filesToWrite has a way to get at the files themselves, in which case you can just move them into place instead of copying them over.
But, your code to do this is a mess, it has bad exception handling, and leaks memory, and is way too much code for a simple job, and is possibly 2000x to 10000x slower than needed depending on your harddisk (I'm not exaggerating, calling single-byte read() on unbuffered streams is thousands of times slower!)
// add `throws IOException` to your method signature.
// it saves files, it's supposed to throw IOException,
// 'doing I/O' is in the very definition of your method!
try (InputStream in = filesToWrite.getInputStream();
OutputStream out = new FileOutputStream(targetFile)) {
in.transferTo(out);
}
That's it. That solves all the problems - no leaks, no speed loss, tiny amount of code, fixes the deplorable error handling (which, here, is 'log something to the log, then print something to standard out, then potentially leak a bunch of resources, then don't tell the calling code anything went wrong and return exactly as if the copy operation succeeded).

Related

How to download monthly Treasury Files

Up till early this year the US Treasury web site posted monthly US Receipts and Outlays data in txt format. It was easy to write a program to read and store the info. All I use were:
URL url = new URL("https://www.fiscal.treasury.gov/fsreports/rpt/mthTreasStmt/mts1214.txt")
URLConnection connection.openConnection();
InputStream is = connection.getInputStream();
Then I just read the InputStream into a local file.
Now when I try same code, for May, I get an InputStream with nothing in it.
Just clicking on "https://www.fiscal.treasury.gov/fsreports/rpt/mthTreasStmt/mts0415.xlsx" opens an excel worksheet (the download path has since changed).
Which is great if you don't mind clicking on each link separately ... saving the file somewhere ... opening it manually to enable editing ... then saving it again as a real .xlsx file (because they really hand you an .xls file.)
But when I create a URL from that link, and use it to get an InputStream, the is empty. I also tried url.openStream() directly. No different.
Can anyone see a way I can resume using a program to read the new format?
In case its of interest I now use this code to write the stream to the file bit by bit... but there are no bits, so I don't know if it works.
static void copyInputStreamToFile( InputStream in, File file ) {
try {
OutputStream out = new FileOutputStream(file);
byte[] buf = new byte[1024];
System.out.println("reading: " + in.read(buf));
//This is what tells me it is empty, i.e. the loop below is ignored.
int len;
while((len=in.read(buf))>0){
out.write(buf,0,len);
}
out.close();
in.close();
} catch (Exception e) {
e.printStackTrace();
}
}
Any help is appreciated.

new FileOutputStream slow, is there a better way?

I'm writing a bunch of relatively small files (about 50k or so each).
The total processing time for writing all of these files is about 400 seconds.
I put in some checks to see what's taking the most time and of that 400 total seconds, 12 seconds is spent writing the data to the files and 380 seconds are spent just doing this code:
fos = new FileOutputStream(fileObj);
I would expect the writing and closing of the file to take most of the time but it looks like just creating the FileOutputStream is taking the most amount of time by far.
Is there a better way to create my files or is the file creation just generally a slow operation? This is the total time for thousands of files by the way, not just the time for a single file.
What you are seeing is pretty much normal behavior, its not java-specific.
When a file is created, the file system needs to add a file entry to its structures, and in the process modify existing structure (e.g. the directory the file is contained in) to take note of the new entry.
On a typical harddisk this requires some head movements, a single seek takes time in the order of milliseconds. On the other hand, once you start writing to the file, the file system will assign new blocks to the file in a linear fashion (as long as possible), so you can write sequential data with about the maximum speed the drive can handle.
The only way to make major improvements in speed is use a faster device (e.g. an SSD drive).
You can pretty much observe this effect everywhere, Windows explorer and similar tools all show the same behavior: large files are copied with speeds close to the devices limits, while tons of small files go painfully slow.
Something to avoid that problem and spend the same time in all files is when you give the path of the file delete the extension and when you finish to copy that file, rename the file with the extension you took before. Here is an example:
public static void copiarArchivo(String pathOrigen, String pathDestino)
{
InputStream in = null;
OutputStream out = null;
// ultPunto has the index where the last point is in the name of the
// file. Before of the last point is the fileName after is the extension
int ultPunto = pathDestino.lastIndexOf(".");
// take the extension of the file
String extension = pathDestino.substring(ultPunto, pathDestino.length());
// take the fileName without extension
String pathSinExtension = pathDestino.substring(0, ultPunto);
try
{
in = new FileInputStream(pathOrigen);
// creates the new file without extension cause it is faster as
// expleanied below
out = new FileOutputStream(pathSinExtension);
byte[] buf = new byte[buffer];
int len;
// binary copy of the content file
while ((len = in.read(buf)) > 0)
{
out.write(buf, 0, len);
}
} catch (IOException e) {
e.printStackTrace();
}
// when the finished is copyed or and exception occour the streams must
// be closed to save resources
finally
{
try
{
if(in != null )
in.close();
if(out != null)
out.close();
} catch (IOException e) {
e.printStackTrace();
}
}
// the file was copyed with out extension and it must be added after the
// fileName
new File(pathSinExtension).renameTo(new File(pathSinExtension + extension));
}
Where pathOrigen is the path of the file you want to copy and the pathDestino is the path where it is going to be copyed.

reduce number of opened files in java code

Hi I have some code that uses block
RandomAccessFile file = new RandomAccessFile("some file", "rw");
FileChannel channel = file.getChannel();
// some code
String line = "some data";
ByteBuffer buf = ByteBuffer.wrap(line.getBytes());
channel.write(buf);
channel.close();
file.close();
but the specific of the application is that I have to generate large number of temporary files, more then 4000 in average (used for Hive inserts to the partitioned table).
The problem is that sometimes I catch exception
Failed with exception Too many open files
during the app running.
I wounder if there any way to tell OS that file is closed already and not used anymore, why the
channel.close();
file.close();
does not reduce the number of opened files. Is there any way to do this in Java code?
I have already increased max number of opened files in
#/etc/sysctl.conf:
kern.maxfiles=204800
kern.maxfilesperproc=200000
kern.ipc.somaxconn=8096
Update:
I tried to eliminate the problem, so I parted the code to investigate each part of it (create files, upload to hive, delete files).
Using class 'File' or 'RandomAccessFile' fails with the exception "Too many open files".
Finally I used the code:
FileOutputStream s = null;
FileChannel c = null;
try {
s = new FileOutputStream(filePath);
c = s.getChannel();
// do writes
c.write("some data");
c.force(true);
s.getFD().sync();
} catch (IOException e) {
// handle exception
} finally {
if (c != null)
c.close();
if (s != null)
s.close();
}
And this works with large amounts of files (tested on 20K with 5KB size each). The code itself does not throw exception as previous two classes.
But production code (with hive) still had the exception. And it appears that the hive connection through the JDBC is the reason of it.
I will investigate further.
The amount of open file handles that can be used by the OS is not the same thing as the number of file handles that can be opened by a process. Most unix systems restrict the number of file handles per process. Most likely it something like 1024 file handles for your JVM.
a) You need to set the ulimit in the shell that launches the JVM to some higher value. (Something like 'ulimit -n 4000')
b) You should verify that you don't have any resource leaks that are preventing your files from being 'finalized'.
Make sure to use a finally{} block. If there is an exception for some reason the close will never happen in the code as written.
Is this the exact code? Because I can think of one scenario where you might be opening all the files in a loop and written the code to close all of them in the end which is causing this problem. Please post the full code.

Best way to design Java file download manager

I would like to write simple Java downloader for my backup website. What is important, applet should be able to download many files at once.
So, here is my problem. Such applet seems to me easily to hack or infect. What is more, it for sure will need many system resources to run. So, I would like to hear your opinions what is the best, the most optimal and the most secure way to do it.
I thought about something like this:
//user chose directory to download his files
//fc is a FileChooser
//fc.showSaveDialog(this)==JFileChooser.APPROVE_OPTION
try {
for(i=0;i<=urls.length-1;i++){
String fileName = '...';//obtaining filename and extension
fileName=fileName.replaceAll(" ", "_");
//I am not sure if line above resolves all problems with names of files...
String path = file.getAbsolutePath() + File.separator + fileName;
try{
InputStream is=null;
FileOutputStream os=new FileOutputStream(path);
URLConnection uc = urls[i].openConnection();
is = uc.getInputStream();
int a=is.read();
while(a!=-1){
os.write(a);
a=is.read();
}
is.close();
os.close();
}
catch(InterruptedIOException iioe) {
//TODO User cancelled.
}
catch(IOException ioe){
//TODO
}
}
}
but I am sure that there is a better solution.
There is one more thing - when user wants to download really huge amount of files (e.g. 1000, between 10MB and 1GB), there will be several problems. So, I thought about setting a limit for it, but I don't really know how to decide how many files at once is OK. Should I check user's Internet connection or computer's load?
Thanks in advance
BroMan
I would like to write simple Java downloader for my backup website.
What is important, applet should be able to download many files at once.
I hope you mean sequentially like your code is written. There would be no advantage in this situation to run multiple download streams in parallel.
Such applet seems to me easily to hack or infect.
Make sure to encrypt your communication stream. Since it looks like you are just accessing URLs on the server, maybe configure your server to use HTTPS.
What is more, it for sure will need many system
resources to run.
Why do you assume that? The network bandwidth will be the limiting factor. You are not going to be taxing your other resources very much. Maybe you meant avoiding saturating user's bandwidth. You can implement simple throttling by giving user a configurable delay that you insert between every file or even every iteration of your read/write loop. Use Thread.sleep to implement the delay.
So, I thought about setting a limit for it, but I don't
really know how to decide how many files at once is OK.
Assuming you are doing download sequentially, setting limits isn't a technical question. More about what kind of service you want to provide. More files just means the download takes longer.
int a=is.read();
Your implementation of stream read/write is very inefficient. You want to read/write in chunks rather than single bytes. See the versions of read/write methods that take byte[].
Here is the basic logic flow to copy data from an input stream to an output stream.
InputStream in = null;
OutputStream out = null;
try
{
in = ...
out = ...
final byte[] buf = new byte[ 1024 ];
for( int count = in.read( buf ); count != -1; count = in.read( buf ) )
{
out.write( buf, 0, count );
}
}
finally
{
if( in != null )
{
in.close();
}
if( out != null )
{
out.close();
}
}

Java - Reading multiple images from a single zip file and eventually turning them into BufferedImage objects. Good idea?

I'm working on a game, and I need to load multiple image files (png, gif, etc.) that I'll eventually want to convert into BufferedImage objects. In my setup, I'd like to load all of these images from a single zip file, "Resources.zip". That resource file will contain images, map files, and audio files - all contained in various neatly ordered sub-directories. I want to do this because it will (hopefully) make resource loading easy in both applet and application versions of my program. I'm also hoping that for the applet version, this method will make it easy for me to show the loading progress of the game resources zip file (which could eventually amount to 10MB depending on how elaborate this game gets, though I'm hoping to keep it under that size so that it's browser-friendly).
I've included my zip handling class below. The idea is, I have a separate resource handling class, and it creates a ZipFileHandler object that it uses to pull specific resources out of the Resources.zip file.
import java.io.BufferedInputStream;
import java.io.File;
import java.io.IOException;
import java.util.Enumeration;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;
public class ZipFileHandler
{
private ZipFile zipFile;
public ZipFileHandler(String zipFileLocation)
{
try
{
zipFile = new ZipFile(zipFileLocation);
}
catch (IOException e) {System.err.println("Unable to load zip file at location: " + zipFileLocation);}
}
public byte[] getEntry(String filePath)
{
ZipEntry entry = zipFile.getEntry(filePath);
int entrySize = (int)entry.getSize();
try
{
BufferedInputStream bis = new BufferedInputStream(zipFile.getInputStream(entry));
byte[] finalByteArray = new byte[entrySize];
int bufferSize = 2048;
byte[] buffer = new byte[2048];
int chunkSize = 0;
int bytesRead = 0;
while(true)
{
//Read chunk to buffer
chunkSize = bis.read(buffer, 0, bufferSize); //read() returns the number of bytes read
if(chunkSize == -1)
{
//read() returns -1 if the end of the stream has been reached
break;
}
//Write that chunk to the finalByteArray
//System.arraycopy(src, srcPos, dest, destPos, length)
System.arraycopy(buffer, 0, finalByteArray, bytesRead, chunkSize);
bytesRead += chunkSize;
}
bis.close(); //close BufferedInputStream
System.err.println("Entry size: " + finalByteArray.length);
return finalByteArray;
}
catch (IOException e)
{
System.err.println("No zip entry found at: " + filePath);
return null;
}
}
}
And I use the ZipFileHandler class like this:
ZipFileHandler zfh = new ZipFileHandler(the_resourceRootPath + "Resources.zip");
InputStream in = new ByteArrayInputStream(zfh.getEntry("Resources/images/bg_tiles.png"));
try
{
BufferedImage bgTileSprite = ImageIO.read(in);
}
catch (IOException e)
{
System.err.println("Could not convert zipped image bytearray to a BufferedImage.");
}
And the good news is, it works!
But I feel like there might be a better way to do what I'm doing (and I'm fairly new to working with BufferedInputStreams).
In the end, my question is this:
Is this even a good idea?
Is there a better way to load a whole bunch of game resource files in a single download/stream, in an applet- AND application-friendly way?
I welcome all thoughts and suggestions!
Thanks!
Taking multiple resources and putting in them in one compressed file is how several web applications work (i.e. GWT) It is less expensive to load one large file than multiple small ones. This assumes that you are going to use all those resources in your app. If not Lazy loading is also a viable alternative.
That being said, it is usually best to get the app working and then to profile to find where the bottlenecks are. If not you will end up with a lot of complicated code and it will take you a lot longer to get your app working. 10%-20% of the code takes 80-90% of the time to execute. You just don;t know which 10-20% that is until the project is mostly complete.
If your goal is to learn the technologies and tinker, then good going - looks good.
If you are using a Java program, it is usually considered good practice to bundle it as a jar file anyway. So why do not put your classes simply inside this jar file (in directories, of course). Then you can simply use
`InputStream stream = MayClass.class.getResourceAsStream(imagePath);`
to load the data for each image, instead of having to handle all the zip by yourself (and it also works for jars not actually on the file system, such as http url in applets).
I also assume the jar will be cached, but you should measure and compare the performance to your solution with an external zip file.

Categories