I have a question regarding reading images in Java. I am trying to read an image using threads and i was curious whether by doing this:
myInputFile = new FileInputStream(myFile);
I already read the whole data or not. I already read it in 4 chunks using threads and i am curious whether I just read it twice, once with threads and once with FileInputStream, or what does FileInputStream exactly do. Thanks in advance!
The FileInputStream is not reading your file yet, just by calling it like: myInputFile = new FileInputStream(myFile);.
It basically only gives you a handle to the underlying file and prepares to read from it by opening a connection to that file. Also it runs some basic checks including whether the file exists and if its a proper file and not a directory.
Following is stated in the JavaDocs which you can find here:
Creates a FileInputStream by opening a connection to an actual file,
the file named by the File object file in the file system. A new
FileDescriptor object is created to represent this file connection.
First, if there is a security manager, its checkRead method is called
with the path represented by the file argument as its argument.
If the named file does not exist, is a directory rather than a regular
file, or for some other reason cannot be opened for reading then a
FileNotFoundException is thrown.
Only by calling the FileInputStream.read methods it starts to read and return the contents of the file.
Thereby the FileInputStream.read() method will only read one single byte of the file and the FileInputStream.read(byte[] b) method will read as many bytes as the size of the byte array b.
Edit:
Because reading a file byte by byte is pretty slow and the usage of the plain FileInputStream.read(byte[] b) method can be a bit cumbersome it's a good practice to use the BufferedInputStream to process files in Java.
It'll read by default the next 8192 bytes of a file and buffer it in-memory for faster access. So the BufferedInputStream.read method will still only return a single byte per call, but in the BufferedInputStream it'll mainly be served from an internal buffer. As long the requested bytes are in this buffer, they'll be served from it. The underlying file will be accessed again only when really needed (-> the requested byte is not in the buffer anymore). This drastically reduces the number of read accesses to the hardware (which in comparison is the slowest operation in this process) and therefore boosts the reading performance a lot.
The initialization looks like this:
InputStream i = new BufferedInputStream(new FileInputStream(myFile));
The handling of it is exactly same as with the 'plain' FileInputStream, since they share the same InputStream interface.
Related
I want to know whether the object of a File class loads the entire file on the main memory. I thought of using the 2 files, one big and one small to make the objects of file class and then comparing the size of these two objects. But apparently, there is no straightforward way to determine the size of the objects in Java.
File object is just a plain object with a reference to a file path. The referenced file may or may not actually exist in the file system. File object does not hold content of the file.
When you read a file using InputStream (e.g. FileInputStream) or Reader (e.g. FileReader) in conjunction with a Buffer (e.g. BufferedReader), you start reading the actual file content. Now, it is up to you whether you want to keep the whole file content data in the memory or process chunk by chunk and discard it. So, whether or not full file content is loaded into memory depends on your application.
In order to know the file size upfront in bytes, you may do: file.length()
In order to know the file content size after reading, while reading the file store content into byte array (byte[]) and measure the length of the array using mybytes.length.
Update
You have mentioned in the comment that you want to find out the size of File object. File object is just another usual object. Still, if you want to measure the size use java.lang.instrument.Instrumentation#getObjectSize()
Please refer to this article How to use the Java Instrumentation API to understand how to determine object size using java.lang.instrument classes.
If you want to compare the size of the file's content in memory, you need to read from the file object. If you store the data in a byte array, you can measure the length of the byte array.
If I had a directory filled with different object files, is there a way I could input them into my application without opening a new stream every time? I am currently using ObjectInputStream, but I don't mind using another form of IO.
For example, if I stored my users directly onto my harddrive as objects (each having their own file: name.user), is there a way I could load them all back in using the same stream? Or would it be impossible seeing how a new File object would be needed for each individual file? Is there a way around this?
Each file will need its own stream behind the scenes; there's no way round that. But that doesn't stop you creating your own InputStream that manages this for you, and then allows you to read everything off from one stream.
The idea would be that when you try to read from your CompoundObjectInputStream or whatever, it looks to see if there are any more files that it hasn't yet processed, and opens one if so using another stream, and passes the data through. When it reaches the point where there are no more files in that directory, the CompoundObjectInputStream indicates end-of-stream.
No, there is not. Each physical file requires its own FileInputStream, FileChannel, or other corresponding native accessor.
Note that File has no direct link to a physical file, it is just an abstract path name.
I was reading the book and it has got the below lines:
A MemoryMappedBuffer directly reflects the disk file with which it
is associated. If the file is structurally modified while the mapping
is in effect, strange behavior can result (exact behaviors are, of
course, operating system- and filesystem-dependent). A
MemoryMappedBuffer has a fixed size, but the file it's mapped to is
elastic. Specifically, if a file's size changes while the mapping is
in effect, some or all of the buffer may become inaccessible,
undefined data could be returned, or unchecked exceptions could be
thrown.
So my questions are:
Can't i append text to the files which i have already mapped. If yes then how?
Can somebody please guide me what are the real use cases of memory mapped file and would be great if you can mention what specific problem you have solved by this.
Please bear with me if the questions are pretty naive. Thanks.
Memory mapped files are much faster then regular ByteBuffer version but it will allocate whole memory for example if you map 4MB file operating system will create 4MB file on filesystem that map file to a memory and you can directly write to file just by writing to memory. This is handy when you know exactly how much of data you want to write as if you write less then specified rest of the data array will be filled with zeros. Also Windows will lock the file (can't be deleted until JVM exits), this is not the case on Linux.
Below is the example of appending to a file with memory mapped buffer, for position just put the file size of file that you are writing to:
int BUFFER_SIZE = 4 * 1024 * 1024; // 4MB
String mainPath = "C:\\temp.txt";
SeekableByteChannel dataFileChannel = Files.newByteChannel("C:\\temp.txt", EnumSet.of(StandardOpenOption.WRITE, StandardOpenOption.CREATE, StandardOpenOption.APPEND));
MappedByteBuffer writeBuffer = dataFileChannel.map(FileChannel.MapMode.READ_WRITE, FILE_SIZE, BUFFER_SIZE);
writeBuffer.write(arrayOfBytes);
I'm trying to use a BufferedInputStream to load an external DICOM file, but it eventually runs out of memory. When I used an InputStream, this never came up (I did this when I was loading the file through the assets folder).
I created my own producer-consumer threads to buffer the file, so I don't actually need the BufferedInputStream, but I DO need to use mark() and reset() which is not available in FileInputStream.
How should I go around this? Is there another kind of InputStream that I can use with a File which has the mark()/reset() functions? Can I empty the buffer somehow before the BufferedInputStream throws the error? Or should I find a way around using mark() instead?
Thanks for your input.
For mark and reset to work with buffered input the file points between the mark and reset need to remain in memory.
Workarounds depend on what you're actually trying to do; if you just need to start reading from a known location, perhaps a RandomAccessFile.
I have a "processor" component that can process a single File, InputStream, Reader, or etc.
For various reasons, I end up with several large files instead of one huge file.
Is there a way to construct an input stream (or reader) that: transparently "appends" all these files so that:
1) The "processor" does not know where one file started or another ended
2) No changes occur in the file system (e.g., no actual appending of files)
3) Each file is read in order so that I do not pay the cost of loading all of them to memory and appending them before the processor starts reading?
I'm sure it is possible to write something like this, but I'm wondering if one exists already; it's been a while since I did file based IO.
SequenceInputStream concatenates multiple streams.
List<InputStream> opened = new ArrayList<InputStream>(files.size());
for (File f : files)
opened.add(new FileInputStream(f));
InputStream is = new SequenceInputStream(Collections.enumeration(opened));
Exception handling (not shown) when opening each file is important; be certain that all files are certain to be closed eventually, even if the operation is aborted before the SequenceInputStream is created.
You can use something like SequenceInputStream to read one stream after the other.