"File" class object size? - java

I want to know whether the object of a File class loads the entire file on the main memory. I thought of using the 2 files, one big and one small to make the objects of file class and then comparing the size of these two objects. But apparently, there is no straightforward way to determine the size of the objects in Java.

File object is just a plain object with a reference to a file path. The referenced file may or may not actually exist in the file system. File object does not hold content of the file.
When you read a file using InputStream (e.g. FileInputStream) or Reader (e.g. FileReader) in conjunction with a Buffer (e.g. BufferedReader), you start reading the actual file content. Now, it is up to you whether you want to keep the whole file content data in the memory or process chunk by chunk and discard it. So, whether or not full file content is loaded into memory depends on your application.
In order to know the file size upfront in bytes, you may do: file.length()
In order to know the file content size after reading, while reading the file store content into byte array (byte[]) and measure the length of the array using mybytes.length.
Update
You have mentioned in the comment that you want to find out the size of File object. File object is just another usual object. Still, if you want to measure the size use java.lang.instrument.Instrumentation#getObjectSize()
Please refer to this article How to use the Java Instrumentation API to understand how to determine object size using java.lang.instrument classes.

If you want to compare the size of the file's content in memory, you need to read from the file object. If you store the data in a byte array, you can measure the length of the byte array.

Related

Reading all object files in directory with single stream

If I had a directory filled with different object files, is there a way I could input them into my application without opening a new stream every time? I am currently using ObjectInputStream, but I don't mind using another form of IO.
For example, if I stored my users directly onto my harddrive as objects (each having their own file: name.user), is there a way I could load them all back in using the same stream? Or would it be impossible seeing how a new File object would be needed for each individual file? Is there a way around this?
Each file will need its own stream behind the scenes; there's no way round that. But that doesn't stop you creating your own InputStream that manages this for you, and then allows you to read everything off from one stream.
The idea would be that when you try to read from your CompoundObjectInputStream or whatever, it looks to see if there are any more files that it hasn't yet processed, and opens one if so using another stream, and passes the data through. When it reaches the point where there are no more files in that directory, the CompoundObjectInputStream indicates end-of-stream.
No, there is not. Each physical file requires its own FileInputStream, FileChannel, or other corresponding native accessor.
Note that File has no direct link to a physical file, it is just an abstract path name.

Memory mapped files in java

I was reading the book and it has got the below lines:
A MemoryMappedBuffer directly reflects the disk file with which it
is associated. If the file is structurally modified while the mapping
is in effect, strange behavior can result (exact behaviors are, of
course, operating system- and filesystem-dependent). A
MemoryMappedBuffer has a fixed size, but the file it's mapped to is
elastic. Specifically, if a file's size changes while the mapping is
in effect, some or all of the buffer may become inaccessible,
undefined data could be returned, or unchecked exceptions could be
thrown.
So my questions are:
Can't i append text to the files which i have already mapped. If yes then how?
Can somebody please guide me what are the real use cases of memory mapped file and would be great if you can mention what specific problem you have solved by this.
Please bear with me if the questions are pretty naive. Thanks.
Memory mapped files are much faster then regular ByteBuffer version but it will allocate whole memory for example if you map 4MB file operating system will create 4MB file on filesystem that map file to a memory and you can directly write to file just by writing to memory. This is handy when you know exactly how much of data you want to write as if you write less then specified rest of the data array will be filled with zeros. Also Windows will lock the file (can't be deleted until JVM exits), this is not the case on Linux.
Below is the example of appending to a file with memory mapped buffer, for position just put the file size of file that you are writing to:
int BUFFER_SIZE = 4 * 1024 * 1024; // 4MB
String mainPath = "C:\\temp.txt";
SeekableByteChannel dataFileChannel = Files.newByteChannel("C:\\temp.txt", EnumSet.of(StandardOpenOption.WRITE, StandardOpenOption.CREATE, StandardOpenOption.APPEND));
MappedByteBuffer writeBuffer = dataFileChannel.map(FileChannel.MapMode.READ_WRITE, FILE_SIZE, BUFFER_SIZE);
writeBuffer.write(arrayOfBytes);

How to persist large strings in a POJO?

If I have a property of an object which is a large String (say the contents of a file ~ 50KB to 1 MB, maybe larger), what is the practice around declaring such a property in a POJO? All I need to do is to be able to set a value from one layer of my application and transfer it to another without making the object itself "heavy".
I was considering if it makes sense to associate an InputStream or OutputStream to get / set the value, rather than reference the String itself - which means when I attempt to read the value of the contents, I read it as a stream of bytes, rather than a whole huge string loaded into memory... thoughts?
What you're describing depends largely on your anticipated use of the data. If you're delivering the contents in raw form, then there may be more efficient ways to manage it.
For example, if your app has a web interface, your app may just provide a URL for a web server to stream the contents to the requester. If it's a CLI-based app, you may be able to get away with a simple file copy. If your app is processing the file, however, then perhaps your POJO could retain only the results of that processing rather than the raw data itself.
If you wish to provide a general pattern along the lines of using POJO's with references to external streams, I would suggest storing in your POJO something akin to a URI that tells where to find the stream (like a row ID in a database or a filename or a URI) rather than storing an instance of the stream itself. In doing so, you'll reduce the number of open file handles, prevent potential concurrency issues, and will be able to serialize those objects locally if needed without having to duplicate the raw data persisted elsewhere.
You could have an object that supplies a stream or an iterator every time you access it. Note that the content has to live on some storage, like a file. I.e your object will store a pointer (e.g. a file path) to the storage and every time someone access it, you open a stream or create an iterator and let that party read. Note also that in order to save on memory, whoever consumes it has to make sure not to store the whole content in memory.
However, 50KB or 1MB is really tiny. Unless you have like gigabytes (or maybe hundred megabytes), I wouldn't try to do something like that.
Also, even if you have large data, it's often simpler to just use files or whatever storage you'll use.
tl;dr: Just use String.

Java Can I restore a 2 dimensional array from file without knowing its size?

Or is there any way to cast a generic object(that originally was a 2Darray) to a 2d Array without knowing its size?
I have a program where the user enter data and that data is submitted into a 2D array that varies in size depending on how much the user entered. I saved the array as an object to file using what I learned from this tutorial.
http://beginwithjava.blogspot.com/2011/04/java-file-save-and-file-load-objects.html
([to save] open file, open object stream, write objects, close)
([to restore] Open a file. Open an object stream from the file. Read objects to stream.
Cast objects, Close the stream and file.)
However, since I can't know how large the 2d array will be, I can't figure out how to cast the object back into a 2dArray when it comes time to restore.
There is no way to know the size ahead of reading, so the most common solution is to save the size before saving the data. The reader than reads the size, allocates enough memory, and procedes with reading the data the regular way.
The other less common way is to save markers in the file to indicate where each row of data ends. In a way, this is similar to null-terminating your strings in C, where the length of the string is not stored explicitly, but must be recomputed each time. This method has a disadvantage of not allowing pre-allocation. In other words, you would either need to allocate enough memory to accommodate the data of any legal size, or re-allocate dynamically as you go.
javas ArrayList automatically grows as needed, and is usefull if you are in the unlucky situation that the no of elements are unknown.
after the file is read you can convert these arraylists to an array.
but much better is to store the size at the beginning of the file.

Java Applet random access storage

I have a java project that uses java.io.RandomAccessFile to manage data loading. It seeks through the file creating a map of key points which can then be loaded as needed later. This works great.
I want to make it run as an applet but it requires security permissions to create a temp file that a downloaded file could be stored in, and that's a huge barrier for it's intended usage.
I think I can spare the memory (a few MB) to store the contents in a memory buffer of some sort and then random access it in the same way I treat local files...
Is there a way to create a temp file without requiring security permissions (I assume not)?
What is the best buffering option? How would I get the contents of a URL based input stream into the buffer, read bytes from it, and be able to record and change the current seek position?
Check this discussion first: RandomAccessFile-like API for in-memory byte array?

Categories