I have a file which contains lot of zeros and as per the requirement the zeros in the file are invalid. I am using RandomAccessFile api to locate data in the file. Is there way so that all the zeros can be removed from the file using the same api.
You'll have to stream through the file and write out the content, minus the zeros, to a separate temporary file. You can then close and delete the original and rename the new file to the old file name. That's your best alternative for this particular use case.
You can use RandomAccessFile to read the files' data, and when you reach a point where you need to change the data you can overwrite the existing number of bytes with equal number of bytes. It's iff the new value is exactly the same length as the old value.
With RandomAccessFile its difficult and equally complex when the size of two, the one being changed and the new value are different. It involves a lot of seeks, reads and writes to move data back
Try to read the whole file, change the bits you have to change and write a new file. You might process one line at a time or read the whole file into memory, modify it and write it all back out again. It is a good idea to perform the edit in the following manner:
Read file
Write to Temporary File [just to back-up]
Rename original to back-up
Work on Temporary file.
Remove Backup if you were successful.
Related
i've to read in a .txt file. That File contains 13 parameters seperated by ",". I read it line by line, split after "," and wrote those 13 parameters in a database. But theres one Problem :
That file gets a bit bigger everyday (~ 2mb), so reading it by line will soon take a lot of time. So i thought of the following:
I want to read the file, then memorize the amount of bytes where the file finishes, write this "pointer" in a database and then next time start reading AFTER that bytes where the pointer is pointing to. (This way i don't have to read the whole stuff i already have again).
How can i do this?
Thanks!
You can do this using Random Access File. It lets you access file randomly, and thereby, you can start reading the file, from wherever you need to(not necessarily from the start).
According to this article: http://bitsofinfo.wordpress.com/2009/04/15/how-to-read-a-specific-line-from-a-very-large-file-in-java/, BufferedReader has a skip() method that could be used to jump into the file (in a seemingly similar way to Random Access File).
I know how to truncate a RandomAccess file so that bytes at the end are removed.
raf.getChannel().truncate(file.length() - 4);
or
raf.setLength(file.length() - 4);
But how to truncate a RandomAccessFile in such a way that bytes at the start is removed? I don't need to write contents of this file to a new file. I googled and could not find an answer. Please help. Thanks in advance.
It's not an operation most file systems support. The model is a sequence of bytes starting at a particular place on the disc. Files are variable length and can be appended, and so truncation is relatively easy from there.
So you will actually need to copy all the bytes in the file. If at all possible avoid. One technique to manage queue files (such as logs), is to have a sequence of files then start a new file periodically and drop one off the end.
I need to implement a simple indexing scheme for a big text file. The text file contains key value pairs and I need to read back a specific key value pair without loading the complete file in memory. The text file is huge and contains millions of entries and the keys are not sorted. Different key-value pairs need to be read depending on user-input. So I don't want the complete file to be read every time. Please let me know the exact classes and methods in java file handling api that would help to implement this in a simple and efficient way.I want to do this without using an external library such as lucene.
As the comments pointed out, you're going to need to do a linear search of the entire file in worst case, and half of it on average. But fortunately there are some tricks you can do.
If the file doesn't change much, then create a copy of the file in which the entries are sorted. Ideally make records in the copy the same length, so that you can go straight to the Nth entry in the sorted file.
If you don't have the disk space for that, then create an index file, which has all the keys in the original file as key and the offset into the original file as the value. Again used fixed length records. Or better, make this index file a database. Or load the original file into a database. In either case, disk storage is very cheap.
EDIT: To create the index file, open the main file using RandomAccessFile and read it sequentially. Use the 'getFilePointer()' method at the start of each entry to read the position in the file, and store that plus the key in the index file. When looking up something read the file pointer from the index file and use the 'seek(long)' method to jump to the point in the original file.
I'd recommend building an index file. Scan the input file and write every key and its offset into a List, then sort the list and write it to the index file. Then, whenever you want to look up a key, you read in the index file and do a binary search on the list. Once you find the key you need, open the data file as a RandomAccessFile and seek to the position of the key. Then you can read the key and the value.
Ok so I know the value of the line, I dont have the line number, how would I edit only 1 line?
Its a config file, i.e
x=y
I want a command to edit x=y to x=y,z.
or even x=z.
In Java you can use `Properties class:
app.config file:
x=y
java:
public void writeConfig() throws Exception {
Properties tempProp = new Properties();
tempProp.load(new FileInputStream("app.config"));
tempProp.setProperty("x", "y,z");
tempProp.store(new FileOutputStream("app.config"), null);
}
If you are using that configuration format, you might want to use
java.util.Properties
component to read/write on that file.
But if you just want to edit it by hand, you can just read the file line by line and match the variable you want to change.
One way to do it is to:
Read the file into memory; e.g. as an array of Strings representing the lines of the file.
Locate the String/line you want to change.
Use a regex (or whatever) to modify the String/line
Write a new version of the file from the in memory version.
There are many variations on this. You also need to take care when you write the new version of the file to guard against losing everything if something goes wrong during the write. (Typically you write the new version to a temporary file, rename the old version out of the way (e.g. as a backup) and rename the new version in place of the old one.)
Unfortunately, there is no way to add or remove characters in the middle of a regular text file without rewriting a large part of the file. This "problem" is not specific to Java. It is fundamental to the way that text files are modelled / represented on most mainstream operating systems.
Unless the new line has the exact same length as the old one, your best bet is to
Open a temporary output file
Read the config file, line by line
Search for your key
If you can't find it, just write the line you just read to the output file
If you can find it, write the new value to the temporary file instead
Until you hit EOF
Delete old file
Rename new file to the old file
IF your config file is small, you can also do the whole parsing/modification step in memory and then write the final result back to the config file, that way you skip the temporary file (although a temporary file is a good way to prevent corruption if something breaks while you write the file).
If this is not what you're looking for, you should edit your question to be a lot more clear. I'm just guessing what you're asking for.
If your data is all key and value pairs, for example ...
key1=value1
key2=value2
... then load them into a Properties object. Off the top of my head, you'll need a FileInputStream to load the properties, modify with myProperties.put(key, value) and then save the properties with the use of a FileOutputStream.
Hope this helps!
rh
Is it possible to shift the contents of a file while writing to it using FileWriter?
I need to write data constants to the head of the file and if I do that it overwrites the file.
What technique should I use to do this or should I make make copies of the file (with the new data on top) on every file write?
If you want to overwrite certain bytes of the file and not others, you can use seek and write to do so. If you want to change the content of every byte in the file (by, for example, adding a single byte to the beginning of the file) then you need to write a new file and potentially rename it after you've done writing it.
Think of the answer to the question "what will be the contents of the byte at offset x after I'm done?". If, for a large percent of the possible values of x the answer is "not what it used to be" then you need a new file.
Rather than contending ourselves with the question "what will be the contents of the byte at offset x after I'm done?", lets change the mindset and ask why can't the file system or perhaps the hard disk firmware do : a) provide another mode of accessing the file [let's say, inline] b) increase the length of the file by the number of bytes added at the front or in the middle or even at the end c) move each byte that starts from the crossection by the newcontent.length positions
It would be easier and faster to handle these operations at the disk firmware or file system implementation level rather than leaving that job to the application developer. I hope file system writers or hard disk vendors would offer such feature soon.
Regards,
Samba