Partial reading of file in Java - java

I am creating a Java application in which I need to read the first few lines of a huge text file and do the processing.
Is it possible that instead of getting the entire file, I read the first few lines and fetch the data? And this is being done using Java API.

Use BufferedReader.

Yes, it can be done. When you use BufferedReader, for example, you read just (buffer_size) from the file. Then you can process it before reading the next fragment...
for example, see this tutorial

There is also LineNumberReader if you need to keep track of the line numbers

Related

android specific writing in File

I want to have an internal File like stock features with these properties:
File has a Maximum of LineNumber i.e.: 400.
New line appends to file until this limitation is reached; then, firstly removes the First Line and then appends New Line.
To remove a line it shouldn't be necessary to read and rewrite the whole file.
Is it possible to do it so?
Many thanks and best regards
You need a circular file buffer. There is nothing built into Android, but you can use This implementation which I have used in the past and works well for me.

Parsing XML file from the end of file

I want to use XML for storing some data. But I do not want read full file when I want to get the last data that was inserted there, as well as I do not want to rewrite full file when adding new data there. Is there a standard way in java to parse xml file not from the beginning but from the end. So that for example SAX or StaX parser will first encounter last closing root tag and than last tag. Or if I want to do this I should read and write everything like I am reading/writing regular text file?
Fundamentally, XML is a poor representation choice for this. The format is inherently "contained" like this, and I haven't seen any APIs which encourage you to fight against that.
Options:
Choose a different format entirely (e.g. use a database)
Create lots of small XML files instead - each one self-contained. When you want the whole of the data, read all the files
Just swallow the hit and read/write the whole file each time.
I found a good topic on this with example solutions for what I want.
This link: http://www.oreillynet.com/xml/blog/2007/03/parsing_xml_backwards.html
Seems that XML is not good file format to achieve what I want. There is no standard parser that can parse XML from the end instead of beginning.
Probably the best solution for will be storing all xml data in one file that contains composition of many xml files contents. On each line stored separate contents of XML. The file itself is not well formed XML but each line contains well formed xml that I will parse using standard xml parser(StaX).
This way I will be able to read just lines from the end of file and append new data to the end of file. When I need the whole data or only the part of it I will read all line or part of them. Probably I can also implement pagination from the end of file for that because the file can be big.
Why XML in each line? I think it is easy to use API for parsing it as well as it is human readable to store data in xml instead of just separating values in the line with some symbol.
Why not use sax/stax and simply process only your last entry? Yes, it will need to open and go through the whole file, but at least it's fairly efficient as opposed to loading the whole DOM tree.
Short of doing that, I don't think you can do what you're asking using XML as a source.
Another alternative, apart from the ones provided by Jon Skeet in his answer, would be to keep the same format but insert the latest entries first, and stop processing the files as soon as you've read your entry.

Indexing text files in java

I have a set of text files providing informations that are parsed, analysed and allow building a model. Sometime, the user of this model wants to know which part of a text file was used to generate a given model item.
For that I am thinking of keeping track of the range of lines (or bytes) ids to be able to read the appropriate text part once required.
My question is: I wonder if it their exists any java Reader able to read a file by using a start and stop line (or byte) id instead of reading the file from the begining and counting the lines (bytes)?
Best regards
If you know exactly amount of bytes, that should be skipped, you can use seek method method of RandomAccessFile
To read from the certain byte - SeekableByteChannel. Of cause, there aren't any Readers able to start from the line id - because positions of line separators are unknown.
You can use InputStream.mark() and InputStream.skip() to navigate to concrete position into the file.
But are you sure you really have to implement this yourself? Take a look on Lucine - the indexing service that probably will help you.

Change the file descriptor offset

Now I'm making a little program in Java which must read a really big file. Due to this thing, I want to access to the file but not read completely each time, then my question is the following: can I change the offset of the file descriptor with a simple instruction or the only solution that I have is read all the previous lines which I don't need?
In other words, can I simulate the lseek command in my input file?
I think it's not necessary this time, but if someone wants code, I'll post it.
Regards!
I think you probably want RandomAccessFile.
Specifically, you want the seek(long) method.

Java File Update

How to change the part of content of the file, starting at specific character, without reading and writing whole file?
Use java.io.RandomAccessFile class. You can seek() to an arbitrary position in the file and then read or write from/to there. Try looking at writeUTF(String) for writing text, and getFilePointer() for remembering position in the file. Unfortunately, there is no easy way to "insert" text as you would do it in an editor, instead the contents are always "overwritten".
Also, FileWriter and FileOutputStream support append-mode, which you can use for appending extra data to the end of the file without rewriting it. But if you need to change things in the middle, you have to use random access file.
check out the scanner class
it makes it easier to read and parse strings and primitive types using regular expressions.

Categories