How to set an offset with readline in java? - java

Is there a way to read from a file via readline() and an offset in java?
For example, file.readline(10) return the 10's line of the file without reading the first 9.

readline doesn't provide such functionality. You can use seek function to set an offset (link to related question. However, it has no way of knowing about newline symbols or anything else. It simply sets offset in bytes. If length of your lines is fixed - you can use it just like you want. Otherwise - you need to use readline several times to get required string.

I don't think so, how would it know that this line is the 10'th line if it hadn't read the first 9 ?. But you can read the file once,store it in an ArrayList or some Data Structure and than access each line alone. You can store line 1 in index ,line 2 in index 2...

Related

How are the content in files are listed?

I could not find any information on the internet about this. For instance, if I have integers in my file. are those integer listed as like an array with index or random like Linkedlist?
In my project I need to read integers from the file , 4 at a time and store it in another file so how can reference to each integer in the file?
A file is nothing but a sequence of bytes. There is no structure to it whatsoever unless you fetch the content in the memory. Interpreting a file depends upon the kind of the file you are trying to read. So, for example, in your case, if numbers are separated by end-of-line characters, you can read the file line by line (which means reading a chunk until you discover an endl or '\n').
As Kevin said, File in Java is a sequence of Bytes. You can read the file either using Byte steam or Character stream. So if the Inout file has integers which are in different lines, then you can read one line at a time and store it temporarily so that once you read 4 lines (4 integers), you can do the processing task.

How parser's buffer works? Matching the regex

One of my students have a task to do which part is to check if there is a regex matching string inside of a file.
The trick is that his teacher has forbidden reading whole file at once then parse it. Instead he said that he supposed to use buffer. The problem is that you never know how much of input you suppose to read from the file: there might be a matching sequence if you read just one character more from the file.
So the teacher wrote(translated):
Use technique known from parsers:
rewrite second half of the buffer to the first part of buffer
read next part of file to the second half
check if whole buffer contains the matching sequence
So how it suppose to be done(idea)? In my opinion it does not solve the problem stated above and it is pretty stupid and wasteful.
A Matcher does use an internal buffer of some kind, certainly. But if you look at the prototype to build a Matcher, you see that the only thing it takes as an argument is a simple CharSequence, which has only three operations:
knowing its length,
getting one character at a given offset,
getting a subsequence (another CharSequence).
When reading from a file, one possibility is to map the whole file using FileChannel.map(), then use an appropriate CharsetDecoder to read into a CharBuffer (which implements CharSequence). Or do that in chunks...
... Or use yours truly's crazy idea: this! I have tested it on 800+ MiB files and it works...
What your teacher is saying:
The regex will never need to match anything longer than half the length of the buffer.
The match could lie on a buffer boundary, hence you need to shift:
That seems realistic.
A BufferedReader reading line wise seems not entirely fitting. Maybe you might consider a byte array, BufferedInputStream.

Reading integers from a text file

I have a text file, in which I am writing 3 things
Eg < int,int,char> for each word.
Now, I am reading the file such that I consider a block of 3.1st one I always consider an integer, 2nd one also as integer and the 3rd one as character .There is no problem when the integer is from 0-9 but when it exceeds like 10,100 then my program doesn't work for the obvious reasons.
Like there is no problem when I have to read this
11a here <1=int,1=int,a=char>
but when something like this comes, I face problem
152a here <15=int,2=int,a=char>
I have put the whole text file in a string.Now, how how do I read the characters that I no longer face the above mentioned problem
Some more info: My text file contains characters like this
11a22d33f1234f
Given your current description of the problem, there is no way to determine if an entry such as
152a
corresponds to (15,2,a) or (1,52,a).
Why don't you write to the file with some delimiter between elements, and then split() around the delimiter when reading back in from the file?
your text file has improper format then
how do you want to differ "1 11 a" and "11 1 a" e.g.
cant you use csv or something like that?

Java File Splitting

What will be the most eficient way to split a file in Java ?
Like to get it grid ready...
(Edit)
Modifying the question.
Basically after scouring the net I understand that there are generally two methods followed for file splitting....
Just split them by the number of bytes
I guess the advantage of this method is that it is fast, but say I have all the data in a line and suppose the file split puts half the data in one split and the other half the data in another split, then what do I do ??
Read them line by line
This will keep my data intact, fine, but I suppose this ain't as fast as the above method
Well, just read the file line by line and start saving it to a new file. Then when you decide it's time to split, start saving the lines to a new place.
Don't worry about efficiency too much unless it's a real problem later.
My first impression is that you have something like a comma separated value (csv) file. The usual way to read / parse those files is to
read them line by line
skip headers and empty lines
use String#split(String reg) to split a line into values (reg is chosen to match the delimiter)

Parsing of data structure in a plain text file

How would you parse in Java a structure, similar to this
\\Header (name)\\\
1JohnRide 2MarySwanson
1 password1
2 password2
\\\1 block of data name\\\
1.ABCD
2.FEGH
3.ZEY
\\\2-nd block of data name\\\
1. 123232aDDF dkfjd ksksd
2. dfdfsf dkfjd
....
etc
Suppose, it comes from a text buffer (plain file).
Each line of text is "\n" - limited. Space is used between the words.
The structure is more or less defined. Ambuguity may sometimes be, though, case
number of fields in each line of information may be different, sometimes there may not
be some block of data, and the number of lines in each block may vary as well.
The question is how to do it most effectively?
First solution that comes to my head is to use regular expressions.
But are there other solutions? Problem-oriented? Maybe some java library already written?
Check out UTAH: https://github.com/sonalake/utah-parser
It's a tool that's pretty good at parsing this kind of semi structured text
As no one recommended any library, my suggestion would be : use REGEX.
From what you have posted it looks like the data is delimited by whitespace. One idea is to use a Scanner or a StringTokenizer to get one token at a time. You can then check the first char of a token to see if it is a digit (in which case the part of the token after the digit(s) will be the data, if there is any).
This sounds like a homework problem so I'm going to try to answer it in such a way to help guide you (not give the final solution).
First, you need to consider each object of data you're reading. Is it a number then a text field? A number then 3 text fields? Variable numbers and text fields?
After that you need to determine what you're going to use to delimit each field and each object. For example, in many files you'll see something like a semi-colon between the fields and a new line for the end of the object. From what you said it sounds like yours is different.
If an object can go across multiple lines you'll need to bear that in mind (don't stop partway through an object).
Hopefully that helps. If you research this and you're still having problems post the code you've got so far and some sample data and I'll help you to solve your problems (I'll teach you to fish....not give you fish :-) ).
If the fields are fixed length, you could use a DataInputStream to read your file. Or, since your format is line-based, you could use a BufferedReader to read lines and write yourself a state machine which knows what kind of line to expect next, given what it's already seen. Once you have each line as a string, then you just need to split the data appropriately.
E.g., the password can be gotten from your password line like this:
final int pos = line.indexOf(' ');
String passwd = line.substring(pos+1, line.length());

Categories