Files.readAllLines skips last line - java

am reading xml file in List in such a way:
List<String> file_lines = Files.readAllLines(path);
My file has 317 lines, but in list appears 316. Main problem is that lat one is empty (just carriage return probably)
How can I deal with this? I need precisely every line of file, because am calculating crc32 checksum for file validation.

I need precisely every line of file, because am calculating crc32 checksum for file validation.
In that case, you are reading it the wrong way. You need to read it and checksum it as bytes rather than characters:
to avoid problems with text transcoding issues,
to avoid the problem of dealing with the last line of the file (which may or may not have an end of line ...), and
to avoid having to deal with variants in end-of-line markers ... that you typically cannot distinguish with a "read line" or "read all lines" API method.

Related

How are the content in files are listed?

I could not find any information on the internet about this. For instance, if I have integers in my file. are those integer listed as like an array with index or random like Linkedlist?
In my project I need to read integers from the file , 4 at a time and store it in another file so how can reference to each integer in the file?
A file is nothing but a sequence of bytes. There is no structure to it whatsoever unless you fetch the content in the memory. Interpreting a file depends upon the kind of the file you are trying to read. So, for example, in your case, if numbers are separated by end-of-line characters, you can read the file line by line (which means reading a chunk until you discover an endl or '\n').
As Kevin said, File in Java is a sequence of Bytes. You can read the file either using Byte steam or Character stream. So if the Inout file has integers which are in different lines, then you can read one line at a time and store it temporarily so that once you read 4 lines (4 integers), you can do the processing task.

Identifying points in lines of text

I have a java program that reads lines of a text file into a buffer and when the buffer is full it outputs the lines so that after all lines have been through the buffer the output is partially sorted.
The output will be in blocks of lines so I need a way to mark the end of each block in the output. Since the output is lines of text I'm not sure what character to use as a marker since the text can contain any characters. I'm thinking of using the ascii null or unit separator but I'm not sure if this would be reliable since it could also be in text.
You could use a Map, so you can set a key for every buffergroup something like that
Hash<int,Buffer> myMap = new HashMap<>();
if you are not sure how to discriminate lines, I suggest you take a look at a sentence tokenizer tool which is usually used in NLP. These programs contain patterns that discriminate lines from each other. That way, you can send all your date through and get the lines without worrying about wich character to use. There are plenty libraries for Java which does the job perfectly (Assuming your text is in English)

write strings to the bottom line of an textfile

i want to write strings to a textfile, everytime to the bottom of the file. And then if im searching for a certain string in the textfile and finds it, i want to replace that line with another.
I'm thinking this: Count rows in textfile and add +1 and then write the string i want to write to that index. But is it even possible to write to a certain linenumber in a textfile?
And how about to update a certain row to another string ?
thanks!
You do not want to do that: it is a recipe for disaster. If, during the original file modification, you fail to write to it, the original file will be corrupted.
Use a double write protocol, write the modified file to another file, and only if the write suceeds, rename that file to the original.
Provided your file is not too big, for some definition of "big", I'd recommend creating a List<String> for the destination file: read the original file line by line, add to that list; once the list processing is complete (your question is unclear what should really happen), write each String to the other file, flush and close, and if that succeeds, rename to the original.
If you want to append strings, the FileOutputStream does have an alternate constructor which you can set to true so you can open for appending.
If you'd like, say, to replace strings into a file without copying it, your best bet would be to rely in RandomAccessFile instead. However, if the line length is varying, this is unreliable. For fixed-length records, this should work as such:
Move to the offset
Write
You can also 'truncate' (via setLength), so if there's a trailing block you need to get rid, you could discard as such.
A Third Solution would be to rely in mmap. This requires on a Memory-Mapped Bytebuffer for the whole file. I'm not considering the whole feasibility of the solution (it works in plain C), but that actually 'looks' the more correct, if you consider both the Java Platform + the Operating System

Most elegant way to read a file and operating on lines as bytes

I have a database dump file I need to operate on rawly. I need to read the file in, operating on it line by line, but I can't have the whole file in memory (they can be 10gb + theoretically).
I want to be able to read it and operate on each line individually as I go, until the end of the file. It has to be weird character friendly (can have all sorts of bytes in them).
You could adapt the old nio example grep and remove the pattern match if you don't need it.
if the line break doesn't interest you can use BufferedReader#readLine() and convert the string back to a byte[]
the other way is to use a byte[] as buffer (has to be large enough for a line) and use InputStream#read(byte[]) to fill it with bytes. then you can search the buffer for linefeeds and work with part of the buffer. once you find no more linefeeds, move the data to the left via System#arraycopy() and fill the rest with new data through InputStream#read(byte[], int, int) and go on.
but be careful! depending on the encoding (e.g. unicode) one byte doesn't have to be one character

Java File Splitting

What will be the most eficient way to split a file in Java ?
Like to get it grid ready...
(Edit)
Modifying the question.
Basically after scouring the net I understand that there are generally two methods followed for file splitting....
Just split them by the number of bytes
I guess the advantage of this method is that it is fast, but say I have all the data in a line and suppose the file split puts half the data in one split and the other half the data in another split, then what do I do ??
Read them line by line
This will keep my data intact, fine, but I suppose this ain't as fast as the above method
Well, just read the file line by line and start saving it to a new file. Then when you decide it's time to split, start saving the lines to a new place.
Don't worry about efficiency too much unless it's a real problem later.
My first impression is that you have something like a comma separated value (csv) file. The usual way to read / parse those files is to
read them line by line
skip headers and empty lines
use String#split(String reg) to split a line into values (reg is chosen to match the delimiter)

Categories