Scenario:
1.Create fromX.txt and toY.txt file (content has to be appended and will come from another logic)
2.check every second fromX.txt file for new addition if yes write it to toY.txt
how to get the just new content fromX.txt file?
I have tried implementing it by counting number of lines and looking for any change in it.
public static int countLines(String filename) throws IOException {
InputStream is = new BufferedInputStream(new FileInputStream(filename));
try {
byte[] c = new byte[1024];
int count = 0;
int readChars = 0;
boolean empty = true;
while ((readChars = is.read(c)) != -1) {
empty = false;
for (int i = 0; i < readChars; ++i) {
if (c[i] == '\n') {
++count;
}
}
}
return (count == 0 && !empty) ? 1 : count;
} finally {
is.close();
}
}
You implement it like this:
Open the using RandomAccessFile
Seek to where the end-of-file was last time. (If this is the first time, seek to the start of the file.)
Read until you reach the new end-of-file.
Record where the end-of-file is.
Close the RandomAccessFile
Record the position as a byte offset from the start of the file, and use the same value for seeking.
You can modify the above to reuse the RandomAccessFile object rather than opening / closing it each time.
UPDATE - The javadocs for RandomAccessFile are here. Look for the seek and getFilePointer methods.
Related
I've got this soure:
public static void inBufferBooks() throws IOException
{
Reader inStreamBooks = null;
BufferedReader bufferIn = null;
try
{
inStreamBooks = new FileReader("Files/BufferBook.txt");
bufferIn = new BufferedReader(inStreamBooks);
char text[] = new char[10];
int i = -1;
while ((i = inStreamBooks.read(text, 0, 10)) != -1)
{
System.out.print(text);
}
When I read file at the end of the text console printing chars who's fill last array.
How can I read whole text from the file without redundant chars from last array?
How can I read whole text from the file without redundant chars from last array?
Use the value read returns to you to determine how many characters in the array are still valid. From the documentation:
Returns:
The number of characters read, or -1 if the end of the stream has been reached
You need to remember how may characters you read and only print that many.
for (int len; ((len = inStreamBooks.read(text, 0, text.length)) != -1; ) {
System.out.print(new String(text, 0, len));
}
To resolve the problem I change my while cycle like this:
while((i = bufferText.read(text, 0, text.length)) != -1){
if(text.length == i){
System.out.print(text);
}else if (text.length != i){
System.out.print(Arrays.copyOfRange(text, 0, i));
}
Thanks everyone for the help.
My code below only parses through the data file once. I'm trying to get it to parse through the whole file. Every time it finds a marker, parse the data and append it to the output file. Currently it successfully parses the data once and then stops. Can't figure out how to keep it looping until eof. The data is 4 byte aligned and is in a input binary file.
private static void startParse(File inFile) throws IOException {
boolean markerFound = false;
for (int offset = 0; !markerFound && offset < 4; offset++){
DataInputStream dis = new DataInputStream(new FileInputStream(inFile));
for (int i = 0; i < offset; i++){
dis.read();
}
try {
int integer;
long l;
while((l = (integer = dis.readInt())) != MARKER) {
//Don't do anything
}
markerFound = true;
for (int i = 0; i < 11; i++){
dis.read();
}
// ********************** data **********************
byte[] data = new byte[1016];
for(int i = 0; i < 1016; i++){
data[i] = (byte) dis.read();
}
for (int i = 0; i < 4; i++){
dis.read();
}
// ***************** output data ********************
if (checksumCheck(checksum) && fecfCheck(fecf)){
FileOutputStream output = new FileOutputStream("ParsedData", true);
try{
output.write(data);
}
finally{
output.close();
}
}
}
catch (EOFException eof) {
}
dis.close();
}
}
markerFound = true;
This line is not inside a conditional and will be executed in any occurrence of the loop.
Which will of course shut down your loop because:
for (int offset = 0; !markerFound && offset < 4; offset++)
First thing
You are opening the file inside your for, so, the reading always will start at the beginning of the file. Open it before the first for.
Second
Because of the test !markerFound && offset < 4, your loop will occur max 4 times.
Third
This code not make sense to me:
for (int i = 0; i < offset; i++){
dis.read();
}
Because the offset, in the first iteration, is 0, in the next will be 1, and so on. And that loop is not necessary, you are using another loop to read bytes until reach the MARKER.
Fourth
If your file has "records" with fixed lenghts and the markers occurs on predictable positionings, use the DataInputStream skipBytes method to go forward to next marker.
As I'd posted in an earlier answer to your question Java, need a while loop to reach eof. i.e.while !eof, keep parsing I'd like to state again that DataInputStream.read() (unlike other readXxX() methods) does not throw EOFExcepion.
From the JavaDocs: (DataInputStream inherits read() from FilterInputStream)
If no byte is available because the end of the stream has been reached, the value -1 is returned.
So, to correctly check for EOF, usually read(byte[]) is used in a while loop as follows:
int read = 0;
byte[] b = new byte[1024];
while ((read = dis.read(b)) != -1) { // returns numOfBytesRead or -1 at EOF
// fos = FileOutputStream
fos.write(b, 0, read); // (byte[], offset, numOfBytesToWrite)
}
Answer
Now, getting back to your current question; since, you haven't shared your binary file format it's difficult to suggest a better way to parse it. So, from the limited understanding of the way your nested loops are parsing your file currently; you need another while loop (as reasoned above) to read/parse and copy your "data" till you reach EOF once you've found the marker.
markerFound = true;
for (int i = 0; i < 11; i++){ // move this loop inside while IF
dis.read(); // these 11 bytes need to be skipped every time
}
// Open the file just ONCE (outside the loop)
FileOutputStream output = new FileOutputStream("ParsedData", true);
// ********************** data **********************
int read = 0;
byte[] data = new byte[1016]; // set byte buffer size
while ((read = dis.read(data)) != -1) { // read and check for EOF
// ***************** output data ********************
if (checksumCheck(checksum) && fecfCheck(fecf)) { // if checksum is valid
output.write(data, 0, read); // write the number of bytes read before
}
// SKIP four bytes
for (int i = 0; i < 4; i++) { // or, dis.skipBytes(4); instead of the loop
dis.read();
}
}
// Close the file AFTER input stream reaches EOF
output.close(); // i.e. all the data has been written
I want to read a text file and store its contents in an array where each element of the array holds up to 500 characters from the file (i.e. keep reading 500 characters at a time until there are no more characters to read).
I'm having trouble doing this because I'm having trouble understanding the difference between all of the different ways to do IO in Java and I can't find any that performs the task I want.
And will I need to use an array list since I don't initially know how many items are in the array?
It would be hard to avoid using ArrayList or something similar. If you know the file is ASCII, you could do
int partSize = 500;
File f = new File("file.txt");
String[] parts = new String[(f.length() + partSize - 1) / partSize];
But if the file uses a variable-width encoding like UTF-8, this won't work. This code will do the job.
static String[] readFileInParts(String fname) throws IOException {
int partSize = 500;
FileReader fr = new FileReader(fname);
List<String> parts = new ArrayList<String>();
char[] buf = new char[partSize];
int pos = 0;
for (;;) {
int nRead = fr.read(buf, pos, partSize - pos);
if (nRead == -1) {
if (pos > 0)
parts.add(new String(buf, 0, pos));
break;
}
pos += nRead;
if (pos == partSize) {
parts.add(new String(buf));
pos = 0;
}
}
return parts.toArray(new String[parts.size()]);
}
Note that FileReader uses the platform default encoding. To specify a specific encoding, replace it with new InputStreamReader(new FileInputStream(fname), charSet). It bit ugly, but that's the best way to do it.
An ArrayList will definitely be more suitable as you don't know how many elements you're going to have.
There are many ways to read a file, but as you want to keep the count of characters to get 500 of them, you could use the read() method of the Reader object that will read character by character. Once you collected the 500 characters you need (in a String I guess), just add it to your ArrayList (all of that in a loop of course).
The Reader object needs to be initialized with an object that extends Reader, like an InputStreamReader (this one take an implementation of an InputStream as parameter, a FileInputStream when working with a file as input).
Not sure if this will work, but you might want to try something like this (Caution: untested code):
private void doStuff() {
ArrayList<String> stringList = new ArrayList<String>();
BufferedReader in = null;
try {
in = new BufferedReader(new FileReader("file.txt"));
String str;
int count = 0;
while ((str = in.readLine()) != null) {
String temp = "";
for (int i = 0; i <= str.length(); i++) {
temp += str.charAt(i);
count++;
if(count>500) {
stringList.add(temp);
temp = "";
count = 0;
}
}
if(count>500) {
stringList.add(temp);
temp = "";
count = 0;
}
}
} catch (IOException e) {
// handle
} finally {
try {
in.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
Are there any ways to store a large binary file like 50 MB in the ten files with 5 MB?
thanks
are there any special classes for doing this?
Use a FileInputStream to read the file and a FileOutputStream to write it.
Here a simple (incomplete) example (missing error handling, writes 1K chunks)
public static int split(File file, String name, int size) throws IOException {
FileInputStream input = new FileInputStream(file);
FileOutputStream output = null;
byte[] buffer = new byte[1024];
int count = 0;
boolean done = false;
while (!done) {
output = new FileOutputStream(String.format(name, count));
count += 1;
for (int written = 0; written < size; ) {
int len = input.read(buffer);
if (len == -1) {
done = true;
break;
}
output.write(buffer, 0, len);
written += len;
}
output.close();
}
input.close();
return count;
}
and called like
File input = new File("C:/data/in.gz");
String name = "C:/data/in.gz.part%02d"; // %02d will be replaced by segment number
split(input, name, 5000 * 1024));
Yes, there are. Basically just count the bytes which you write to file and if it hits a certain limit, then stop writing, reset the counter and continue writing to another file using a certain filename pattern so that you can correlate the files with each other. You can do that in a loop. You can learn here how to write to files in Java and for the remnant just apply the primary school maths.
I have a zip file whose contents are presented as byte[] but the original file object is not accessible. I want to read the contents of each of the entries. I am able to create a ZipInputStream from a ByteArrayInputStream of the bytes and can read the entries and their names. However I cannot see an easy way to extract the contents of each entry.
(I have looked at Apache Commons but cannot see an easy way there either).
UPDATE #Rich's code seems to solve the problem, thanks
QUERY why do both examples have a multiplier of * 4 (128/512 and 1024*4) ?
If you want to process nested zip entries from a stream, see this answer for ideas. Because the inner entries are listed sequentially they can be processed by getting the size of each entry and reading that many bytes from the stream.
Updated with an example that copies each entry to Standard out:
ZipInputStream is;//obtained earlier
ZipEntry entry = is.getNextEntry();
while(entry != null) {
copyStream(is, out, entry);
entry = is.getNextEntry();
}
...
private static void copyStream(InputStream in, OutputStream out,
ZipEntry entry) throws IOException {
byte[] buffer = new byte[1024 * 4];
long count = 0;
int n = 0;
long size = entry.getSize();
while (-1 != (n = in.read(buffer)) && count < size) {
out.write(buffer, 0, n);
count += n;
}
}
It actually uses the ZipInputStream as the InputStream (but don't close it at the end of each entry).
It's a little bit tricky to calculate the start of next ZipEntry. Please see this example included in JDK 6,
public static void main(String[] args) {
try {
ZipInputStream is = new ZipInputStream(System.in);
ZipEntry ze;
byte[] buf = new byte[128];
int len;
while ((ze = is.getNextEntry()) != null) {
System.out.println("----------- " + ze);
// Determine the number of bytes to skip and skip them.
int skip = (int)ze.getSize() - 128;
while (skip > 0) {
skip -= is.skip(Math.min(skip, 512));
}
// Read the remaining bytes and if it's printable, print them.
out: while ((len = is.read(buf)) >= 0) {
for (int i=0; i<len; i++) {
if ((buf[i]&0xFF) >= 0x80) {
System.out.println("**** UNPRINTABLE ****");
// This isn't really necessary since getNextEntry()
// automatically calls it.
is.closeEntry();
// Get the next zip entry.
break out;
}
}
System.out.write(buf, 0, len);
}
}
is.close();
} catch (Exception e) {
e.printStackTrace();
}
}