Program is delayed in writing to a .txt file? - java

So, I've searched around stackoverflow for a bit, but I can't seem to find an answer to this issue.
My current homework for my CS class involves reading from a file of 5000 random numbers and doing various things with the data, like putting it into an array, seeing how many times a number occurs, and finding what the longest increasing sequence is. I've got all that done just fine.
In addition to this, I am (for myself) adding in a method that will allow me to overwrite the file and create 5000 new random numbers to make sure my code works with multiple different test cases.
The method works for the most part, however after I call it it doesn't seem to "activate" until after the rest of the program finishes. If I run it and tell it to change the numbers, I have to run it again to actually see the changed values in the program. Is there a way to fix this?
Current output showing the delay between changing the data:
Not trying to change the data here- control case.
elkshadow5$ ./CompileAndRun.sh
Create a new set of numbers? Y for yes. n
What number are you looking for? 66
66 was found 1 times.
The longest sequence is [606, 3170, 4469, 4801, 5400, 8014]
It is 6 numbers long.
The numbers should change here but they don't.
elkshadow5$ ./CompileAndRun.sh
Create a new set of numbers? Y for yes. y
What number are you looking for? 66
66 was found 1 times.
The longest sequence is [606, 3170, 4469, 4801, 5400, 8014]
It is 6 numbers long.
Now the data shows that it's changed, the run after the data should have been changed.
elkshadow5$ ./CompileAndRun.sh
Create a new set of numbers? Y for yes. n
What number are you looking for? 1
1 was found 3 times.
The longest sequence is [1155, 1501, 4121, 5383, 6000]
It is 5 numbers long.
My code:
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.PrintWriter;
import java.util.Scanner;
public class jeftsdHW2 {
static Scanner input = new Scanner(System.in);
public static void main(String args[]) throws Exception {
jeftsdHW2 random = new jeftsdHW2();
int[] data;
data = new int[5000];
random.readDataFromFile(data);
random.overwriteRandNums();
}
public int countingOccurrences(int find, int[] array) {
int count = 0;
for (int i : array) {
if (i == find) {
count++;
}
}
return count;
}
public int[] longestSequence(int[] array) {
int[] sequence;
return sequence;
}
public void overwriteRandNums() throws Exception {
System.out.print("Create a new set of numbers? Y for yes.\t");
String answer = input.next();
char yesOrNo = answer.charAt(0);
if (yesOrNo == 'Y' || yesOrNo == 'y') {
writeDataToFile();
}
}
public void readDataFromFile(int[] data) throws Exception {
try {
java.io.File infile = new java.io.File("5000RandomNumbers.txt");
Scanner readFile = new Scanner(infile);
for (int i = 0; i < data.length; i++) {
data[i] = readFile.nextInt();
}
readFile.close();
} catch (FileNotFoundException e) {
System.out.println("Please make sure the file \"5000RandomNumbers.txt\" is in the correct directory before trying to run this.");
System.out.println("Thank you.");
System.exit(1);
}
}
public void writeDataToFile() throws Exception {
int j;
StringBuilder theNumbers = new StringBuilder();
try {
PrintWriter writer = new PrintWriter("5000RandomNumbers.txt", "UTF-8");
for (int i = 0; i < 5000; i++) {
if (i > 1 && i % 10 == 0) {
theNumbers.append("\n");
}
j = (int) (9999 * Math.random());
if (j < 1000) {
theNumbers.append(j + "\t\t");
} else {
theNumbers.append(j + "\t");
}
}
writer.print(theNumbers);
writer.flush();
writer.close();
} catch (IOException e) {
System.out.println("error");
}
}
}

It is possible that the file has not been physically written to the disk, using flush is not enough for this, from the java documentation here:
If the intended destination of this stream is an abstraction provided by the underlying operating system, for example a file, then flushing the stream guarantees only that bytes previously written to the stream are passed to the operating system for writing; it does not guarantee that they are actually written to a physical device such as a disk drive.
Because of the HDDs read and write speed, it is advisable to depend as little as possible on HDD access.
Perhaps storing the random number strings to a list when re-running and using that would be a solution. You could even write the list to disk, but this way the implementation does not depend on the time the file is being written.
EDIT
After the OP posted more of its code it became apparent that my original answer is not relatede to the problem. Nonetheless it is sound.
The code OP posted is not enough to see when is he reading the file after writing. It seems he is writing to the file after reading, which of course is what is percieved as an error. Reading after writing should produce a program that does what you want.
Id est, this:
random.readDataFromFile(data);
random.overwriteRandNums();
Will be reflected until the next execution. This:
random.overwriteRandNums();
random.readDataFromFile(data);
Will use the updated file in the current execution.

Related

Java "Quicksave" Execution

The problem I seem to have hit is one relating to loading times; I'm not running on a particularly fast machine by any means, but I still want to dabble into neural networks. In short, I have to load 336,600,000 integers into one large array (I'm using the MNIST database; each image is 28x28, which amounts to 748 pixels per image, times 45,000 images). It works fine, and surprisingly I don't run out of RAM, but... it takes 4 and a half hours, just to get the data into an array.
I can supply the rest of the code if you want me to, but here's the function that runs through the file.
public static short[][] readfile(String fileName) throws FileNotFoundException, IOException {
short[][] array = new short[10000][784];
BufferedReader br = new BufferedReader(new FileReader(System.getProperty("user.dir") + "/MNIST/" + fileName + ".csv"));
br.readLine();
try {
for (short i = 1; i < 45000; i++) {
String line = br.readLine();
for (short j = 0; j < 784; j++) {
array[i][j] = Short.parseShort(line.split(",")[j]);
}
}
br.close();
} catch (IOException e) {
e.printStackTrace();
}
return array;
}
What I want to know is, is there some way to "quicksave" the execution of the program so that I don't have to rebuild the array for every small tweak?
Note: I haven't touched Java in a while, and my code is mostly chunked together from a lot of different sources. I wouldn't be surprised if there were some serious errors (or just Java "no-nos"), it would actually help me a lot if you could fix them if you answer.
Edit: Bad question, I'm just blind... sorry for wasting time
Edit 2: I've decided after a while that instead of loading all of the images, and then training with them one by one, I could simply train one by one and load the next. Thank you all for your ideas!
array[i][j] = Short.parseShort(line.split(",")[j]);
You are calling String#split() for every single integer.
Call it once outside the loop and copy the value into your 2d array.

Reading a text file into an array and performing a sort in Java

I have a homework question I need help with
We have been given a text file containing one word per line, of a story.
We need to read this file into an array, perform a sort on the array and then perform a binary search.
The task also says I'll need to use an overload method, but I'm unsure where
I have a bubble sort, that I've tested on a small array of characters which works
public static void bubbleV1String(String[]numbers)
{
for(int i = 0; i < numbers.length-1; i++)
{
for(int j = 0; j < numbers.length-1; j++)
{
if(numbers[j] .compareTo(numbers[j+1])>0)
{
String temp = numbers[j+1];
numbers[j+1] = numbers[j];
numbers[j] = temp;
}
}
}
}`
And my binary search which I've tested on the same small array
public static String binarySearch(int[] numbers, int wanted)
{
ArrayUtilities.bucketSort(numbers);
int left = 0;
int right = numbers.length-1;
while(left <= right)
{
int middle = (left+right)/2;
if (numbers[middle] == wanted)
{
return (wanted + " was found at position " + middle);
}
else if(numbers[middle] > wanted)
{
right = middle - 1;
}
else
{
left = middle + 1;
}
}
return wanted + " was not found";
}
Here is my code in an app class to read in a file and sort it
String[] myArray = new String[100000];
int index = 0;
File text = new File("threebears.txt");
try {
Scanner scan = new Scanner(text);
while(scan.hasNextLine() && index < 100000)
{
myArray[index] = scan.nextLine();
index++;
}
scan.close();
} catch (IOException e) {
System.out.println("Problem with file");
e.printStackTrace();
}
ArrayUtilities.bubbleV1String(myArray);
try {
FileWriter outFile = new FileWriter("sorted1.txt");
PrintWriter out = new PrintWriter(outFile);
for(String item : myArray)
{
out.println(item);
}
out.close();
} catch (IOException e) {
e.printStackTrace();
}
When I go to run the code, I get a null pointer exception and the following message
Exception in thread "main" java.lang.NullPointerException
at java.base/java.lang.String.compareTo(Unknown Source)
at parrayutilities.ArrayUtilities.bubbleV1String(ArrayUtilities.java:129)
at parrayutilities.binarySearchApp.main(binarySearchApp.java:32)
Line 129 refers to this line of code of my bubblesort
if(numbers[j] .compareTo(numbers[j+1])>0)
And line 32 refers to the piece of code where I call the bubblesort
ArrayUtilities.bubbleV1String(myArray);
Does anyone know why I'm getting a null pointer exception when I've tested the bubblesort on a small string array? I'm thinking possibly something to do with the overloaded method mentioned earlier but I'm not sure
Thanks
You are creating an array of length 100000 and fill the lines as they are read. Initially all elements will be null and after reading the file quite a number of them is likely to still be null. Thus when you sort the array numbers[j] will eventually be a null element and thus calling compareTo(...) on that will throw a NullPointerException.
To fix that you need to know where in the array the non-null part ends. You are already tracking the number of read lines in index so after reading the file that would be the index of the first null element.
Now you basically have 2 options:
Pass index to bubbleV1String() and do for(int i = 0; i < index-1; i++) etc.
Make a copy of the array after reading the lines and before sorting it:
String[] copy = new String[index];
StringSystem.arrayCopy(myArray,0,copy,0,index);
//optional but it can make the rest of the code easier to handle: replace myArray with copy
myArray = copy;
Finally you could also use a List<String> which would be better than using arrays but I assume that's covered by a future lesson.
It seems that you have some null values in your numbers array. Try to debug your code (or just print array's content) and verify what you have there. Hard to tell anything not knowing what is in your input file.
Method overloading is when multiple functions have the same name but different parameters.
e.g. (taken from wikipedia - function overloading)
// volume of a cube
int volume(const int s)
{
return s*s*s;
}
// volume of a cylinder
double volume(const double r, const int h)
{
return 3.1415926*r*r*static_cast<double>(h);
}
Regarding your null pointer exception, you've created an array of size 100000, but it's likely you haven't read in enough information to fill that size. Therefore some of the array is empty when you try to access it. There are multiple ways you can go about this, off the top of my head that includes array lists, dynamic arrays or even moving the contents of the array to another one, once you know the size of the contents (however this is inefficient).

selecting random lines from huge text file

I have very huge text file 18000000 line 4Gbyte, and I want to pick some random lines from it, I wrote the following piece of code to do this but it is slow
import java.io.BufferedWriter;
import java.io.IOException;
import java.nio.charset.Charset;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.Random;
import java.util.stream.Collectors;
import java.util.stream.Stream;
public class Main {
public static void main(String[] args) throws IOException {
int sampleSize =3000;
int fileSize = 18000000;
int[] linesNumber = new int[sampleSize];
Random r = new Random();
for (int i = 0; i < linesNumber.length; i++) {
linesNumber[i] = r.nextInt(fileSize);
}
List<Integer> list = Arrays.stream(linesNumber).boxed().collect(Collectors.toList());
Collections.sort(list);
BufferedWriter outputWriter = Files.newBufferedWriter(Paths.get("output.txt"));
for (int i : list) {
try (Stream<String> lines = Files.lines(Paths.get("huge_text_file"))) {
String en=enlines.skip(i-1).findFirst().get();
outputWriter.write(en+"\n");
lines.close();
} catch (Exception e) {
System.err.println(e);
}
}
outputWriter.close();
}
}
is there more elegant faster method to do this?
thanks.
There are several things that I find troublesome about your current code.
You are currently loading the entire file into RAM. I don't know much about your sample file, but the one I used crashed my default JVM.
You are skipping the same lines over and over again, more so for the earlier lines - this is horribly inefficient, like O(n^n) or something. I would be surprised if you could handle even a 500MB file with that approach.
Here's what I came up with:
public static void main(String[] args) throws IOException {
int sampleSize = 3000;
int fileSize = 50000;
int[] linesNumber = new int[sampleSize];
Random r = new Random();
for (int i = 0; i < linesNumber.length; i++) {
linesNumber[i] = r.nextInt(fileSize);
}
List<Integer> list = Arrays.stream(linesNumber).boxed().collect(Collectors.toList());
Collections.sort(list);
BufferedWriter outputWriter = Files.newBufferedWriter(Paths.get("localOutput/output.txt"));
long t1 = System.currentTimeMillis();
try(BufferedReader reader = new BufferedReader(new FileReader("extremely large file.txt")))
{
int index = 0;//keep track of what item we're on in the list
int currentIndex = 0;//keep track of what line we're on in the input file
while(index < sampleSize)//while we still haven't finished the list
{
if(currentIndex == list.get(index))//if we reach a line
{
outputWriter.write(reader.readLine());
outputWriter.write("\n");//readLine doesn't include the newline characters
while(index < sampleSize && list.get(index) <= currentIndex)//have to put this here in case of duplicates in the list
index++;
}
else
reader.readLine();//readLine is dang fast. There may be faster ways to skip a line, but this is still plenty fast.
currentIndex++;
}
} catch (Exception e) {
System.err.println(e);
}
outputWriter.close();
System.out.println(String.format("Took %d milliseconds", System.currentTimeMillis() - t1));
}
This takes ~87 milliseconds for me on a 4.7GB file running with a sample size of 30 and filesize of 50000 and took ~91 milliseconds when I changed the sample size to 3000. It took 122 milliseconds when I increased the filesize to 10,000. Tl;Dr for this paragraph = it scales pretty well, and it scales extremely well with larger sample sizes.
In direct answer to your question "is there more elegant faster method to do this?" Yes, there is. The faster way to do it is to skip lines yourself, don't load the entire file into memory, and make sure to keep using buffered readers and writers. Also, I'd avoid trying to do your own raw Array buffers or anything like that - just don't.
Feel free to step through the method I've included if you want to see more of how it works.
My first cut at an approach would be to have a look at RandomAccess files in Java cf. https://docs.oracle.com/javase/tutorial/essential/io/rafs.html. Typically random seeks will be a lot faster than reading the whole file, but you'd then need read byte by byte to get to the beginning of the next line (for example), then read that line in byte by byte to the next newline, then seek to another random location.
I'm not sure the approach would be more elegant (depends partly on how you code it I guess), but I'd expect it to be faster.
There is no efficient way to seek lines. Only thing I can think of is using a RandomAccessFile, seeking a random possition and then reading the next 200(?) characters into an array. Then do the linebreak finding and form a String.
doc

Java Sort Data From A File Method writeToFile

I have to write a method to write data to a file. It has to take an array of integers as a parameter and write them to a file, but I am getting an error on these lines:
Integer[] x = val.toArray(new Integer[val.size(25)]);
if (x < 0) break;
public static void writeToFile (String filename, int[] x) throws IOException {
PrintWriter outputWriter = new PrintWriter("integers.txt");
System.out.println("Please enter 25 scores.");
System.out.println("You must hit enter after you enter each score.");
Scanner sc = new Scanner(System.in);
int score = 0;
while (score < 25) {
int val = sc.nextInt();
Integer[] x = val.toArray(new Integer[val.size(25)]);
if (x < 0) break;
outputWriter.println(x);
score++; }
outputWriter.flush();
outputWriter.close();
}
There are a couple of things. First off, you are trying to do things that are not possible to do with an int. Look at the ever helpful java API when trying to use a class:
http://docs.oracle.com/javase/7/docs/api/java/lang/Integer.html
Next, if I was writing your program (which I'm not and I do not intend to), I would watch your instantiations. The array is being instantiated every loop which means that you will have a new array every time the user puts in a value. Meaning all the previous numbers are going to be lost. Also, take the integer array out of the parameter. You aren't even using it in the method.
Instantiate your array outside of the loop with a size of 25 elements:
int[] array = new int[25];
Now, you can place the items in this array every loop like this:
array[score] = val;
This places the value in the indexes 0 -> 24. It seems to me that in order to truly understand how to do this program you are going to have to have a refresher on arrays and how they work.
Finally, the computer sees this method as a sequence. So, line by line think about what is happening on your program. Ideally, this is what should be happening.
Instantiate your objects: the scanner, the array (where the ints are stored), the Print writer
give instructions to the user how to use the program.
run a loop 25 times doing this:
- scanning in an int
- placing the int into the array at the appropriate index
write the array into the file
flush the writer.
close the writer.

How do I use BigInterger In my code for calculating primes?

I'm using this basic and crude code below for calculating prime numbers then exporting them to a text file:
import java.util.Scanner;
import java.io.*;
public class primeGenerator{
public static void main(String[] args)throws Exception {
Scanner kb = new Scanner(System.in);
String prime;
long num = kb.nextLong();
long i;
long z=0;
while(z==0){
for (i=2; i < num ;i++ ){
long n = num%i;
if (n==0){
break;
}
}
if(i == num){
writer(num);
}
num=num+2;
}
}
public static void writer(long num) throws Exception {
FileWriter writer = new FileWriter("prime.txt",true);
String prime= ""+ num;
writer.write(prime);
writer.write(" ");
writer.flush();
writer.close();
}
}
I would like to find primes beyond the Primative long's range and apparently big integer is the way to go about it. So how do i alter my code to do so?
Do you really need this? Having numbers bigger than can be handled by long means you want to test numbers bigger than 9223372036854775807. If your for-loop can test a hundred million divisions per second, it will still take it 2923 years to determine if that number is prime - and longer for larger numbers, of course.
A common optimization is to only test divisions up to sqrt(num). If you haven't found anything then, the number is prime.
Well, use BigInteger wherever you've currently got long. Instead of using % you'll use mod, instead of incrementing you'll use i = i.add(BigInteger.ONE), instead of == 0 you'll use equals(BigInteger.ZERO) etc.
Use Scanner.nextBigInteger instead of Scanner.nextLong, too.
Given that this looks like homework of some description (possibly self-set, of course) I won't write out the whole code for you - but if you have specific problems, feel free to ask.

Categories