So I am trying to create a program which takes a text file, creates an index (by line numbers) for all the words in the file and writes the index into the output file. Here is the main class:
import java.util.Scanner;
import java.io.*;
public class IndexMaker
{
public static void main(String[] args) throws IOException
{
Scanner keyboard = new Scanner(System.in);
String fileName;
// Open input file:
if (args.length > 0)
fileName = args[0];
else
{
System.out.print("\nEnter input file name: ");
fileName = keyboard.nextLine().trim();
}
BufferedReader inputFile =
new BufferedReader(new FileReader(fileName), 1024);
// Create output file:
if (args.length > 1)
fileName = args[1];
else
{
System.out.print("\nEnter output file name: ");
fileName = keyboard.nextLine().trim();
}
PrintWriter outputFile =
new PrintWriter(new FileWriter(fileName));
// Create index:
DocumentIndex index = new DocumentIndex();
String line;
int lineNum = 0;
while ((line = inputFile.readLine()) != null)
{
lineNum++;
index.addAllWords(line, lineNum);
}
// Save index:
for (IndexEntry entry : index)
outputFile.println(entry);
// Finish:
inputFile.close();
outputFile.close();
keyboard.close();
System.out.println("Done.");
}
}
The program contains two more classes: IndexEntry which represents one index entry, and the DocumentIndex class which represents the entire index for a document: the list of all its index entries. The index entries should always be arranged in alphabetical order. So the implementation for these two classes are shown below
import java.util.ArrayList;
public class IndexEntry {
private String word;
private ArrayList<Integer> numsList;
public IndexEntry(String w) {
word = w.toUpperCase();
numsList = new ArrayList<Integer>();
}
public void add(int num) {
if (!numsList.contains(num)) {
numsList.add(num);
}
}
public String getWord() {
return word;
}
public String toString() {
String result = word + " ";
for (int i=0; i<numsList.size(); i++) {
if (i == 0) {
result += numsList.get(i);
} else {
result += ", " + numsList.get(i);
}
}
return result;
}
}
import java.util.ArrayList;
public class DocumentIndex extends ArrayList<IndexEntry> {
public DocumentIndex() {
super();
}
public DocumentIndex(int c) {
super(c);
}
public void addWord(String word, int num) {
super.get(foundOrInserted(word)).add(num);
}
private int foundOrInserted(String word) {
int result = 0;
for (int i=0; i<super.size(); i++) {
String w = super.get(i).getWord();
if (word.equalsIgnoreCase(w)) {
result = i;
} else if (w.compareTo(word) > 0) {
super.add(i, new IndexEntry(w));
result = i;
}
}
return result;
}
public void addAllWords(String str, int num) {
String[] arr = str.split("[^A-Za-z]+");
for (int i=0; i<arr.length; i++) {
if (arr[i].length() > 0 ) {
addWord(arr[i], num);
}
}
}
}
When I run this program I'm getting an error and I'm not sure where the error came from.
Exception in thread "main" java.lang.IndexOutOfBoundsException: Index 0 out of bounds for length 0
at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:248)
at java.base/java.util.Objects.checkIndex(Objects.java:372)
at java.base/java.util.ArrayList.get(ArrayList.java:459)
at DocumentIndex.addWord(DocumentIndex.java:14)
at DocumentIndex.addAllWords(DocumentIndex.java:35)
at Main.main(Main.java:53)```
There is where the problem arises:
String line;
int lineNum = 0;
while ((line = inputFile.readLine()) != null)
{
lineNum++;
index.addAllWords(line, lineNum);
}
You add lineNum by 1 before executing the line after. At the last loop, lineNum will be 1 more than the maximum, because the loop starts at line 1, and it is 0 index based.
Instead, use:
String line;
int lineNum = 0;
while ((line = inputFile.readLine()) != null)
{
index.addAllWords(line, lineNum);
lineNum++;
}
Related
I have file of which I need to read input. On one of the lines, there is no name added. In this case, I want to print out that no match was found. The problem that I'm having is that I don't know how I can make sure the program actually reads the part as an empty string. What happens now is that the will just leave the line empty on the console.
The date input looks like this:
5=20=22=10=2=0=0=1=0=1;Vincent Appel,Johannes Mondriaan
2=30=15=8=4=3=2=0=0=0;
class Administration {
public static final int TOTAL_NUMBER_OF_SIMULARITY_SCORES = 10;
public static final String ZERO_MATCHES = "_";
public static final String LESS_THAN_TWENTY_MATCHES= "-";
public static final String TWENTY_OR_MORE_MATCHES = "^";
PrintStream out;
Administration() {
out = new PrintStream(System.out);
}
void printSimilarityScores (Scanner similarityScoresScanner, String similarityScoresInput) {
similarityScoresScanner.useDelimiter("=|;");
int length = similarityScoresInput.length();
for (int i = 0; i < TOTAL_NUMBER_OF_SIMULARITY_SCORES; i++) {
int grade = similarityScoresScanner.nextInt();
if (grade == 0) {
out.printf(ZERO_MATCHES);
} else if (grade < 20) {
out.printf(LESS_THAN_TWENTY_MATCHES);
} else {
out.printf(TWENTY_OR_MORE_MATCHES);
}
}
System.out.print("\n");
similarityScoresScanner.useDelimiter(";|,");
while(similarityScoresScanner.hasNext()) {
String name = similarityScoresScanner.next();
if (length < 22) {
out.printf("No matches found\n");
} else {
System.out.print("\n" + name);
}
}
}
void start() {
Scanner fileScanner = UIAuxiliaryMethods.askUserForInput().getScanner();
while (fileScanner.hasNext()) {
String finalGradeInput = fileScanner.nextLine();
String similarityScoresInput = fileScanner.nextLine();
Scanner finalGradeInputScanner = new Scanner(finalGradeInput);
Scanner similarityScoresScanner = new Scanner(similarityScoresInput);
printFinalGrade(finalGradeInputScanner);
printSimilarityScores(similarityScoresScanner, similarityScoresInput);
}
}
public static void main(String[] argv) {
new Administration().start();
}
}
An easier solution would be to read the file, line by line and handle them like this :
split by the separator
check if there is more than 1 element,
if positive print them
Scanner similarityScoresScanner = new Scanner(myFile);
while (similarityScoresScanner.hasNextLine()) {
String[] content = similarityScoresScanner.nextLine().split("[;,]");
if (content.length == 1) {
System.out.println("No matches found");
} else {
for (int i = 1; i < content.length; i++) {
System.out.println(content[i]);
}
}
}
I have a program that takes in a file of unindented code and comments the program takes the specified file and will output an indented version of the code.
I keep on getting the java.lang.ArrayIndexOutOfBoundsException: 1 error. This seems to occur when I have only one comment on a line as for when it splits the string the index only takes up 0. I have got an if statement in place to handle a comment on a line on its own but it still throws the exception.
Would I need to implement an if statement to check whether or not the split string has more than 1 part to it?
import java.io.*;
import java.util.*;
class Program
{
public static int spaces = 0;
public static int longestLine = 0;
public static int commentSpaces;
public static String beforeComment;
public static String afterComment;
public static void main(String args[]) throws FileNotFoundException
{
Scanner input2 = new Scanner(new File("C:\\Users\\James\\Music\\code.java")); //get text from file
while (input2.hasNextLine() == true) { //get the longest line
String text = input2.nextLine();
if (text.contains("//")) {
if (text.contains("\"//")) {
printLine(text);
}
String[] parts = text.split("//");
String codeOnly = parts[0];
if (codeOnly.length() > longestLine) {
longestLine = codeOnly.length();
}
}
else {
if (text.length() > longestLine) {
longestLine = text.length();
}
}
if (input2.hasNextLine() == false) {
break;
}
}
Scanner input3 = new Scanner(new File("C:\\Users\\James\\Music\\code.java"));
while (input3.hasNextLine()) { //indent comments
String text = input3.nextLine();
if (text.contains("}")) {
spaces -=2;
}
for (int i = 0; i < spaces; i++) {
System.out.print(" ");
}
if (text.startsWith("//")){
String justComment = text;
commentSpaces = longestLine - spaces + 6;
for (int i = 0; i < commentSpaces; i++) {
System.out.print(" ");
}
printLine(justComment);
System.out.println(" ");
}
if (text.contains("\"//")) {
printLine(text);
}
if (text.contains("//")) {
String[] parts = text.split("(?=//)");
beforeComment = parts[0].trim(); // trim() to get rid of any spaces that are already present within the code
afterComment = parts[1];
printLine(beforeComment);
commentSpaces = longestLine - beforeComment.length() - spaces + 5;
for (int i = 0; i < commentSpaces; i++) {
System.out.print(" ");
}
printLine(afterComment);
System.out.println();
}
else {
printLine(text);
System.out.println();
}
if (text.contains("{")) {
spaces +=2;
}
}
}
public static void printLine(String text) {
Scanner data = new Scanner(text);
while (data.hasNext()) {
System.out.print(" " + data.next());
}
}
public static void yesItContains() {
System.out.print("It contains a string");
System.exit(0);
}
}
I think that if text is "something//" meaning it is ending in a empty comment your parts will only have length 1. So yes, you need to check it, e.g. via afterComment = parts.length > 1 ? parts[1] : "";. Note that lines like "something // something else // blabla" might break that logic as well.
I am making a multiple string input random swap without using a temp variable.
But when I input, this happens a few times:
This happens more frequently... (note that the first output is always null and some outputs occasionally repeat)
My code:
import java.util.Arrays;
import java.util.Scanner;
public class myFile {
public static boolean contains(int[] array, int key) {
Arrays.sort(array);
return Arrays.binarySearch(array, key) >= 0;
}
public static void println(Object line) {
System.out.println(line);
}
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
String finalText = "";
String[] input = new String[5];
String[] swappedInput = new String[input.length];
int[] usedIndex = new int[input.length];
int swapCounter = input.length, useCounter;
for (int inputCounter = 0; inputCounter < input.length; inputCounter++) { //input
println("Enter input 1 " + (inputCounter + 1) + ": ");
input[inputCounter] = in.nextLine();
}
while (--swapCounter > 0) {
do{
useCounter = (int) Math.floor(Math.random() * input.length);
}
while (contains(usedIndex, useCounter));
swappedInput[swapCounter] = input[swapCounter].concat("#" + input[useCounter]);
swappedInput[useCounter] = swappedInput[swapCounter].split("#")[0];
swappedInput[swapCounter] = swappedInput[swapCounter].split("#")[1];
usedIndex[useCounter] = useCounter;
}
for (int outputCounter = 0; outputCounter < input.length; outputCounter++) {
finalText = finalText + swappedInput[outputCounter] + " ";
}
println("The swapped inputs are: " + finalText + ".");
}
}
Because of randomality some times useCounter is the same as swapCounter and now look at those lines (assume useCounter and swapCounter are the same)
swappedInput[swapCounter] = input[swapCounter].concat("#" + input[useCounter]);
swappedInput[useCounter] = swappedInput[swapCounter].split("#")[0];
swappedInput[swapCounter] = swappedInput[swapCounter].split("#")[1];
In the second line you are changing the value of xxx#www to be www so in the third line when doing split you dont get an array with two values you get an empty result thats why exception is thrown in addition you should not use swappedInput because it beats the pourpuse (if i understand correctly yoush shoud not use temp values while you are using addition array which is worse) the correct sollution is to only use input array here is the solution
public class myFile {
public static boolean contains(int[] array, int key) {
Arrays.sort(array);
return Arrays.binarySearch(array, key) >= 0;
}
public static void println(Object line) {
System.out.println(line);
}
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
String finalText = "";
String[] input = new String[5];
int[] usedIndex = new int[input.length];
int swapCounter = input.length, useCounter;
for (int inputCounter = 0; inputCounter < input.length; inputCounter++) { //input
println("Enter input 1 " + (inputCounter + 1) + ": ");
input[inputCounter] = in.nextLine();
}
while (--swapCounter >= 0) {
do {
useCounter = (int) Math.floor(Math.random() * input.length);
}
while (contains(usedIndex, useCounter));
// Skip if results are the same
if (useCounter == swapCounter) {
swapCounter++;
continue;
}
input[swapCounter] = input[swapCounter].concat("#" + input[useCounter]);
input[useCounter] = input[swapCounter].split("#")[0];
input[swapCounter] = input[swapCounter].split("#")[1];
usedIndex[useCounter] = useCounter;
}
for (int outputCounter = 0; outputCounter < input.length; outputCounter++) {
finalText = finalText + input[outputCounter] + " ";
}
println("The swapped inputs are: " + finalText + ".");
}
}
Here is my input file
So I am reading in a .txt file and I keep getting a string index out of bounds exception. I have been trying to find duplicate words and keep the array sorted as I add words to it. I thought my problem was trying to sort and search the array when It has no words or only one word in it.
The line with the ** in front of it is the problem line. Its line 129
import java.io.*;
import java.util.Scanner;
import java.util.regex.*;
public class BuildDict
{
static String dict[] = new String[20];
static int index = 0;
public static void main(String args [])
{
readIn();
print();
}
public static void readIn()
{
File inFile = new File("carol.txt");
try
{
Scanner scan = new Scanner(inFile);
while(scan.hasNext())
{
String word = scan.next();
if(!Character.isUpperCase(word.charAt(0)))
{
checkRegex(word);
}
}
scan.close();
}
catch(IOException e)
{
System.out.println("Error");
}
}
public static void addToDict(String word)
{
if(index == dict.length)
{
String newAr[] = new String[dict.length*2];
for(int i = 0; i < index; i++)
{
newAr[i] = dict[i];
}
if(dict.length < 2)
{
newAr[index] = word;
index++;
}
else
{
bubbleSort(word);
if(!wordHasDuplicate(word))
{
newAr[index] = word;
index++;
}
}
dict = newAr;
}
else
{
dict[index] = word;
index++;
}
}
public static void checkRegex(String word)
{
String regex = ("[^A-Za-z]");
Pattern check = Pattern.compile(regex);
Matcher regexMatcher = check.matcher(word);
if(!regexMatcher.find())
{
addToDict(word);
}
}
public static void print()
{
try
{
FileWriter outFile = new FileWriter("dict.txt");
for(int i = 0; i < index; i++)
{
outFile.write(dict[i]);
outFile.write(" \n ");
}
outFile.close();
}
catch (IOException e)
{
System.out.println("Error ");
}
}
public static void bubbleSort(String word)
{
boolean swap = true;
String temp;
int wordBeforeIndex = 0;
String wordBefore;
while(swap)
{
swap = false;
wordBefore = dict[wordBeforeIndex];
for(int i = 0; (i < word.length()) && (i < wordBefore.length()) i++)
{
**if(word.charAt(i) < wordBefore.charAt(i))**
{
temp = wordBefore;
dict[wordBeforeIndex] = word;
dict[wordBeforeIndex++] = temp;
wordBeforeIndex++;
swap = true;
}
}
}
}
public static boolean wordHasDuplicate(String word)
{
int low = 0;
int high = dict.length - 1;
int mid = low + (high - low) /2;
while (low <= high && dict[mid] != word)
{
if (word.compareTo(dict[mid]) < 0)
{
low = mid + 1;
}
else
{
high = mid + 1;
}
}
return true;
}
}
Error is shown below:
Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 2
at java.lang.String.charAt(String.java:658)
at BuildDict.bubbleSort(BuildDict.java:129)
at BuildDict.addToDict(BuildDict.java:60)
at BuildDict.checkRegex(BuildDict.java:90)
at BuildDict.readIn(BuildDict.java:30)
at BuildDict.main(BuildDict.java:14)
Check the length of wordBefore as a second condition of your for loop:
for(int i = 0; (i < word.length()) && (i < wordbefore.length()); i++)
This question already has an answer here:
Counting distinct words with Threads
(1 answer)
Closed 9 years ago.
I've asked this question before ( Counting distinct words with Threads ) and made the code more appropriate. As described in first question I need to count the distinct words from a file.
De-Bug shows that all my words are stored and sorted correctly, but the issue now is an infinite "while" loop in the Test class that keeps on going after reading all the words (De-bug really helped to figure out some points...).
I'm testing the code on a small file now with no more than 10 words.
DataSet class has been modified mostly.
I need some advice how to get out of the loop.
Test looks like this:
package test;
import java.io.File;
import java.io.IOException;
import junit.framework.Assert;
import junit.framework.TestCase;
import main.DataSet;
import main.WordReader;
public class Test extends TestCase
{
public void test2() throws IOException
{
File words = new File("resources" + File.separator + "test2.txt");
if (!words.exists())
{
System.out.println("File [" + words.getAbsolutePath()
+ "] does not exist");
Assert.fail();
}
WordReader wr = new WordReader(words);
DataSet ds = new DataSet();
String nextWord = wr.readNext();
// This is the loop
while (nextWord != "" && nextWord != null)
{
if (!ds.member(nextWord))
{
ds.insert(nextWord);
}
nextWord = wr.readNext();
}
wr.close();
System.out.println(ds.toString());
System.out.println(words.toString() + " contains " + ds.getLength()
+ " distinct words");
}
}
Here is my updated DataSet class, especially member() method, I'm still not sure about it because at some point I used to get a NullPointerExeption (don't know why...):
package main;
import sort.Sort;
public class DataSet
{
private String[] data;
private static final int DEFAULT_VALUE = 200;
private int nextIndex;
private Sort bubble;
public DataSet(int initialCapacity)
{
data = new String[initialCapacity];
nextIndex = 0;
bubble = new Sort();
}
public DataSet()
{
this(DEFAULT_VALUE);
nextIndex = 0;
bubble = new Sort();
}
public void insert(String value)
{
if (nextIndex < data.length)
{
data[nextIndex] = value;
nextIndex++;
bubble.bubble_sort(data, nextIndex);
}
else
{
expandCapacity();
insert(value);
}
}
public int getLength()
{
return nextIndex + 1;
}
public boolean member(String value)
{
for (int i = 0; i < data.length; i++)
{
if (data[i] != null && nextIndex != 10)
{
if (data[i].equals(value))
return true;
}
}
return false;
}
private void expandCapacity()
{
String[] larger = new String[data.length * 2];
for (int i = 0; i < data.length; i++)
{
data = larger;
}
}
}
WordReader class didn't change much. ArrayList was replaced with simple array, storing method also has been modified:
package main;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
public class WordReader
{
private File file;
private String[] words;
private int nextFreeIndex;
private BufferedReader in;
private int DEFAULT_SIZE = 200;
private String word;
public WordReader(File file) throws IOException
{
words = new String[DEFAULT_SIZE];
in = new BufferedReader(new FileReader(file));
nextFreeIndex = 0;
}
public void expand()
{
String[] newArray = new String[words.length * 2];
// System.arraycopy(words, 0, newArray, 0, words.length);
for (int i = 0; i < words.length; i++)
newArray[i] = words[i];
words = newArray;
}
public void read() throws IOException
{
}
public String readNext() throws IOException
{
char nextCharacter = (char) in.read();
while (in.ready())
{
while (isWhiteSpace(nextCharacter) || !isCharacter(nextCharacter))
{
// word = "";
nextCharacter = (char) in.read();
if (!in.ready())
{
break;
}
}
word = "";
while (isCharacter(nextCharacter))
{
word += nextCharacter;
nextCharacter = (char) in.read();
}
storeWord(word);
return word;
}
return word;
}
private void storeWord(String word)
{
if (nextFreeIndex < words.length)
{
words[nextFreeIndex] = word;
nextFreeIndex++;
}
else
{
expand();
storeWord(word);
}
}
private boolean isWhiteSpace(char next)
{
if ((next == ' ') || (next == '\t') || (next == '\n'))
{
return true;
}
return false;
}
private boolean isCharacter(char next)
{
if ((next >= 'a') && (next <= 'z'))
{
return true;
}
if ((next >= 'A') && (next <= 'Z'))
{
return true;
}
return false;
}
public boolean fileExists()
{
return file.exists();
}
public boolean fileReadable()
{
return file.canRead();
}
public Object wordsLength()
{
return words.length;
}
public void close() throws IOException
{
in.close();
}
public String[] getWords()
{
return words;
}
}
And Bubble Sort class for has been changed for strings:
package sort;
public class Sort
{
public void bubble_sort(String a[], int length)
{
for (int j = 0; j < length; j++)
{
for (int i = j + 1; i < length; i++)
{
if (a[i].compareTo(a[j]) < 0)
{
String t = a[j];
a[j] = a[i];
a[i] = t;
}
}
}
}
}
I suppose the method that actually blocks is the WordReader.readNext(). My suggestion there is that you use Scanner instead of BufferedReader, it is more suitable for parsing a file into words.
Your readNext() method could be redone as such (where scan is a Scanner):
public String readNext() {
if (scan.hasNext()) {
String word = scan.next();
if (!word.matches("[A-Za-z]+"))
word = "";
storeWord(word);
return word;
}
return null;
}
This will have the same functionality as your code (without using isCharacter() or isWhitespace() - the regex (inside matches())checks that a word contains only characters. The isWhitespace() functionality is built-in in next() method which separates words. The added functionality is that it returns null when there are no more words in the file.
You'll have to change your while-loop in Test class for this to work properly or you will get a NullPointerException - just switch the two conditions in the loop definition (always check for null before, or the first will give a NPE either way and the null-check is useless).
To make a Scanner, you can use a BufferedReader as a parameter or the File directly as well, as such:
Scanner scan = new Scanner(file);