I am trying to search for words within a text file and replace all upper-cased with lower-cased characters. The problem is that when I use the replace All function using a regular expression I get a syntax error. I have tried different tactics, but it doesn't work. Any tips? I think that maybe I should create a replace All method that I would have to invoke, but I don't really see its use.
public static void main() throws FileNotFoundException {
ArrayList<String> inputContents = new ArrayList<>();
Scanner inFile =
new Scanner(new FileReader("H:\\csc8001\\data.txt"));
while(inFile.hasNextLine())
{
String line = inFile.nextLine();
inputContents.add(inFile.nextLine());
}
inFile.close();
ArrayList<String> dictionary = new ArrayList<>();
for(int i= 0; i <inputContents.size(); i++)
{
String newLine = inFile.nextLine();
newLine = newLine(i).replaceAll("[^A-Za-z0-9]");
dictionary.add(inFile.nextLine());
}
// PrintWriter outFile =
// new PrintWriter("H:\\csc8001\\results.txt");
}
There is a compilation error on this line:
newLine = newLine(i).replaceAll("[^A-Za-z0-9]");
Because replaceAll takes 2 parameters: a regex and a replacement.
(And because newLine(i) is non-sense.)
This should be closer to what you need:
newLine = newLine.replaceAll("[^A-Za-z0-9]+", " ");
That is, replace non-empty sequences of non-[A-Za-z0-9] characters with a space.
To convert all uppercase letters to lowercase, it's simpler and better to use toLowerCase.
There are many other issues in your code too. For example, some lines in the input will be skipped, due to some inappropriate inFile.nextLine calls. Also, the input file is closed after the first loop, but the second tries to use it, which makes no sense.
With these and a few other issues cleaned up, this should be closer to what you want:
Scanner inFile = new Scanner(new FileReader("H:\\csc8001\\data.txt"));
List<String> inputContents = new ArrayList<>();
while (inFile.hasNextLine()) {
inputContents.add(inFile.nextLine());
}
inFile.close();
List<String> dictionary = new ArrayList<>();
for (String line : inputContents) {
dictionary.add(line.replaceAll("[^A-Za-z0-9]+", " ").toLowerCase());
}
If you want to add words to the dictionary instead of lines, you also need to split the lines on spaces. One simple way to achieve that:
dictionary.addAll(Arrays.asList(line.replaceAll("[^A-Za-z0-9]+", " ").toLowerCase().split(" ")));
Related
I have a question about my split function inside the file reading function.
Here is my code. I tried to use split to put these text in to array. But the problem is I have this error. java.lang.NumberFormatException: For input string: "Sug" at java.base/java.lang.NumberFormatException.forInputString
public static SLL<train> readFile() {
SLL<train> list = new SLL();
try {
FileReader fr = new FileReader("Train.dat");
BufferedReader br = new BufferedReader(fr);
String line = "";
while (true) {
line = br.readLine();
if (line == null) {
break;
}
String[] text = line.split(" | ");
String tcode = text[0];
String train_name = text[1];
int seat = Integer.parseInt(text[2]);
int booked = Integer.parseInt(text[3]);
double depart_time = Double.parseDouble(text[4]);
String depart_place = text[5];
train t = new train(tcode, train_name, seat, booked, depart_time, depart_place);
list.addLast(t);
}
br.close();
fr.close();
} catch (Exception e) {
e.printStackTrace();
}
return list;
This is my text file:
"SUG" should be added into train name because I declared train_name as a string.I think this error only appears when declaring the wrong data type, "12" should be added into seat, "3" should be added into booked, and so on. Can you guys explained to me what happened to "SUG". Thanks a lot :(
Note that String.split(regex) uses a regex to find the split location, i.e. it splits before and after any match produced by the regular expression.
Furthermore, in regex the pipe (|) has the special meaning "or" so your code right now reads as "split on a space or a space". Since this splits on spaces only, splitting "B03 | Sug | 12 ..." will result in the array ["B03","|","Sug","|","12",...] and hence text[2] yields "Sug".
Instead you need to escape the pipe by either using line.split(" \\| ") or line.split(Pattern.quote(" | ")) to make it "split on a sequence of space, pipe, space". That would then result in ["B03","Sug","12",...].
However, you also might need to surround Integer.parseInt() etc. with a try-catch block or do some preliminary checks on the string because you might run into a malformed file.
In addition to what #Thomas has said, it would be a whole lot easier to use a Scanner
sc.useDelimiter("\\s*\\|\\s*");
while (sc.hasNext()) {
Train t = new Train(sc.next(), sc.next(), sc.nextInt(), sc.nextInt(), sc.nextDouble(), sc.next());
}
Naming is important in Java for readability and maintenance
Note class names begin upper case in Java. There's no place for underscores except for as separators for blocks of capitals in the name of a constant.
I am trying to compare a .txt file that has a list of words, and a String[] array that is also filled with words.
Solved thank you.
Assuming you're ultimately just trying to get a list of words that are in both files:
Scanner fileReader = new Scanner(file);
Set<String> words = new HashSet<>();
while (fileReader.hasNext()) {
String s = fileReader.next();
words.add(s);
}
fileReader.close();
Scanner otherFileReader = new Scanner(otherFile);
List<String> wordsInBothFiles = new ArrayList<>();
while (otherFileReader.hasNext()) {
String s = otherFileReader.next();
if (words.contains(s)) {
wordsInBothFiles.add(s);
}
}
otherFileReader.close();
// Do whatever it is you have to do with the shared words, like printing them:
// for (String s : wordsInBothFiles) {
// System.out.println(s);
// }
If you check the documentation it will usually explain why a method throws an exception. In this case "no line was found" means you've hit the end of your file. There are two possible ways this error could come about:
String nextLine = scanner.nextLine(); //problem 1: reads a file with no lines
while (scanner.hasNextLine()) {
linearSearch(words,nextLine);
System.out.println(nextLine);
}
scanner.nextLine(); //problem 2: reads after there is not next line
Since you loop appears to be infinite I'd wager you're getting the exception from the first line and can fix it by adding the following check before String nextLine = scanner.nextLine();:
if(!scanner.hasNextLine()) {
System.out.println("empty file: "+filePath)
return; //or break or otherwise terminate
}
Beyond that you may still have some other issues but hopefully this resolves your present problem.
I'm very new to Java but this has had me stumped for the last half an hour or so. I'm reading in lines from a text file and storing them as String Arrays. From here I'm trying to use the values from within the arrays to be used to initialise another class I have. To initialise my Route class (hence using routeName) I need to take the first value from the array and pass it as a string. When I try to return s[0] for routeName, I'm given the last line of from my text file. Any ideas on how to fix this would be greatly appreciated. I'm in the process of testing still so thats why my code is barely finished.
My text file is as follows.
66
Uq Lakes, Southbank
1,2,3,4,5
2,3,4,5,6
and my code:
import java.io.*;
import java.util.*;
public class Scan {
public static void main(String args[]) throws IOException {
String routeName = "";
String stationName = " ";
Scanner timetable = new Scanner(new File("fileName.txt"));
while (timetable.hasNextLine()) {
String[] s = timetable.nextLine().split("\n");
routeName = s[0];
}
System.out.println(routeName);
}
}
The method you are calling timetable.nextLine.split("\n") will return the Array of String.
So every time when you call this method is overwrites your array with new line in file and as the last line is added finally in your array you are getting the lat line at the end.
below is the code you can use.
public static void main(String[] args) throws FileNotFoundException {
String routeName = "";
Scanner timetable;
int count = 0;
String[] s = new String[10];
timetable = new Scanner(new File("fileName.txt"));
while (timetable.hasNextLine()) {
String line = timetable.nextLine();
s[count++] = line;
}
routeName = s[0];
System.out.println(routeName);
}
Scanner.nextLine() returns a single line so splitting by '\n' will always give a single element array, e.g.:
timetable.nextLine().split("\n"); // e.g., "1,2,3,4,5" => ["1,2,3,4,5"]
Try splitting by the ',' instead, e.g.:
timetable.nextLine().split(","); // e.g., "1,2,3,4,5" => ["1", "2", "3", "4", "5"]
NOTE: If you are intending for the array to contain individual lines, then check out this SO post.
Scanner s = new Scanner(new File(filename));
List<String> lines = new ArrayList<String>(); // A List can be dynamically resized
while(s.hasNextLine()) lines.add(s.nextLine()); // Store each line in the list
String[] arr = lines.toArray(new String[0]); // If you really need an Array, use this
Your while loop itterates over all lines and sets the current line to the routeName. Thats why you habe the last line in you string. What you could do is calling a break, when you habe read the first line oft the file. Then you will have the first line.
I need to count the words in a String. For many of you that seems pretty simple but from what I've read in similar questions people are saying to use arrays but I'd rather not. It complicates my program more than it helps as my string is coming from an input file and the program cannot be hardwired to a specific file.
I have this so far:
while(input.hasNext())
{
String sentences = input.nextLine();
int countWords;
char c = " ";
for (countWords = 0; countWords < sentences.length(); countWords++)
{
if (input.hasNext(c))
countWords++;
}
System.out.println(sentences);
System.out.println(countWords);
}
The problem is that what I have here ends up counting the amount of characters in the string. I thought it would count char c as a delimiter. I've also tried using String c instead with input.hasNext but the compiler tells me:
Program04.java:39: incompatible types
found : java.lang.String[]
required: java.lang.String
String token = sentences.split(delim);
I've since deleted the .split method from the program.
How do I delimit (is that the right word?) without using a String array with a scanned in file?
Don't use the Scanner (input) for more than one thing. You're using it to read lines from a file, and also trying to use it to count words in those lines. Use a second Scanner to process the line itself, or use a different method.
The problem is that the scanner consumes its buffer as it reads it. input.nextLine() returns sentences, but after that it no longer has them. Calling input.hasNext() on it gives you information about the characters after sentences.
The simplest way to count the words in sentences is to do:
int wordCount = sentences.split(" ").length;
Using Scanner, you can do:
Scanner scanner = new Scanner(sentences);
while(scanner.hasNext())
{
scanner.next();
wordCount++;
}
Or use a for loop for best performance (as mentioned by BlackPanther).
Another tip I'd give you is how to better name your variables. countWords should be wordCount. "Count words" is a command, a verb, while a variable should be a noun. sentences should simply be line, unless you know both that the line is composed of sentences and that this fact is relevant to the rest of your code.
Maybe, this is what you are looking for.
while(input.hasNext())
{
String sentences = input.nextLine();
System.out.println ("count : " + line.split (" ").length);
}
what you are trying to achieve is not quite clear. but if you are trying to count the number of words in your text file then try this
int countWords = 0;
while(input.hasNext())
{
String sentences = input.nextLine();
for(int i = 0; i< sentences.length()-1;i++ ) {
if(sentences.charAt(i) == " ") {
countWords++;
}
}
}
System.out.println(countWords);
Im working on the question below and am quite close but in line 19 and 32 I get the following error and cant figure it out.
foreach not applicable to expression type
for (String place: s)
Question:
Tax inspectors have available to them two text files, called unemployed.txt and taxpayers.txt, respectively. Each file contains a collection of names, one name per line. The inspectors regard anyone who occurs in both files as a dodgy character. Write a program which prints the names of the dodgy characters. Make good use of Java’s support for sets.
My code:
class Dodgy {
public static void main(String[] args) {
HashSet<String> hs = new HashSet<String>();
Scanner sc1 = null;
try {sc1 = new Scanner(new File("taxpayers.txt"));}
catch(FileNotFoundException e){};
while (sc1.hasNextLine()) {
String line = sc1.nextLine();
String s = line;
for (String place: s) {
if((hs.contains(place))==true){
System.out.println(place + " is a dodgy character.");
hs.add(place);}
}
}
Scanner sc2 = null;
try {sc2 = new Scanner(new File("unemployed.txt"));}
catch(FileNotFoundException e){};
while (sc2.hasNextLine()) {
String line = sc2.nextLine();
String s = line;
for (String place: s) {
if((hs.contains(place))==true){
System.out.println(place + " is a dodgy character.");
hs.add(place);}
}
}
}
}
You're trying to iterate over "each string within a string" - what does that even mean?
It feels like you only need to iterate over each line in each file... you don't need to iterate within a line.
Secondly - in your first loop, you're only looking at the first file, so how could you possibly detect dodgy characters?
I would consider abstracting the problem to:
Write a method to read a file and populate a hash set.
Call that method twice to create two sets, then find the intersection.
Foreach is applicable for only java.lang.Iterable types. Since String is not, so is the error.
If your intention is to iterate characters in the string, then replace that "s" with "s.toCharArray()" which returns you an array that is java.lang.Iterable.