hashset input java - java

Im working on the question below and am quite close but in line 19 and 32 I get the following error and cant figure it out.
foreach not applicable to expression type
for (String place: s)
Question:
Tax inspectors have available to them two text files, called unemployed.txt and taxpayers.txt, respectively. Each file contains a collection of names, one name per line. The inspectors regard anyone who occurs in both files as a dodgy character. Write a program which prints the names of the dodgy characters. Make good use of Java’s support for sets.
My code:
class Dodgy {
public static void main(String[] args) {
HashSet<String> hs = new HashSet<String>();
Scanner sc1 = null;
try {sc1 = new Scanner(new File("taxpayers.txt"));}
catch(FileNotFoundException e){};
while (sc1.hasNextLine()) {
String line = sc1.nextLine();
String s = line;
for (String place: s) {
if((hs.contains(place))==true){
System.out.println(place + " is a dodgy character.");
hs.add(place);}
}
}
Scanner sc2 = null;
try {sc2 = new Scanner(new File("unemployed.txt"));}
catch(FileNotFoundException e){};
while (sc2.hasNextLine()) {
String line = sc2.nextLine();
String s = line;
for (String place: s) {
if((hs.contains(place))==true){
System.out.println(place + " is a dodgy character.");
hs.add(place);}
}
}
}
}

You're trying to iterate over "each string within a string" - what does that even mean?
It feels like you only need to iterate over each line in each file... you don't need to iterate within a line.
Secondly - in your first loop, you're only looking at the first file, so how could you possibly detect dodgy characters?
I would consider abstracting the problem to:
Write a method to read a file and populate a hash set.
Call that method twice to create two sets, then find the intersection.

Foreach is applicable for only java.lang.Iterable types. Since String is not, so is the error.
If your intention is to iterate characters in the string, then replace that "s" with "s.toCharArray()" which returns you an array that is java.lang.Iterable.

Related

searching for words in a text file in java

I am trying to search for words within a text file and replace all upper-cased with lower-cased characters. The problem is that when I use the replace All function using a regular expression I get a syntax error. I have tried different tactics, but it doesn't work. Any tips? I think that maybe I should create a replace All method that I would have to invoke, but I don't really see its use.
public static void main() throws FileNotFoundException {
ArrayList<String> inputContents = new ArrayList<>();
Scanner inFile =
new Scanner(new FileReader("H:\\csc8001\\data.txt"));
while(inFile.hasNextLine())
{
String line = inFile.nextLine();
inputContents.add(inFile.nextLine());
}
inFile.close();
ArrayList<String> dictionary = new ArrayList<>();
for(int i= 0; i <inputContents.size(); i++)
{
String newLine = inFile.nextLine();
newLine = newLine(i).replaceAll("[^A-Za-z0-9]");
dictionary.add(inFile.nextLine());
}
// PrintWriter outFile =
// new PrintWriter("H:\\csc8001\\results.txt");
}
There is a compilation error on this line:
newLine = newLine(i).replaceAll("[^A-Za-z0-9]");
Because replaceAll takes 2 parameters: a regex and a replacement.
(And because newLine(i) is non-sense.)
This should be closer to what you need:
newLine = newLine.replaceAll("[^A-Za-z0-9]+", " ");
That is, replace non-empty sequences of non-[A-Za-z0-9] characters with a space.
To convert all uppercase letters to lowercase, it's simpler and better to use toLowerCase.
There are many other issues in your code too. For example, some lines in the input will be skipped, due to some inappropriate inFile.nextLine calls. Also, the input file is closed after the first loop, but the second tries to use it, which makes no sense.
With these and a few other issues cleaned up, this should be closer to what you want:
Scanner inFile = new Scanner(new FileReader("H:\\csc8001\\data.txt"));
List<String> inputContents = new ArrayList<>();
while (inFile.hasNextLine()) {
inputContents.add(inFile.nextLine());
}
inFile.close();
List<String> dictionary = new ArrayList<>();
for (String line : inputContents) {
dictionary.add(line.replaceAll("[^A-Za-z0-9]+", " ").toLowerCase());
}
If you want to add words to the dictionary instead of lines, you also need to split the lines on spaces. One simple way to achieve that:
dictionary.addAll(Arrays.asList(line.replaceAll("[^A-Za-z0-9]+", " ").toLowerCase().split(" ")));

if(s1.Contains(s2)) Seems to always be true

sorry, changed the question slightly.
essentially i want to know if aString contains String. My issue is when comparing say aS a substring of aString) "aS".contains("String") shows true.
String a="st", b="string";
I ran System.out.println(a.contains(b));
That returned false, as expected. I have an understanding of contains, i must be missing something else.
So It had seemed that my program was working properly, but I made some adjustments and came back and the whole thing stopped working. I sussed out what are usually the common culprits (brackets, file io, etc.). I found if(string.contains(string)) would continually run, ie: always true. not sure why this is happening, probably something I missed in the code.
This is an example of my output (Just a char by char reading of the file):
I
n
t
e
g
e
r
G
;
import java.io.File;
import java.util.ArrayList;
import java.util.Scanner;
public class comp{
public static void main(String[] args){
ArrayList<String> lines = new ArrayList<String>();
ArrayList<String> symbolTable = new ArrayList<String>();
ArrayList<String> parsedFile = new ArrayList<String>();
try {
File file = new File("symbolTable.txt");
Scanner scanner=new Scanner(file);
while (scanner.hasNextLine()&&symbolTable.add(scanner.nextLine().replaceAll("\\s+","").toLowerCase()));
scanner.close();
} catch (Exception ex) {
ex.printStackTrace();
}
try {
File file = new File("APU_CS400_input.txt");
Scanner scanner=new Scanner(file);
while (scanner.hasNextLine()&&lines.add(scanner.nextLine().replaceAll("\\s+","").toLowerCase()));
scanner.close();
} catch (Exception ex) {
ex.printStackTrace();
}
//runs through line by line of the input file
for(String line: lines){
String sBuild = "";
StringBuilder identifier = new StringBuilder("");
//moves through the line char by char
for(int i=0;line.length()>i; i++){
sBuild+=line.charAt(i);
//moves through the symbol table comparing each symbol to each string
//that is built char by char
for(String symbol: symbolTable){
//if the char string matches the symbol then any identifiers are saved and
//symbols are saved, the string is then reset to empty
//This is where i seem to get an issue
***if(sBuild.contains(symbol)){***
if(symbol.length()<sBuild.length()){
identifier.append(sBuild,0,sBuild.length()-symbol.length());
parsedFile.add(identifier.toString());
identifier.delete(0,sBuild.length()-symbol.length());
}
sBuild="";
parsedFile.add(symbol);
}
}
}
}
for(String symbol:parsedFile){
System.out.println(symbol);
}
}
}
Blockquote
Think of it this way.
s1.contains(s2)
should return true, if a substring of s1 can be found such that
s1.substring(i, j).equals(s2)
is true.
If s2 is an empty string, then i = 0, j = 0 is one such substring, so contains() returns true.
As it should.
if(String.Contains("")) always should be true, as long as the String is not null.
essentially i want to know if "aString" contains "String".
Yes, "aString" as a string-value does contain the string-value of "String"
My issue is when comparing say "aS" (a substring of "aString") "aS".contains("String") shows true.
Are you sure? This cannot be, therefore I rather suspect bugs in your code.
To spare youself of "empty String symbols" consider this:
try {
File file = new File("symbolTable.txt");
Scanner scanner=new Scanner(file);
while (scanner.hasNextLine()) {
// toLowerCase will do nothing for characters that are not letters
// Don't spend CPU cycles with regex
String symbolLine=scanner.nextLine().toLowerCase();
// Collect the symbol only if non-empty?? This will save you from empty symbols
if(symbolLine.trim().length()>0) {
symbolTable.add(symbolLine); // or .add(symbolLine.trim()) ???
}
}
scanner.close();
} catch (Exception ex) {
ex.printStackTrace();
}
You may have to look at this one a bit mathematically to see why s.contains("") is always true. Suppose you think of this this way:
a.contains(b) is true if there are some values i and j such that a.substring(i,j) is equal to b.
If you think about it a bit, you'll see that this is exactly what contains means when the argument is a nonempty string like "xyz". If there is some substring of x that equals "xyz", then s.contains("xyz") is true. If there is no such substring, then s.contains("xyz") is false.
So it makes sense that the same logic would apply for an empty string, since it applies everywhere else. And it's always true that a.substring(0,0) equals "" (if a is not null). That's why a.contains("") should always be true.
It may not be intuitively obvious from the English meaning of "contains", but when you're dealing with "edge cases" like this, you sometimes have to think in different terms. Often, the Javadoc spells things out so that you can easily figure out what happens in the edge cases, without relying on intuition. Unfortunately, in this case, they didn't.

Java Scanner to print previous and next lines

I am using 'java.util.Scanner' to read and scan for keywords and want to print the previous 5 lines and next 5 lines of the encountered keyword, below is my code
ArrayList<String> keywords = new ArrayList<String>();
keywords.add("ERROR");
keywords.add("EXCEPTION");
java.io.File file = new java.io.File(LOG_FILE);
Scanner input = null;
try {
input = new Scanner(file);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
int count = 0;
String previousLine = null;
while(input.hasNext()){
String line = input.nextLine();
for(String keyword : keywords){
if(line.contains(keyword)){
//print prev 5 lines
system.out.println(previousLine); // this will print only last previous line ( i need last 5 previous lines)
???
//print next 5 lines
system.out.println(input.nextLine());
system.out.println(input.nextLine());
system.out.println(input.nextLine());
system.out.println(input.nextLine());
system.out.println(input.nextLine());
}
previousLine = line;
}
any pointers to print previous 5 lines..?
any pointers to print previous 5 lines..?
Save them in an Dequeue<String> such as a LinkedList<String> for its "First In First Out (FIFO)" behavior.
Either that or use 5 variables or an array of 5 Strings, manually move Strings from one slot or variable to another, and then print them.
If you use Dequeue/LinkedList, use the Dequeue's addFirst(...) method to add a new String to the beginning and removeLast() to remove the list's last String (if its size is > 5). Iterate through the LinkedList to get the current Strings it contains.
Other suggestions:
Your Scanner's check scanner.hasNextXXX() method should match the get method, scanner.nextXXX(). So you should check for hasNextLine() if you're going to call nextLine(). Otherwise you risk problems.
Please try to post real code here in your questions, not sort-of, will never compile code. i.e., system.out.println vs System.out.println. I know it's a little thing, but it means a lot when others try to play with your code.
Use ArrayList's contains(...) method to get rid of that for loop.
e.g.,
LinkedList<String> fivePrevLines = new LinkedList<>();
java.io.File file = new java.io.File(LOG_FILE);
Scanner input = null;
try {
input = new Scanner(file);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
while (input.hasNextLine()) {
String line = input.nextLine();
if (keywords.contains(line)) {
System.out.println("keyword found!");
for (String prevLine : fivePrevLines) {
System.out.println(prevLine);
}
} else {
fivePrevLines.addFirst(line);
if (fivePrevLines.size() > 5) {
fivePrevLines.removeLast();
}
}
}
if (input != null) {
input.close();
}
Edit
You state in comment:
ok i ran small test program to see if the contains(...) method works ...<unreadable unformatted code>... and this returned keyword not found...!
It's all how you use it. The contains(...) method works to check if a Collection contains another object. It won't work if you feed it a huge String that may or may not use one of the Strings in the collection, but will work on the individual Strings that comprise the larger String. For example:
ArrayList<String> temp = new ArrayList<String>();
temp.add("error");
temp.add("exception");
String s = "Internal Exception: org.apache.tomcat.dbcp.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object";
String[] tokens = s.split("[\\s\\.:,]+");
for (String token : tokens) {
if (temp.contains(token.toLowerCase())) {
System.out.println("keyword found: " + token);
} else {
System.out.println("keyword not found: " + token);
}
}
Also, you will want to avoid posting code in comments since they don't retain their formatting and are unreadable and untestable. Instead edit your original question and post a comment to alert us to the edit.
Edit 2
As per dspyz:
For stacks and queues, when there isn't any significant functionality/performance reason to use one over the other, you should default to ArrayDeque rather than LinkedList. It's generally faster, takes up less memory, and requires less garbage collection.
If your file is small (< a million lines) you are way better off just copying the lines into an ArrayList and then getting the next and previous 5 lines using random access into the array.
Sometimes the best solution is just plain brute force.
Your code is going to get tricky if you have two keyword hits inside your +-5 line window. Let's say you have hits two lines apart. Do you dump two 10-line windows? One 12-line window?
Random access will make implementing this stuff way easier.

Reading two lines from an input file using Scanner

Hi I'm in a programming class over the summer and am required to create a program that reads input from a file. The input file includes DNA sequences ATCGAGG etc and the first line in the file states how many pairs of sequences need to be compared. The rest are pairs of sequences. In class we use the Scanner method to input lines from a file, (I read about bufferedReader but we have not covered it in class so not to familiar with it) but am lost on how to write the code on how to compare two lines from the Scanner method simultaneously.
My attempt:
public static void main (String [] args) throws IOException
{
File inFile = new File ("dna.txt");
Scanner sc = new Scanner (inFile);
while (sc.hasNextLine())
{
int pairs = sc.nextLine();
String DNA1 = sc.nextLine();
String DNA2 = sc.nextLine();
comparison(DNA1,DNA2);
}
sc.close();
}
Where the comparison method would take a pair of sequences and output if they had common any common characters. Also how would I proceed to input the next pair, any insight would be helpful.. Just stumped and google confused me even further. Thanks!
EDIT:
Here's the sample input
7
atgcatgcatgc
AtgcgAtgc
GGcaAtt
ggcaatt
GcT
gatt
aaaaaGTCAcccctccccc
GTCAaaaaccccgccccc
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
gctagtacACCT
gctattacGcct
First why you are doing:
while (sc.hasNextLine())
{
int pairs = sc.nextLine();
While you have pairs only in one line not pairs and two lines of input, but number of lines once? Move reading pairs from that while looop and parse it to int, then it does not matter but you could use it to stop reading lines if you know how many lines are there.
Second:
throws IOException
Might be irrelevant but, really you don't know how to do try catch and let's say skip if you do not care about exceptions?
Comparision, if you read strings then string has method "equals" with which you can compare two strings.
Google will not help you with those problems, you just don't know it all, but if you want to know then search for basic stuff like type in google "string comparision java" and do not think that you can find solution typing "Reading two lines from an input file using Scanner" into google, you have to go step by step and cut problem into smaller pieces, that is the way software devs are doing it.
Ok I have progz that somehow wokrked for me, just finds the lines that have something and then prints them out even if I have part, so it is brute force which is ok for such thing:
import java.io.File;
import java.io.IOException;
import java.util.Scanner;
public class program
{
public static void main (String [] args) throws IOException
{
File inFile = new File ("c:\\dna.txt");
Scanner sc = new Scanner (inFile);
int pairs = Integer.parseInt(sc.nextLine());
for (int i = 0; i< pairs-1; i++)
{
//ok we have 7 pairs so we do not compare everything that is one under another
String DNA1 = sc.nextLine();
String DNA2 = sc.nextLine();
Boolean compareResult = comparison(DNA1,DNA2);
if (compareResult){
System.out.println("found the match in:" + DNA1 + " and " + DNA2) ;
}
}
sc.close();
}
public static Boolean comparison(String dna1, String dna2){
Boolean contains = false;
for (int i = 0; i< dna1.length(); i++)
{
if (dna2.contains(dna1.subSequence(0, i)))
{
contains = true;
break;
}
if (dna2.contains(dna1.subSequence(dna1.length()-i,dna1.length()-1 )))
{
contains = true;
break;
}
}
return contains;
}
}

Read data in from text file, convert each word to PigLatin

I'm having trouble printing out the final result without each word being on its own line. The output should be formatted just as the input was. Here is the code I used to read the data and print it:
Scanner sc2 = null;
try {
sc2 = new Scanner(new File(dataFile));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
while (sc2.hasNextLine()) {
Scanner s2 = new Scanner(sc2.nextLine());
boolean b;
while (b = s2.hasNext()) {
String s = s2.next();
System.out.println(pig(s));
}
}
The actual instructions were as follows: "Translate the Declaration of Independence ("declaration.txt") into PigLatin. Try to preserve the paragraphs. There are several ways to do this, but they all use nested loops. You may want to look at nextLine, next, split, or StringTokenizer."
We haven't been taught how to use any of the methods listed there, though.
The println method is short for "print line". It prints the given output to the target output device followed by a newline. Check out the other methods in that class for the solution.
Update
The problem here is that to my knowledge java.util.Scanner throws out the whitespace (delimiter) between words. Check out java.util.StringTokenizer for a similar class that can be configured to return the whitespace characters one at a time.

Categories