Reading two lines from an input file using Scanner - java

Hi I'm in a programming class over the summer and am required to create a program that reads input from a file. The input file includes DNA sequences ATCGAGG etc and the first line in the file states how many pairs of sequences need to be compared. The rest are pairs of sequences. In class we use the Scanner method to input lines from a file, (I read about bufferedReader but we have not covered it in class so not to familiar with it) but am lost on how to write the code on how to compare two lines from the Scanner method simultaneously.
My attempt:
public static void main (String [] args) throws IOException
{
File inFile = new File ("dna.txt");
Scanner sc = new Scanner (inFile);
while (sc.hasNextLine())
{
int pairs = sc.nextLine();
String DNA1 = sc.nextLine();
String DNA2 = sc.nextLine();
comparison(DNA1,DNA2);
}
sc.close();
}
Where the comparison method would take a pair of sequences and output if they had common any common characters. Also how would I proceed to input the next pair, any insight would be helpful.. Just stumped and google confused me even further. Thanks!
EDIT:
Here's the sample input
7
atgcatgcatgc
AtgcgAtgc
GGcaAtt
ggcaatt
GcT
gatt
aaaaaGTCAcccctccccc
GTCAaaaaccccgccccc
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
gctagtacACCT
gctattacGcct

First why you are doing:
while (sc.hasNextLine())
{
int pairs = sc.nextLine();
While you have pairs only in one line not pairs and two lines of input, but number of lines once? Move reading pairs from that while looop and parse it to int, then it does not matter but you could use it to stop reading lines if you know how many lines are there.
Second:
throws IOException
Might be irrelevant but, really you don't know how to do try catch and let's say skip if you do not care about exceptions?
Comparision, if you read strings then string has method "equals" with which you can compare two strings.
Google will not help you with those problems, you just don't know it all, but if you want to know then search for basic stuff like type in google "string comparision java" and do not think that you can find solution typing "Reading two lines from an input file using Scanner" into google, you have to go step by step and cut problem into smaller pieces, that is the way software devs are doing it.
Ok I have progz that somehow wokrked for me, just finds the lines that have something and then prints them out even if I have part, so it is brute force which is ok for such thing:
import java.io.File;
import java.io.IOException;
import java.util.Scanner;
public class program
{
public static void main (String [] args) throws IOException
{
File inFile = new File ("c:\\dna.txt");
Scanner sc = new Scanner (inFile);
int pairs = Integer.parseInt(sc.nextLine());
for (int i = 0; i< pairs-1; i++)
{
//ok we have 7 pairs so we do not compare everything that is one under another
String DNA1 = sc.nextLine();
String DNA2 = sc.nextLine();
Boolean compareResult = comparison(DNA1,DNA2);
if (compareResult){
System.out.println("found the match in:" + DNA1 + " and " + DNA2) ;
}
}
sc.close();
}
public static Boolean comparison(String dna1, String dna2){
Boolean contains = false;
for (int i = 0; i< dna1.length(); i++)
{
if (dna2.contains(dna1.subSequence(0, i)))
{
contains = true;
break;
}
if (dna2.contains(dna1.subSequence(dna1.length()-i,dna1.length()-1 )))
{
contains = true;
break;
}
}
return contains;
}
}

Related

How to limit the number of words when reading a line from standard input?

I am new to Stackoverflow and this is my first time asking a question. I have searched my problem thoroughly, however, could not find an appropriate answer. I am sorry if this has been asked. Thank you in advance.
The question is from Hyperskill.com as follows:
Write a program that reads five words from the standard input and outputs each word in a new line.
First, you need to print all the words from the first line, then from the second (from the left to right).
Sample Input 1:
This Java course
is adaptive
Sample Output 1:
This
Java
course
is
adaptive
My trial to solve it
import java.util.Scanner;
public class Main {
public static void main(String[] args) {
/* I have not initialized the "userInput" String.
* I know that String is immutable in Java and
* if I initialize it to an empty String ""
* and read a String from user.
* It will not overwrite to the "userInput" String.
* But create another String object to give it the value of the user input,
* and references the new String object to "userInput".
* I didn't want to waste memory like that.
*/
String userInput;
String[] userInputSplitFirstLine = new String[3];
String[] userInputSplitSecondLine = new String[2];
Scanner scan = new Scanner(System.in);
userInput = scan.nextLine();
userInputSplitFirstLine = userInput.split("\\s+");
userInput = scan.nextLine();
userInputSplitSecondLine = userInput.split("\\s+");
for(String firstLineSplitted: userInputSplitFirstLine) {
System.out.println(firstLineSplitted);
}
for(String secondLineSplitted: userInputSplitSecondLine) {
System.out.println(secondLineSplitted);
}
scan.close();
}
}
If you try the sample input above, the output will match the sample output above. However, if you write more than 3 words to the first line and/or more than 2 words to the second line, the userInputSplitFirstLine array of size 3 will store more than 3 words. Same goes with the userInputSplitSecondLine array also. My first question is how can an array of size 3 (userInputSplitFirstLine) and an array of size 2 (userInputSplitSecondLine) can hold more than 3 and 2 elements, respectively? My second question is that how can I restrict/limit the number of words that the user can insert in a line; for example, the first line only accepts 3 words and the second line only accepts 2 words?
Also the answer to this question suggested by Hyperskill.com is as follows:
import java.util.Scanner;
class Main {
public static void main(String[] args) {
Scanner scanner = new Scanner(System.in);
String wordOne = scanner.next();
String wordTwo = scanner.next();
String wordThree = scanner.next();
String wordFour = scanner.next();
String wordFive = scanner.next();
System.out.println(wordOne);
System.out.println(wordTwo);
System.out.println(wordThree);
System.out.println(wordFour);
System.out.println(wordFive);
}
}
You can use next method of scanner object to read string and then it can be printed easily on new line.
while(true){
if(scanner.hasNext()){
System.out.println(scanner.next());
}
else{
break;
}
}
I think this should do the work. Don't hesitate to ask, if you have some questions.
import java.util.Scanner;
class App {
public static void main(String[] args) {
final StringBuffer line = new StringBuffer();
final StringBuffer words = new StringBuffer();
try (final Scanner sc = new Scanner(System.in)) {
while (sc.hasNextLine()) {
final String currentLine = sc.nextLine();
line.append(currentLine).append(System.lineSeparator());
for (final String word : currentLine.split("\\s+")) {
words.append(word).append(System.lineSeparator());
}
}
} finally {
System.out.println(line.toString());
System.out.println();
System.out.println(words.toString());
}
}
}
My first question is how can an array of size 3 (userInputSplitFirstLine) and an array of size 2 (userInputSplitSecondLine) can hold more than 3 and 2 elements, respectively?
The array here:
String[] userInputSplitFirstLine = new String[3];
is not the same one as the one you got from split:
userInputSplitFirstLine = userInput.split("\\s+");
When you do the above assignment, the old array that was in there is basically "overwritten", and now userInputSplitFirstLine refers to this new array that has a length independent of what the old array had. split always return a new array.
My second question is that how can I restrict/limit the number of words that the user can insert in a line; for example, the first line only accepts 3 words and the second line only accepts 2 words?
It really depends on what you mean by "restrict". If you just want to check if there are exactly three words, and if not, exit the program, you can do this:
userInputSplitFirstLine = userInput.split("\\s+");
if (userInputSplitFirstLine.length != 3) {
System.out.println("Please enter exactly 3 words!");
return;
}
You can do something similar with the second line.
If you want the user to be unable to type more than 3 words, then that's impossible, because this is a command line app.
By the way, the code in the suggested solution works because next() returns the next "word" (or what we generally think of as a word, anyway) by default.
hope this will help you!
public class pratice1 {
public static void main (String[]args) {
Scanner sc = new Scanner(System.in);
String input = sc.nextLine();
String input1 = sc.nextLine();
char[]a =input.toCharArray();
char[]a1 = input1.toCharArray();
System.out.println(input +""+ input1);
int a2=0;
if(input!=null) {
for(int i=0;i<input.length();i++) {
if(a[i]==' ') {
a2=i;
for(int j=0;j<a2;j++) {
System.out.println(a[i]);
a2=0;
}
}
else System.out.print(a[i]);
}System.out.println("");
for(int i=0;i<input1.length();i++) {
if(a1[i]==' ') {
a2=i;
for(int j=0;j<a2;j++) {
System.out.println(a1[i]);
a2=0;
}
}
else System.out.print(a1[i]);
}
}
}
}
To solve the problem:
Write a program that reads five words from the standard input and
outputs each word in a new line.
This was my solution:
while(scanner.hasNext()){
System.out.println(scanner.next());
}

Comparing one data structure against another resulting in run time of over 50 mins

I'm writing code which reads in a text file (each line a tweet) and goes through each tweet comparing it against a list of English words to see if the word is misspelled.
So the list of English words is read in from a text file as well, this is then stored in a List. When I run the code for this alone, it operates in less than one second. When I run the code for storing each word in the tweet file (without checking for spelling) for the 1,000,000 tweets, it stores each word and its frequency in a HashMap<String, Integer> in around 20-30sec.
But when I add the line to check if the word is spelled correctly, it causes a ridiculous run time increase, to the point where I could almost watch a movie before it finished running.
The simple aspect of invoking isSpelledCorrectly(X) (which just invokes list.contains(x), which has a worst case run-time of O(n)), yet it seems quite confounding that it causes the code to go from a 30 sec runtime to a 50 min runtime?
Code:
Spelling:
static List<String> spellCheck = new ArrayList<String>();
public AssignTwo() throws IOException{
spellCheck = initCorrectSpelling("C:\\Users\\Gregs\\InfoRetrieval\\src\\english-words");
}
public static List<String> initCorrectSpelling(String filename) throws IOException { //store correct spelling of words in list
Scanner scanner = new Scanner(new FileInputStream(filename));
try{
while(scanner.hasNextLine()){
String next = scanner.nextLine();
spellCheck.add(next);
}
}
finally{
scanner.close();
}
return spellCheck;
}
public static boolean isSpelledCorrectly(String word){ //check if any given word is spelled correctly by seeing if it is
boolean output = false; //contained within the spellCheck list
if(spellCheck.contains(word)) output = true;
return output;
}
Code storing Tweets:
public static HashMap<String, Integer> misSpell;
public AssignOne() throws IOException { //read in file from path, test functions
index("C:\\Users\\Gregs\\InfoRetrieval\\src\\tweets");
}
public static void index(String filename) throws IOException {
misSpell = new HashMap<String, Integer>();
Scanner scanner = new Scanner(new FileInputStream(filename));
try{
while(scanner.hasNextLine()){
String line = scanner.nextLine();
String[] lineArr = line.split(" ");
for(int i=3; i<lineArr.length; i++){
int count=1;
lineArr[i] = lineArr[i].replaceAll("[^a-zA-Z0-9]", "");
//if(!AssignTwo.isSpelledCorrectly(lineArr[i].toLowerCase())){ //with this line commented out, runtime <30sec, with line >50mins
if(misSpell.containsKey(lineArr[i].toLowerCase())){
count = 1 + misSpell.get(lineArr[i].toLowerCase());
}
misSpell.put(lineArr[i].toLowerCase(), count);
//}
}
}
}
finally{
scanner.close();
}
}
Any suggestion on where to improve code or how to make the comparisons more efficient? Is there a faster data structure for correctly spelled words?
List.contains() is O(N), N being the number of words in the dictionary.
Use a HashSet, where contains() is O(1).
Using A buffered reader would also speed things up. And avoiding to call toLowerCase() three times on each word would too.

my program reads from the file but cant find the words

So my program knows where the file is and it can read how many words it has, however, I am trying to compare words to count the occurrences of a word that i will use with a scanner.
The program says i can't convert string to a boolean which i understand but how would i be able to make it happen?
can I get an answer why it runs but doesn't allow me to find the word to look for
thanks
import java.util.*;
import java.io.*;
public class wordOccurence {
public static void main(String[] args) throws IOException {
{
int wordCount=0;
int word =0;
Scanner scan=new Scanner(System.in);
System.out.println("Enter file name");
System.out.println("Enter the word you want to scan");
String fileName=scan.next().trim();
Scanner scr = new Scanner(new File(fileName));
// your code goes here ...
while(scr.nextLine()){
String word1 = scr.next();
if (word1.equals(scr)){
word++;
}
}
System.out.println("Total words = " + word);
}
}
}
At present you are only checking if there is a next line available:
while(scr.hasNextLine()){
but you are not fetching it. Its like you are staying at the same position in the file forever.
To fetch the next line, you can make use of
scanner.nextLine()

useDelimiter, read up till first delimiter and then change line

I'm trying to use a Delimiter to pull out the first numbers in a document with 31 rows looking something like "105878-798##176000##JDOE" and put it in an int array.
The numbers I'm interesed in are "105878798", and the number of numbers is not consistent.
I wrote this but can't figure out how to change the line when i reach the first delimiter (of the line).
import java.util.*;
import java.io.*;
public class Test {
public static void main(String[] args) throws Exception {
int n = 0;
String rad;
File fil = new File("accounts.txt");
int[] accountNr = new int[31];
Scanner sc = new Scanner(fil).useDelimiter("##");
while (sc.hasNextLine()) {
rad = sc.nextLine();
rad.replaceAll("-","");
accountNr[n] = Integer.parseInt(rad);
System.out.println(accountNr[n]);
n++;
System.out.println(rad);
}
}
}
Don't use the scanner for this, use the StringTokenizer and set the delimiter to ##, then just keep calling .nextElement() and you will get the next number no matter how long it is.
StringTokenizer st2 = new StringTokenizer(str, "##");
while (st2.hasMoreElements()) {
log.info(st2.nextElement());
}
(Of course, you can iterate in different ways..)
I would suggest for each line use line.split("[#][#]")[0] (of course haldle your exceptions).
also, rad.replaceAll(...) returns a new String, because String is an imutable object. you should execute parseInt on the returned String and not on rad.
just use the following instead of the equivalent 2 lines in your code:
String newRad = rad.replaceAll("-","");
accountNr[n] = Integer.parseInt(newRad);

hashset input java

Im working on the question below and am quite close but in line 19 and 32 I get the following error and cant figure it out.
foreach not applicable to expression type
for (String place: s)
Question:
Tax inspectors have available to them two text files, called unemployed.txt and taxpayers.txt, respectively. Each file contains a collection of names, one name per line. The inspectors regard anyone who occurs in both files as a dodgy character. Write a program which prints the names of the dodgy characters. Make good use of Java’s support for sets.
My code:
class Dodgy {
public static void main(String[] args) {
HashSet<String> hs = new HashSet<String>();
Scanner sc1 = null;
try {sc1 = new Scanner(new File("taxpayers.txt"));}
catch(FileNotFoundException e){};
while (sc1.hasNextLine()) {
String line = sc1.nextLine();
String s = line;
for (String place: s) {
if((hs.contains(place))==true){
System.out.println(place + " is a dodgy character.");
hs.add(place);}
}
}
Scanner sc2 = null;
try {sc2 = new Scanner(new File("unemployed.txt"));}
catch(FileNotFoundException e){};
while (sc2.hasNextLine()) {
String line = sc2.nextLine();
String s = line;
for (String place: s) {
if((hs.contains(place))==true){
System.out.println(place + " is a dodgy character.");
hs.add(place);}
}
}
}
}
You're trying to iterate over "each string within a string" - what does that even mean?
It feels like you only need to iterate over each line in each file... you don't need to iterate within a line.
Secondly - in your first loop, you're only looking at the first file, so how could you possibly detect dodgy characters?
I would consider abstracting the problem to:
Write a method to read a file and populate a hash set.
Call that method twice to create two sets, then find the intersection.
Foreach is applicable for only java.lang.Iterable types. Since String is not, so is the error.
If your intention is to iterate characters in the string, then replace that "s" with "s.toCharArray()" which returns you an array that is java.lang.Iterable.

Categories