I'm working a program that counts the accurrences of words in a text file. The program compiles and runs fine, but I'm tryting to use the split method to seperate special characters such as .,;:!?(){} from the words.
here is an output example
6 eyes,
3 eyes.
2 eyes;
1 eyes?
1 eyrie
As you can see the split fuction is not working. I have tried debugging, but no luck so far. Can anythone point me out to the right direction or tell me what I'm doing wrong. Thank you.
import java.util.*;
import java.io.*;
public class testingForLetters {
public static void main(String[] args) throws FileNotFoundException {
// open the file
Scanner console = new Scanner(System.in);
System.out.print("What is the name of the text file? ");
String fileName = console.nextLine();
Scanner input = new Scanner(new File(fileName));
// count occurrences
Map<String, Integer> wordCounts = new TreeMap<String, Integer>();
while (input.hasNext()) {
input.next().split("[ \n\t\r.,;:!?(){}]" );
String next = input.next().toLowerCase();
if (next.startsWith("a") || next.startsWith("b") || next.startsWith("c") || next.startsWith("d") || next.startsWith("e") ) {
if (!wordCounts.containsKey(next)) {
wordCounts.put(next, 1);
} else {
wordCounts.put(next, wordCounts.get(next) + 1);
}
}
}
// get cutoff and report frequencies
System.out.println("Total words = " + wordCounts.size());
for (String word : wordCounts.keySet()) {
int count = wordCounts.get(word);
System.out.println(count + "\t" + word);
}
}
}
The .split() method returns an array of strings, and right now you aren't setting input.next().split() equal to anything. You have to create an array and set it equal to input.next().split(), and then get the word(s) from the array. You basically need to handle it exactly like you handled the .toLowerCase() part where you set String next = input.next().toLowerCase(). Hope this helps.
Related
I am new to Stackoverflow and this is my first time asking a question. I have searched my problem thoroughly, however, could not find an appropriate answer. I am sorry if this has been asked. Thank you in advance.
The question is from Hyperskill.com as follows:
Write a program that reads five words from the standard input and outputs each word in a new line.
First, you need to print all the words from the first line, then from the second (from the left to right).
Sample Input 1:
This Java course
is adaptive
Sample Output 1:
This
Java
course
is
adaptive
My trial to solve it
import java.util.Scanner;
public class Main {
public static void main(String[] args) {
/* I have not initialized the "userInput" String.
* I know that String is immutable in Java and
* if I initialize it to an empty String ""
* and read a String from user.
* It will not overwrite to the "userInput" String.
* But create another String object to give it the value of the user input,
* and references the new String object to "userInput".
* I didn't want to waste memory like that.
*/
String userInput;
String[] userInputSplitFirstLine = new String[3];
String[] userInputSplitSecondLine = new String[2];
Scanner scan = new Scanner(System.in);
userInput = scan.nextLine();
userInputSplitFirstLine = userInput.split("\\s+");
userInput = scan.nextLine();
userInputSplitSecondLine = userInput.split("\\s+");
for(String firstLineSplitted: userInputSplitFirstLine) {
System.out.println(firstLineSplitted);
}
for(String secondLineSplitted: userInputSplitSecondLine) {
System.out.println(secondLineSplitted);
}
scan.close();
}
}
If you try the sample input above, the output will match the sample output above. However, if you write more than 3 words to the first line and/or more than 2 words to the second line, the userInputSplitFirstLine array of size 3 will store more than 3 words. Same goes with the userInputSplitSecondLine array also. My first question is how can an array of size 3 (userInputSplitFirstLine) and an array of size 2 (userInputSplitSecondLine) can hold more than 3 and 2 elements, respectively? My second question is that how can I restrict/limit the number of words that the user can insert in a line; for example, the first line only accepts 3 words and the second line only accepts 2 words?
Also the answer to this question suggested by Hyperskill.com is as follows:
import java.util.Scanner;
class Main {
public static void main(String[] args) {
Scanner scanner = new Scanner(System.in);
String wordOne = scanner.next();
String wordTwo = scanner.next();
String wordThree = scanner.next();
String wordFour = scanner.next();
String wordFive = scanner.next();
System.out.println(wordOne);
System.out.println(wordTwo);
System.out.println(wordThree);
System.out.println(wordFour);
System.out.println(wordFive);
}
}
You can use next method of scanner object to read string and then it can be printed easily on new line.
while(true){
if(scanner.hasNext()){
System.out.println(scanner.next());
}
else{
break;
}
}
I think this should do the work. Don't hesitate to ask, if you have some questions.
import java.util.Scanner;
class App {
public static void main(String[] args) {
final StringBuffer line = new StringBuffer();
final StringBuffer words = new StringBuffer();
try (final Scanner sc = new Scanner(System.in)) {
while (sc.hasNextLine()) {
final String currentLine = sc.nextLine();
line.append(currentLine).append(System.lineSeparator());
for (final String word : currentLine.split("\\s+")) {
words.append(word).append(System.lineSeparator());
}
}
} finally {
System.out.println(line.toString());
System.out.println();
System.out.println(words.toString());
}
}
}
My first question is how can an array of size 3 (userInputSplitFirstLine) and an array of size 2 (userInputSplitSecondLine) can hold more than 3 and 2 elements, respectively?
The array here:
String[] userInputSplitFirstLine = new String[3];
is not the same one as the one you got from split:
userInputSplitFirstLine = userInput.split("\\s+");
When you do the above assignment, the old array that was in there is basically "overwritten", and now userInputSplitFirstLine refers to this new array that has a length independent of what the old array had. split always return a new array.
My second question is that how can I restrict/limit the number of words that the user can insert in a line; for example, the first line only accepts 3 words and the second line only accepts 2 words?
It really depends on what you mean by "restrict". If you just want to check if there are exactly three words, and if not, exit the program, you can do this:
userInputSplitFirstLine = userInput.split("\\s+");
if (userInputSplitFirstLine.length != 3) {
System.out.println("Please enter exactly 3 words!");
return;
}
You can do something similar with the second line.
If you want the user to be unable to type more than 3 words, then that's impossible, because this is a command line app.
By the way, the code in the suggested solution works because next() returns the next "word" (or what we generally think of as a word, anyway) by default.
hope this will help you!
public class pratice1 {
public static void main (String[]args) {
Scanner sc = new Scanner(System.in);
String input = sc.nextLine();
String input1 = sc.nextLine();
char[]a =input.toCharArray();
char[]a1 = input1.toCharArray();
System.out.println(input +""+ input1);
int a2=0;
if(input!=null) {
for(int i=0;i<input.length();i++) {
if(a[i]==' ') {
a2=i;
for(int j=0;j<a2;j++) {
System.out.println(a[i]);
a2=0;
}
}
else System.out.print(a[i]);
}System.out.println("");
for(int i=0;i<input1.length();i++) {
if(a1[i]==' ') {
a2=i;
for(int j=0;j<a2;j++) {
System.out.println(a1[i]);
a2=0;
}
}
else System.out.print(a1[i]);
}
}
}
}
To solve the problem:
Write a program that reads five words from the standard input and
outputs each word in a new line.
This was my solution:
while(scanner.hasNext()){
System.out.println(scanner.next());
}
I have a java program that reads a txt file and counts the words in that file. I setup my program so the String read from the txt file is saved as an ArrayList, and my variable word contains that ArrayList. The issue with my code is that my if statement does not seem to add a value to my count variable each time it detects space in the word string, it seems to only run the if statement once. How can I make it so the if statement finds a space, adds a +1 to my counter value, removes the space, and looks for the next space in the word variable's string? Here is the code:
import java.io.*;
import java.util.*;
public class FrequencyCounting
{
public static void main(String[] args) throws FileNotFoundException
{
// Read-in text from a file and store each word and its
// frequency (count) in a collection.
Scanner inputFile = new Scanner(new File("phrases.txt"));
String word= " ";
Integer count = 0;
List<String> ma = new ArrayList<String>();
while(
inputFile.hasNextLine()) {
word = word + inputFile.nextLine() + " ";
}
ma.add(word);
System.out.println(ma);
if(word.contains(" ")) {
ma.remove(" ");
count++;
System.out.println("does contain");
}
else {
System.out.println("does not contain");
}
System.out.println(count);
//System.out.println(ma);
inputFile.close();
// Output each word, followed by a tab character, followed by the
// number of times the word appeared in the file. The words should
// be in alphabetical order.
; // TODO: Your code goes here.
}
}
When I execute the program, I get a value of 1 for the variable count and I get a returned string representation of the txt file from my phrases.txt
phrases.txt is :
my watch fell in the water
time to go to sleep
my time to go visit
watch out for low flying objects
great view from the room
the world is a stage
the force is with you
you are not a jedi yet
an offer you cannot refuse
are you talking to me
Your if statement is not inside any loop, so it will only execute once.
A better approach, which would save a shit ton of runtime, is to read each line like you already do, use the String.split() method to split it on spaces, then add each element of the returned String[] to your list by using the ArrayList.addAll() method (if that one exist, otherwise (optionally, ensure the capacity and) add the elements one by one).
Then count by using the ArrayList.size() method to get the number of elements.
Based on the comments in your code :
// Read-in text from a file and store each word and its
// frequency (count) in a collection.
// Output each word, followed by a tab character, followed by the
// number of times the word appeared in the file. The words should
// be in alphabetical order.
My understanding is that you need to store count for every word, rather having a total count of words. For storing count for every word which should be stored itself in alphabetical order, it is better to go with a TreeMap.
public static void main(String[] args) {
Map<String, Integer> wordMap = new TreeMap<String, Integer>();
try {
Scanner inputFile = new Scanner(new File("phrases.txt"));
while(inputFile.hasNextLine()){
String line = inputFile.nextLine();
String[] words = line.split(" ");
for(int i=0; i<words.length; i++){
String word = words[i].trim();
if(word.length()==0){
continue;
}
int count = 0;
if(wordMap.containsKey(word)){
count = wordMap.get(word);
}
count++;
wordMap.put(word, count);
}
}
inputFile.close();
for(Entry<String,Integer> entry : wordMap.entrySet()){
System.out.println(entry.getKey()+"\t"+entry.getValue());
}
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
What is your goal here ? Do you just want to read the file and count numbers of words?
You need to use a while loop instead of an if statement that'll just run once. Here's a better way to do what you want to do:
Scanner inputFile = new Scanner(new File("phrases.txt"));
StringBuilder sb = new StringBuilder();
String line;
int totalCount = 0;
while(inputFile.hasNextLine()) {
line = inputFile.nextLine();
sb.append(line).append("\n"); // This is more efficient than concatenating strings
int spacesOnLine = countSpacesOnLine(line);
totalCount += spacesOnLine;
// print line and spacesOnLine if you wish to here
}
// print text file
System.out.println(sb.toString());
// print total spaces in file
System.out.println("Total spaces" + totalCount);
inputFile.close();
Then add a method that counts the spaces on a line:
private int countSpacesOnLine(String line) {
int totalSpaces = 0;
for(int i = 0; i < line.length(); i++) {
if (line.charAt(i) == ' ')
totalSpaces += 1;
}
return totalSpaces;
}
You can achieve your objective with the following one liner too:
int words = Files.readAllLines(Paths.get("phrases.txt"), Charset.forName("UTF-8")).stream().mapToInt(string -> string.split(" ").length).sum();
probably I am late, but here is c# simple version:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
namespace StackOverflowAnswers
{
class Program
{
static void Main(string[] args)
{
string contents = File.ReadAllText(#"C:\temp\test.txt");
var arrayString = contents.Split(' ');
Console.WriteLine("Number of Words {0}", arrayString.Length);
Console.ReadLine();
}
}
}
I have 100 text files. 50 of them are called text_H and the other are called text_T. What I would like to do is the following open two text files text_T_1 and text_H_1 and find the number of common words and write it to a text file then open text_H_2 and text_T_2 and find the number of common words....then open text_H_50 and text_T_50 and find the number of common words.
I have written the following code that open two text files and find common words and return the the number of common words between the the two files. The results are written in text file
For whatever reason instead of giving me the number of common word for just the open text files, it gave me the number of of common words for all files. For the example if the number of common words between fileA_1 and fileB_1 is 10 and the number of common words between fileA_2 and fileB_2 is 5, then result I get for number of common word for the second two files is 10+5=15.
I'm hoping someone here can catch whatever it is that I'm missing, because I've been through this code many times now without success. Thanks ahead of time for any help!
The code:
package xml_test;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Scanner;
public class app {
private static ArrayList<String> load(String f1) throws FileNotFoundException
{
Scanner reader = new Scanner(new File(f1));
ArrayList<String> out = new ArrayList<String>();
while (reader.hasNext())
{
String temp = reader.nextLine();
String[] sts = temp.split(" ");
for (int i = 0;i<sts.length;i++)
{
if(sts[i] != "" && sts[i] != " " && sts[i] != "\n")
out.add(sts[i]);
}
}
return out;
}
private static void write(ArrayList<String> out, String fname) throws IOException
{
FileWriter writer = new FileWriter(new File(fname));
//int count=0;
int temp1=0;
for (int ss= 1;ss<=3;ss++)
{
int count=0;
for (int i = 0;i<out.size();i++)
{
//writer.write(out.get(i) + "\n");
//writer.write(new Integer(count).toString());
count++;
}
writer.write("count ="+new Integer(temp1).toString()+"\n");
}
writer.close();
}
public static void main(String[] args) throws IOException
{
ArrayList<String> file1;
ArrayList<String> file2;
ArrayList<String> out = new ArrayList<String>();
//add for loop to loop through all T's and H's
for(int kk = 1;kk<=3;kk++)
{
int count=0;
file1 = load("Training_H_"+kk+".txt");
file2 = load("Training_T_"+kk+".txt");
//int count=1;
for(int i = 0;i<file1.size();i++)
{
String word1 = file1.get(i);
count=0;
//System.out.println(word1);
for (int z = 0; z <file2.size(); z++)
{
//if (file1.get(i).equalsIgnoreCase(file2.get(i)))
if (word1.equalsIgnoreCase(file2.get(z)))
{
boolean already = false;
for (int q = 0;q<out.size();q++)
{
if (out.get(q).equalsIgnoreCase(file1.get(i)))
{
count++;
//System.out.println("count is "+count);
already = true;
}
}
if (already==false)
{
out.add(file1.get(i));
}
}
}
//write(out,"output_"+kk+".txt");
}
//count=new Integer(count).toString();
//write(out,"output_"+kk+".txt");
//write(new Integer(count).toString(),"output_2.txt");
//System.out.println("count is "+count);
}//
}
}
Let me show you what your code is doing and see if you can spot the problem.
List wordsInFile1 = getWordsFromFile();
List wordsInFile2 = getWordsFromFile();
List foundWords = empty;
//Does below for each compared file
for each word in file 1
set count to 0
compare to each word in file 2
if the word matches see if it's also in foundWords
if it is in foundWords, add 1 to count
otherwise, add the word to foundWords
//Write the number of words
prints out the number of words in foundWords
Hint: The issue is with foundWords and where you are adding to count. arunmoezhi's comment is on the right track, as well as board_reader's point #3 in his answer.
As it stands now, your code is doing nothing meaningful with any of the count variables
use more meaningful variable names in loops, makes code readable.
use HashMap-s instead of ArrayList-s, will make code smaller, faster and a lot easier. will use less memory too in case words are repeated several times in files.
should not you increase count in already==false case?
could not figure out point of calculating count 3 times in write method, is not count equal to out.size()?
probably there are more too...
Hi I'm in a programming class over the summer and am required to create a program that reads input from a file. The input file includes DNA sequences ATCGAGG etc and the first line in the file states how many pairs of sequences need to be compared. The rest are pairs of sequences. In class we use the Scanner method to input lines from a file, (I read about bufferedReader but we have not covered it in class so not to familiar with it) but am lost on how to write the code on how to compare two lines from the Scanner method simultaneously.
My attempt:
public static void main (String [] args) throws IOException
{
File inFile = new File ("dna.txt");
Scanner sc = new Scanner (inFile);
while (sc.hasNextLine())
{
int pairs = sc.nextLine();
String DNA1 = sc.nextLine();
String DNA2 = sc.nextLine();
comparison(DNA1,DNA2);
}
sc.close();
}
Where the comparison method would take a pair of sequences and output if they had common any common characters. Also how would I proceed to input the next pair, any insight would be helpful.. Just stumped and google confused me even further. Thanks!
EDIT:
Here's the sample input
7
atgcatgcatgc
AtgcgAtgc
GGcaAtt
ggcaatt
GcT
gatt
aaaaaGTCAcccctccccc
GTCAaaaaccccgccccc
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
gctagtacACCT
gctattacGcct
First why you are doing:
while (sc.hasNextLine())
{
int pairs = sc.nextLine();
While you have pairs only in one line not pairs and two lines of input, but number of lines once? Move reading pairs from that while looop and parse it to int, then it does not matter but you could use it to stop reading lines if you know how many lines are there.
Second:
throws IOException
Might be irrelevant but, really you don't know how to do try catch and let's say skip if you do not care about exceptions?
Comparision, if you read strings then string has method "equals" with which you can compare two strings.
Google will not help you with those problems, you just don't know it all, but if you want to know then search for basic stuff like type in google "string comparision java" and do not think that you can find solution typing "Reading two lines from an input file using Scanner" into google, you have to go step by step and cut problem into smaller pieces, that is the way software devs are doing it.
Ok I have progz that somehow wokrked for me, just finds the lines that have something and then prints them out even if I have part, so it is brute force which is ok for such thing:
import java.io.File;
import java.io.IOException;
import java.util.Scanner;
public class program
{
public static void main (String [] args) throws IOException
{
File inFile = new File ("c:\\dna.txt");
Scanner sc = new Scanner (inFile);
int pairs = Integer.parseInt(sc.nextLine());
for (int i = 0; i< pairs-1; i++)
{
//ok we have 7 pairs so we do not compare everything that is one under another
String DNA1 = sc.nextLine();
String DNA2 = sc.nextLine();
Boolean compareResult = comparison(DNA1,DNA2);
if (compareResult){
System.out.println("found the match in:" + DNA1 + " and " + DNA2) ;
}
}
sc.close();
}
public static Boolean comparison(String dna1, String dna2){
Boolean contains = false;
for (int i = 0; i< dna1.length(); i++)
{
if (dna2.contains(dna1.subSequence(0, i)))
{
contains = true;
break;
}
if (dna2.contains(dna1.subSequence(dna1.length()-i,dna1.length()-1 )))
{
contains = true;
break;
}
}
return contains;
}
}
I have an assignment that pretty much has me stumped early on, the remainder of which is fairly easy (sorting the data once its imported and then saving it again under a different name).
We need to import data from a .txt file into 3 separate Arrays ( name, mascot, alias ) however the lines are not consistent. By consistent I mean one line may have:
Glebe,G Shield,Glebe District
While another line may have:
St George,Knight & Dragon,Saints,Dragons,St George Illawarra
Everything before the first , belongs to the name array.
Everything after the first , but before the second , belongs to the mascot array.
Everything after the second , till the end of the line belongs to the alias array.
I've been able to work out how to import the .txt file where it contains the entire line, which I was then able to convert into importing everything before a "," and new line (using Delimiters). However the lines that contain more then 3 sets of data ruin the import as the alias array only ends up holding 1 not everything else.
Thus does anyone know of and can show me a code that pretty much does:
name = Everything before the first ,
Mascot = Everything after the first , but before the second ,
Alias = Everything after the second , till the end of the line
That I could use as a base to work into mine?
After a day of research I've constantly come up with dead ends. They all generally involve splitting up at each comma but that breaks the import (lines with more then 1 alias, the second alias is put into the name array, ect)
This is the code I came up with that imports the entire line into an array:
public static void LoadData() throws IOException
{
String clubtxt = ("NRLclubs.txt");
String datatxt = ("NRLdata.txt");
int i, count;
File clubfile = new File(clubtxt);
File datafile = new File(datatxt);
if (clubfile.exists())
{
count = 0;
Scanner inputFile = new Scanner(clubfile);
i = 0;
while(inputFile.hasNextLine())
{
count++;
inputFile.nextLine();
}
String [] teamclub = new String[count];
inputFile.close();
inputFile = new Scanner(clubfile);
while(inputFile.hasNext())
{
teamclub[i] = inputFile.nextLine();
System.out.println(teamclub[i]);
i++;
}
inputFile.close();
}
else
{
System.out.println("\n" + "The file " + clubfile + " does not exist." + "\n");
}
if (datafile.exists())
{
count = 0;
Scanner inputFile = new Scanner(datafile);
i = 0;
while(inputFile.hasNextLine())
{
count++;
inputFile.nextLine();
}
String [] teamdata = new String[count];
inputFile.close();
inputFile = new Scanner(datafile);
while(inputFile.hasNext())
{
teamdata[i] = inputFile.nextLine();
System.out.println(teamdata[i]);
i++;
}
inputFile.close();
}
else
{
System.out.println("\n" + "The file " + datafile + " does not exist." + "\n");
}
}
Look at String.split method with the parameter limit.
When you have your input line in a variable called line, you can can call
String[] tokens = line.split(',', 3);
This will split the line on the commas, while making sure that it will not return more than 3 tokens. It returns an array of String in which the first element will be what is before the first comma, the second will be what is between the first and second commas, and the third element will be what is after the second comma.
Since you only want to parse on the first 2 commas, you can use String split with a limit.
If you prefer, you can use the String indexOf method to find the first 2 commas, then use the String substring method to get the characters between the commas.
You want to be able to handle a line with one comma, or no commas at all.
Here's one way to parse the String line
public List<String> splitLine(String line) {
List<String> list = new ArrayList<String>();
int firstPos = line.indexOf(",");
int secondPos = line.indexOf(",", firstPos + 1);
if (firstPos >= 0) {
if (secondPos >= 0) {
list.add(line.substring(0, firstPos));
list.add(line.substring(firstPos + 1, secondPos));
list.add(line.substring(secondPos + 1));
} else {
list.add(line.substring(0, firstPos));
list.add(line.substring(firstPos + 1));
list.add("");
}
} else {
list.add(line);
list.add("");
list.add("");
}
return list;
}
You can use the String.split method.
String line = // the line you read here
// Split on commas but only make three elements
String[] elements = line.split(',', 3);
// The first belongs to names
names[linecount] = elements[0];
// The second belongs to mascot
mascot[linecount] = elements[1];
// And the last belongs to aliases
aliases[linecount] = elements[2];
Try looking into the Pattern/Matcher stuff -- you need to come up with an appropriate regex.
http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html
Something like this might do it:
static final Pattern pattern = Pattern.compile("([^,]*),([^,]*),(*$)");
MatchResult result = pattern.matcher(line).toMatchResult();
if (result.groupCount() == 3) {
// Found the groups
name = result.group(0);
// etc..
} else {
// failed to match line
}
Basically what you want to do is split each line into an array as you read it in, and then parse the data line by line. Something like this (pseudocode):
Scanner inputFile = new Scanner(datafile);
while(inputFile.hasNextLine()) {
String line = inputFile.nextLine();
String[] lineSplit = line.split(",");
//TODO: make sure lineSplit is at least 3 long.
String name = lineSplit[0];
String mascot = lineSplit[1];
//EDIT: Don't just get the last element, get everything after the first two.
// You can do this buy just getting the substring of the length of those two strings
// + 2 to account for commas.
//String alias = lineSplit[lineSplit.length() - 1];
String alias = line.substring(name.length() + mascot.length() + 2);
//If you need to do trimming on the strings to remove extra whitespace, do that here:
name = name.trim();
mascot = mascot.trim();
alias = alias.trim();
//TODO: add these into the arrays you need.
}
Hope this helps.