I am sorry if there is something very simple I am missing.
I have the following code:
import java.util.Scanner;
import java.io.File;
import java.util.regex.Pattern;
public class UnJumble
{
String[] ws;
int ind=0;
public static void main(String args[]) throws Exception
{
System.out.println("Enter a jumbled word");
String w = new Scanner(System.in).next();
UnJumble uj = new UnJumble();
uj.ws = new String[uj.fact(w.length())];
uj.makeWords("",w);
int c=1;
Scanner sc = new Scanner(new File("dict.txt"));
for(int i=0; i<uj.ws.length; i++)
{
Pattern pat = Pattern.compile(uj.ws[i].toUpperCase());
if(sc.hasNext(pat))
System.out.println(c+++" : \'"+uj.ws[i]+"\'");
}
System.out.println("Search Completed.");
if(c==1) System.out.println("No word found.");
}
public void makeWords(String p,String s)
{
if(s.length()==0)
ws[ind++] = p;
else
for(int i=0; i<s.length(); i++)
makeWords(p+s.charAt(i),s.substring(0,i)+s.substring(i+1));
}
public int fact(int n)
{
if(n==0) return 1;
else return n*fact(n-1);
}
}
The dict.txt file is the SOWPODS dictionary, which is the official Scrabble dictionary..
I want to take in a jumbled word, and rearrange it to check if it is present in the dictionary. If it is, then print it out.
When I try tra as input, the output says No word Found..
But the output should have the words tar, art and rat.
Please tell me where I am making a mistake. I apologize if I have made a very simple mistake, because this is the first time I am working with Pattern.
This is from the JavaDoc of Scanner.hasNext(Pattern pattern) (with my highlighting)
Returns true if the next complete token matches the specified pattern.
As your Scanner was initialized with file dict.txt, it is positioned on first word.
And the first complete token in dict.txt does not match any of your scambled words, so no match is found.
Note: This assumes you have one word per line
I'd think you may want to change your code to find your scrambled text somewhere in the dictionary file (with start-of-line before and end-of-line after) resulting in a pattern "(^|\\W)"+uj.ws[i].toUpperCase()+"(\\W|$)" and something like
String dictstring = your dictionary as one string;
Matcher m = p.matcher(dictstring);
if(m.find()) {
...
I recommend IOUtils.toString() for reading your file like this:
String dictstring = "";
try(InputStream is = new FileInputStream("dict.txt")) {
dictstring = IOUtils.toString(is);
}
Here's a small example code to get familiar with pattern and matcher:
String dictString= "ONE\r\nTWO\r\nTHREE";
Pattern p = Pattern.compile("(^|\\W)TWO(\\W|$)");
Matcher m = p.matcher(dictString);
if(m.find()) {
System.out.println("MATCH: " + m.group());
}
Related
I need to figure out a way to turn an input file into a list of sentences which are delimited by more than one character, or more specifically, periods and exclamation points (! or .)
My input file has a layout similar to this:
Sample textfile!
A man, l, a ballot, a catnip, a pooh, a rail, a calamus, a dairyman, a bater, a canal - Panama!
This is a sentence! This one also.
Heres another one?
Yes another one.
How can I put that file into a list sentence by sentence?
Each sentence in my file is finished once a ! or . character is passed.
There are a decent amount of ways to accomplish what you are asking, but here is one way to read a file into a program and split each line by specific delimiters into a list, while still keeping the delimiters in the sentence.
All of the functionality for turning a file to a list based on multiple delimiters can be found in the turnSentencesToList() method
In my example below I split by: ! . ?
import java.io.File;
import java.io.FileNotFoundException;
import java.util.LinkedList;
import java.util.List;
import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test{
public static void main(String [] args){
LinkedList<String> list = turnSentencesToList("sampleFile.txt");
for(String s: list)
System.out.println(s);
}
private static LinkedList<String> turnSentencesToList(String fileName) {
LinkedList<String> list = new LinkedList<>();
String regex = "\\.|!|\\?";
File file = new File(fileName);
Scanner scan = null;
try {
scan = new Scanner(file);
while(scan.hasNextLine()){
String line = scan.nextLine().trim();
String[] sentences = null;
//we don't need empty lines
if(!line.equals("")) {
//splits by . or ! or ?
sentences = line.split("\\.|!|\\?");
//gather delims because split() removes them
List<String> delims = getDelimiters(line, regex);
if(sentences!=null) {
int count = 0;
for(String s: sentences) {
list.add(s.trim()+delims.get(count));
count++;
}
}
}
}
} catch (FileNotFoundException e) {
e.printStackTrace();
return null;
}finally {
if(scan!=null)
scan.close();
}
return list;
}
private static List<String> getDelimiters(String line, String regex) {
//this method is used to provide a list of all found delimiters in a line
List<String> allDelims = new LinkedList<String>();
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(line);
String delim = null;
while(matcher.find()) {
delim = matcher.group();
allDelims.add(delim);
}
return allDelims;
}
}
Based on your example input file, the produced output would be:
Sample textfile!
A man, l, a ballot, a catnip, a pooh, a rail, a calamus, a dairyman, a bater, a canal - Panama!
This is a sentence!
This one also.
Heres another one?
Yes another one.
I've implemented a program that does the following:
scan all of the words in a web page into a string (using jsoup)
Filter out all of the HTML markup and code
Put these words into a spell checking program and offer suggestions
The spell checking program loads a dictionary.txt file into an array and compares the string input to the words inside the dictionary.
My current problem is that when the input contains the same word multiple times, such as "teh program is teh worst", the code will print out
You entered 'teh', did you mean 'the'?
You entered 'teh', did you mean 'the'?
Sometimes a website will have multiple words over and over again and this can become messy.
If it's possible, printing the word along with how many times it was spelled incorrectly would be perfect, but putting a limit to each word being printed once would be good enough.
My program has a handful of methods and two classes, but the spell checking method is below:
Note: the original code contains some 'if' statements that remove punctuation marks but I've removed them for clarity.
static boolean suggestWord;
public static String checkWord(String wordToCheck) {
String wordCheck;
String word = wordToCheck.toLowerCase();
if ((wordCheck = (String) dictionary.get(word)) != null) {
suggestWord = false; // no need to ask for suggestion for a correct
// word.
return wordCheck;
}
// If after all of these checks a word could not be corrected, return as
// a misspelled word.
return word;
}
TEMPORARY EDIT: As requested, the complete code:
Class 1:
public class ParseCleanCheck {
static Hashtable<String, String> dictionary;// To store all the words of the
// dictionary
static boolean suggestWord;// To indicate whether the word is spelled
// correctly or not.
static Scanner urlInput = new Scanner(System.in);
public static String cleanString;
public static String url = "";
public static boolean correct = true;
/**
* PARSER METHOD
*/
public static void PageScanner() throws IOException {
System.out.println("Pick an english website to scan.");
// This do-while loop allows the user to try again after a mistake
do {
try {
System.out.println("Enter a URL, starting with http://");
url = urlInput.nextLine();
// This creates a document out of the HTML on the web page
Document doc = Jsoup.connect(url).get();
// This converts the document into a string to be cleaned
String htmlToClean = doc.toString();
cleanString = Jsoup.clean(htmlToClean, Whitelist.none());
correct = false;
} catch (Exception e) {
System.out.println("Incorrect format for a URL. Please try again.");
}
} while (correct);
}
/**
* SPELL CHECKER METHOD
*/
public static void SpellChecker() throws IOException {
dictionary = new Hashtable<String, String>();
System.out.println("Searching for spelling errors ... ");
try {
// Read and store the words of the dictionary
BufferedReader dictReader = new BufferedReader(new FileReader("dictionary.txt"));
while (dictReader.ready()) {
String dictInput = dictReader.readLine();
String[] dict = dictInput.split("\\s"); // create an array of
// dictionary words
for (int i = 0; i < dict.length; i++) {
// key and value are identical
dictionary.put(dict[i], dict[i]);
}
}
dictReader.close();
String user_text = "";
// Initializing a spelling suggestion object based on probability
SuggestSpelling suggest = new SuggestSpelling("wordprobabilityDatabase.txt");
// get user input for correction
{
user_text = cleanString;
String[] words = user_text.split(" ");
int error = 0;
for (String word : words) {
if(!dictionary.contains(word)) {
checkWord(word);
dictionary.put(word, word);
}
suggestWord = true;
String outputWord = checkWord(word);
if (suggestWord) {
System.out.println("Suggestions for " + word + " are: " + suggest.correct(outputWord) + "\n");
error++;
}
}
if (error == 0) {
System.out.println("No mistakes found");
}
}
} catch (IOException e) {
e.printStackTrace();
System.exit(-1);
}
}
/**
* METHOD TO SPELL CHECK THE WORDS IN A STRING. IS USED IN SPELL CHECKER
* METHOD THROUGH THE "WORD" STRING
*/
public static String checkWord(String wordToCheck) {
String wordCheck;
String word = wordToCheck.toLowerCase();
if ((wordCheck = (String) dictionary.get(word)) != null) {
suggestWord = false; // no need to ask for suggestion for a correct
// word.
return wordCheck;
}
// If after all of these checks a word could not be corrected, return as
// a misspelled word.
return word;
}
}
There is a second class (SuggestSpelling.java) which holds a probability calculator but that isn't relevant right now, unless you planned on running the code for yourself.
Use a HashSet to detect duplicates -
Set<String> wordSet = new HashSet<>();
And store each word of the input sentence. If any word already exist during inserting into the HashSet, don't call checkWord(String wordToCheck) for that word. Something like this -
String[] words = // split input sentence into words
for(String word: words) {
if(!wordSet.contains(word)) {
checkWord(word);
// do stuff
wordSet.add(word);
}
}
Edit
// ....
{
user_text = cleanString;
String[] words = user_text.split(" ");
Set<String> wordSet = new HashSet<>();
int error = 0;
for (String word : words) {
// wordSet is another data-structure. Its only for duplicates checking, don't mix it with dictionary
if(!wordSet.contains(word)) {
// put all your logic here
wordSet.add(word);
}
}
if (error == 0) {
System.out.println("No mistakes found");
}
}
// ....
You have other bugs as well like you are passing String wordCheck as argument of checkWord and re-declare it inside checkWord() again String wordCheck; which is not right. Please check the other parts as well.
I am having issues with my synonym map. I want to be able to search a text file for a keyword or a related word in the textfile then outputting the found sentence. so my program searches for the answers to questions based on the keyword or sunonym. the way my program works is by searching a text file for a keyword in the question and then outputting the answer to the question which is the next line after then question in the text file. When i search for the main keyword in a question the program works. But when i try to ask a question with the related word the program does not recognize the input. So for example if i enter "how is the major?" the answer to that question is on the next line which is "the major is difficult" but if i enter "how is the focus" the program does not recognize the related word focus Can someone help me find the issue which lies in searching for a related word also. Here is my text file
what is the textbook name?
the textbook name is Java
how is the major?
the major is difficult
how much did the shoes cost?
the shoes cost two dollars
how is the major when cramer took it?
when cramer took it, it was okay
how is the major when jar took it?
jar said it was fine
what is the color of my bag?
the color of my bag is blue
and here is my code
public static class DicEntry {
String key;
String[] syns;
Pattern pattern;
public DicEntry(String key, String... syns) {
this.key = key;
this.syns = syns;
pattern = Pattern.compile(".*(?:"
+ Stream.concat(Stream.of(key), Stream.of(syns))
.map(x -> "\\b" + Pattern.quote(x) + "\\b")
.collect(Collectors.joining("|")) + ").*");
}
}
public static void parseFile(String s) throws IOException {
List<DicEntry> synonymMap = populateSynonymMap(); // populate the map
File file = new File("data.txt");
Scanner scanner = new Scanner(file);
Scanner forget = new Scanner(System.in);
int flag_found = 0;
while (scanner.hasNextLine()) {
final String lineFromFile = scanner.nextLine();
for (DicEntry entry : synonymMap) { // iterate over each word of the
// sentence.
if (entry.pattern.matcher(s).matches()) {
if (lineFromFile.contains(entry.key)) {
//String bat = entry.key;
if(lineFromFile.contains(s)) {
String temp = scanner.nextLine();
System.out.println(temp);
}
}
}
}
}
}
private static List<DicEntry> populateSynonymMap() {
List<DicEntry> responses = new ArrayList<>();
responses.add(new DicEntry("bag", "purse", "black"));
responses.add(new DicEntry("shoe", "heels", "gas"));
responses.add(new DicEntry("major", "discipline", "focus", "study"));
return responses;
}
public static void getinput() throws IOException {
Scanner scanner = new Scanner(System.in);
String input = null;
/* End Initialization */
System.out.println("Welcome ");
System.out.println("What would you like to know?");
System.out.print("> ");
input = scanner.nextLine().toLowerCase();
parseFile(input);
}
public static void main(String args[]) throws ParseException, IOException {
/* Initialization */
getinput();
}
}
It would seem that after you pass
if (lineFromFile.contains(entry.key))
in your parseFile(String s) method, you would want to know if your user entered input contains any of the entry.syns and replace the synonym with the key
// This is case sensitive
boolean synonymFound = false;
for (String synonym : entry.syns) {
if (s.contains(synonym)) {
s = s.replace(synonym, entry.key)
break;
}
}
Since you want to stop searching once you find a match (exact or synonym match), you'll want to have a return statement to kick out of the method or use a flag to kick out of the while (scanner.hasNextLine())
if (lineFromFile.contains(s)) {
String temp = scanner.nextLine();
System.out.println(temp);
flag_found = 1;
System.out
.println(" Would you like to update this information ? ");
String yellow = forget.nextLine();
if (yellow.equals("yes")) {
// String black = scanner.nextLine();
removedata(temp);
} else if (yellow.equals("no")) {
System.out.println("Have a good day");
// break;
}
// Add return statment to end the search
return;
}
Results:
Importing a large list of words and I need to create code that will recognize each word in the file. I am using a delimiter to recognize the separation from each word but I am receiving a suppressed error stating that the value of linenumber and delimiter are not used. What do I need to do to get the program to read this file and to separate each word within that file?
public class ASCIIPrime {
public final static String LOC = "C:\\english1.txt";
#SuppressWarnings("null")
public static void main(String[] args) throws IOException {
//import list of words
#SuppressWarnings("resource")
BufferedReader File = new BufferedReader(new FileReader(LOC));
//Create a temporary ArrayList to store data
ArrayList<String> temp = new ArrayList<String>();
//Find number of lines in txt file
String line;
while ((line = File.readLine()) != null)
{
temp.add(line);
}
//Identify each word in file
int lineNumber = 0;
lineNumber++;
String delimiter = "\t";
//assess each character in the word to determine the ascii value
int total = 0;
for (int i=0; i < ((String) line).length(); i++)
{
char c = ((String) line).charAt(i);
total += c;
}
System.out.println ("The total value of " + line + " is " + total);
}
}
This smells like homework, but alright.
Importing a large list of words and I need to create code that will recognize each word in the file. What do I need to do to get the program to read this file and to separate each word within that file?
You need to...
Read the file
Separate the words from what you've read in
... I don't know what you want to do with them after that. I'll just dump them into a big list.
The contents of my main method would be...
BufferedReader File = new BufferedReader(new FileReader(LOC));//LOC is defined as class variable
//Create an ArrayList to store the words
List<String> words = new ArrayList<String>();
String line;
String delimiter = "\t";
while ((line = File.readLine()) != null)//read the file
{
String[] wordsInLine = line.split(delimiter);//separate the words
//delimiter could be a regex here, gotta watch out for that
for(int i=0, isize = wordsInLine.length(); i < isize; i++){
words.add(wordsInLine[i]);//put them in a list
}
}
You can use the split method of the String class
String[] split(String regex)
This will return an array of strings that you can handle directly of transform in to any other collection you might need.
I suggest also to remove the suppresswarning unless you are sure what you are doing. In most cases is better to remove the cause of the warning than supress the warning.
I used this great tutorial from thenewboston when I started off reading files: https://www.youtube.com/watch?v=3RNYUKxAgmw
This video seems perfect for you. It covers how to save file words of data. And just add the string data to the ArrayList. Here's what your code should look like:
import java.io.*;
import java.util.*;
public class ReadFile {
static Scanner x;
static ArrayList<String> temp = new ArrayList<String>();
public static void main(String args[]){
openFile();
readFile();
closeFile();
}
public static void openFile(){
try(
x = new Scanner(new File("yourtextfile.txt");
}catch(Exception e){
System.out.println(e);
}
}
public static void readFile(){
while(x.hasNext()){
temp.add(x.next());
}
}
public void closeFile(){
x.close();
}
}
One thing that is nice with using the java util scanner is that is automatically skips the spaces between words making it easy to use and identify words.
My bad for the title, I am usually not good at making those.
I have a programme that will generate all permutations of an inputted word and that is supposed to check to see if those are words (checks dictionary), and output the ones that are. Really I just need the last the part and I can not figure out how to parse through a file.
I took out what was there (now displaying the "String words =") because it really made thing worse (was an if statement). Right now, all it will do is output all permutations.
Edit: I should add that the try/catch was added in when I tried turning the file in a list (as opposed to the string format which it is currently in). So right now it does nothing.
One more thing: is it possible (well how, really) to get the permutations to display permutations with lesser characters than entered ? Sorry for the bad wording, like if I enter five characters, show all five character permutations, and four, and three, and two, and one.
import java.util.List;
import java.util.Scanner;
import java.io.BufferedReader;
import java.io.File;
import java.io.InputStreamReader;
import java.io.IOException;
import org.apache.commons.io.FileUtils;
import static java.lang.System.out;
public class Permutations
{
public static void main(String[] args) throws Exception
{
out.println("Enter anything to get permutations: ");
Scanner scan = new Scanner(System.in);
String io = scan.nextLine();
String str = io;
StringBuffer strBuf = new StringBuffer(str);
mutate(strBuf,str.length());
}
private static void mutate(StringBuffer str, int index)
{
try
{
String words = FileUtils.readFileToString(new File("wordsEn.txt"));
if(index <= 0)
{
out.println(str);
}
else
{
mutate(str, index - 1);
int currLoc = str.length()-index;
for (int i = currLoc + 1; i < str.length(); i++)
{
change(str, currLoc, i);
mutate(str, index - 1);
change(str, i, currLoc);
}
}
}
catch(IOException e)
{
out.println("Your search found no results");
}
}
private static void change(StringBuffer str, int loc1, int loc2)
{
char t1 = str.charAt(loc1);
str.setCharAt(loc1, str.charAt(loc2));
str.setCharAt(loc2, t1);
}
}
If each word in your file is actually on a different line, maybe you can try this:
BufferedReader br = new BufferedReader(new FileReader(file));
String line = null;
while ((line = br.readLine()) != null)
{
... // check and print here
}
Or if you want to try something else, the Apache Commons IO library has something called LineIterator.
An Iterator over the lines in a Reader.
LineIterator holds a reference to an open Reader. When you have finished with the iterator you should close the reader to free internal resources. This can be done by closing the reader directly, or by calling the close() or closeQuietly(LineIterator) method on the iterator.
The recommended usage pattern is:
LineIterator it = FileUtils.lineIterator(file, "UTF-8");
try {
while (it.hasNext()) {
String line = it.nextLine();
// do something with line
}
} finally {
it.close();
}