compare one file with three other files

compare one file with three other files - java

i write a bigger program in java for a poem analysis. Now i have one text file with a poem and three text files with words lists. I want to compare my poem with the three different word lists. So my program should say for example: this poem have three words from the list: MomentoMori, 0 words from the list vanitas and o words from the list carpe diem.
My Problem is: i know how to read files in java BUT i don´t know how to compare.
I thought that i should convert the text files into strings and then compare, but don´t know how? I did soemthing but it´s only compare the words of the first line from the poem with the first word from the word list.
Can anybody help me ? It´s only one percent from my program, but this step is very important.
Thank you all

Related

How can I find if a word in a text file also present in another text file?

I have been asked to find whether any of the words in a text file are present in another text file. One of the files contains 10 random words on separate lines which form a 10x10 grid whereas the other file contains 2656 words on separate lines. I need to find if any of the words from the 2656 are a part of any of the 10. It is almost like a word search where I need to compare each of the 2656 words to every line and column of the 10 words in the other text file.
I know how to import the text files and use "try" and "catch" to do so. I am stuck from here. I know that I need to create two loops, one for the rows and one for the columns. I also need to convert each of the 10 lines of words into strings so I can use "String.contains(string)" to compare each of the 2656 words to the 10 lines of words to see if they match. The 10 words can be thought of a grid where it is 10x10.
An example of the 10 words might be:
fndgsdgawe
fjshellofj
fslkdfmkls
sfmkbyefkf
fsmflsfmkl
sfmJavadfm
smfjknmfkj
gjforloopj
mgfslgmlgs
gsnmgkjnsg
An example of the 2656 words might be:
Hello
Bye
Java
ForLoop
NestedLoop
As the output, I need to do it in the format of:
Hello: row 1, position 3
The "matching word" will be the word that matches in both files, the row number will correspond to the row of the 10x10 grid of words that it's found in and the same with the position which can be thought of as the column. Both the row and position start from index 0. I need to use trim() to remove trailing and leading spaces and all occurrences that happen more than once need to be output on a separate line. I am very new at coding and understand the logic behind working it out but I am unable to write it. Can you please help a beginner out? Thanks!

How to properly manage what has been read from a .txt

I've been currently assigned to read from a .txt, and make a structure with what I've read. This is an example of what I should read:
Name1$Surname1$Programming&5.0#Mathematics&6.5#Algebra&7.2#History&6.7#Biology&6.9
I have no problems whatsoever when it comes to read the first two strings, however, from that point on i don't know how to manage, in order to properly split it and make new objects with them.
Any tips/ suggestions on how to do it pls?

Weird structure.
Split at '$'
First element of that split is the Name, second the Surname.
Split the third element at '#'.
Split each element of the result of step 3 again at '&' to get course and grade.
See here how to split strings.

Finding the 5 most common words in a text

I have a school task where a part of the task asks us to make a method that finds the 5 most commons words in a .txt file.
The task asks us to put all the words in an ArrayList, which i have already done. The real problem is making the program print out the top 5 words in the text file.
The only "clue" i have is the method name which is:
public Words[] common5(){
}

Iterate through the ArrayList, for each word in the list, put the word into a HashMap where the key is the word and the value is an Integer which you will increase every time you find the word again. At the end iterate through the HashSet and find the top 5 integers. Print those found words.

Searching through an a text file for a string?

I have a text file with thousands and thousands of lines of gibberish, Hidden somewhere inside is a string of words in english.
What would be the most efficient way to search through the text without having to read it line by line?
Is there a script I could write to read through the file?
I can post the file if anyones interested?
edit: If someone would be willing to show me how to check for words with a BufferedReader in Java that would be really cool!

If you know nothing more than that there is one streak of valid english words somewhere in the file, you will have to read in the file and check each word against a set of valid words (dictionary). On the first hit, you continue to read in the file until the first non-valid word occurs.
This assumes there are no accidentally valid words within the gibberish. In that case, you would have to find all streaks of valid words, and then probably have a human (you) decide which is the right one.
edit: another thing you can do is define a minimum streak length n if you know that the string of words you are looking for consists of a minimum on n valid words. This could at least spare you dealing with all the false positive 1-word-streaks of single accidentally valid words within the gibberish.

Java parsing text file

I need to write a parser for textfiles (at least 20 kb), and I need to determine if words out of a set of words appear in this textfile (about 400 words and numbers). So I am looking for the most efficient possibilitie to do this (if a match is found, i need to do some further processing of this and it's previous line).
What I currently do, is to exclude lines that do not contain any information for sure (kind of metadata lines) and then compare word by word - but i don't think that only comparing word by word is the most efficient possibility.
Can anyone please provide some tips/hints/ideas/...
Thank you very much

It depends on what you mean with "efficient".
If you want a very straightforward way to code it, keep in mind that the String object in java has method String.contains(CharSequence sequence).
Then, you could put the file content into a String and then iterate on your keywords you want to check to see if any of those appear in String, using the method contains().

How about the following:
Put all your keywords in a HashSet (Set<String> keywords;)
Read the file one line at once
For each line in file:
Tokenize to words
For each word in line:
If word is contained in keywords (keywords.containes(word))
Process actual line
If previous line is available
Process previous line
Keep track of previous line (prevLine = line;)

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.