How Can I Read the Movie Names Only? [closed] - java

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have a data like this
1|Toy Story (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0
2|GoldenEye (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?GoldenEye%20(1995)|0|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0
3|Four Rooms (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Four%20Rooms%20(1995)|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0
4|Get Shorty (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Get%20Shorty%20(1995)|0|1|0|0|0|1|0|0|1|0|0|0|0|0|0|0|0|0|0
5|Copycat (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Copycat%20(1995)|0|0|0|0|0|0|1|0|1|0|0|0|0|0|0|0|1|0|0
and suppose the link part is in the same line with the movie names part.I am
only interested in movie numbers in the leftmost part and the movie names.
How can I read this file in Java and return like:
1|Toy Story
2|GoldenEye
Thanks for helping in advance.

Pretty easy, just split on " (" and remember to escape it using \\.
public static void main(String[] args) {
String result = movie("1|Toy Story (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0");
System.out.println(result); //prints 1|Toy Story
}
public static String movie(String movieString){
return movieString.split(" \\(")[0];
}

You can use regular expressions to extract the part that you want.
It is assumed that a movie title only contains word characters or spaces.
List<String> movieInfos = Arrays.asList(
"1|Toy Story (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0",
"2|GoldenEye (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?GoldenEye%20(1995)|0|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0",
"3|Four Rooms (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Four%20Rooms%20(1995)|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0",
"4|Get Shorty (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Get%20Shorty%20(1995)|0|1|0|0|0|1|0|0|1|0|0|0|0|0|0|0|0|0|0",
"5|Copycat (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Copycat%20(1995)|0|0|0|0|0|0|1|0|1|0|0|0|0|0|0|0|1|0|0"
);
Pattern pattern = Pattern.compile("^(\\d+)\\|([\\w\\s]+) \\(\\d{4}\\).*$");
for (String movieInfo : movieInfos) {
Matcher matcher = pattern.matcher(movieInfo);
if (matcher.matches()) {
String id = matcher.group(1);
String title = matcher.group(2);
System.out.println(String.format("%s|%s", id, title));
} else {
System.out.println("Unexpected data");
}
}

This works only if you have all the lines formated like that.
private static final String FILENAME = "pathToFile";
public static void main(String[] args) {
BufferedReader br = null;
FileReader fr = null;
ArrayList<String> output = new ArrayList<>();
try {
//br = new BufferedReader(new FileReader(FILENAME));
fr = new FileReader(FILENAME);
br = new BufferedReader(fr);
String currentLine;
while ((currentLine= br.readLine()) != null) {
String movie = currentLine.split(" \\(")[0];
output.add(movie);
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (br != null)
br.close();
if (fr != null)
fr.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
}

Considering the file format is the same as you have given, read the file line by line and for each read line, split it on the "(" parenthesis and print the first index in the resultant array obtained after the split operation.
static void readMovieNamesFromFile(String fileName) {
try (BufferedReader br = new BufferedReader(new FileReader(new File(fileName)))) {
String line;
while( (line = br.readLine()) != null){
System.out.println((line.split("\\(")[0]).trim());
}
} catch (IOException e) {
e.printStackTrace();
}
}

Assuming you are reading t.txt
File file = new File("t.txt");
try {
Scanner in = new Scanner(file);
while(in.hasNextLine())
{
String arr[] = in.nextLine().split("\\|");
if(arr.length > 1)
{
System.out.println(arr[0] +"|"+arr[1].split("\\(")[0]);
System.out.println();
}
}
} catch (FileNotFoundException e) {
e.printStackTrace();
}
Will give you as an output
1|Toy Story
2|GoldenEye
3|Four Rooms
4|Get Shorty
5|Copycat
There are 2 things which you have to take care in this.
(Here we assume we are reading the first line)
Split by |. Now since | is a meta character you have to use to escape it. Hence in.nextLine().split("\\|");
Now arr[0] will contain 1 and arr[2] will contain Toy Story (1995). So we split arr[2] via "(". you need the first match hence you can write it as arr[1].split("\\(")[0]) (you again have to escape it as "(" is also a metacharacter).
PS : if(arr.length > 1) this line is there to avoid blank new lines so that you don't end up with ArrayIndexOutOfBoundsException.

You can save data in String
For example
String name = //data of move
Then use if with is char
for(int i =0;i<name.lenght;i++)
{
if(name.charat(i).equals("(") //will read when it catch ( after name it will stop
{Break;}
Else
System.out.print("name.charat(i);
}
You can also fixt by other way

Related

Java: Searching for specific word in a text file

I've currently got a large text file with lots of the most popular names. I get the user to input a specific name and I'm currently trying to print the line that has the name. My problem is that if the user enters a name like Alex, every name that contains Alex like Alexander, Alexis, Alexia gets printed when I only want Alex to get printed. What can I do to "if(line.contains(name)){" to fix this.
The line contains info like the name, it's popularity ranking and number of people with that name
try {
line = reader.readLine();
while (line != null) {
if(line.contains(name)){
text += line;
line = reader.readLine();
}
line = reader.readLine();
}
}catch(Exception e){
System.out.println("Error");
}
System.out.println(text);
A shorthand would be to use Java8 Streams: Here is a look :
public class Test2 {
public static void main(String[] args) {
String fileName = "c://lines.txt";
String name = "nametosearch";
try (Stream<String> stream = Files.lines(Paths.get(fileName))) {
stream.filter(line -> line.contains(" " + name + " ")).forEach(System.out::println);
} catch (IOException e) {
e.printStackTrace();
}
}
}
You can use regex with a word boundary for this task:
final String regex = String.format("\\b%s\\b", name);
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(line);
matcher.find();
if( matcher.group(0).length() > 0 ) {
text += line;
line = reader.readLine();
}
line.equals(name)
Replace
line.contains(name)

Reading a text file (~90,000 words) and trying to add each word into an ArrayList of strings

My method read and prints the file, but I am having trouble adding each word to the ArrayList dict.
The reader reads the file one char at a time, so what I have written adds each char to dict: [c,a,t,d,o,g] when I want [cat,dog]. The text file has the words on their own line; how can I distinguish them?
My code so far:
public static List Dictionary() {
ArrayList <String> dict = new ArrayList <String>();
File inFile = new File("C:/Users/Aidan/Desktop/fua.txt");
FileReader ins = null;
try {
ins = new FileReader(inFile);
int ch;
while ((ch = ins.read()) != -1) {
System.out.print((char) ch);
dict.add((char) ch + "");
}
} catch (Exception e) {
System.out.println(e);
} finally {
try {
ins.close();
} catch (Exception e) {
}
}
return dict;
}
Please observe Java naming conventions, so readDictionary instead of Dictionary (which looks like a class name). Next, I would pass the fileName into the method (instead of hard-coding the path in your method). Instead of reinventing the wheel, I would use a Scanner. You can also use the try-with-resources instead of finally here (and the diamond operator). Like,
public static List<String> readDictionary(String fileName) {
List<String> dict = new ArrayList<>();
try (Scanner scan = new Scanner(new File(fileName))) {
while (scan.hasNext()) {
dict.add(scan.next());
}
} catch (Exception e) {
System.out.printf("Caught Exception: %s%n", e.getMessage());
e.printStackTrace();
}
return dict;
}
Alternatively, use a BufferedReader and split each word yourself. Like,
public static List<String> readDictionary(String fileName) {
List<String> dict = new ArrayList<>();
try (BufferedReader br = new BufferedReader(new FileReader(
new File(fileName)))) {
String line;
while ((line = br.readLine()) != null) {
if (!line.isEmpty()) {
Stream.of(line.split("\\s+"))
.forEachOrdered(word -> dict.add(word));
}
}
} catch (Exception e) {
System.out.printf("Caught Exception: %s%n", e.getMessage());
e.printStackTrace();
}
return dict;
}
But that is basically what the first example does.
Check out the answer here which shows how to use Scanner to get words from a file: Read next word in java.
Instead of printing out the words, you'd want to append them to an ArrayList.
As the read method of the FileReader can only read a single character at a time and that's not what you want, then I would suggest you use a Scanner to read the file.
ArrayList<String> dict = new ArrayList<>();
Scanner scanner = new Scanner(new File("C:/Users/Aidan/Desktop/fua.txt"));
while(scanner.hasNext()){
dict.add(scanner.next());
}
You can wrap your FileReader in a BufferedReader, which has a readLine() method that will get you an entire line (word) at a time. readLine() returns null when there are no more lines to read.

Buffer Reader code to read input file

I have a text file named "message.txt" which is read using Buffer Reader. Each line of the text file contains both "word" and "meaning" as given in this example:
"PS:Primary school"
where PS - word, Primary school - meaning
When the file is being read, each line is tokenized to "word" and "meaning" from ":".
If the "meaning" is equal to the given input string called "f_msg3", "f_msg3" is displayed on the text view called "txtView". Otherwise, it displays "f_msg" on the text view.
But the "if condition" is not working properly in this code. For example if "f_msg3" is equal to "Primary school", the output on the text view must be "Primary school". But it gives the output as "f_msg" but not "f_msg3". ("f_msg3" does not contain any unnecessary strings.)
Can someone explain where I have gone wrong?
try {
BufferedReader file = new BufferedReader(new InputStreamReader(getAssets().open("message.txt")));
String line = "";
while ((line = file.readLine()) != null) {
try {
/*separate the line into two strings at the ":" */
StringTokenizer tokens = new StringTokenizer(line, ":");
String word = tokens.nextToken();
String meaning = tokens.nextToken();
/*compare the given input with the meaning of the read line */
if(meaning.equalsIgnoreCase(f_msg3)) {
txtView.setText(f_msg3);
} else {
txtView.setText(f_msg);
}
} catch (Exception e) {
txtView.setText("Cannot break");
}
}
} catch (IOException e) {
txtView.setText("File not found");
}
Try this
............
meaning = meaning.replaceAll("\\s+", " ");
/*compare the given input with the meaning of the read line */
if(meaning.equalsIgnoreCase(f_msg3)) {
txtView.setText(f_msg3);
} else {
txtView.setText(f_msg);
}
............
Otherwise comment the else part, then it will work.
I don't see any obvious error in your code, maybe it is just a matter
of cleaning the string (i.e. removing heading and trailing spaces, newlines and so on) before comparing it.
Try trimming meaning, e.g. like this :
...
String meaning = tokens.nextToken();
if(meaning != null) {
meaning = meaning.trim();
}
if(f_msg3.equalsIgnoreCase(meaning)) {
txtView.setText(f_msg3);
} else {
txtView.setText(f_msg);
}
...
A StringTokenizer takes care of numbers (the cause for your error) and other "tokens" - so might be considered to invoke too much complexity.
String[] pair = line.split("\\s*\\:\\s*", 2);
if (pair.length == 2) {
String word = pair[0];
String meaning = pair[1];
...
}
This splits the line into at most 2 parts (second optional parameter) using a regular expression. \s* stands for any whitespace: tabs and spaces.
You could also load all in a Properties. In a properties file the format key=value is convention, but also key:value is allowed. However then some escaping might be needed.
ArrayList vals = new ArrayList();
String jmeno = "Adam";
vals.add("Honza");
vals.add("Petr");
vals.add("Jan");
if(!(vals.contains(jmeno))){
vals.add(jmeno);
}else{
System.out.println("Adam je už v seznamu");
}
for (String jmena : vals){
System.out.println(jmena);
}
try (BufferedReader br = new BufferedReader(new FileReader("dokument.txt")))
{
String aktualni = br.readLine();
int pocetPruchodu = 0;
while (aktualni != null)
{
String[] znak = aktualni.split(";");
System.out.println(znak[pocetPruchodu] + " " +znak[pocetPruchodu + 1]);
aktualni = br.readLine();
}
br.close();
}
catch (IOException e)
{
System.out.println("Nezdařilo se");
}
try (BufferedWriter bw = new BufferedWriter(new FileWriter("dokument2.txt")))
{
int pocetpr = 0;
while (pocetpr < vals.size())
{
bw.write(vals.get(pocetpr));
bw.append(" ");
pocetpr++;
}
bw.close();
}
catch (IOException e)
{
System.out.println("Nezdařilo se");
}

How to pass info to the constructor from a text file? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Suppose there're two classes: Exam and MainExam (contains a main method). class Exam has a constructor
public Exam(String firstName, String lastName, int ID)
The class MainExam reads data from a tex tfile. For example, the data can be:
John Douglas 57
How can one pass data to the constructor from a textfile?
Here is the code that reads the file (just in case you don't actually have it)
Scanner scanner = new Scanner(new File("C:\\somefolder\\filename.txt");
String data = scanner.nextLine();
Now, assuming your file lines are in the following format:
<FirstName> <LastName> <id>
without any whitespace in each element, you can use the regex " " to String#split your data
String[] arguments = data.split(" ");
and then pass them into the constructor (String, String, int)
String fn = data[0];
String ln = data[1];
int id = Integer.parse(data[2]);
new Exam(fn, ln, id);
You can refer the following snippet to store the contents of your text file in a string object:
BufferedReader br = null;
try {
String sCurrentLine;
br = new BufferedReader(new FileReader("C:\\testing.txt"));
while ((sCurrentLine = br.readLine()) != null) {
// System.out.println(sCurrentLine);
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (br != null)br.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
The contents of the file are in sCurrentLine object now. Using StringTokenizer you may separate the firstname, lastname and ID using space as delimiter. Hope this helps!!
You can use StringTokenizer to break the data into parts that were read by MainExam.
String str; //data read by MainExam, like: John Douglas 57
String[] values = new String[3]; // size acording to your example
StringTokenizer st = new StringTokenizer(str);
int i=0;
while (st.hasMoreTokens()) {
values[i++] = st.nextToekn();
}
Now you have the data separated in the array values.

java string matching from a large text file issue

I would like to implement a task of string matching from a large text file.
1. replace all the non-alphanumeric characters
2. count the number of a specific term in the text file. For example, matching term "tom". The matching is not case sensitive.so term "Tom" should me counted. However the term tomorrow should not be counted.
code template one:
try {
in = new BufferedReader(new InputStreamReader(new FileInputStream(inputFile));
} catch (FileNotFoundException e1) {
System.out.println("Not found the text file: "+inputFile);
}
Scanner scanner = null;
try {
while (( line = in.readLine())!=null){
String newline=line.replaceAll("[^a-zA-Z0-9\\s]", " ").toLowerCase();
scanner = new Scanner(newline);
while (scanner.hasNext()){
String term = scanner.next();
if (term.equalsIgnoreCase(args[1]))
countstr++;
}
}
} catch (IOException e) {
e.printStackTrace();
}
code template two:
try {
in = new BufferedReader(new InputStreamReader(new FileInputStream(inputFile));
} catch (FileNotFoundException e1) {
System.out.println("Not found the text file: "+inputFile);
}
Scanner scanner = null;
try {
while (( line = in.readLine())!=null){
String newline=line.replaceAll("[^a-zA-Z0-9\\s]", " ").toLowerCase();
String[] strArray=newline.split(" ");//split by blank space
for (int =0;i<strArray.length;i++)
if (strArray[i].equalsIgnoreCase(args[1]))
countstr++;
}
}
} catch (IOException e) {
e.printStackTrace();
}
By running the two codes, I get the different results, the Scanner looks like to get the right one.But for the large text file, the Scanner runs much more slower than the latter one. Anyone who can tell me the reason and give a much more efficient solution.
In your first approch. You dont need to use two scanner. Scanner with "" is not good choice for the large line.
your line is already Converted to lowercase. So you just need to do lowercase of key outside once . And do equals in loop
Or get the line
String key = String.valueOf(".*?\\b" + "Tom".toLowerCase() + "\\b.*?");
Pattern p = Pattern.compile(key);
word = word.toLowerCase().replaceAll("[^a-zA-Z0-9\\s]", "");
Matcher m = p.matcher(word);
if (m.find()) {
countstr++;
}
Personally i would choose BufferedReader approach for the large file.
String key = String.valueOf(".*?\\b" + args[0].toLowerCase() + "\\b.*?");
Pattern p = Pattern.compile(key);
try (final BufferedReader br = Files.newBufferedReader(inputFile,
StandardCharsets.UTF_8)) {
for (String line; (line = br.readLine()) != null;) {
// processing the line.
line = line.toLowerCase().replaceAll("[^a-zA-Z0-9\\s]", "");
Matcher m = p.matcher(line);
if (m.find()) {
countstr++;
}
}
}
Gave Sample in Java 7. Change if required!!

Categories