I am trying to write a program that loads a movie data base file, and then splits up that information into the movie title, year, and all of the associated actors. I split up all of the info, but I am having issues converting the year, which is in a string, to an int. The format of the year string is (****) with the * being a year, such as 1999. When I try to use parse I get a number format exception. I have tried replacing the parentheses, but it just gave me more errors! Any ideas?
public class MovieDatabase {
ArrayList<Movie> allMovie = new ArrayList<Movie>();
//Loading the text file and breaking it apart into sections
public void loadDataFromFile( String aFileName) throws FileNotFoundException{
Scanner theScanner = new Scanner(aFileName);
theScanner = new Scanner(new FileInputStream("cast-mpaa.txt"));
while(theScanner.hasNextLine()){
String line = theScanner.nextLine();
String[] splitting = line.split("/" );
String movieTitleAndYear = splitting[0];
int movieYearIndex = movieTitleAndYear.indexOf("(");
String movieYear = movieTitleAndYear.substring(movieYearIndex);
System.out.println(movieYear);
//this is where I have issues
int theYear = Integer.parseInt(movieYear);
String movieTitle = movieTitleAndYear.substring(0, movieYearIndex);
ArrayList<Actor> allActors = new ArrayList<Actor>();
for ( int i = 1; i < splitting.length; i++){
String[] names = splitting[i].split(",");
String firstName = names[0];
Actor theActor = new Actor(firstName);
ArrayList<Actor> allActor = new ArrayList<Actor>();
allActor.add(theActor);
}
Movie theMovie = new Movie(movieTitle, theYear, allActors);
allMovie.add(theMovie);
}
theScanner.close();
}
output:
(1967)
Here is the errors I am getting:
Exception in thread "main" java.lang.NumberFormatException: For input string: "(1967)"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:481)
at java.lang.Integer.parseInt(Integer.java:527)
at MovieDatabase.loadDataFromFile(MovieDatabase.java:27)
You have brackets around the numbers. You could either correct your file or you could remove brackets using:
String str = "(1967)";
System.out.println(str.substring(1, str.length()-1));
Output:
1967
In your code, you used:
int movieYearIndex = movieTitleAndYear.indexOf("(");
String movieYear = movieTitleAndYear.substring(movieYearIndex);
So if my movieTitleAndYear string is "hi (1947)", indexOf will give me index of "(" as 3 and substring will start reading string from index 3 which includes "(". One way you could avoid opening bracket is to change your substring line to:
String movieYear = movieTitleAndYear.substring(movieYearIndex + 1);//but still you have closing bracket.
If you are sure it's always going to be of four digit, then you could do something like:
String movieYear = movieTitleAndYear.substring(movieYearIndex + 1, movieYearIndex + 5);
You need to add indexof for ")".
Code snippet:
int movieYearOpenBracesIndex = movieTitleAndYear.indexOf("(");
int movieYearCloseBracesIndex = movieTitleAndYear.indexOf(")");
String movieYear = movieTitleAndYear.substring(movieYearOpenBracesIndex + 1, movieYearCloseBracesIndex);
System.out.println(movieYear);
This will give the exact year. e.g. 1967
Your substring call currently gets a year enclosed by brackets, e.g., (1967). You can avoid this by calling the substring variant that accepts an endIndex, and just get the year's four digits:
String movieYear =
movieTitleAndYear.substring(movieYearIndex + 1, // to get rid of "("
movieYearIndex + 5 // to get rid of ")"
);
Related
I have a string that I want to break down and assign different part of this string to different variables.
String:
String str ="NAME=Mike|Phone=555.555.555| address 298 Stack overflow drive";
To Extract the Name:
int startName = str.indexOf("=");
int endName = str.indexOf("|");
String name = str.substring(startName +1 , endName ).trim();
But I can't extract the phone number:
int startPhone = arg.indexOf("|Phone");
int endPhone = arg.indexOf("|");
String sip = arg.substring(startPhone + 7, endPhone).trim();
Now how can I extract the phone number that is between delimiter "|".
Also, is there a different way to extract the name using the between delimiter "=" & the first "|"
You can split on both = and | at the same time, and then pick the non-label parts
String delimiters = "[=\\|]";
String[] splitted = str.split(delimiters);
String name = splitted[1];
String phone = splitted[3];
Note that his code assumes that the input is formatted exactly as you posted. You may want to check for whitespace and other irregularities.
String[] details = str.split("|");
String namePart = details[0];
String phonePart = details[1];
String addressPart = details[2];
String name = namePart.substring(namePart.indexOf("=") + 1).trim();
String phone = phonePart.substring(phonePart.indexOf("=") + 1).trim();
String address = addressPart.trim();
I hope this could help.
I have list of text files I need to read a specific string from, which is always preceded by the string "SWEUserName=". I have been able to print the entire line from the log, but not just the string I need. I do want to print the line number, just not the whole line
So far this is what I've got:
public static String [] openFile() throws FileNotFoundException, IOException{
String searchTech = "SWEUserName=";
int s;
String foundTech = "";
File logs = new File("C:\\Users\\wfedric\\Desktop\\GD\\Java\\Learning\\app\\src\\main\\java\\com\\fedrictechnologies\\learning\\FSDS2.txt");
Scanner scnr = new Scanner(logs);
int lineNumber = 1;
while(scnr.hasNextLine()){
String line = scnr.nextLine();
lineNumber++;
if(line.contains(searchTech)){
s = 10;
foundTech = lineNumber +" :"+ searchTech + s;
System.out.println(foundTech);
System.out.println(line);
}else;
}
return null;
}
I know I am missing something, but I can't for the life of me figure how to count the next 10 characters. I realize at it stands in my code, I am simply printing the Line number followed by my searchTech variable, and the number 10.
I need s to hold on to the 10 characters following searchTech. Perhaps an array is the best way? Just not sure :(
With the above code, I have the following output, which I should expect:
141 :SWEUserName=10
[09/04/14 EDT:8:15:48 AM- INFO- MASC1050141409832948329] - [ HomePageURL ] - ThinClient Home Page URL - https://wls.rio.directv.com/wpservsm_enu/start.swe?SWECmd=ExecuteLogin&SWENeedContext=false&SWEUserName=masc105014&SWEPassword=%5BNDSEnc-D%5Dji%2Fic25k%2FTB%2Fy7mqG2kcb2ndd1S3hgWC8Rfa4e1DvtwKWMGQmTzngA%3D%3D&
143 :SWEUserName=10
[09/04/14 EDT:8:15:48 AM- INFO- ] - [ webServiceRequest ] - Web service Call - RetryCounter: 0, URL: https://wls.rio.directv.com/wpservsm_enu/start.swe?SWECmd=ExecuteLogin&SWENeedContext=false&SWEUserName=masc105014&SWEPassword=%5BNDSEnc-D%5Dji%2Fic25k%2FTB%2Fy7mqG2kcb2ndd1S3hgWC8Rfa4e1DvtwKWMGQmTzngA%3D%3D&, Type: GET
1st and 3rd lines are the General format I want, 2nd and 4th lines are where I get stuck returning the specific values after searchTech.
SOLUTION (During this process, I played with the indexOf method to include the date, and left it there)
public class techMatching {
static int s;
static int d;
static String sTech;
static String dTech;
public static String [] openReadFile() throws FileNotFoundException, IOException{
String searchTech = "SWEUserName=";
String foundTech;
File logs = new File("C:\\FSDS2.txt");
Scanner scnr = new Scanner(logs);
int lineNumber = 1;
while(scnr.hasNextLine()){
String line = scnr.nextLine();
lineNumber++;
if(line.contains(searchTech)){
s = line.indexOf(searchTech);
sTech = line.substring(s+12,s+22);
d = line.indexOf("[");
dTech = line.substring(1, 22);
foundTech = lineNumber +": "+ "(" + dTech + ")" + "|"+ sTech.toUpperCase();
System.out.println(foundTech);
}else;
}
return null;
}
Which returned the expected output:
141: (09/04/14 EDT:8:15:48 )|MASC105014
143: (09/04/14 EDT:8:15:48 )|MASC105014
And so on.
"" ""
I suggest you look at the methods available in the String class. Using indexOf(searchTech), you know where in the line the "SWEUserName=" is. Using substring, you can get a String consisting of part of the line.
I have a string String a = "(3e4+2e2)sin(30)"; and i want to show it as a = "(3e4+2e2)*sin(30)";
I am not able to write a regular expression for this.
Try this replaceAll:
a = a.replaceAll("\) *(\\w+)", ")*$1");
You can go with this
String func = "sin";// or any function you want like cos.
String a = "(3e4+2e2)sin(30)";
a = a.replaceAll("[)]" + func, ")*siz");
System.out.println(a);
this should work
a = a.replaceAll("\\)(\\s)*([^*+/-])", ") * $2");
String input = "(3e4+2e2)sin(30)".replaceAll("(\\(.+?\\))(.+)", "$1*$2"); //(3e4+2e2)*sin(30)
Assuming the characters within the first parenthesis will always be in similar pattern, you can split this string into two at the position where you would like to insert the character and then form the final string by appending the first half of the string, new character and second half of the string.
string a = "(3e4+2e2)sin(30)";
string[] splitArray1 = Regex.Split(a, #"^\(\w+[+]\w+\)");
string[] splitArray2 = Regex.Split(a, #"\w+\([0-9]+\)$");
string updatedInput = splitArray2[0] + "*" + splitArray1[1];
Console.WriteLine("Input = {0} Output = {1}", a, updatedInput);
I did not try but the following should work
String a = "(3e4+2e2)sin(30)";
a = a.replaceAll("[)](\\w+)", ")*$1");
System.out.println(a);
If I have a dataset with lines like this 199.72.81.55 - - [01/Jul/1995:00:00:01 -0400] "GET /history/apollo/ HTTP/1.0" 200 6245 and I am running a map reduce job with hadoop, how can I get the last element in each line?
I have tried all the obvious answers, such as String lastWord = test.substring(test.lastIndexOf(" ")+1); but this gives me the - character. I have tried splitting it based on a space, and getting the last element, but the last character is still a -.
Can I not expect that the data will be delivered to me line by line. In other words, can I not expect a file in the form a b c d \n e f g h\n to be delivered line by line?
And does anyone have any tips on how to get the last word in this line?
This is a snippet from my map function, where I try to get the data:
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String test = value.toString();
StringTokenizer tokenizer = new StringTokenizer(test);
//String lastWord = test.substring(test.lastIndexOf(" ")+1); <--first try
//String [] array = test.split(" ");//<--second try
//one.set(Integer.valueOf(array[8]));
int i = 0;
String candidate = null;
while (tokenizer.hasMoreTokens()) {
candidate = tokenizer.nextToken();
if (i == 3) {
//this works to get the date field
String wholeDate = candidate;
String[] dateArray = wholeDate.split(":");
String date = dateArray[0].substring(1); // get rid of '['
String hour = dateArray[1];
word.set(date + " " + hour);
} else if (i == 7) {
// <-- third try
String replySizeString = candidate;
one.set(Integer.valueOf(replySizeString)); }
}
i++;
Instead of using a StringTokenizer you could just use the String[] String.split(String regex) method to return an array of Strings for each line. Then, assuming that each line of your data has the same number of fields, separated by spaces, you can just look at that array element.
String line = value.toString();
String[] lineArray = line.split(" ");
String lastWord = lineArray[9];
Or if you know that you always want the last token you could see how long the array is and then just grab the last element.
String lastWord = lineArray[lineArray.length - 1];
I have a txt file formatted like:
Name 'Paul' 9-years old
How can I get from a "readline":
String the_name="Paul"
and
int the_age=9
in Java, discarding all the rest?
I have:
...
BufferedReader bufferedReader = new BufferedReader(fileReader);
StringBuffer stringBuffer = new StringBuffer();
String line;
while ((line = bufferedReader.readLine()) != null) {
//put the name value in the_name
//put age value in the_age
}
...
Please suggest, thanks.
As you're using BufferedReader and everything is on the one line, you would have to split it to extract the data. Some additional formatting is then required to remove the quotes & extract the year part of age. No need for any fancy regex:
String[] strings = line.split(" ");
if (strings.length >= 3) {
String the_name= strings[1].replace("'", "");
String the_age = strings[2].substring(0, strings[2].indexOf("-"));
}
I notice you have this functionality in a while loop. For this to work, make sure that every line keeps the format:
text 'Name' digit-any other text
^^ ^^ ^
Important chars are
Spaces: min of 3 tokens needed for split array
Single quotes
- Hyphen character
use java.util.regex.Pattern:
Pattern pattern = Pattern.compile("Name '(.*)' (\d*)-years old");
for (String line : lines) {
Matcher matcher = pattern.matcher(line);
if (matcher.matches()) {
String theName = matcher.group(1);
int theAge = Integer.parseInt(matcher.group(2));
}
}
You can use the String.substring, String.indexOf, String.lastIndexOf, and Integer.parseInt methods as follows:
String line = "Name 'Paul' 9-years old";
String theName = line.substring(line.indexOf("'") + 1, line.lastIndexOf("'"));
String ageStr = line.substring(line.lastIndexOf("' ") + 2, line.indexOf("-years"));
int theAge = Integer.parseInt(ageStr);
System.out.println(theName + " " + theAge);
Output:
Paul 9