I have the following code, please keep in mind I'm just starting to learn a language and a such have been looking for fairly simple exercises. Coding etiquette and critics welcome.
import java.util.*;
import java.io.*;
public class Tron
{
public static void main(String[] args) throws Exception
{
int x,z,y = 0;
File Tron= new File("C:\\Java\\wordtest.txt");
Scanner word = new Scanner(Tron);
HashMap<String, Integer> Collection = new HashMap<String, Integer>();
//noticed that hasNextLine and hasNext both work.....why one over the other?
while (word.hasNext())
{
String s = word.next();
Collection.get(s);
if (Collection.containsKey(s))
{
Integer n = Collection.get(s);
n = n+1;
Collection.put(s,n);
//why does n++ and n+1 give you different results
}else
{
Collection.put(s,1);
}
}
System.out.println(Collection);
}
}
Without the use of useDelimiter() I get my desired output based on the file I have:
Far = 2, ran = 4, Frog = 2, Far = 7, fast = 1, etc...
Inserting the useDelimiter method as follows
Scanner word = new Scanner(Bible);
word.useDelimiter("\\p{Punct} \\p{Space}");
provides the following output as it appears in the text file shown below.
the the the the the
frog frog
ran
ran ran ran
fast, fast fast
far, far, far far far far far
Why such a difference in output if useDelimiter was supposed to account for punctuation new lines etc? Probably pretty simple but again first shot at a program. Thanks in advance for any advice.
With word.useDelimiter("\\p{Punct} \\p{Space}") you are actually telling the scanner to look for delimiters consisting of a punctuation character followed by a space followed by another whitespace character. You probably wanted to have one (and only one) of these instead, which would be achieved by something like
word.useDelimiter("\\p{Punct}|\\p{Space}");
or at least one of these, which would look like
word.useDelimiter("[\\p{Punct}\\p{Space}]+");
Update
#Andrzej nicely answered the questions in your code comments (which I forgot about), however he missed one little detail which I would like to expand / put straight here.
why does n++ and n+1 give you different results
This obviously relates to the line
n = n+1;
and my hunch is that the alternative you tried was
n = n++;
which indeed gives confusing results (namely the end result is that n is not incremented).
The reason is that n++ (the postfix increment operator by its canonical name) increments the value of n but the result of the expression is the original value of n! So the correct way to use it is simply
n++;
the result of which is equivalent to n = n+1.
Here is a thread with code example which hopefully helps you understand better how these operators work.
Péter is right about the regex, you're matching a very specific sequence rather than a class of characters.
I can answer the questions from your source comments:
noticed that hasNextLine and hasNext both work.....why one over the other?
The Scanner class is declared to implement Iterator<String> (so that it can be used in any situation where you want some arbitrary thing that provides Strings). As such, since the Iterator interface declares a hasNext method, the Scanner needs to implement this with the exact same signature. On the other hand, hasNextLine is a method that the Scanner implements on its own volition.
It's not entirely unusual for a class which implements an interface to declare both a "generically-named" interface method and a more domain-specific method, which both do the same thing. (For example, you might want to implement a game-playing client as an Iterator<GameCommand> - in which case you'd have to declare hasNext, but might want to have a method called isGameUnfinished which did exactly the same thing.)
That said, the two methods aren't identical. hasNext returns true if the scanner has another token to return, whereas hasNextLine returns true if the scanner has another line of input to return.
I expect that if you run the scanner over a file which doesn't end in a newline, and consume all but one of the tokens, then hasNext would return true while hasNextLine would return false. (If the file ends in a newline then both methods will behave the same - as there are more tokens if and only if not all lines have been consumed - but they're not technically the same.)
why does n++ and n+1 give you different results
This is quite straightforward.
n + 1 simply returns a value that is one greater than the current value of n. Whereas n++ sets n to be one greater, and then returns that value.
So if n was currently 4, then both options would return 5; the difference is that the value of n would still be 4 if you called n + 1 but it would be 5 if you called n++.
In general, it's wise to avoid using the ++ operator except in situations where it's used as boilerplate (such as in for loops over an index). Taking two or three extra characters, or even an extra line, to express your intent more clearly and unambiguously is such a small price that it's almost always worth doing.
Related
I'm fairly inexperienced with using objects so I would really like some input.
I'm trying to remove comments from a list that have certain "unwanted words" in them, both the comments and the list of "unwanted words" are in ArrayList objects.
This is inside of a class called FormHelper, which contains the private member comments as an ArrayList, the auditList ArrayList is created locally in a member function called populateComments(), which then calls this function (below). PopulateComments() is called by the constructor, and so this function only gets called once, when an instance of FormHelper is created.
private void filterComments(ArrayList <String> auditList) {
for(String badWord : auditList) {
for (String thisComment : this.comments) {
if(thisComment.contains(badWord)) {
int index = this.comments.indexOf(thisComment);
this.comments.remove(index);
}
}
}
}
something about the way I implemented this doesn't feel right, I'm also concerned that I'm using ArrayList functions inefficiently. Is my suspicion correct?
It is not particularly efficient. However, finding a more efficient solution is not straightforward.
Lets step back to a simpler problem.
private void findBadWords(List <String> wordList, List <String> auditList) {
for(String badWord : auditList) {
for (String word : wordList) {
if (word.equals(badWord)) {
System.err.println("Found a bad word");
}
}
}
}
Suppose that wordList contains N words and auditList contains M words. Some simple analysis will show that the inner loop is executed N x M times. The N factor is unavoidable, but the M factor is disturbing. It means that the more "bad" words you have to check for the longer it takes to check.
There is a better way to do this:
private void findBadWords(List <String> wordList, HashSet<String> auditWords) {
for (String word : wordList) {
if (auditWords.contains(word))) {
System.err.println("Found a bad word");
}
}
}
Why is that better? It is better (faster) because HashSet::contains doesn't need to check all of the audit words one at a time. In fact, in the optimal case it will check none of them (!) and the average case just one or two of them. (I won't go into why, but if you want to understand read the Wikipedia page on hash tables.)
But your problem is more complicated. You are using String::contains to test if each comment contains each bad word. That is not a simple string equality test (as per my simplified version).
What to do?
Well one potential solution is to split the the comments into an array of words (e.g. using String::split and then user the HashSet lookup approach. However:
That changes the behavior of your code. (In a good way actually: read up on the Scunthorpe problem!) You will now only match the audit words is they are actual words in the comment text.
Splitting a string into words is not cheap. If you use String::split it entails creating and using a Pattern object to find the word boundaries, creating substrings for each word and putting them into an array. You can probably do better, but it is always going to be a non-trivial calculation.
So the real question will be whether the optimization is going to pay off. That is ultimately going to depend on the value of M; i.e. the number of bad words you are looking for. The larger M is, the more likely it will be to split the comments into words and use a HashSet to test the words.
Another possible solution doesn't involve splitting the comments. You could take the list of audit words and assemble them into a single regex like this: \b(word-1|word-2|...|word-n)\b. Then use this regex with Matcher::find to search each comment string for bad words. The performance will depend on the optimizing capability of the regex engine in your Java platform. It has the potential to be faster than splitting.
My advice would be to benchmark and profile your entire application before you start. Only optimize:
when the benchmarking says that the overall performance of the requests where this comment checking occurs is concerning. (If it is OK, don't waste your time optimizing.)
when the profiling says that this method is a performance hotspot. (There is a good chance that the real hotspots are somewhere else. If so, you should optimize them rather than this method.)
Note there is an assumption that you have (sufficiently) completed your application and created a realistic benchmark for it before you think about optimizing. (Premature optimization is a bad idea ... unless you really know what you are doing.)
As a general approach, removing individual elements from an ArrayList in a loop is inefficient, because it requires shifting all of the "following" elements along one position in the array.
A B C D E
^ if you remove this
^---^ you have to shift these 3 along by one
/ / /
A C D E
If you remove lots of elements, this will have a substantial impact on the time complexity. It's better to identify the elements to remove, and then remove them all at once.
I suggest that a neater way to do this would be using removeIf, which (at least for collection implementations such as ArrayList) does this "all at once" removal:
this.comments.removeIf(
c -> auditList.stream().anyMatch(c::contains));
This is concise, but probably quite slow because it has to keep checking the entire comment string to see if it contains each bad word.
A probably faster way would be to use regex:
Pattern p = Pattern.compile(
auditList.stream()
.map(Pattern::quote)
.collect(joining("|")));
this.comments.removeIf(
c -> p.matcher(c).find());
This would be better because the compiled regex would search for all of the bad words in a single pass over each comment.
The other advantage of a regex-based approach is that you can check case insensitively, by supplying the appropriate flag when compiling the regex.
The question was asking me to return set containing all the possible combination of strings made up of "cc" and "ddd" for given length n.
so for example if the length given was 5 then set would include "ccddd" and "dddcc".
and length 6 would return set containing "cccccc","dddddd"
and length 7 would return set contating "ccdddcc","dddcccc","ccccddd"
and length 12 will return 12 different combination and so on
However, set returned is empty.
Can you please help?
"Please understand extremeply poor coding style"
public static Set<String> set = new HashSet<String>();
public static Set<String> generateset(int n) {
String s = strings(n,n,"");
return set; // change this
}
public static String strings(int n,int size, String s){
if(n == 3){
s = s + ("cc");
return "";}
if(n == 2){
s = s + ("ddd");
return "";}
if(s.length() == size)
set.add(s);
return strings(n-3,size,s) + strings(n-2,size,s);
}
I think you'll need to rethink your approach. This is not an easy problem, so if you're extremely new to Java (and not extremely familiar with other programming languages), you may want to try some easier problems involving sets, lists, or other collections, before you tackle something like this.
Assuming you want to try it anyway: recursive problems like this require very clear thinking about how you want to accomplish the task. I think you have a general idea, but it needs to be much clearer. Here's how I would approach the problem:
(1) You want a method that returns a list (or set) of strings of length N. Your recursive method returns a single String, and as far as I can tell, you don't have a clear definition of what the resulting string is. (Clear definitions are very important in programming, but probably even more so when solving a complex recursive problem.)
(2) The strings will either begin with "cc" or "ddd". Thus, to form your resulting list, you need to:
(2a) Find all strings of length N-2. This is where you need a recursive call to get all strings of that length. Go through all strings in that list, and add "cc" to the front of each string.
(2b) Similarly, find all strings of length N-3 with a recursive call; go through all the strings in that list, and add "ddd" to the front.
(2c) The resulting list will be all the strings from steps (2a) and (2b).
(3) You need base cases. If N is 0 or 1, the resulting list will be empty. If N==2, it will have just one string, "cc"; if N==3, it will have just one string, "ddd".
You can use a Set instead of a list if you want, since the order won't matter.
Note that it's a bad idea to use a global list or set to hold the results. When a method is calling itself recursively, and every invocation of the method touches the same list or set, you will go insane trying to get everything to work. It's much easier if you let each recursive invocation hold its own local list with the results. Edit: This needs to be clarified. Using a global (i.e. instance field that is shared by all recursive invocations) collection to hold the final results is OK. But the approach I've outlined above involves a lot of intermediate results--i.e. if you want to find all strings whose length is 8, you will also be finding strings whose length is 6, 5, 4, ...; using a global to hold all of those would be painful.
The answer to why set is returned empty is simply follow the logic. Say you execute generateset(5); which will execute strings(5,5,"");:
First iteration strings(5,5,""); : (s.length() == size) is false hence nothing added to set
Second iteration strings(2,5,""); : (n == 2) is true, hence nothing added to set
Third iteration strings(3,5,""); : (n == 3) is true, hence nothing added
to set
So set remains un changed.
I want to declare integers, while the program is running.
I run the program, and then I give it via System.in.println an integer and repeat this as long as I want.
I want the program to give those integers a name of a certain type for, for example a(i) or a[i], dunno, (it should be handy) and then a(i) represents the the i'th integer I gave the program.
My idea is then that I can use those elements by their name just like, if I had declared them in the first place.
For example add two integers together.
For example I defined a method add+, which waits for 2 integer and then adds them. For example I write:
add
a(2)
a(47)
(then I would get here the result.)
I don't think implementing the add function is difficult. However I don't know, how to let the program count the number of inputs or how to let it declare and use variables.
First: Welcome to programming java; it will be a long road.
Here are some hints:
Use a List<Integer> to hold the sequence of numbers entered by the user.
Actually instanciate a concreate List class, for example LinkedList<Integer>'. If you need to access the elements by index, use anArrayList`.\
Each time the user enters a number, create a new Integer and userList.add(newInteger);
Simple sample
List<Integer> userList = new LinkedList<Integer>();
for (index = 0; index < 9; ++index)
{
Integer newInteger = new Integer(index);
userList.add(newInteger);
}
for (Integer current : userList)
{
System.out.println(current);
}
Yeah, I am following the conversation.
I am just a bit frustrated, because I can't really write any interesting or practical java programs (yet), because my knowledge isn't that big yet.
First I tried to find out, if there was a way to add elements to array, because arrays seemed to me very useful, because each element of an array already has an address. I googled, and it seems that is not possible.
I might be able to use the idea with the list, but it seems to be that the length of the list has to have a limit and actually I wanted to avoid that.
I'm looking at finding very short substrings (pattern, needle) in many short lines of text (haystack). However, I'm not quite sure which method to use outside the naive, brute force method.
Background: I'm doing a side project for fun where I receive text messaging chat logs of multiple users (anywhere from 2000-15000 lines of text and 2-50 users), and I want to find all the various pattern matches in the chat logs based on predetermined words that I've come up with. So far I have about 1600 patterns that I'm looking for, but I may look for more.
So for example, I want to find the number of food related words that are used in an average text message log such as "hamburger", "pizza", "coke", "lunch", "dinner", "restaurant", "McDonalds". While I gave out English examples, I will actually be using Korean for my program. Each of these designated words will have their own respective score, which I put in a hashmap as key and value separately. I then show the top scorers for food related words as well as the most frequent words used by those users for food words.
My current method is to eliminate each line of text by whitespaces, and process each individual word from the haystack by using contains method (which uses the indexOf method and the naive substring search algorithm) of the haystack contains the pattern.
wordFromInput.contains(wordFromPattern);
To give an example, with 17 users in chat, 13000 lines of text, and the 1600 patterns, I've found that this whole program took 12-13 seconds with this method. And on the Android app that I'm developing, it took 2 minutes and 30 seconds to process, which is far too slow.
Originally, I tried to use a hash map and to merely get the pattern instead of searching for it in the ArrayList, but I then realized that is...
not possible with hash table
for what I am trying to do with a substring.
I've looked around through Stackoverflow and found a lot of helpful and related questions, such as these two:
1 and 2. I'm somewhat more familiar with the various string algorithms (Boyer Moore, KMP, etc.)
I initially thought then that the naive method would of course be the worst type of algorithm for my case, but having found this question, I've realized that my case (short pattern, short text), might actually be more effective with the naive method. But I wanted to know if there was something that I was neglecting completely.
Here is a snippet of my code though if anyone wants to see my issue more concretely.
While I removed large parts of the code to simplify it, the primary method that I use to actually match substrings is there in the method matchWords().
I know that's really ugly and bad code (5 for loops...), so if there are any suggestions for that, I'm happy to hear it as well.
So to clean it up:
lines of text from chat logs (2000-10,000+), haystack
1600+ patterns, needle(s)
mostly using Korean characters, although some English is included
Brute force naive method is simply too slow, but debating whether there are other alternatives and even if there are, whether they are practical given the nature of short patterns and text.
I just want some input on my thought process, and possibly some general advice. But additionally, I would like some specific suggestion for a particular algorithm or method if that is possible.
You can replace the hashtable with a Trie.
Split the line of text into words using white space to separate words. Then check if the word is in the Trie. If it is in the Trie, update a counter associated with the word. Ideally, the counter would be integrated into the Trie.
This appraoch is O(C) where C is the number of characters in the text. It's highly unlikely that you can avoid checking each character at least once. Thus this approach should be as good as you can get at least in terms of big O.
However, it sounds like you may not want to list all of the possible words you are searching for. Therefore, you might want to simply use you could build a counting Trie from all of the words. If nothing else that'll probably make it easier for any pattern matching algorithm you use. Although, it might require some modifications to the Trie.
What you're describing sounds like an excellent use case for the Aho-Corasick string-matching algorithm. This algorithm finds all matches of a set of pattern strings inside of a source string and does so in linear time (plus the time to report the matches). If you have a fixed set of strings to search for, you can do linear preprocessing work up front on the patterns to search for all matches very quickly.
There's a Java implementation of Aho-Corasick available here. I haven't tried it out, but it might be a good match.
Hope this helps!
I'm pretty sure string.contains is already highly optimized, so replacing it with something else is not going to do you a lot of good.
So the way to go, I suspect, is not to look for each and every bank-word in your chat words, but rather do multiple comparisons at once.
The first way to do it would be to create one huge regular expression that will match all your bank-words. Compile it and hope the regular expression package is efficient enough (chances are - it is). You will have a rather lengthy setup stage (the regex compilation), but matches should be a lot faster.
You can build an index of the words you need to match and count them as you process them. If you can use a HashMap to lookup the patterns for each word, the cost will be O(n * m)
You can use a HashMap for all the possible words, you can then dissect the words later.
e.g. say you need to match red and apple, you can combine the sum of
redapple = 1
applered = 0
red = 10
apple = 15
This means that red is actually 11 (10 + 1), and apple is 16 (15 + 1)
I don't know Korean so I imagine the same strategies used to tinker with Strings in Korean isn't necessarily possible in the way it is with English, but perhaps this strategy in pseudocode can be applied with your knowledge of Korean to make it work. (Java is of course still the same, but for example, in Korean is it still highly likely for the letters "ough" to be in succession? Are there even letters for "ough"? But with that being said, hopefully the principle can be applied
I would use String.toCharArray to create a two-dimensional array (or ArrayList if variable size needed). The
if (first letter of word matches keyword's first letter)//we have a candidate
skip to last letter of the current word //see comment below
if(last letter of word matches keyword's last letter)//strong candidate
iterate backwards to start+1 checking remainder of letters
The reason I suggest to skip to the last letter is because statistically a "consonant, vowel" for the first two letters of a word is significantly high, especially nouns, which will consist of alot of your keywords since any food is a noun (almost all the keyword examples you gave were matched that structure of consonant, vowel). And since there are only 5 vowels(plus y), the likelihood of the second letter "i" showing up in the keyword "pizza" is inherently highly likely, yet after that point there is still a good chance that the word may turn out to not be a match.
However if you know that the first letter and the last letter match, then you probably have a much stronger candidate and can then iterate in reverse. I think over larger sets of data, this would eliminate candidates much faster than checking letters in order. Basically you'd be letting too many fake candidates past the second iteration, thus increasing your overall conditional operations. It might sound like something small, but in a project like this there's lots of reiterating, so micro-optimizations will accumulate very quickly.
If this approach can be applied in a language that's probably structurally very different from English(I'm speaking from ignorance here though), then I think it might provide some efficiency for you whether you make it happen through iterating a char array or with a scanner, or any other construct.
The trick is to realise that if you can describe the string you are searching for as a regular expression you can also, by definition, describe it with a state machine.
At every character in your message start a state machine for every one of your 1600 patterns and pass the character through it. This sounds scary but believe me most of them will terminate immediately anyway so you aren't really doing a huge amount of work. Bear in mind that a state machine can usually be encoded with a simple switch/case or a ch == s.charAt at each step so they are close to the ultimate in light-weight.
Obviously you know what to do whenever one of your search machines terminates at the end of their search. Any that terminate before full-match can be discarded immediately.
private static class Matcher {
private final int where;
private final String s;
private int i = 0;
public Matcher ( String s, int where ) {
this.s = s;
this.where = where;
}
public boolean match(char ch) {
return s.charAt(i++) == ch;
}
public int matched() {
return i == s.length() ? where: -1;
}
}
// Words I am looking for.
String[] watchFor = new String[] {"flies", "like", "arrow", "banana", "a"};
// Test string to search.
String test = "Time flies like an arrow, fruit flies like a banana";
public void test() {
// Use a LinkedList because it is O(1) to remove anywhere.
List<Matcher> matchers = new LinkedList<> ();
int pos = 0;
for ( char c : test.toCharArray()) {
// Fire off all of the matchers at this point.
for ( String s : watchFor ) {
matchers.add(new Matcher(s, pos));
}
// Discard all matchers that fail here.
for ( Iterator<Matcher> i = matchers.iterator(); i.hasNext(); ) {
Matcher m = i.next();
// Should it be removed?
boolean remove = !m.match(c);
if ( !remove ) {
// Still matches! Is it complete?
int matched = m.matched();
if ( matched >= 0 ) {
// Todo - Should use getters.
System.out.println(" "+m.s +" found at "+m.where+" active matchers "+matchers.size());
// Complete!
remove = true;
}
}
// Remove it where necessary.
if ( remove ) {
i.remove();
}
}
// Step pos to keep track.
pos += 1;
}
}
prints
flies found at 5 active matchers 6
like found at 11 active matchers 6
a found at 16 active matchers 2
a found at 19 active matchers 2
arrow found at 19 active matchers 6
flies found at 32 active matchers 6
like found at 38 active matchers 6
a found at 43 active matchers 2
a found at 46 active matchers 3
a found at 48 active matchers 3
banana found at 45 active matchers 6
a found at 50 active matchers 2
There are several simple optimisations. With some simple pre-processing the most obvious is to use the current character to determine which matchers may be applicable.
This is a pretty broad question, so I won't go into too much detail, but roughly:
Pre-process the haystacks using something like broad lemmatizer to create "topic word only" versions of the messages by noting which topics all words in it cover. For example, any occurrences of "hamburger", "pizza", "coke", "lunch", "dinner", "restaurant", or "McDonalds" would cause the "topic" word "food" to be collected for that message. Some words may have multiple topics, eg "McDonalds" may be in the topics "food" and "business". Most words won't have any topic.
After this process, you'll have haystacks consisting of only "topic" words. Then create a Map<String, Set<Integer>> and populate it with the topic word and the Set of chat message ids that contain it. This is reverse index of topic word to the chat messages that contain it.
The runtime code to find all documents that contain all n words is then trivial and super fast - near O(#terms):
private Map<String, Set<Integer>> index; // pre-populated
Set<Integer> search(String... topics) {
Set<Integer> results = null;
for (String topic : topics) {
Set<Integer> hits = index.get(topic);
if (hits == null)
return Collections.emptySet();
if (results == null)
results = new HashSet<Integer>(hits);
else
results.retainAll(hits);
if (results.isEmpty())
return Collections.emptySet(); // exit early
}
return results;
}
This will perform near O(1), and tell you which messages share all search terms. If you just want the number, use the trivial size() of the returned Set.
I asked about this array a little while ago, and I can't see what the problem is. Too tired. What have I done wrong? Basically, I am taking a string array and trying to check to see if it contains numbers or an x (ISBN number validation). I want to take the number from a given input (bookNum), check the input, and feed any valid input into a new array (book). At the line
'bookNum.charAt[j]==book[i]'
I get the 'not a statement error'. What gives?
String[] book = new String [ISBN_NUM];
bookNum.replaceAll("-","");
if (bookNum.length()!=ISBN_NUM)
throw new ISBNException ("ISBN "+ bookNum + " must be 10 characters");
for (int i=0;i<bookNum.length();i++)
{
if (Character.isDigit(bookNum.charAt(i)))
bookNum.CharAt[j]==book[i];
j++;
if (book[9].isNotDigit()||
book[9]!="x" ||
book[9]!="X")
throw new ISBNException ("ISBN " + bookNum + " must contain all digits" +
"or 'X' in the last position");
== is java is used for equivalence comparison. If you want to assign it, use a single =.
The first issue here is that charAt is a function, and thus needs parenthesis even though you are accessing with an index like an array.
The other issue is that the line is a boolean expression, which just by itself does not mean anything. A lot of people are suggestion that you mean to make an assignment to that character, but just changing to a single equals causes other problems. The left side of an equals sign needs to be a variable, and the result of a function is not a variable.
Strings are immutable, so you can not simply change one of the characters in the string. Earlier in your code, you have a call to replaceAll(), that returns a new string with the alterations. As written, this altered string is being lost.
There are few odd problems here. For starters, did you mean for book to be an array of Strings, as opposed to just one string? You're trying (assuming CharAt was written properly and the assignment was proper) to assign a character to a string.
Second, instead of copying character by character, why not check the whole string, and copy the whole thing at the end if it is a proper ISBN? Depending on what you do with Exceptions (if you continue regardless), you could add a boolean as a flag that gets set if there is an error. At the end, if there is no error, then make book = to booknumber.replace(etc...)
bookNum.CharAt[j]==book[i];
Should be
bookNum.CharAt[j]=book[i];
You are using an equality boolean operator, not an assignment one.
Looks like you're using .charAt(i) wrong! Assuming that "bookNum" is a String, you should use:
bookNum.charAt(i)==book[i];
Instead. Note that this is a boolean expression, and not "=".
The line bookNum.CharAt[j]==book[i]; isn't a statement. It's a comparison. Perhaps you want bookNum.CharAt[j]=book[i]; (single = instead of ==).
Edit: That's not going to fix things, though, since you can't assign to bookNum.CharAt[j].