My roommate's teacher gave them a assignment to implement string length method in JAVA?
we have thought out two ways.
Check the element,and when get the out of bounds exception,it means the end of string,we catch this exception,then we can get the length.
Every time a string is pass to calculate the length,we add the special character to the end of it,it can be '\0',or "A",etc..
But we all think this two way may can finish the assignment,but they are bad(or bad habit to do with exception),it's not cool.
And we have googled it,but don't get what we want.
Something like this?
int i = 0;
for (char ch : string.toCharArray()) {
i++;
}
The pseudo-code you probably want is:
counter = 0
for(Character c in string) {
counter = counter + 1
}
This requires you to find a way to turn a Java String into an array of characters.
Likely the teacher is trying to make his or her students think, and will be satisfied with creative solutions that solve the problem.
None of these solutions would be used in the real world, because we have the String.length() method. But the creative, problem-solving process you're learning would be used in real development.
"1. Check the element,and when get the out of bounds exception,it means the end of string,we catch this exception,then we can get the length."
Here, you're causing an exception to be thrown in the normal case. A common style guideline is for exceptions to be thrown only in exceptional cases. Compared to normal flow of control, throwing an exception can be more expensive and more difficult to follow by humans.
That said, this one of your ideas has a potential advantage for very long strings. All of the posted answers so far run in linear time and space. The time and/or additional space they take to execute is proportional to the length of the string. With this approach, you could implement an O(log n) search for the length of the string.
Linear or not, it's possible that the teacher would find this approach acceptable for its creativity. Avoid if the teacher has communicated the idea that exceptions are only for exceptional cases.
"2. Every time a string is pass to calculate the length,we add the special character to the end of it,it can be '\0',or "A",etc.."
This idea has a flaw. What happens if the string contains your special character?
EDIT
A simple implementation would be to get a copy of the underlying char array with String.toCharArray(), then simply take its length. Unlike your ideas, this is not an in-place approach - making the copy requires additional space in memory.
String s = "foo";
int length = s.toCharArray().length;
Try this
public static int Length(String str) {
str = str + '\0';
int count = 0;
for (int i = 0; str.charAt(i) != '\0'; i++) {
count++;
}
return count;
}
What about:
"your string".toCharArray().length
Related
I am trying to solve this question on LeetCode:
A string s is nice if, for every letter of the alphabet that s contains, it appears both in uppercase and lowercase. For example, "abABB" is nice because 'A' and 'a' appear, and 'B' and 'b' appear. However, "abA" is not because 'b' appears, but 'B' does not.
Given a string s, return the longest substring of s that is nice. If there are multiple, return the substring of the earliest occurrence. If there are none, return an empty string.
For s = "YazaAay", the expected output is: "aAa"
One of the top voted solutions uses a Divide and Conquer approach:
class Solution {
public String longestNiceSubstring(String s) {
if (s.length() < 2) return "";
char[] arr = s.toCharArray();
Set<Character> set = new HashSet<>();
for (char c: arr) set.add(c);
for (int i = 0; i < arr.length; i++) {
char c = arr[i];
if (set.contains(Character.toUpperCase(c)) && set.contains(Character.toLowerCase(c))) continue;
String sub1 = longestNiceSubstring(s.substring(0, i));
String sub2 = longestNiceSubstring(s.substring(i+1));
return sub1.length() >= sub2.length() ? sub1 : sub2;
}
return s;
}
}
I understand how it works, but not the intuition behind using a Divide and Conquer approach. In other words, if I revisit the problem again after a few days/weeks after I have forgotten everything about it, I won't be able to realize it is a Divide and Conquer problem.
What is that 'thing' that makes it solvable by a Divide and Conquer approach?
This is how the algorithm could be described in plain English:
If the entire string is nice, we are done.
Otherwise, there must be a character which exists in only one case. Such a character naturally divides the string into two substrings. Conquer each of them individually, and compare results.
Edit: BTW, I don't think it is a good example of D&C problem. The point is, once we encounter the first "bad" character, the substring to the left of it is nice. There is no need to descend into it. Just record its length and keep going. A simple loop it is.
Divide-And-Conquer, to paraphrase wikipedia, is most appropriate when a problem can be broken down into "2 or more subproblems". The solution here checks that the input string meets the condition, then breaks it in two at each character, and recursively checks the strings meet the condition until there is no solution. Generally, the application of divide-and-conquer is easy to get a feel for when the problem can be subdivided symmetrically, such as in the DeWall algorithm for computing the delaunay triangulation for a set of points (http://vcg.isti.cnr.it/publications/papers/dewall.pdf - cool stuff).
What sets the substring problem apart in this instance is it checks all (edit:) possible viable subdivisions by incrementing the line of subdivision. To clarify for anyone who might be confused, this is necessary because the string can't be split down the middle, else you might be splitting a substring like "aAaA" apart and returning only half of it in the end. This kind of meets the more condition in "two or more problems", but I agree it's not intuitive in this instance.
Hope this helps, I had to learn about this a lot recently while implementing the referenced algorithm. Someone with more experience might have a better answer.
I have this code, but it seems pretty unwieldy. Is there a more canonical way of doing so in Java?
public boolean oneDiff(String from, String s) {
if (from.length()!=s.length()) return false;
int differences = 0;
for (int charIndex = 0;charIndex<from.length();charIndex++) {
if (from.charAt(charIndex)!=s.charAt(charIndex)) differences++;
}
return (differences==1);
}
I agree with #mk. However to minimize the loop execution you should not run the loop till the string ends. Instead you can break the loop as soon as the difference becomes greater than 1. Like this:
for (int charIndex = 0;charIndex<from.length();charIndex++) {
if (from.charAt(charIndex)!=s.charAt(charIndex)) differences++;
if(differences > 1) break;
}
return (differences==1);
This will help in faster execution by loop optimization if this is what you want.
Nope, that really is the best way!
There's nothing built-in because this isn't something you need to do often. The closest trick is doing an xor on two integers, and then getting the Hamming Weight using bitCount, in order to check for how many flipped bits they have in common:
Integer.bitCount(int1 ^ int2)
But there's nothing like that for Strings - it's not a common case, so you have to code your own. And the way you've coded it seems fine - you really do have to loop over every character. I guess you could shorten the variable names and remove the parens around your return, but that's just cosmetic.
I'm trying to search an array of common English words to see if a specific word is contained in it, based on a text file. Since this array has >700,000 words and around 1000 words need to be checked if in the array multiple times, I thought it would be more efficient to separate the words into separate arrays or lists based on length. Is there an easy way to do this without using a switch or lots of if statements? Like so:
for(int i = 0; i < commonWordArray.length; i++) {
if(commonWordArray[i].length == 2) {
twoLetterList.add(commonWordArray[i]);
else if(commonWordArray[i].length == 3) {
threeLetterList.add(commonWordArray[i]);
else if(commonWordArray[i].length == 4) {
fourLetterList.add(commonWordArray[i]);
}
...etc
}
Then doing the same thing when checking the words:
for(int i = 0; i < checkWords.length; i++) {
if(checkWords[i].length == 2) {
if(twoLetterList.contains(checkWords[i])) {
...etc
}
Step 1
Create word buckets.
ArrayList<ArrayList<String>> buckets = new ArrayList<>();
for(int i = 0; i < maxWordLength; i++) {
buckets.add(new ArrayList<String>());
}
Step 2
Add words to your buckets.
buckets.get(word.length()).add(word);
This approach has the downside that some of your buckets may go unused. This is not an issue if you are only filtering common English words, as they do not exceed 30 characters in length. Creating 10-15 extra lists is a trivial overhead for a computer. The largest uncommon but non-technical word is 183 characters. Technical words exceed 180,000 characters, by which point this approach is clearly not practical.
The upside of this approach is that ArrayList.get() and ArrayList.add() both run in constant (O(1)) time.
Use a List<Set<String>> sets. That is, given a String word, find first the proper set (Set<String> set = sets.get(word.length)) - create the set if needed, extend the list if needed. Then just do a set.add(word). Done!
Edit/Hint: a (good) programmer should be lazy - if you need to do/write the same thing twice, you're doing something wrong.
Assuming you've got memory for it (which your current approach relies on), why not just a single Set<String>? Simpler, faster.
If you want to use multiple strings to search, you may want to try something like the Aho Corasick algorithm.
Alternatively, you may want to turn the problem around and check if a string from the 700k array is in the 1k array. To this, you won't have memory issues (imho) and you may do it with a simple dictionary (balanced tree). so you'd have 700k log2(1000).
Use a Trie, which is a memory-efficient storage mechanism which excels at storing words and checking for whether they exist or not.
Implementing one on your own is a fun exercise, or look at existing implementations.
Though the question is generic, I would mention the scenario which sparked the query.
Scenario:
I am interested in analyzing a large number of strings (numeric ones in particular). Therefore, my first job is to filter out those ones which contain even a single character other than numbers.
A simple way to do this is (in Java):
for (String val : stringArray){
try{
int num = Integer.parseInt(val);
doSomething(num);
}
catch(NumberFormatException nfe){}
}
Another point which I must mention is that there are only about 5% of the strings in the array which are purely numeric. Thus there would be, in short, a lot of catching involved.
What I was wondering about was that whether this was an efficient way in terms of design or should I be thinking of other ways to do the same?
Conclusion based on answers: Exceptions are indeed expensive and it is not a very good design practice to use them as a form of control statement.
Therefore, one should try and look for alternatives wherever possible and if still exceptions seem to be clearer/easier, one should document it well.
What you do here is inherently correct as there is no other standard way in java to check if a string is numeric.
If a profiling proves you that this operation is too long, you could try to do it yourself as in the parseInt method but the JVM won't be able to do the same optimizations so I don't recommend it. You'll see that the JVM is heavily optimized to handle exceptions and that it does this job very well.
As a curiosity, here are a few ways to do it in java :
http://rosettacode.org/wiki/Determine_if_a_string_is_numeric#Java
with links to other languages, but your solution is the standard and idiomatic one and I doubt you'll find a big difference by rewriting it as in the example :
private static final boolean isNumeric(final String s) {
if (s == null || s.isEmpty()) return false;
for (int x = 0; x < s.length(); x++) {
final char c = s.charAt(x);
if (x == 0 && (c == '-')) continue; // negative
if ((c >= '0') && (c <= '9')) continue; // 0 - 9
return false; // invalid
}
return true; // valid
}
Using this, in my opinion, would be a typical case of premature optimization leading to a less maintainable code.
It is not efficient.
You can look up lots of resources on the web as to why throwing an exception is considered expensive, for example: http://www.yoda.arachsys.com/csharp/exceptions.html.
Unfortunately Java does not come with such a utility method OOTB (like C#'s tryParse). You can enumerate the characters of the string and use the Character.isDigit method (you can even intertwine the verification and the transformation into an int).
Exceptions should be used for abnormal termination of some flow.
Performing an operation that might raise an exception, you should always consider whether you can perform a check that will save you the cost and especially the code for handling an exception. For example- check whether a string is a number instead of trying to parse it and relying on the exception mechanism to tell you if its not.
It's not likely to matter much in the larger context of your application. Micro-optimizations like this are hard to guess at.
A better approach is to write your code as cleanly as possible and then measuring to see what its performance is and where bottlenecks, if any, reside. If you find that your performance is not acceptable, find the biggest bottleneck and address it if you can; rinse and repeat until performance is acceptable.
The problem is that none of us are smart enough to "know" where the problems will be. You're better off optimizing with data instead of guessing.
In your case, that's an unchecked exception. You could ignore it, but that would mean that a single bad string would blow you out of the loop. Putting the catch inside the loop allows you to tolerate that small percentage of input strings that fail the numeric parsing and continue on.
A non-exception based way to check for numeric-only strings would be to user a regular expression. For example:
public static void main(String[] args) throws Exception {
String[] array = {
"abc",
"123",
"12A",
};
Pattern p = Pattern.compile("\\d*");
for (String s: array) {
Matcher m = p.matcher(s);
if (m.matches()) {
System.out.println(s);
}
}
}
Exception based handling can be expensive.
Regular expressions are not the fastest either.
Try both and see which is faster for you.
Trying to perform a binary search on a sorted array of Book objects.
Its not working well, it returns the correct results for some of the objects, but not all.
I went through the loop on paper and it seems that a number can get missed out due to rounding #.5 upwards.
Any ideas how to make this work?
Book found = null;
/*
* Search at the center of the collection. If the reference is less than that,
* search in the upper half of the collection, else, search in the lower half.
* Loop until found else return null.
*/
int top = numberOfBooks()-1;
int bottom = 0;
int middle;
while (bottom <= top && found == null){
middle = (bottom + top)/2;
if (givenRef.compareTo(bookCollection.get(middle).getReference()) == 0) {
found = bookCollection.get(middle);
} else if (givenRef.compareTo(bookCollection.get(middle).getReference()) < 0){
bottom = middle + 1;
} else if (givenRef.compareTo(bookCollection.get(middle).getReference()) > 0){
top = middle - 1;
}
}
return found;
A couple suggestions for you:
there's no need to keep a Book variable. In your loop, just return the book when it's found, and at the end return null. And you can also remove the boolean check for the variable in the while condition.
the middle variable can be scoped inside the loop, no need to have it live longer.
you're doing bookCollection.get(middle).getReference() three times. Consider creating a variable and then using it.
the middle = (bottom + top)/2 is a classic mistake in binary search implementation algorithms. Even Joshua Bloch, who wrote the Java Collection classes, made that error (see this interesting blog post about it). Instead, use (bottom+top) >>> 1, to avoid integer overflow for very large values (you probably wouldn't encounter this error, but it's for the principle).
As for your actual problem statement, rounding would be downwards (integer division), not upwards. To troubleshoot the problem:
are you sure the numberOfBooks() method corresponds to the length of your collection?
are you sure the compareTo() method works as expected for the types you are using (in your code example we do not know what the getReference() return type is)
are you sure your collection is properly sorted according to getReference()?
and finally, are you sure that using givenRef.compareTo(bookCollection.get(middle).getReference()) < 0 is correct? In standard binary search implementations it would be reversed, e.g. bookCollection.get(middle).getReference().compareTo(givenRef) < 0. This might be what donroby mentions, not sure.
In any case, the way to find the error would be to try out different values and see for which the output is correct and for which it isn't, and thus infer what the problem is. You can also use your debugger to help you step through the algorithm, rather than using pencil and paper if you have to run many tests. Even better, as donroby said, write a unit test.
What about Collections.binarySearch()?
All of JRL's suggestions are right, but the actual fail is that your compares are reversed.
I didn't see this immediately myself, but replicating your code into a function (using strings instead of Books), writing a some simple Junit tests and then running them in the debugger made it really obvious.
Write unit tests!
I found the problem.
It turns out i was binary searching my bookCollection arrayList, and NOT the new sroted array i had created - sortedLib.
Silly mistake at my end, but thanks for the input and suggestions!