set.add(new String(s) + (ch == 0 ? "" : ch) + new StringBuffer(new String(s)).reverse());
I encountered this code from written by someone. It is java code.
s is a char[].
set is a String set.
So why does he use String and then StringBuffer?
String has a constructor which takes an array of chars, hence why they create a new String first.
Then to reverse the String, they create a StringBuffer to use a built in reverse function in order to not implement their own. StringBuffer's constructor takes a String, hence why a String is made first and then a StringBuffer
Let's split the 3 parts on 3 lines to compare:
set.add(
new String(s)
+ (ch == 0 ? "" : ch)
+ new StringBuffer(new String(s)).reverse()
);
Rewritten
It is equivalent with
String trimZero = ch == 0 ? "" : String.valueOf(ch);
set.add(String.valueOf(s) + trimZero + StringUtils.reverse(s));
Well, using Apache's StringUtils.reverse().
If s is a String it can simply added as is, for example, in an alternative way (to emphasize the different structures):
if (ch == 0) {
set.add(s + StringUtils.reverse(s));
} else {
set.add(s + String.valueOf(ch) + StringUtils.reverse(s));
}
Output wise
For example:
alphabet gets added as alphabet8tebahpla (for coincidence ch is a non-zero integer).
an gets added as anna (given that ch == 0)
Abbreviations and naming-conventions
When guessing the types I would say:
ch probably is a primitive char, array of that, or CharSequece
s most-likely is a String (rather than the rarely used short integer)
Usually abbreviations in symbols/names are ambiguous and can be considered a code-smell.
However there seems to be a historical and accepted convention or habit, especially for temporary / looping variables like:
int i
String s
char ch
The Java Pocket Guide, 4th Edition by Robert Liguori, Patricia Liguori, Chapter 1. Naming Conventions assorted them in a table:
Temporary variable names may be single letters such as i, j, k, m, and n for integers and c, d, and e for characters. Temporary and looping variables may be one-character names as shown in Table 1-1.
Even core Java methods have such (ambiguous) abbreviated parameter-names
if the method-context is obvious enough
that the contents and purpose of the parameter is self-explaining
E.g. String.contains(CharSequence s)
Related
I am trying to compare two char arrays lexicographically, using loops and arrays only. I solved the task, however, I think my code is bulky and unnecessarily long and would like an advice on how to optimize it. I am a beginner. See code below:
//Compare Character Arrays Lexicographically
//Write a program that compares two char arrays lexicographically (letter by letter).
// Research how to convert string to char array.
Scanner scanner = new Scanner(System.in);
String word1 = scanner.nextLine();
String word2 = scanner.nextLine();
char[] firstArray = word1.toCharArray();
char[] secondArray = word2.toCharArray();
for (char element : firstArray) {
System.out.print(element + " ");
}
System.out.println();
for (char element : secondArray) {
System.out.print(element + " ");
}
System.out.println();
String s = String.valueOf(firstArray);
String b = String.valueOf(secondArray);
int result = s.compareTo(b);
if (result < 0) {
System.out.println("First");
} else if (result > 0) {
System.out.println("Second");
} else {
System.out.println("Equal");
}
}
}
I think its pretty normal. You've done it right. There's not much code to reduce here , best you can do is not write the two for loops to print the char arrays. Or if you are wanting to print the two arrays then maybe use System.out.println(Arrays.toString(array_name)); instead of two full dedicated for/for each loops. It does the same thing in the background but makes your code look a little bit cleaner and that's what you are looking for.
As commented by tgdavies, you schoolwork assignment was likely intended for you to compare characters in your own code rather than call String#compareTo.
In real life, sorting words alphabetically is quite complex because of various cultural norms across various languages and dialects. For real work, we rely on collation tools rather than write our own sorting code. For example, an e with the diacritical ’ may sort before or after an e without, depending on the cultural context.
But for a schoolwork assignment, the goal of the exercise is likely to have you compare each letter of each word by examining its code point number, the number assigned to identify each character defined in Unicode. These code point numbers are assigned by Unicode in roughly alphabetical order. This code point number ordering is not sufficient to do sorting in real work, but is presumably good enough for your assignment, especially for text using only basic American English using letters a-z/A-Z.
So, if the numbers are the same, move to the next character in each word. When you reach the nth letter that are not the same in both, then in overly simplistic terms, you know which comes after which alphabetically. If all the numbers are the same, the words are the same.
Another real world problem is the char type has been legacy since Java 5, essentially broken since Java 2. As a 16-bit value, char is physically incapable of representing most characters.
So instead of char arrays, use int arrays to hold code point integer numbers.
int[] firstWordCodePoints = firstWord.codePoints().toArray() ;
This question already has answers here:
How do I compare strings in Java?
(23 answers)
Closed 3 years ago.
I'm doing a problem on hackerRank and the problem is:
Problem Statement
Here we have to count the number of valleys does XYZ person visits.
A valley is a sequence of consecutive steps below sea level, starting with a step down from sea level and ending with a step up to sea level.
For One step up it U, and one step down it is D. We will be given the number of steps does XYZ person traveled plus the ups and down in the form of string, i.e,
UUDDUDUDDUU
Sample Input
8
UDDDUDUU
Sample Output
1
Explanation
If we represent _ as sea level, a step up as /, and a step down as \, Gary's hike can be drawn as:
_/\ _
\ /
\/\/
He enters and leaves one valley.
The code I wrote doesn't work
static int countingValleys(int n, String s) {
int count = 0;
int level = 0;
String[] arr = s.split("");
for(int i = 0; i<n;i++){
if(arr[i] == "U"){
level++;
} else{
level--;
}
if(level==0 && arr[i]=="U"){
count++;
}
}
return count;
}
But another solution I found does, however no matter how I look at it the logic is the same as mine:
static int countingValleys(int n, String s) {
int v = 0; // # of valleys
int lvl = 0; // current level
for(char c : s.toCharArray()){
if(c == 'U') ++lvl;
if(c == 'D') --lvl;
// if we just came UP to sea level
if(lvl == 0 && c == 'U')
++v;
}
return v;
}
So what's the difference I'm missing here that causes mine to not work?
Thanks.
In java, you need to do this to compare String values:
if("U".equals (arr[i])) {
And not this:
if(arr[i] == "U") {
The former compares the value "U" to the contents of arr[i].
The latter checks whether the strings reference the same content or more precisely the same instance of an object. You could think of this as do they refer to the same block of memory? The answer, in this case, is they do not.
To address the other aspect of your question.
Why this works:
for(char c : s.toCharArray()){
if(c == 'U') ++lvl;
if(c == 'D') --lvl;
when this does not:
String[] arr = s.split("");
for(int i = 0; i<n;i++){
if(arr[i] == "U"){
You state that the logic is the same. Hmmmm, maybe, but the data types are not.
In the first version, the string s is split into an array of character values. These are primitive values (i.e. an array of values of a primitive data type) - just like numbers are (ignoring autoboxing for a moment). Since character values are primitive types, the value in arr[i] is compared by the == operator. Thus arr[i] == 'U' (or "is the primitive character value in arr[i] equal to the literal value 'U') results in true if arr[i] happens to contain the letter 'U'.
In the second version, the string s is split into an array of strings. This is an array of instances (or more precisely, an array of references to instances) of String objects. In this case the == operator compares the reference values (you might think of this as a pointer to the two strings). In this case, the value of arr[i] (i.e. the reference to the string) is compared to the reference to the string literal "U" (or "D"). Thus arr[i] == "U" (or "is the reference value in arr[i] equal to the reference value of where the String instance containing a "U" string" is located) is false because these two strings are in different locations in memory.
As mentioned above, since they are different instances of String objects the == test is false (the fact that they just happen to contain the same value is irrelevant in Java because the == operator doesn't look at the content). Hence the need for the various equals, equalsIgnoreCase and some other methods associated with the String class that define exactly how you wish to "compare" the two string values. At risk of confusing you further, you could consider a "reference" or "pointer" to be a primitive data type, and thus, the behaviour of == is entirely consistent.
If this doesn't make sense, then think about it in terms of other object types. For example, consider a Person class which maybe has name, date of birth and zip/postcode attributes. If two instances of Person happen to have the same name, DOB and zip/postcode, does that mean that they are the same Person? Maybe, but it could also mean that they are two different people that just happen to have the same name, same date of birth and just happen to live in the same suburb. While unlikely, it definitely does happen.
FWIW, the behaviour of == in Java is the same behaviour as == in 'C'. For better or worse, right or wrong, this is the behaviour that the Java designers chose for == in Java.
It is worthy to note that other languages, e.g. Scala, define the == operator for Strings (again rightly or wrongly, for better or worse) to perform a comparison of the values of the strings via the == operator. So, in theory, if you addressed other syntactic issues, your arr[i] == "U" test would work in Scala. It all boils down to understanding the rules that the various operators and methods implement.
Going back to the Person example, assume Person was defined as a case class in Scala. If we created two instances of Person with the same name, DOB and zip/postcode (e.g. p1 and p2), then p1 == p2 would be true (in Scala). To perform a reference comparison (i.e. are p1 and p2 instances of the same object), we would need to use p1.eq(p2) (which would result in false).
Hopefully the Scala reference, does not create additional confusion. If it does, then simply think of it as the function of an operator (or method) is defined by the designers of the language / library that you are using and you need to understand what their rules are.
At the time Java was designed, C was prevalent, so it can be argued that it makes sense the C like behaviour of == replicated in Java was a good choice at that time. As time has moved on, more people think that == should be a value comparison and thus some languages have implemented it that way.
I got a problem and I think it is in comparing a char with a number.
String FindCountry = "BB";
Map<String, String> Cont = new HashMap <> ();
Cont.put("BA-BE", "Angola");
Cont.put("9X-92", "Trinidad & Tobago");
for ( String key : Cont.keySet()) {
if (key.charAt(0) == FindCountry.charAt(0) && FindCountry.charAt(1) >= key.charAt(1) && FindCountry.charAt(1) <= key.charAt(4)) {
System.out.println("Country: "+ Cont.get(key));
}
}
In this case the code print "Angola", but if
String FindCountry = "9Z"
it doesn't print anything. I am not sure I think the problem is in that it can't compare that is '2' greater than 'Z'. In that example, I got only two Cont.put(), but in my file, I got much more and a lot of them are not only with chars. I got a problem with them.
What is the smartest and best way to compare char with a number ? Actually, if I set a rule like "1" is greater than "Z" it will be okay because I need this way of greater: A-Z-9-0.
Thanks!
You can use a lookup "table", I used a String:
private static final String LOOKUP = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
And then compare the chars with indexOf(), but it seems messy and could probably be achieved more easily, I just can't come up with something easier at the moment:
String FindCountry = "9Z";
Map<String, String> Cont = new HashMap<>();
Cont.put("BA-BE", "Angola");
Cont.put("9X-92", "Trinidad & Tobago");
for (String key : Cont.keySet()) {
if (LOOKUP.indexOf(key.charAt(0)) == LOOKUP.indexOf(FindCountry.charAt(0)) &&
LOOKUP.indexOf(FindCountry.charAt(1)) >= LOOKUP.indexOf(key.charAt(1)) &&
LOOKUP.indexOf(FindCountry.charAt(1)) <= LOOKUP.indexOf(key.charAt(4))) {
System.out.println("Country: " + Cont.get(key));
}
}
If you only use the characters A-Z and 0-9, you could add a conversion method in between which will increase the values of the 0-9 characters so they'll be after A-Z:
int applyCharOrder(char c){
// If the character is a digit:
if(c < 58){
// Add 43 to put it after the 'Z' in terms of decimal unicode value:
return c + 43;
}
// If it's an uppercase letter instead: simply return it as is
return c;
}
Which can be used like this:
if(applyCharOrder(key.charAt(0)) == applyCharOrder(findCountry.charAt(0))
&& applyCharOrder(findCountry.charAt(1)) >= applyCharOrder(key.charAt(1))
&& applyCharOrder(findCountry.charAt(1)) <= applyCharOrder(key.charAt(4))){
System.out.println("Country: "+ cont.get(key));
}
Try it online.
Note: Here is a table with the decimal unicode values. Characters '0'-'9' will have the values 48-57 and 'A'-'Z' will have the values 65-90. So the < 58 is used to check if it's a digit-character, and the + 43 will increase the 48-57 to 91-100, putting their values above the 'A'-'Z' so your <= and >= checks will work as you'd want them to.
Alternatively, you could create a look-up String and use its index for the order:
int applyCharOrder(char c){
return "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789".indexOf(c);
}
Try it online.
PS: As mentioned in the first comment by #Stultuske, variables are usually in camelCase, so they aren't starting with an uppercase letter.
As the others stated in the comments, such mathematical comparison operations on characters are based on the actual ASCII values of each char. So I'd suggest you refactor your logic using the ASCII table as reference.
Let's say there has a string like " world ". This String only has the blank at front and end. Is the trim() faster than replace()?
I used the replace once and my mentor said don't use it since the trim() probably faster.
If not, what's the advantage of trim() than replace()?
If we look at the source code for the methods:
replace():
public String replace(CharSequence target, CharSequence replacement) {
String tgtStr = target.toString();
String replStr = replacement.toString();
int j = indexOf(tgtStr);
if (j < 0) {
return this;
}
int tgtLen = tgtStr.length();
int tgtLen1 = Math.max(tgtLen, 1);
int thisLen = length();
int newLenHint = thisLen - tgtLen + replStr.length();
if (newLenHint < 0) {
throw new OutOfMemoryError();
}
StringBuilder sb = new StringBuilder(newLenHint);
int i = 0;
do {
sb.append(this, i, j).append(replStr);
i = j + tgtLen;
} while (j < thisLen && (j = indexOf(tgtStr, j + tgtLen1)) > 0);
return sb.append(this, i, thisLen).toString()
}
Vs trim():
public String trim() {
int len = value.length;
int st = 0;
char[] val = value; /* avoid getfield opcode */
while ((st < len) && (val[st] <= ' ')) {
st++;
}
while ((st < len) && (val[len - 1] <= ' ')) {
len--;
}
return ((st > 0) || (len < value.length)) ? substring(st, len) : this;
}
As you can see replace() calls multiple other methods and iterates throughout the entire String, while trim() simply iterates over the beginning and ending of the String until the character isn't a white space. So in the single respect of trying to only remove white space before and after a word, trim() is more efficient.
We can run some benchmarks on this:
public static void main(String[] args) {
long testStartTime = System.nanoTime();;
trimTest();
long trimTestTime = System.nanoTime() - testStartTime;
testStartTime = System.nanoTime();
replaceTest();
long replaceTime = System.nanoTime() - testStartTime;
System.out.println("Time for trim(): " + trimTestTime);
System.out.println("Time for replace(): " + replaceTime);
}
public static void trimTest() {
for(int i = 0; i < 1000000; i ++) {
new String(" string ").trim();
}
}
public static void replaceTest() {
for(int i = 0; i < 1000000; i ++) {
new String(" string ").replace(" ", "");
}
}
Output:
Time for trim(): 53303903
Time for replace(): 485536597
//432,232,694 difference
Assuming that the people writing the Java library code are doing a good job1, you can assume that a special purpose method (like trim()) will be as fast, and probably faster than a general purpose method (like replace(...)) doing the same thing.
Two reasons:
If the special purpose method is slower, its implementation can be rewritten as equivalent calls to the general purpose one, making the performance equivalent in most cases. A competent programmer will do this because it reduces maintenance costs.
In the special purpose method, it is likely that there will be optimizations that can be made that don't apply in the general-purpose case.
In this case we know that trim() only needs to look at the start and end of the string ... whereas replace(...) needs to look at all of the characters in the string. (We can infer this from the description of what the respective methods do.)
If we assume "competence" then we can infer that the developers will have done the analysis and not implemented trim() sub-optimally2; i.e. they won't code trim() to examine all characters.
There is another reason to use the special purpose method over the general purpose. It makes your code simpler, easier to read, and easier to inspect for correctness. This may well be more important than performance.
This clearly applies in the case of trim() versus replace(...).
1 - We can in this case. There are lots of eyes looking at this code, and lots of people who will complain loudly about egregious performance issues.
2 - Unfortunately, it is not always as straightforward as this. A library method needs to be optimized for "typical" behavior, but it also needs to avoid pathological performance in edge-cases. It is not always possible to achieve both things.
trim() is definitely faster to type, yes. It doesn't take any parameters.
It is also much faster to understand what you where trying to do. You were trying to trim the string, rather than replacing all the spaces it contains with the empty string, knowing from other context that there is only space at the beginning and the end of the string.
Indeed much faster no matter how you look at it. Don't complicate the life of the persons who're trying to read your code. Most of the time, it will be you months later, or at least someone you don't hate.
Trim will prune the outter characters until they are non white space. I believe they trim space, tab, and new lines.
Replace will scan the entire string (so, it could be a sentense) and would replace inner " " with "", essentially compressing them together.
They have different use cases though, obviously 1 is to clean up user input where the other is to update a string where matches are found with something else.
That being said, run times: Replace will run in N time, as it will look for all matching characters. Trim will run in O(N), but most likely a just a few characters off of each end.
The idea behind trim i think came around from people would would type and input things but accidentally press space before submitting their forms, essentially trying to save the field "Foo " instead of "Foo"
s.trim() shortens a String s. This means no characters has to be moved from an index to another. It starts at the first character (s.toCharArray()[0]) of the String and shortens the String character by character until the first non-whitespace character occurs. It works the same way to shorten the String at the end. So it compresses the String. If a String has no leading and trailing whitespace trim will be ready after checking the first and the last character.
In case of " world ".trim() two steps are needed: one to remove the first leading whitespace as it is on the first index and the the second to remove the last whitespace as it is on the last index.
" world ".replace(" ", "") will need at least n = " world ".length() steps. It has to check every character if it has to be replaced. But if we take into account that the implementation of String.replace(...) needs to compile a Pattern, build a Matcher and then to replace all the matched regions it's seems far complex comparing to shorten a String.
We also have to consider that " world ".replace(" ", "") does not replace whitespaces but only the String " ". Since String replace(CharSequence target, CharSequence replacement) compiles the target using Pattern.LITERAL we cannot use the character class \s. To be more accurate we would have to compare " world ".trim() to " world ".replaceAll("\\s", ""). It is still not the same because a whitespace in String trim() is defined as c <= ' ' for each c in s.toCharArray().
Summarizing: String.trim() should be faster - especially for long strings
The description how the methods work is based on the implementation of String in Java 8. But implementations can change.
But the question should be: What do you intent to do with the string? Do you want to trim it or to replace some characters? According to it use the corresponding method.
I am trying to print the letters of the alphabet in caps. So I wrote this in a for loop:
System.out.print(Character.toChars(i));
//where i starts at 65 and ends at 90
This works fine and prints the letters but In my code I wanted to put a space between the letters to make it look nicer. So i did this:
System.out.print(Character.toChars(i) + " ")
Why does it print the memory address of the characters instead of the letter?
The solution I came up with was to explicitly convert the char to a new String object:
String character = new String(Character.toChars(i));
System.out.print (character + " ");
but I'm not quite sure why I can't just write "Character.toChars(i)"
In the first one Does the method(Character.toChars()) point to the address of the character and System.out.print is smart enough to print the value at that address? i.e the corresponding letter?
System.out.print(Character.toChars(i)) calls PrintStream.print(char[]), an overload that handles char[] specially.
Character.toChars(i) + " " is really equivalent to Character.toChars(i).toString() + " "; calling toString() on an array type results in a string representation of its address (this behaviour is directly inherited from Object).
A simpler solution for your particular case may be this:
System.out.println((char)i + " ");
The Character.toChars method returns char[], which will be represented as [C#<hex hashcode> in String form.
You don't need to use the toChars method (or do any casting at all):
for (char c = 'A'; c <= 'Z'; c++) {
System.out.print(c + " ");
}
You use string concatenation, with one side being an array of chars and the other a string and according to the Java language specification, then as the char array is not a primitive type, but a reference value (aka an object), its toString method is called. And as there is no specific method implemented for arrays, they inherit the method implementation from java.lang.Object, which prints the address.
On the other hand, System.out.print(Character.toChars(i)) calls a specific implementation of print for character arrays, see the documentation of PrintStream.