Reversing a sentence recursively in Java - java

I want to reverse a sentence recursively and below is my following code. I wanted to know what other bases cases shud i take care of. And for the base case if string is null, how should that be handled?
public String reverse(String s) {
int n = s.indexOf(' ');
if(n == -1)
return s;
return reverse(s.substring(n+1))+" "+s.substring(0,n);
}

The reverse of null is null, so that's easy:
if(s == null) return null;
Because your method has the potential to return null, then, I would also do some null checking before referencing the value in your return statement and trying to append to it. so, something like...
String reversed = reverse(s.substring(n+1));
if(reversed != null) return reverse + " " + s.substring(0,n);
else return s;
Everything else looks fine. You shouldn't need any other base cases. Of course, this will reverse the sentence exactly as-is, including punctuation and case information. If you want to do this kind of thing, more strenuous processing will be required.
To ensure appropriate upper- and lower-case structure, I'd probably do something like this in your normal base case:
if(n == -1) {
s = s.toLowerCase();
String firstLetter = new String(s.charAt(0));
s = s.replaceFirst(firstLetter, firstLetter.toUpperCase());
return s;
}
Punctuation gets a little more complicated, especially if you have more than just an ending period, exclamation point, or question mark.

in your case, if the string is null, you can return an empty string (""). Returning a null will require you to handle the null in the calling functions and if you miss a case, you might encounter a NullPointerException

Related

StringIndexOutOfBounds when removing adjacent duplicate letters

Here is my code:
public static String removeAdjDuplicates(String s) {
if(s == "" || s == null || s.isEmpty())
return s;
if(s.length() < 2)
return s;
if(s.charAt(0) != s.charAt(1))
s = s.charAt(0) + removeAdjDuplicates(s.substring(1));
if(s.charAt(0) == s.charAt(1)) //line 37
return removeAdjDuplicates(s.substring(2));
return s;
}
With the input string "ull", I get the following error:
Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 1
at java.lang.String.charAt(String.java:658)
at GFG.removeAdjDuplicates(File.java:37)
at GFG.main(File.java:16)
I read and tried answers given to similar questions, but I'm not sure what is wrong.
Judging from the exception that you get, removeAdjDuplicates returns an empty string, invalidating all indexes past zero.
Although your code performs length checking at the top, it also performs this assignment when the two initial characters are different:
s = s.charAt(0) + removeAdjDuplicates(s.substring(1));
This means that s can become a one-character string if removeAdjDuplicates returns an empty string.
As you Try to pass this string "ull" to the method the last letter in the String should be the letter "u" because you use this
if(s.charAt(0) != s.charAt(1))
s = s.charAt(0) + removeAdjDuplicates(s.substring(1));
as you dont return the String back like the other conditions in the method it will continue to the next condition at line 37
and u have only one letter while the condition checking the first and the second characters ... there is no second letter so you get this error .. so the solution is to return s like this
if(s.charAt(0) != s.charAt(1)){
s = s.charAt(0) + removeAdjDuplicates(s.substring(1));
return s;
}
I think the source of the error is sufficiently explained by #dasblinkenlight's answer.
Although not clearly stated in the question, it looks like you're trying to remove adjacent duplicate letters recursively (one of your comments mentions that you would expect output s for input geegs).
Here's an alternative way to do it:
while(!s.equals(s = s.replaceAll("(.)\\1", "")));
It uses a regular expression to match and remove duplicate characters, and the while loop keeps executing this until the string is no longer being modified by the operation.
You should simplify your code:
public static String removeAdjDuplicates(String s) {
if (s == null || s.length() < 2)
return s;
if (s.charAt(0) != s.charAt(1))
return s.charAt(0) + removeAdjDuplicates(s.substring(1));
return removeAdjDuplicates(s.substring(2));
}
Changes
The first two if statements do the same thing (return s;) and can be combined into one. Some of the conditions are redundant and can be eliminated.
The third if statement should immediately return instead of continuing into the fourth if statement (or you can instead change the fourth if statement into an else), because removedAdjDuplicates can return an empty String making s a length-one String when the fourth if is expecting at least a length-two String.
The fourth if can be eliminated because if (s.charAt(0) != s.charAt(1)) failed in the third if, then the only alternative is that (s.charAt(0) == s.charAt(1)), so the check for that isn't necessary.

How to find duplicates inside a string?

I want to find out if a string that is comma separated contains only the same values:
test,asd,123,test
test,test,test
Here the 2nd string contains only the word "test". I'd like to identify these strings.
As I want to iterate over 100GB, performance matters a lot.
Which might be the fastest way of determining a boolean result if the string contains only one value repeatedly?
public static boolean stringHasOneValue(String string) {
String value = null;
for (split : string.split(",")) {
if (value == null) {
value = split;
} else {
if (!value.equals(split)) return false;
}
}
return true;
}
No need to split the string at all, in fact no need for any string manipulation.
Find the first word (indexOf comma).
Check the remaining string length is an exact multiple of that word+the separating comma. (i.e. length-1 % (foundLength+1)==0)
Loop through the remainder of the string checking the found word against each portion of the string. Just keep two indexes into the same string and move them both through it. Make sure you check the commas too (i.e. bob,bob,bob matches bob,bobabob does not).
As assylias pointed out there is no need to reset the pointers, just let them run through the String and compare the 1st with 2nd, 2nd with 3rd, etc.
Example loop, you will need to tweak the exact position of startPos to point to the first character after the first comma:
for (int i=startPos;i<str.length();i++) {
if (str.charAt(i) != str.charAt(i-startPos)) {
return false;
}
}
return true;
You won't be able to do it much faster than this given the format the incoming data is arriving in but you can do it with a single linear scan. The length check will eliminate a lot of mismatched cases immediately so is a simple optimization.
Calling split might be expensive - especially if it is 200 GB data.
Consider something like below (NOT tested and might require a bit of tweaking the index values, but I think you will get the idea) -
public static boolean stringHasOneValue(String string) {
String seperator = ",";
int firstSeparator = string.indexOf(seperator); //index of the first separator i.e. the comma
String firstValue = string.substring(0, firstSeparator); // first value of the comma separated string
int lengthOfIncrement = firstValue.length() + 1; // the string plus one to accommodate for the comma
for (int i = 0 ; i < string.length(); i += lengthOfIncrement) {
String currentValue = string.substring(i, firstValue.length());
if (!firstValue.equals(currentValue)) {
return false;
}
}
return true;
}
Complexity O(n) - assuming Java implementations of substring is efficient. If not - you can write your own substring method that takes the required no of characters from the String.
for a crack just a line code:
(#Tim answer is more efficient)
System.out.println((new HashSet<String>(Arrays.asList("test,test,test".split(","))).size()==1));

Efficient Text Processing Java

I have created an application to process log files but am having some bottle neck when the amount of files = ~20
The issue comes from a particular method which takes on average a second or so to complete roughly and as you can imagime this isn't practical when it needs to be done > 50 times
private String getIdFromLine(String line){
String[] values = line.split("\t");
String newLine = substringBetween(values[4], "Some String : ", "Value=");
String[] split = newLine.split(" ");
return split[1].substring(4, split[1].length());
}
private String substringBetween(String str, String open, String close) {
if (str == null || open == null || close == null) {
return null;
}
int start = str.indexOf(open);
if (start != -1) {
int end = str.indexOf(close, start + open.length());
if (end != -1) {
return str.substring(start + open.length(), end);
}
}
return null;
}
A line comes from the reading of a file which is very efficient so I don't feel a need to post that code unless someone asks.
Is there anyway to improve perofmrance of this at all?
Thanks for your time
A few things are likely problematic:
Whether or not you realized, you are using regular expressions. The argument to String.split() is a treated as a regex. Using String.indexOf() will almost certainly be a faster way to find the particular portion of the String that you want. As HRgiger points out, Guava's splitter is a good choice because it does just that.
You're allocating a bunch of stuff you don't need. Depending on how long your lines are, you could be creating a ton of extra Strings and String[]s that you don't need (and the garbage collecting them). Another reason to avoid String.split().
I also recommend using String.startsWith() and String.endsWith() rather that all of this stuff that you're doing with the indexOf() if only for the fact that it'd be easier to read.
I would try to use regular expressions.
One of the main problems in this code is the "split" method.
For example this one:
private String getIdFromLine3(String line) {
int t_index = -1;
for (int i = 0; i < 3; i++) {
t_index = line.indexOf("\t", t_index+1);
if (t_index == -1) return null;
}
//String[] values = line.split("\t");
String newLine = substringBetween(line.substring(t_index + 1), "Some String : ", "Value=");
// String[] split = newLine.split(" ");
int p_index = newLine.indexOf(" ");
if (p_index == -1) return null;
int p_index2 = newLine.indexOf(" ", p_index+1);
if (p_index2 == -1) return null;
String split = newLine.substring(p_index+1, p_index2);
// return split[1].substring(4, split[1].length());
return split.substring(4, split.length());
}
UPD: It could be 3 times faster.
I would recommend to use the VisualVM to find the bottle neck before oprimisation.
If you need performance in your application, you will need profiling anyways.
As optimisation i would make an custom loop to replace yours substringBetween method and get rid of multiple indexOf calls
Google guava splitter pretty fast as well.
Could you try the regex anyway and post results please just for comparison:
Pattern p = Pattern.compile("(Some String : )(.*?)(Value=)"); //remove first and last group if not needed (adjust m.group(x) to match
#Test
public void test2(){
String str = "Long java line with Some String : and some object with Value=154345 ";
System.out.println(substringBetween(str));
}
private String substringBetween(String str) {
Matcher m = p.matcher(str);
if(m.find(2)){
return m.group(2);
}else{
return null;
}
}
If this is faster find a regex that combines both functions

StringBuffer Append Space (" ") Appends "null" Instead

Basically what I'm trying to do is take a String, and replace each letter in the alphabet inside, but preserving any spaces and not converting them to a "null" string, which is the main reason I am opening this question.
If I use the function below and pass the string "a b", instead of getting "ALPHA BETA" I get "ALPHAnullBETA".
I've tried all possible ways of checking if the individual char that is currently iterated through is a space, but nothing seems to work. All these scenarios give false as if it's a regular character.
public String charConvert(String s) {
Map<String, String> t = new HashMap<String, String>(); // Associative array
t.put("a", "ALPHA");
t.put("b", "BETA");
t.put("c", "GAMA");
// So on...
StringBuffer sb = new StringBuffer(0);
s = s.toLowerCase(); // This is my full string
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
String st = String.valueOf(c);
if (st.compareTo(" ") == 1) {
// This is the problematic condition
// The script should just append a space in this case, but nothing seems to invoke this scenario
} else {
sb.append(st);
}
}
s = sb.toString();
return s;
}
compareTo() will return 0 if the strings are equal. It returns a positive number of the first string is "greater than" the second.
But really there's no need to be comparing Strings. You can do something like this instead:
char c = s.charAt(i);
if(c == ' ') {
// do something
} else {
sb.append(c);
}
Or even better for your use case:
String st = s.substring(i,i+1);
if(t.contains(st)) {
sb.append(t.get(st));
} else {
sb.append(st);
}
To get even cleaner code, your Map should from Character to String instead of <String,String>.
String.compareTo() returns 0 if the strings are equal, not 1. Read about it here
Note that for this case you don't need to convert the char to a string, you could do
if(c == ' ')
use
Character.isWhitespace(c)
that solves the issue. Best practice.
First, of all, what is s in this example? It's hard to follow the code. Then, your compareTo seems off:
if (st.compareTo(" ") == 1)
Should be
if (st.compareTo(" ") == 0)
since 0 means "equal" (read up on compareTo)
From the compareTo documentation: The result is a negative integer if this String object lexicographically precedes the argument string. The result is a positive integer if this String object lexicographically follows the argument string. The result is zero if the strings are equal;
You have the wrong condition in if (st.compareTo(" ") == 1) {
The compareTo method of a String returns -1 if the source string precedes the test string, 0 for equality, and 1 if the source string follows. Your code checks for 1, and it should be checking for 0.

In Java, how to find if first character in a string is upper case without regex

In Java, find if the first character in a string is upper case without using regular expressions.
Assuming s is non-empty:
Character.isUpperCase(s.charAt(0))
or, as mentioned by divec, to make it work for characters with code points above U+FFFF:
Character.isUpperCase(s.codePointAt(0));
Actually, this is subtler than it looks.
The code above would give the incorrect answer for a lower case character whose code point was above U+FFFF (such as U+1D4C3, MATHEMATICAL SCRIPT SMALL N). String.charAt would return a UTF-16 surrogate pair, which is not a character, but rather half the character, so to speak. So you have to use String.codePointAt, which returns an int above 0xFFFF (not a char). You would do:
Character.isUpperCase(s.codePointAt(0));
Don't feel bad overlooked this; almost all Java coders handle UTF-16 badly, because the terminology misleadingly makes you think that each "char" value represents a character. UTF-16 sucks, because it is almost fixed width but not quite. So non-fixed-width edge cases tend not to get tested. Until one day, some document comes in which contains a character like U+1D4C3, and your entire system blows up.
There is many ways to do that, but the simplest seems to be the following one:
boolean isUpperCase = Character.isUpperCase("My String".charAt(0));
Don't forget to check whether the string is empty or null. If we forget checking null or empty then we would get NullPointerException or StringIndexOutOfBoundException if a given String is null or empty.
public class StartWithUpperCase{
public static void main(String[] args){
String str1 = ""; //StringIndexOfBoundException if
//empty checking not handled
String str2 = null; //NullPointerException if
//null checking is not handled.
String str3 = "Starts with upper case";
String str4 = "starts with lower case";
System.out.println(startWithUpperCase(str1)); //false
System.out.println(startWithUpperCase(str2)); //false
System.out.println(startWithUpperCase(str3)); //true
System.out.println(startWithUpperCase(str4)); //false
}
public static boolean startWithUpperCase(String givenString){
if(null == givenString || givenString.isEmpty() ) return false;
else return (Character.isUpperCase( givenString.codePointAt(0) ) );
}
}
Make sure you first check for null and empty and ten converts existing string to upper. Use S.O.P if want to see outputs otherwise boolean like Rabiz did.
public static void main(String[] args)
{
System.out.println("Enter name");
Scanner kb = new Scanner (System.in);
String text = kb.next();
if ( null == text || text.isEmpty())
{
System.out.println("Text empty");
}
else if (text.charAt(0) == (text.toUpperCase().charAt(0)))
{
System.out.println("First letter in word "+ text + " is upper case");
}
}
If you have to check it out manually you can do int a = s.charAt(0)
If the value of a is between 65 to 90 it is upper case.
we can find upper case letter by using regular expression as well
private static void findUppercaseFirstLetterInString(String content) {
Matcher m = Pattern
.compile("([a-z])([a-z]*)", Pattern.CASE_INSENSITIVE).matcher(
content);
System.out.println("Given input string : " + content);
while (m.find()) {
if (m.group(1).equals(m.group(1).toUpperCase())) {
System.out.println("First Letter Upper case match found :"
+ m.group());
}
}
}
for detailed example . please visit http://www.onlinecodegeek.com/2015/09/how-to-determines-if-string-starts-with.html
String yourString = "yadayada";
if (Character.isUpperCase(yourString.charAt(0))) {
// print something
} else {
// print something else
}

Categories