How to get real end index of string found in another string - java

I am trying to get a range of chars found in another string using Java:
String input = "test test2 Test3";
String substring = "test2";
int diffStart = StringUtils.indexOf(input, substring);
int diffEnd = StringUtils.lastIndexOf(input, substring);
I want to get
diffStart = 5
diffEnd = 10
But I am getting
diffStart = 5
diffEnd = 5
Based on Apache's Commons lastIndexOf function it should work:
public static int lastIndexOf(CharSequence seq,
CharSequence searchSeq)
Finds the last index within a CharSequence, handling null. This method
uses String.lastIndexOf(String) if possible.
StringUtils.lastIndexOf("aabaabaa", "ab") = 4
What am I doing wrong?

you probably want
diffStart = String.valueOf(StringUtils.indexOf(strInputString02, strOutputDiff));
diffEnd = diffStart + strOutputDiff.length();
lastIndexOf finds the matching string, but the last instance of it.
E.g. ab1 ab2 ab3 ab4
lastindexof("ab") finds the 4th ab
indexof("ab") finds the 1st ab (position 0)
However, they always return the location of the first character.
If there is only one instance of a substring lastindexof and indexof will give the same index.
(To enhance your example more, you may also want to do some -1 checks in case the substring is not there at all)

Related

Java - second last occurrence of char in string

Suppose I've the string
String path = "the/quick/brown/fox/jumped/over/the/lazy/dog/";
I would like the following output
String output = "the/quick/brown/fox/jumped/over/the/lazy/";
I was thinking the following would do
output = path.substring(0, path.lastIndexOf("/", 1));
given how the doc says
Returns the index of the first (last) occurrence of the specified character, searching forward (backward) from the specified index.
but that doesn't seem to work.
Any help would be appreciated.
It seems like every single answer is assuming that you already know the input string and the exact position of the last occurrence of "/" in it, when that is usually not the case...
Here's a more general method to obtain the nth-last (second-last, third-last, etc.) occurrence of a character inside a string:
static int nthLastIndexOf(int nth, String ch, String string) {
if (nth <= 0) return string.length();
return nthLastIndexOf(--nth, ch, string.substring(0, string.lastIndexOf(ch)));
}
Usage:
String s = "the/quick/brown/fox/jumped/over/the/lazy/dog/";
System.out.println(s.substring(0, nthLastIndexOf(2, "/", s)+1)); // substring up to 2nd last included
System.out.println(s.substring(0, nthLastIndexOf(3, "/", s)+1)); // up to 3rd last inc.
System.out.println(s.substring(0, nthLastIndexOf(7, "/", s)+1)); // 7th last inc.
System.out.println(s.substring(0, nthLastIndexOf(2, "/", s))); // 2nd last, char itself excluded
Output:
the/quick/brown/fox/jumped/over/the/lazy/
the/quick/brown/fox/jumped/over/the/
the/quick/brown/
the/quick/brown/fox/jumped/over/the/lazy
This works, given path length >2
final String path = "the/quick/brown/fox/jumped/over/the/lazy/dog/";
final int secondLast = path.length()-2;
final String output = path.substring(0, path.lastIndexOf("/",secondLast)+1);
System.out.println(output);
The lastIndexOf method's second parameter specifies the maximum index upto where the method should search the string. This means, that in your case
path.lastIndexOf("/", 1)
returns the first index of "/" whose index is smaller than 1.
First of all, lastIndexOf will return an index, not a string. It also searches backwards from the specified index 1, so it will only look at everything before and including the character at index 1. This means that it only checks t and h. Expectedly, it finds nothing and returns -1.
You should just omit the second argument if you want to search the whole string.
In addition, to achieve your desired output string (I assume you want the last path component removed?), you can use replaceAll with a regex:
String output = path.replaceAll("[^/]+/$", "");
Using Apache Commons IO
String output = org.apache.commons.io.FilenameUtils.getPath(path);
Not using Apache
public static String removeLastPart(String str) {
int pos = str.length() - 1;
while (str.charAt(pos) != '/' || pos + 1 == str.length()) {
pos--;
}
return str.substring(0, pos + 1);
}
If you are dealing with paths and files why not use the built in classes? Something like below seems to me easier than string manipulation:
Path path = Paths.get("the/quick/brown/fox/jumped/over/the/lazy/dog/");
System.out.println(path.getParent());
// prints: the\quick\brown\fox\jumped\over\the\lazy
System.out.println(path.getParent().getParent());
// prints: the\quick\brown\fox\jumped\over\the
For example,
String key = "aaa/bbb/ccc/ddd" ;
and i need my result string as "ccc/ddd". which is, sub-string of second last index of "/", The following code helps ::
String key="aaa/bbb/ccc/ddd";
key=key.substring(key.substring(0, key.lastIndexOf("/")).lastIndexOf("/")+1);
The final value of key will be "ccc/ddd".
Here's an use case:
String url = "http://localhost:4000/app/getPTVars";
int secondLastIndexOf = url.substring(0, url.lastIndexOf('/')).lastIndexOf('/');
System.out.println(secondLastIndexOf);
System.out.println(url.substring(secondLastIndexOf, url.length()));
and the output:
21
/app/getPTVars
Try - 1 approach:
int j = path.lastIndexOf("/");
int i = path.lastIndexOf("/", j - 1); // the 2nd last index from the last index
String output = path.substring(0, i + 1); // inclusive
String path = "the/quick/brown/fox/jumped/over/the/lazy/dog/";
String output = path.substring(0, path.lastIndexOf("/",path.lastIndexOf("/")-1)+1);

StringIndexOutOfBoundsException when trying to get string from long string

I tried to get string from long string which is Firebase URL
"https://firebasestorage.googleapis.com/v0/b/No-manworld-3577.appspot.com/o/Contacts%2F1510361061636_Julien_Vcf?alt=media&token=c0bff20d-d115-4fef-b58c-4c7ffaef4296"
Now if you notice there is under score before and after name Julien in above string. I am trying to get that name but i am getting
java.lang.StringIndexOutOfBoundsException: String index out of range: -1
Here is my piece of code
String s="https://firebasestorage.googleapis.com/v0/b/No-manworld-3577.appspot.com/o/Contacts%2F1510361061636_Julien_Vcf?alt=media&token=c0bff20d-d115-4fef-b58c-4c7ffaef4296";
String newName=s.substring(s.indexOf("_")+1, s.indexOf("_"));
System.out.println(newName);
As said in my comment, when using substring, the first number has to be smaller than the second one.
In your case, you are calling substring with x + 1 and x. x + 1 > x thus substring fails, with x being s.indexOf("_").
I understand that you are trying to get the second indexOf of _.
Here is code that would in your case yield Julien:
String s = "...";
int start = s.indexOf("_") + 1;
int end = s.indexOf("_", start);
// name will hold the content of s between the first two `_`s, assuming they exist.
String name = s.substring(start, end);
If requirements are not clear on which 2 _ to select then here is Java 8 Stream way of doing it ..
public class Check {
public static void main(String[] args) {
String s = "https://firebasestorage.googleapis.com/v0/b/No-manworld-3577.appspot.com/o/Contacts%2F1510361061636_Julien_Vcf?alt=media&token=c0bff20d-d115-4fef-b58c-4c7ffaef4296";
long count = s.chars().filter(ch -> ch == '_').count();
if (count == 2) {
System.out.println(s.substring(s.indexOf('_') + 1, s.lastIndexOf('_')));
} else {
System.out.println("More than 2 underscores");
}
}
}
Why your code didn't work?
Let assume s.indexOf("_") gets some positive number say 10 then below translates to ...
String newName=s.substring(s.indexOf("_")+1, s.indexOf("_"));
String newName=s.substring(11, 10);
This will give StringIndexOutOfBoundsException as endIndex < beginIndex for subString method.

Replacing a character in a string from another string with the same char index

I'm trying to search and reveal unknown characters in a string. Both strings are of length 12.
Example:
String s1 = "1x11222xx333";
String s2 = "111122223333"
The program should check for all unknowns in s1 represented by x|X and get the relevant chars in s2 and replace the x|X by the relevant char.
So far my code has replaced only the first x|X with the relevant char from s2 but printed duplicates for the rest of the unknowns with the char for the first x|X.
Here is my code:
String VoucherNumber = "1111x22xx333";
String VoucherRecord = "111122223333";
String testVoucher = null;
char x = 'x'|'X';
System.out.println(VoucherNumber); // including unknowns
//find x|X in the string VoucherNumber
for(int i = 0; i < VoucherNumber.length(); i++){
if (VoucherNumber.charAt(i) == x){
testVoucher = VoucherNumber.replace(VoucherNumber.charAt(i), VoucherRecord.charAt(i));
}
}
System.out.println(testVoucher); //after replacing unknowns
}
}
I am always a fan of using StringBuilders, so here's a solution using that:
private static String replaceUnknownChars(String strWithUnknownChars, String fullStr) {
StringBuilder sb = new StringBuilder(strWithUnknownChars);
while ((int index = Math.max(sb.toString().indexOf('x'), sb.toString().indexOf('X'))) != -1) {
sb.setCharAt(index, fullStr.charAt(index));
}
return sb.toString();
}
It's quite straightforward. You create a new string builder. While a x or X can still be found in the string builder (indexOf('X') != -1), get the index and setCharAt.
Your are using String.replace(char, char) the wrong way, the doc says
Returns a new string resulting from replacing all occurrences of oldChar in this string with newChar.
So you if you have more than one character, this will replace every one with the same value.
You need to "change" only the character at a specific spot, for this, the easiest is to use the char array that you can get with String.toCharArray, from this, this is you can use the same logic.
Of course, you can use String.indexOf to find the index of a specific character
Note : char c = 'x'|'X'; will not give you the expected result. This will do a binary operation giving a value that is not the one you want.
The OR will return 1 if one of the bit is 1.
0111 1000 (x)
0101 1000 (X)
OR
0111 1000 (x)
But the result will be an integer (every numeric operation return at minimum an integer, you can find more information about that)
You have two solution here, you either use two variable (or an array) or if you can, you use String.toLowerCase an use only char c = 'x'

Get last character of korean word in java

I am trying to get the last character of a Korean word (a String) but it's not working as planned. If I have the string: "사람", I want to get the "ㅁ" but I am getting the "람".
What I already have tried:
word.charAt(word.length-1); // gets 람
I have also checked if "사람" ends with "ㅁ" using word.endsWith("ㅁ"), but it returned false.
It gives true back if I ask, word.endsWith("람").
This answer uses information from How to convert to Korean initials and The Korean Writing System. As the latter one describes, the Hangul is divided into (possible) three parts: initial, vowel, and tail consonant (if present). The tail consonant may consist of 2 consonants like ㅆ.
The unicode coding was, IMHO, quite brilliantly designed so that the Hangul character coding can be encoded/decoded using a formula, as described by (The Korean Writing System) as:
tail = mod ($hangulCodepoint − 44032, 28)
vowel = 1 + mod ($hangulCodepoint − 44032 − tail, 588) / 28
lead = 1 + int [ ($hangulCodepoint − 44032) / 588 ]
Since I need the same thing as you describe, I implemented the following:
private final static String getCharacter(final String character) {
// the following characters are in the correct (i.e. Unicode) order
final String initials = "ㄱㄲㄴㄷㄸㄹㅁㅂㅃㅅㅆㅇㅈㅉㅊㅋㅌㅍㅎ";// list of initials
final String vowels = "ᅡᅢᅣᅤᅥᅦᅧᅨᅩᅪᅫᅬᅭᅮᅯᅰᅱᅲᅳᅴᅵ";// list of vowels
final String finals = "ᆨᆩᆪᆫᆬᆭᆮᆯᆰᆱᆲᆳᆴᆵᆶᆷᆸᆹᆺᆻᆼᆽᆾᆿᇀᇁᇂ";// list of tail characters
final int characterValue = character.codePointAt(0); // Unicode value
final int hangulUnicodeStartValue = 44032;
if (characterValue < hangulUnicodeStartValue)
return character; // for instance for 32 (space)
final int tailIndex = Math.round((characterValue - hangulUnicodeStartValue) % 28) - 1;
final int vowelIndex = Math.round(((characterValue - hangulUnicodeStartValue - tailIndex) % 588) / 28);
final int initialIndex = (characterValue - hangulUnicodeStartValue) / 588;
final String leadString = initials.substring(initialIndex, initialIndex + 1);
final String vowelString = vowels.substring(vowelIndex, vowelIndex + 1);
final String tailString = tailIndex == -1 ? "" : finals.substring(tailIndex, tailIndex + 1);// may be -1 when there is no tail character
return leadString + vowelString + tailString;
}
Note that ㅎ (from the initials) is not the same as ᇂ (from tails) as is for all initials vs tails.
Note also that, due to index starting at 0 instead of 1 as the example from The Korean Writing System, we have to subtract 1 from tail and not add 1 for vowel and lead
To test the above code, you can use, for instance, which contains two three and four character values:
#Test
public void deconstructKoreanCharacters() {
final String koreanText = "항성은 항상 혼자 있는 것이 아니라, 두 개 이상의";
for (int i = 0; i < koreanText.length(); i++) {
final String character = koreanText.substring(i, i + 1);
final String decomposedCharacters = getCharacter(character);
System.out.println(character + ":" + decomposedCharacters);
}
Statics.doNothing();
}
If you need both characters from ᆪ, thus ㄱ and ㅅ this might be a bit manual work, as the number of possible tail characters is 27 (including single character tails)

Split by first found String in Java

is ist possible to tell String.split("(") function that it has to split only by the first found string "("?
Example:
String test = "A*B(A+B)+A*(A+B)";
test.split("(") should result to ["A*B" ,"A+B)+A*(A+B)"]
test.split(")") should result to ["A*B(A+B" ,"+A*(A+B)"]
Yes, absolutely:
test.split("\\(", 2);
As the documentation for String.split(String,int) explains:
The limit parameter controls the number of times the
pattern is applied and therefore affects the length of the resulting
array. If the limit n is greater than zero then the pattern
will be applied at most n - 1 times, the array's
length will be no greater than n, and the array's last entry
will contain all input beyond the last matched delimiter.
test.split("\\(",2);
See javadoc for more info
EDIT: Escaped bracket, as per #Pedro's comment below.
Try with this solution, it's generic, faster and simpler than using a regular expression:
public static String[] splitOnFirst(String str, char c) {
int idx = str.indexOf(c);
String head = str.substring(0, idx);
String tail = str.substring(idx + 1);
return new String[] { head, tail} ;
}
Test it like this:
String test = "A*B(A+B)+A*(A+B)";
System.out.println(Arrays.toString(splitOnFirst(test, '(')));
System.out.println(Arrays.toString(splitOnFirst(test, ')')));

Categories