String character swap - java

I have problem with character swap in string.
e.g. I have string "sdgk4e5s3gj6ds3h6fggh" and I need code that can swap numbers with character.
The result should look something like this: "sdgke4s5g3jd6sh3f6ggh"
I have got to the point where I make char array out of String, but I don't know what to do next. Any help?

If I understand correctly what you are asking, a simple regex could solve your problem:
String result = "sdgk4e5s3gj6ds3h6fggh".replaceAll("(\\d)(\\D)", "$2$1")
which basically inverts 2 characters every time it finds one digit followed by one non-digit.

Related

Remove Character Occurrences Until the First Different One, Java

So I have some numerical strings like the following ones:
"00000545468" - "00002021" - "000000001990" etc.. (I don't know how this strings will be passed to me, the only thing I know is that they will start with some zeros from left and then there will be other different numbers)
I want to remove all the zeros (0) occurrences from left until the first different number of the string.
So if I have for example "00002021", I want to have as a result "2021" and if I have "000000001990" I want "1990".
I excluded the usage of .replace("0", ""), because by doing this I would also remove the zero in "2021" and in "1990", and I don't want this to happen.
Any suggestions?
You can use replaceFirst or replaceAll but the main point is to anchor your regex so it will only replace the zeroes if they are placed at the begining of the string.
"00002021".replaceFirst("^0+", "")
should return what you need, and does not replace zeroes if they are not at the beginning of the string.
You can check this at https://www.regexplanet.com/advanced/java/index.html (disclaimer: I do not own this website and I do not profit by linking to it).

replacing string with regex in java

I think I have a decent handle wrt matching strings using Regex in Java, but now I am trying to replace strings using Regex and not having much success.
Simply put, I am trying to find where there is a digit immediately followed by a constant string "CMR", then adding a space between the digit and the "CMR" substring. "0CMR" should become "0 CMR", "5CMR" should become "5 CMR", etc. Any preceding non-digit should be left as it was.
So my source string is "theStringThat0CMRhas"
my command is:
replaceAll("[0-9]CMR", "[0-9] CMR");
I get the added space in the result, but the result becomes "theStringThat[0-9] CMRhas" which obviously isn't what I need. Somehow I need to tell Regex not to replace with "[0-9]", but with whatever it matched on in the first place.
I know I'm doing this wrong, but I don't know what's right.
Any help appreciated.
Thanks,
Tom
You want to use a capturing group:
replaceAll("([0-9])CMR", "$1 CMR")
$1 references the first group in the match, denoted by parentheses.
Also, [0-9] can be substituted with \d.
Try this:
replaceAll("(?<=\\d)(?=\\D)"," ")
It uses look ahead for non digit character and negative look ahead for digit characters.
If you want just do it for the one with CMR after the digits, use:
"(?<=\\d)(?=CMR)"
You should group the number regex and call argument. Your code here:
replaceAll("([0-9])CMR", "$1 CMR");
For more regex knowledge, please read this document
https://www.tutorialspoint.com/java/java_regular_expressions.htm
Good luck!
a good starting point may be here for reading regex: http://www.regular-expressions.info/java.html
on this site the replacing string page is here: http://www.regular-expressions.info/replacetutorial.html
$with a number represents a whole regex match, and you can use these to refer to what you were doing
String testString = "theStringThat0CMRhas";
String resultString = testString.replaceAll("[0-9]CMR","$0");
System.out.println(resultString);
this would result in the answer: theStringThat0CMR has
you obviously didnt want this, so lets change the answer up a little
String testString = "theStringThat0CMRhas";
String resultString = testString.replaceAll("([0-9])CMR","$0 CMR");
System.out.println(resultString);
now we are referencing the parenthsis, in which it hasn't done anything yet, so its replacing what it found, with the same thing, a space, and CMR
your result would now be: theStringThat0CMR CMRhas
so lets reference the part where we have chosen the number
String testString = "theStringThat0CMRhas";
String resultString = testString.replaceAll("([0-9])CMR","$1 CMR");
System.out.println(resultString);
now your answer will be: theStringThat0 CMRhas
it is finding where it picked a number, replacing it with that number, a space, and then CMR
you are trying to do what I believe to be called a backreference though I am unsure. Regex is still not my strong suit either.

Java split by alphabeta char creates an empty value in array

I want to split my string on every occurrence of an alpha-beta character.
for example:
"s1l1e13" to an array of: ["s1","l1","e13"]
when trying to use this simple split by regex i get some weird results:
testStr = "s1l1e13"
Arrays.toString(testStr.split("(?=[a-z])"))
gives me the array of:
["","s1","l1","e13"]
how can i create the split without the empty array element?
I tried a couple more things:
testStr = "s1"
Arrays.toString(testStr.split("(?=[a-z])"))
does return the currect array: ["s1"]
but when trying to use substring
testStr = "s1l1e13"
Arrays.toString(testStr.substring(1).split("(?=[a-z])")
i get in return ["1","l1","e13"]
what am i missing?
Your Lookahead marks each position before any character of a to z; marking the following positions:
s1 l1 e13
^ ^ ^
So by spliting using just the Lookahead, it returns ["", "s1", "l1", "e13"]
You can use a Negative Lookbehind here. This looks behind to see if there is not the beginning of the string.
String s = "s1l1e13";
String[] parts = s.split("(?<!\\A)(?=[a-z])");
System.out.println(Arrays.toString(parts)); //=> [s1, l1, e13]
Your problem is that (?=[a-z]) means "place before [a-z]" and in your text
s1l1e13
you have 3 such places. I will mark them with |
|s1|l1|e13
so split (unfortunately correctly) produces "" "s1" "l1" "e13" and doesn't automatically remove for you first empty elements.
To solve this problem you have at least two options:
make sure that there is something before your place you need to split on (it is not at start of your string). You can use for instance (?<=\\d)(?=[a-z]) if you want to split after digit but before character
(PREFFERED SOLUTION) start using Java 8 which automatically removes empty strings at start of result array if regex used on split is zero-length (look-arounds are zero length).
The first match finds "" to be okay because its looking ahead for any alpha character, which is called zero-width lookahead, so it doesn't need to actually match anything. So "s" at the beginning is alphanumeric, and it matches that at a probable spot.
If you want the regex to match something always, use ".+(?=[a-z])"
The problem is that the initial "s" counts as an alphabetic character. So, the regex is trying to split at s.
The issue is that there is nothing before the s, so the regex machine instead decides to show that there is nothing by adding the null element. It'll do the same thing at the end if you ended with "s" (or any other letter).
If this is the only string you're splitting, or if every array you had starts with a letter but does not end with one, just truncate the array to omit the first element. Otherwise, you'll probably need to loop through each array as you make it so that you can drop empty elements.
So it seems your matches has the pattern x###, where x is a letter, and # is a number.
I'd make the following Regex:
([a-z][0-9]+)

Removing every other character in a string using Java regex

I have this homework problem where I need to use regex to remove every other character in a string.
In one part, I have to delete characters at index 1,3,5,... I have done this as follows:
String s = "1a2b3c4d5";
System.out.println(s.replaceAll("(.).", "$1"));
This prints 12345 which is what I want. Essentially I match two characters at a time, and replacing with the first character. I used group capturing to do this.
The problem is, I'm having trouble with the second part of the homework, where I need to delete characters at index 0,2,4,...
I have done the following:
String s = "1a2b3c4d5";
System.out.println(s.replaceAll(".(.)", "$1"));
This prints abcd5, but the correct answer must be abcd. My regex is only incorrect if the input string length is odd. If it's even, then my regex works fine.
I think I'm really close to the answer, but I'm not sure how to fix it.
You are indeed very close to the answer: just make matching the second char optional.
String s = "1a2b3c4d5";
System.out.println(s.replaceAll(".(.)?", "$1"));
// prints "abcd"
This works because:
Regex is greedy by default, it will take the second character if it's there
When the input is of odd length, the second char won't be there at the last replacement, but you'd still match one char (i.e. last char in input)
You can still use backreferences in substitution even if the group fails to match
It will substitute in the empty string, not "null"
This is different from Matcher.group(int), which returns null for failed groups
References
regular-expressions.info/Optional
A closer look at the first part
Let's take a closer look at the first part of the homework:
String s = "1a2b3c4d5";
System.out.println(s.replaceAll("(.).", "$1"));
// prints "12345"
Here you didn't have to use ? for the second char, but it "works" because even though you didn't match the last char, you didn't have to! The last char can remain unmatched, unreplaced, due to the problem specification.
Now suppose that we want to delete chars at index 1,3,5..., and put the chars at index 0,2,4... in brackets.
String s = "1a2b3c4d5";
System.out.println(s.replaceAll("(.).", "($1)"));
// prints "(1)(2)(3)(4)5"
A-ha!! Now you're experiencing the exact same problem with odd-length input! You couldn't match the last char with your regex, because your regex needs two chars, but there's only one char at the end for odd-length input!
The solution, again, is to make matching the second char optional:
String s = "1a2b3c4d5";
System.out.println(s.replaceAll("(.).?", "($1)"));
// prints "(1)(2)(3)(4)(5)"
my regex is only incorrect if the input string length is odd. if it's even, then my regex works fine.
Change your expresion to .(.)? - the question mark makes the second character optional, which means it doesn't matter if input is odd or even
Your regex needs 2 chars to match, so fails on the final char.
This regex:
".(.{0,1})"
Will make the second char optional, so it will match with your final '5' as well

Regular expression removing all words shorter than n

Well, I'm looking for a regexp in Java that deletes all words shorter than 3 characters.
I thought something like \s\w{1,2}\s would grab all the 1 and 2 letter words (a whitespace, one to two word characters and another whitespace), but it just doesn't work.
Where am I wrong?
I've got it working fairly well, but it took two passes.
public static void main(String[] args) {
String passage = "Well, I'm looking for a regexp in Java that deletes all words shorter than 3 characters.";
System.out.println(passage);
passage = passage.replaceAll("\\b[\\w']{1,2}\\b", "");
passage = passage.replaceAll("\\s{2,}", " ");
System.out.println(passage);
}
The first pass replaces all words containing less than three characters with a single space. Note that I had to include the apostrophe in the character class to eliminate because the word "I'm" was giving me trouble without it. You may find other special characters in your text that you also need to include here.
The second pass is necessary because the first pass left a few spots where there were double spaces. This just collapses all occurrences of 2 or more spaces down to one. It's up to you whether you need to keep this or not, but I think it's better with the spaces collapsed.
Output:
Well, I'm looking for a regexp in Java that deletes all words shorter than 3 characters.
Well, looking for regexp Java that deletes all words shorter than characters.
If you don't want the whitespace matched, you might want to use
\b\w{1,2}\b
to get the word boundaries.
That's working for me in RegexBuddy using the Java flavor; for the test string
"The dog is fun a cat"
it highlights "is" and "a". Similarly for words at the beginning/end of a line.
You might want to post a code sample.
(And, as GameFreak just posted, you'll still end up with double spaces.)
EDIT:
\b\w{1,2}\b\s?
is another option. This will partially fix the space-stripping issue, although words at the end of a string or followed by punctuation can still cause issues. For example, "A dog is fun no?" becomes "dog fun ?" In any case, you're still going to have issues with capitalization (dog should now be Dog).
Try: \b\w{1,2}\b although you will still have to get rid of the double spaces that will show up.
If you have a string like this:
hello there my this is a short word
This regex will match all words in the string greater than or equal to 3 characters in length:
\w{3,}
Resulting in:
hello there this short word
That, to me, is the easiest approach. Why try to match what you don't want, when you can match what you want a lot easier? No double spaces, no leftovers, and the punctuation is under your control. The other approaches break on multiple spaces and aren't very robust.

Categories