Append a char at the end of each word using regex - java

I am looking for a RegEx (Java) which will append '~' char after end of each word.
My requirement is:
Append ~ at the end of each word
If word has any special char in it, then do not append '~'.
If there are multiple whitespaces, it should be trim to single whitespace.
Please have a look on my example below :
Input: Hello World How* A1e Y?u
Output: Hello~ World~ How* A1e~ Y?u
I took help from forum and could achieve it but I am not able to achieve #2.
My code snippet:
pattern = ([^\\s][a-zA-Z0-9])(\\s|$);
pattern.matcher(searchTerm).replaceAll("$1~$2");
How can I skip append operation if word has any special char?
Please suggest.

I suggest using
searchTerm = searchTerm.replaceAll("(?<!\\S)\\w++(?!\\S)", "$0~").replaceAll("\\s{2,}", " ").trim();
See the Java demo
Details
(?<!\S) - a negative lookbehind making sure there is either a whitespace or start of string right before the current location
\w++ - 1 or more word chars
(?!\S) - a negative lookahead making sure there is either a whitespace or start of string right after the current location.
The $0 is the whole match value.
The .replaceAll("\\s{2,}", " ") (for regular spaces, just replace \\s with a space) part "shrinks" any two or more whitespace characters to a single space, and .trim() part trims the result from whitespace on both ends.

This might help:
public static void main(String[] args) {
String input = "Hello World How* A1e Y?u word";
String extraSpaceInput = String.format(" %s ", input.replaceAll("\\s+", " "));
// Wanted output: Output: Hello~ World~ How* A1e~ Y?u word
Pattern pattern = Pattern.compile("\\s([a-zA-Z0-9]+)\\s");
String output = pattern.matcher(extraSpaceInput).replaceAll("$1~ ");
String cleanedUpOutput = output.replaceAll("\\s+", " ").trim();
// My output: "Hello~ World~ How* A1e~ Y?u word~"
System.out.println("My output: \"" + cleanedUpOutput + "\"");
}

Related

Replace .(dot) inside number Java between two elements

I have this String:
String str = "<p>23.5</p>";
And i want to replace the dot for comma only inside elements. The output i need is:
<p>23,5</p>
I cant figure it out, i have this:
str = str.replaceAll("(?<=<p>)\\.(?=</p>)", ",");
But it doesnt work. I need to replace dot only in elements with particular tag (is an xml in a String), in this case .
Thank you
You may use capturing groups + escape the /:
str = str.replaceAll("(?<=<p>)(\\d*)\\.(\\d+)(?=<\\/p>)", "$1,$2");
If you want to replace dot in all numbers, you may just as well use
str = str.replaceAll("(\\d*)\\.(\\d+)", "$1,$2");
Following regex will match the dot character that is between numerical characters
(?<=\d)\.(?=\d)
Regex Explanation:
\d - match any digit between 0-9
(?<=\d)\. - positive look-behind to match any . character that has a digit just before it
\.(?=\d) - positive look-ahead to match any . character that has a digit just after it
Demo:
https://regex101.com/r/WMEjPl/1
Java Code Example:
public static void main(String args[]) {
String regex = "(?<=\\d)\\.(?=\\d)";
String str = "<p>23.5</p>";
String str2 = "Mr. John <p>23.5</p> Hello";
String str3 = "Mr. John <p>23.5</p> Hello 12.2324";
System.out.println(str.replaceAll(regex, ",")); // <p>23,5</p>
System.out.println(str2.replaceAll(regex, ",")); // Mr. John <p>23,5</p> Hello
System.out.println(str3.replaceAll(regex, ",")); // Mr. John <p>23,5</p> Hello 12,2324
}

Getting the first word in a String after quotation " (Java)

I have the following String:
String sentence = "this is my sentence \"course of math\" of this year";
I need to get the first word after a quote like this one ".
In my example I would get the word : course.
That's really simple, Try this:
/"(\w+)/
And you can get expected word by using $1
" matches the characters " literally
( capturing group
\w+ match any word character [a-zA-Z0-9_]
Online Demo
An alternative replaceAll approach:
String sentence = "this is my sentence \"course of math\" of this year";
System.out.println(sentence.replaceAll("(?s)[^\"]*\"(\\w+).*", "$1"));
// Or - if there can be a space after the first quote:
sentence = "this is my sentence \" course of math\" of this year";
System.out.println(sentence.replaceAll("(?s)[^\"]*\"\\s*(\\w+).*", "$1"));
It returns course because the pattern grabs any characters up to the first " (with [^"]*), then matches the quote, then matches and captures 1+ alphanumeric or underscore characters (with (\w+)), and then matches any 0+ characters up to the end (with .*), and we replace it all with just the contents of Group 1.
Just in case someone wonders if a non-regex solution is also possible, here is one that does not support spaces between the first " and the word:
String sentence = "this is my sentence \"course of math\" of this year";
String[] MyStrings = sentence.split(" "); // Split with a space
String res = "";
for(int i=0; i < MyStrings.length; i++) // Iterate over the split parts
{
if(MyStrings[i].startsWith("\"")) // Check if the split chunk starts with "
{
res = MyStrings[i].substring(1); // Get a substring from Index 1
break; // Stop the iteration, yield the value found first
}
}
System.out.println(res);
See the IDEONE demo
And here is another one that supports spaces between the first " and the next word:
String sentence = "this is my sentence \" course of math\" of this year";
String[] MyStrings = sentence.split("\"");
String res = MyStrings.length == 1 ? MyStrings[0] : // If no split took place use the whole string
MyStrings[1].trim().indexOf(" ") > -1 ? // If the second element has space
MyStrings[1].trim().substring(0, MyStrings[1].trim().indexOf(" ")): // Get substring
MyStrings[1]; // Else, fetch the whole second element
System.out.println(res);
See another demo

How to replace last letter to another letter in java using regular expression

i have seen to replace "," to "." by using ".$"|",$", but this logic is not working with alphabets.
i need to replace last letter of a word to another letter for all word in string containing EXAMPLE_TEST using java
this is my code
Pattern replace = Pattern.compile("n$");//here got the real problem
matcher2 = replace.matcher(EXAMPLE_TEST);
EXAMPLE_TEST=matcher2.replaceAll("k");
i also tried "//n$" ,"\n$" etc
Please help me to get the solution
input text=>njan ayman
output text=> njak aymak
Instead of the end of string $ anchor, use a word boundary \b
String s = "njan ayman";
s = s.replaceAll("n\\b", "k");
System.out.println(s); //=> "njak aymak"
You can use lookahead and group matching:
String EXAMPLE_TEST = "njan ayman";
s = EXAMPLE_TEST.replaceAll("(n)(?=\\s|$)", "k");
System.out.println("s = " + s); // prints: s = njak aymak
Explanation:
(n) - the matched word character
(?=\\s|$) - which is followed by a space or at the end of the line (lookahead)
The above is only an example! if you want to switch every comma with a period the middle line should be changed to:
s = s.replaceAll("(,)(?=\\s|$)", "\\.");
Here's how I would set it up:
(?=.\b)\w
Which in Java would need to be escaped as following:
(?=.\\b)\\w
It translates to something like "a character (\w) after (?=) any single character (.) at the end of a word (\b)".
String s = "njan ayman aowkdwo wdonwan. wadawd,.. wadwdawd;";
s = s.replaceAll("(?=.\\b)\\w", "");
System.out.println(s); //nja ayma aowkdw wdonwa. wadaw,.. wadwdaw;
This removes the last character of all words, but leaves following non-alphanumeric characters. You can specify only specific characters to remove/replace by changing the . to something else.
However, the other answers are perfectly good and might achieve exactly what you are looking for.
if (word.endsWith("char oldletter")) {
name = name.substring(0, name.length() - 1 "char newletter");
}

Remove begining punctuation from a word

I have seen a couple of threads here that kindof matches what I am asking here. But none are concrete. If I have a string like "New Delhi", I want my code to extract New Delhi. So here the quotes are stripped off. I want to strip off any punctuation, in general at start and end.
So far, this helps to strip the punctuations at the end: String replacedString = replaceable_string.replaceAll("\\p{Punct}*([a-z]+)\\p{Punct}*", "$1");
What am I doing wrong here? My output is "New Delhi with the beginning quote still there.
The following will remove a punctuation character from both the beginning and end of a String object if present:
String s = "\"New, Delhi\"";
// Output: New, Delhi
System.out.println(s.replaceAll("^\\p{Punct}|\\p{Punct}$", ""));
The ^ part of the Regex represents the beginning of the text, and $ represents the end of the text. So, ^\p{Punct} will match a punctuation that is a first character and \p{Punct}$ will match a punctuation that is a last character. I used | (OR) to match either the first expression or the second one, resulting in ^\p{Punct}|\p{Punct}$.
In case you want to remove all punctuation characters from the beginning and the end of the String object, you can use the following:
String s = "\"[{New, Delhi}]\"";
// Output: New, Delhi
System.out.println(s.replaceAll("^\\p{Punct}+|\\p{Punct}+$", ""));
I simply added the + sign after each \p{Punct}. The + sign means "One or more", so it will match many punctuations if they are present at the beginning or end of the text.
Hope this is what you were looking for :)
class SO {
public static void main(String[] args) {
String input = "\"New Delhi\"";
String output = "";
try {
output = input.replaceAll("(^\\p{P}+)(.+)(\\p{P}+$)", "($1)($2)($3)");
} catch (IndexOutOfBoundsException e) {
}
System.out.println("Input: " + input);
System.out.println("Output: " + output);
}
}
Result:
Input: "New Delhi"
Output: (")(New Delhi)(")
String replacedString = replacable_string.replaceAll("^\"|\"$", "");
or
String replacedString = replace_string.replace("\"", "");
should work also.
Try using this :
String data = "\"New Delhi\"";
Pattern pattern = Pattern.compile("[^\\w\\s]*([\\w\\s]+)[^\\w\\s]*");
Matcher matcher = pattern.matcher(data);
while (matcher.find()) {
// Indicates match is found. Do further processing
System.out.println(matcher.group(1));
}
try
String s = "\"New Deli\"".replaceAll("\\p{Punct}*(\\P{Punct}+)\\p{Punct}*", "$1");
your [a-z] is only going to capture lower case letters and no spaces. Try ([a-zA-Z ])

Match only first and last character of a string

I had a look at other stackoverflow questions and couldn't find one that asked the same question, so here it is:
How do you match the first and last characters of a string (can be multi-line or empty).
So for example:
String = "this is a simple sentence"
Note that the string includes the beginning and ending quotation marks.
How do I get match the first and last characters where the string begins and ends with a quotation mark (").
I tried:
^"|$" and \A"\Z"
but these do not produce the desired result.
Thanks for your help in advance :)
Is this what you are looking for?
String input = "\"this is a simple sentence\"";
String result = input.replaceFirst("(?s)^\"(.*)\"$", " $1 ");
This will replace the first and last character of the input string with spaces if it starts and ends with ". It will also work across multiple lines since the DOTALL flag is specified by (?s).
The regex that matches the whole input ".*". In java, it looks like this:
String regex = "\".*\"";
System.out.println("\"this is a simple sentence\"".matches(regex)); // true
System.out.println("this is a simple sentence".matches(regex)); // false
System.out.println("this is a simple sentence\"".matches(regex)); // false
If you want to remove the quotes, use this:
String input = "\"this is a simple sentence\"";
input = input.replaceAll("(^\"|\"$)", "")); // this is a simple sentence (without any quotes)
If you want this to work over multiple lines, use this:
String input = "\"this is a simple sentence\"\n\"and another sentence\"";
System.out.println(input + "\n");
input = input.replaceAll("(?m)(^\"|\"$)", "");
System.out.println(input);
which produces output:
"this is a simple sentence"
"and another sentence"
this is a simple sentence
and another sentence
Explanation of regex (?m)(^"|"$):
(?m) means "Caret and dollar match after and before newlines for the remainder of the regular expression"
(^"|"$) means ^" OR "$, which means "start of line then a double quote" OR "double quote then end of line"
Why not use the simple logic of getting the first and last characters based on charAt method of String? Place a few checks for empty/incomplete strings and you should be done.
String regexp = "(?s)\".*\"";
String data = "\"This is some\n\ndata\"";
Matcher m = Pattern.compile(regexp).matcher(data);
if (m.find()) {
System.out.println("Match starts at " + m.start() + " and ends at " + m.end());
}

Categories