Regex Match word that include a Dot - java

I have a Question I have this Sentence for Example:
"HalloAnna daveca.nn dave anna ca. anna"
And I only wanna match the single Standing "ca." .
My RegEx is like that :
(?i)\b(ca\.)\b
But this doesn't work and I don't know why. Any ideas ?
//Update
I excecute it with:
testSource.replaceAll()
and with
pattern.matcher(testSource).replaceAll().
both doesn´t work.

You must escape the dot and assert a non-word following:
(?i)\bca\.(?=\W)
See live demo.

You should use it like this:
Pattern.compile("(?i)\\b(ca\\.)(?=\\W)").matcher(a).replaceAll("SOME TEXT");
Which if you omit the java escapes gives a regex: (?i)\b(ca\.)\W.
Every \ in normal regex has to be escaped in java - \\.
Also, before a word you have word boundary (\b), but it applies only to a part in String where you have a change from whitespace to a alphanumeric character or the other way around. But in your case you have a dot, which is not an alphanumeric character, so you can't use \b at the end. You can use \W which means that a non-word character is following the dot. But to use \W you need to ignore it in the capture group (so it won't be replaced) - (?=.
Another issue was that you used ., which matches any character, but you actually want to match the real dot, so to do that you have to escape it - \., which in java String becomes \\..

Related

Escaping ampersand in the regexp

I have to match words which are ending with ampersand character. I came up with this regex: \w*\&\b. It works correctly for all letters, for example: \w*a\b, but when I add escaped ampersand (as in first example) it won't match words ending up with it.
Btw I was using https://regex101.com/ to test my regexps.
Here's the same regex provided by #MonkeyZeus with a small correction to accept whole word ending with multiple ampersands (e.g. wordendingwithmultiple&&&&):
\w+\&+(?=\W|$)
\\&+ instead of just &.
The demo link.
You can use this:
\w+&(?=\W|$)
\w+ - require at least one word char
& - followed by an ampersand
(?=\W|$) - positive lookahead after the ampersand for a non-word char or the end of the line
Just make sure to double up on the backslashes for Java string escaping:
\\w+&(?=\\W|$)
https://regex101.com/r/WaSEyJ/1/

java regex matching &[text(text - text text) !text]

I am currently working on creating a regex to split out all occurrences of Strings that match the following format: &[text(text - text text) !text]. Here text can be any char really. and the spacing is important. The text will be listed as shown.
I have tried the following regex but I cannot seem to get it to work:
&\\[([^\\]]*)\\]
Any help would be greatly appreciated.
You replace text with \w+ to capture 1 or more word characters.
Assuming everything else was a literal, the following regular expression should work:
&\[\w+\(\w+ - \w+ \w+\) !\w+\]
You could also use [a-zA-Z] in place of \w if you would like. It is sometimes easier to understand since it explicitly describes the characters to match, a-z and A-Z inclusive.
&\[[a-zA-Z]+\([a-zA-Z]+ - [a-zA-Z]+ [a-zA-Z]+\) ![a-zA-Z]+\]
And for one character only, remove the +
&\[\w\(\w - \w \w\) !\w\]
&\[[a-zA-Z]\([a-zA-Z] - [a-zA-Z] [a-zA-Z]\) ![a-zA-Z]\]
P.S - I cant remember if -, &, or ! are coutned as regex symbols and if they are you can make them literals by using \-, \&, or \!.
P.P.S - In java you have to escape \ so \w becomes \\w in a string.
If you want to extract text as groups to work with them after:
&\\[(\\w+)\\((\w+)\\s\\-\\s(\\w+)\\s(\\w+)\\)\\s!(\\w+)]
example

How to match tab and newline but not space with REGEX?

I am trying to match "tab" and "newline" meta chars but without "spaces" with REGEX in Java.
\s matches evrything i.e. tab, space and new line... But, I don't want "space" to be matched.
How do I do that?
Thanks.
One way to do it is:
[^\\S ]
The negated character class makes this regex to match anything except - \\S (non-whitespace) and " "(space) character. So, it will match \\s except space.
Explicitly list them inside [...] (set of characters):
"[\\t\\n\\r\\f\\v]"

How to match ^(d+) in a particular text using regex

For example I have text like below :
case1:
(1) Hello, how are you?
case2:
Hi. (1) How're you doing?
Now I want to match the text which starts with (\d+).
I have tried the following regex but nothing is working.
^[\(\d+\)], ^\(\d+\).
[] are used to match any of the things you specify inside the brackets, and are to be followed by a quantifier.
The second regexp will work: ^\(\d+\), so check your code.
Check also so there's no space in front of the first parenthesis, or add \s* in front.
EDIT: Also, java can be tricky with escapes depending on if the regexp you type is directly translated to a regexp or is first a string literal. You may need to double escape your escapes.
In Java you have to escape parenthesis, so "\\(\\d+\\)" should match (1) in case one and two. Adding ^ as you did "^\\(\\d+\\)" will match only case1.
You have to use double back slashes within java string. Consider this
"\n" give you [line break]
"\\n" give you [backslash][n]
If you are going to downvote my post, at least comment to tell me WHY it's not useful.
I believe Java's Regex Engine supports Positive Lookbehind, in which case you can use the following regex:
(?<=[(][0-9]{1,9999}[)]\s?)\b.*$
Which matches:
The literal text (
Any digit [0-9], between 1 and 9999 times {1,9999}
The literal text )
A space, between 0 and 1 times \s?
A word boundary \b
Any character, between 0 and unlimited times .*
The end of a string $

Regex - negate a group within a class

I'm using Java's matcher to group terms in a string using the following regex:
Pattern.compile("(\\\\\"[^\\\\\"]*\\\\\"|[^\\s\\\\\"]+)");
This is the part I'm having trouble with: [^\s\\\"]
I'd like it to only match non-spaces and dangling escaped quotes such as \". Is there any way to group the \\ and \" within a character class so they're only matched together?
I tried to use lookahead/lookbehind, but found that including it within the character class put me back at square one.
A character class matches a single character. If I understood you correctly, you want to match only the string \". To do this, you don't need a character class at all--the regex \\" matches that already! (Inside a Java string, it would look like \\\\\" which is ridiculous, but there you have it.)
You can group things together using parentheses: (\\\\\"). You can also alternate inside a group like this using |. So to match non-spaces or \", you can do this: (\S|\\\\\"). (Note that \S is the same as [^\s].)
EDIT: I wasn't paying enough attention. You can match everything but \" or a space as follows: (\\\\(?!")|[^\s\\]), I think.
How about this: ([^\\s\\\\]|\\\\(?!")). This should match anything except whitespace or \ or a \ not followed by a ".

Categories