How to find such lines in a file
######## this_is_a_line.sh ########
I tried the below regular expression but it does not work
(#)+( )+(A-Za-z0-9_)+(.sh)( )+(#)+
But this doesn't seem to work. Can anyone let me know what is wrong?
You used (...) (a capturing group) instead of [...] (a character class). Use the character class:
#+ +[A-Za-z0-9_]+\.sh +#+
^^^^^^^^^^^^^
See regex demo (note most of the capture groups are redundant here, and I removed them. Also, . must be escaped to match a literal dot.)
The [A-Za-z0-9_]+ matches 1 or more letters, digits or _.
The (A-Za-z0-9_)+ matches 1 or more sequences of A-Za-z0-9_ (see demo).
Also, in Java, you can use \w to match [A-Za-z0-9_] and shorten your regex to
#+ +\w+\.sh +#+
Do not forget that you need to double each \ in the pattern string in Java (String pattern = "#+ +\\w+\\.sh +#+";).
Related
I have a Question I have this Sentence for Example:
"HalloAnna daveca.nn dave anna ca. anna"
And I only wanna match the single Standing "ca." .
My RegEx is like that :
(?i)\b(ca\.)\b
But this doesn't work and I don't know why. Any ideas ?
//Update
I excecute it with:
testSource.replaceAll()
and with
pattern.matcher(testSource).replaceAll().
both doesn´t work.
You must escape the dot and assert a non-word following:
(?i)\bca\.(?=\W)
See live demo.
You should use it like this:
Pattern.compile("(?i)\\b(ca\\.)(?=\\W)").matcher(a).replaceAll("SOME TEXT");
Which if you omit the java escapes gives a regex: (?i)\b(ca\.)\W.
Every \ in normal regex has to be escaped in java - \\.
Also, before a word you have word boundary (\b), but it applies only to a part in String where you have a change from whitespace to a alphanumeric character or the other way around. But in your case you have a dot, which is not an alphanumeric character, so you can't use \b at the end. You can use \W which means that a non-word character is following the dot. But to use \W you need to ignore it in the capture group (so it won't be replaced) - (?=.
Another issue was that you used ., which matches any character, but you actually want to match the real dot, so to do that you have to escape it - \., which in java String becomes \\..
I have a wrong regex
([A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9]*\.)
I need to accept strings like:
a-b.
ab.
a.
But i am not needing in this string - a-.
What should I change?
[A-Za-z0-9]+\.|[A-Za-z0-9]+-?[A-Za-z0-9]\.
The idea is:
-? - optional dash
\. - escaped dot, to match literal dot
| - alternation (one or the other)
x+ - one or more repetitions, equivalent to xx*
If you don't mind matching underscores too, you can use the word character set:
\w+\.|\w+-?\w\.
See it in action
You can try like this by using an optional group.
"(?i)[A-Z0-9](?:-?[A-Z0-9]+)*\\."
(?i) flag for caseless matching.
[A-Z0-9] one alphanumeric character
(?:-?[A-Z0-9]+)* any amount of (?: an optional hyphen followed by one or more alnum )
\. literal dot
See demo at Regexplanet (click Java)
This works for your test cases:
[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?\.
See live demo.
For example I have text like below :
case1:
(1) Hello, how are you?
case2:
Hi. (1) How're you doing?
Now I want to match the text which starts with (\d+).
I have tried the following regex but nothing is working.
^[\(\d+\)], ^\(\d+\).
[] are used to match any of the things you specify inside the brackets, and are to be followed by a quantifier.
The second regexp will work: ^\(\d+\), so check your code.
Check also so there's no space in front of the first parenthesis, or add \s* in front.
EDIT: Also, java can be tricky with escapes depending on if the regexp you type is directly translated to a regexp or is first a string literal. You may need to double escape your escapes.
In Java you have to escape parenthesis, so "\\(\\d+\\)" should match (1) in case one and two. Adding ^ as you did "^\\(\\d+\\)" will match only case1.
You have to use double back slashes within java string. Consider this
"\n" give you [line break]
"\\n" give you [backslash][n]
If you are going to downvote my post, at least comment to tell me WHY it's not useful.
I believe Java's Regex Engine supports Positive Lookbehind, in which case you can use the following regex:
(?<=[(][0-9]{1,9999}[)]\s?)\b.*$
Which matches:
The literal text (
Any digit [0-9], between 1 and 9999 times {1,9999}
The literal text )
A space, between 0 and 1 times \s?
A word boundary \b
Any character, between 0 and unlimited times .*
The end of a string $
I'm trying to compare following strings with regex:
#[xyz="1","2"'"4"] ------- valid
#[xyz] ------------- valid
#[xyz="a5","4r"'"8dsa"] -- valid
#[xyz="asd"] -- invalid
#[xyz"asd"] --- invalid
#[xyz="8s"'"4"] - invalid
The valid pattern should be:
#[xyz then = sign then some chars then , then some chars then ' then some chars and finally ]. This means if there is characters after xyz then they must be in format ="XXX","XXX"'"XXX".
Or only #[xyz]. No character after xyz.
I have tried following regex, but it did not worked:
String regex = "#[xyz=\"[a-zA-z][0-9]\",\"[a-zA-z][0-9]\"'\"[a-zA-z][0-9]\"]";
Here the quotations (in part after xyz) are optional and number of characters between quotes are also not fixed and there could also be some characters before and after this pattern like asdadad #[xyz] adadad.
You can use the regex:
#\[xyz(?:="[a-zA-z0-9]+","[a-zA-z0-9]+"'"[a-zA-z0-9]+")?\]
See it
Expressed as Java string it'll be:
String regex = "#\\[xyz=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\"\\]";
What was wrong with your regex?
[...] defines a character class. When you want to match literal [ and ] you need to escape it by preceding with a \.
[a-zA-z][0-9] match a single letter followed by a single digit. But you want one or more alphanumeric characters. So you need [a-zA-Z0-9]+
Use this:
String regex = "#\\[xyz(=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\")?\\]";
When you write [a-zA-z][0-9] it expects a letter character and a digit after it. And you also have to escape first and last square braces because square braces have special meaning in regexes.
Explanation:
[a-zA-z0-9]+ means alphanumeric character (but not an underline) one or more times.
(=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\")? means that expression in parentheses can be one time or not at all.
Since square brackets have a special meaning in regex, you used it by yourself, they define character classes, you need to escape them if you want to match them literally.
String regex = "#\\[xyz=\"[a-zA-z][0-9]\",\"[a-zA-z][0-9]\"'\"[a-zA-z][0-9]\"\\]";
The next problem is with '"[a-zA-z][0-9]' you define "first a letter, second a digit", you need to join those classes and add a quantifier:
String regex = "#\\[xyz=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\"\\]";
See it here on Regexr
there could also be some characters before and after this pattern like
asdadad #[xyz] adadad.
Regex should be:
String regex = "(.)*#\\[xyz(=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\")?\\](.)*";
The First and last (.)* will allow any string before the pattern as you have mentioned in your edit. As said by #ademiban this (=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\")? will come one time or not at all. Other mistakes are also very well explained by Others +1 to all other.
How can I create a regular expression to search strings with a given pattern? For example I want to search all strings that match pattern '*index.tx?'. Now this should find strings with values index.txt,mainindex.txt and somethingindex.txp.
Pattern pattern = Pattern.compile("*.html");
Matcher m = pattern.matcher("input.html");
This code is obviously not working.
You need to learn regular expression syntax. It is not the same as using wildcards. Try this:
Pattern pattern = Pattern.compile("^.*index\\.tx.$");
There is a lot of information about regular expressions here. You may find the program RegexBuddy useful while you are learning regular expressions.
The code you posted does not work because:
dot . is a special regex character. It means one instance of any character.
* means any number of occurrences of the preceding character.
therefore, .* means any number of occurrences of any character.
so you would need something like
Pattern pattern = Pattern.compile(".*\\.html.*");
the reason for the \\ is because we want to insert dot, although it is a special regex sign.
this means: match a string in which at first there are any number of wild characters, followed by a dot, followed by html, followed by anything.
* matches zero or more occurrences of the preceding token, so if you want to match zero or more of any character, use .* instead (. matches any char).
Modified regex should look something like this:
Pattern pattern = Pattern.compile("^.*\\.html$");
^ matches the start of the string
.* matches zero or more of any char
\\. matches the dot char (if not escaped it would match any char)
$ matches the end of the string