Function end index search in java using regex - java

I was using regex to find function start and end stored in string in java. But was unable to get end index.
String regexPart1 = "((public)|(private)|(protected)) [a-zA-Z_0-9\\<\\>\\,]+ ";
String regexPart2 = "\\(.*\\) (throws .*)?\\{.*}$";
Pattern pattern = Pattern.compile(regexPart1+"run"+regexPart2);
Matcher matcher = pattern.matcher(toEval);
while (matcher.find()) {
System.out.println(" Found: " + matcher.group());
}
With
toEval = "public class ClassEval{public static initialize() throws Exception(){System.out.println("Initialize");}public static void run() throws Exception{System.out.println("This should come only")}public static void main(String[] args){System.out.println("Hello");}}";
Expected output:
Found: public static void run(){System.out.println("This should come only")}
Output coming:
Found: public static void run(){System.out.println("This should come only")}public static void main(String[] args){System.out.println("Hello");}}

* is a greedy quantifier, meaning {.*} will match everything from the first { until the last }. Change it to {.*?} if you want it to stop at the first }. Of course it still won't be able to identify nested braces, but that's a whole other issue.

Related

How can Java remove leading whitespace?

I have been trying to figure out why this Java code won't delete any leading whitespace to my actual string, I have been trying to use stripLeading() method and the trim(); method, and various other methods with the same functionality but still haven't gotten a favorable outcome. Code:
public static String message(String logLine) {
logLine = (String) logLine.subSequence(logLine.indexOf(" ") + 1, logLine.length());
return logLine;
}
public static void main(String[] args) {
System.out.println(message("[WARNING]: \tTimezone not set \r\n"));
}
What results is what I wanted, just the words "Timezone not set" however I want this program to completely ignore leading whitespace, which for some reason it can't. Thank you for any help.
Possible solutions
Use String::replaceFirst to keep only the part after a prefix ([WARNING]:) followed by whitespaces and the main part:
public static String message(String logLine) {
return logLine.replaceFirst("^\\S*\\s+(\\S+(\\s+\\S+)*)\\s+$", "$1");
}
As the prefix ends with ':', a solution offered in the comment using String::substring + String::trim works too:
public static String message(String logLine) {
return logLine.substring(logLine.indexOf(":") + 1).trim();
}

Java substring string when specific string occurs

i need help to substring a string when a a substring occurs.
Example
Initial string: 123456789abcdefgh
string to substr: abcd
result : 123456789
I checked substr method but it accept index position value.I need to search the occurrence of the substring and than pass the index?
If you want to split the String from the last number (a), then the code would look like this:
you can change the "a" to any char within the string
package nl.testing.startingpoint;
public class Main {
public static void main(String args[]) {
String[] part = getSplitArray("123456789abcdefgh", "a");
System.out.println(part[0]);
System.out.println(part[1]);
}
public static String[] getSplitArray(String toSplitString, String spltiChar) {
return toSplitString.split("(?<=" + spltiChar + ")");
}
}
Bear in mind that toSplitString.split("(?<=" + spltiChar + ")"); splits from the first occurrence of that character.
Hope this might help:
public static void main(final String[] args)
{
searchString("123456789abcdefghabcd", "abcd");
}
public static void searchString(String inputValue, final String searchValue)
{
while (!(inputValue.indexOf(searchValue) < 0))
{
System.out.println(inputValue.substring(0, inputValue.indexOf(searchValue)));
inputValue = inputValue.substring(inputValue.indexOf(searchValue) +
searchValue.length());
}
}
Output:
123456789
efgh
Use a regular expression, like this
static String regex = "[abcd[.*]]"
public String remove(String string, String regex) {
return string.contains(regex) ? string.replaceAll(regex) : string;
}

Using String.matches to match *, |, and spaces

This is what i am testing my method with that will check a string to see if its valid, i don't even know if matches is what i need to use. but i am trying to use it to check if a string contains only *, |, and spaces.
public class TallyTest {
public static void main(String[] args) {
System.out.println(TallyString.isValidGroup("||**|*|"));
System.out.println("Expected true");
System.out.println(TallyString.evaluateGroup("||**|*|"));
System.out.println("Expected 19");
System.out.println(TallyString.makeGroup(19));
System.out.println("Expected '***||||'");
}
}
public class Test
{
public static void main(String[] args)
{
String s = "GT|!ll22";
if(!s.matches("[*| ]+"))
System.out.println("Incorrect");
}
}
You can use String.matches to check if characters OTHER THAN certain ones that you want were entered.

Regular expression for starting sequence

package xmlchars;
import java.util.regex.Pattern;
public class TestRegex {
public static final String SPECIAL_CHARACTERS = "(?i)^[^a-z_]|[^a-z0-9-_.]";
public static void main(String[] args) {
// TODO Auto-generated method stub
String name = "#1998St #";
Pattern pattern = Pattern.compile(SPECIAL_CHARACTERS);
System.out.println(pattern.matcher(name).replaceAll(""));//gives wrong output 1998St
}
}
Basically what i'm trying to achieve is
String to start only with a-z and _
String to contain a-z 0-9 _ - . after the start
Case insensitive for the whole string
You could say:
... SPECIAL_CHARACTERS = "^[a-z_][a-z0-9_]+$";
and define the pattern by saying:
Pattern pattern = Pattern.compile(SPECIAL_CHARACTERS, Pattern.CASE_INSENSITIVE);
I managed to crack the regex. Simple change to the existing.
"^[^a-z_]*|[^a-z_0-9-._]"
Here you go, with the working proof.
package xmlchars;
import java.util.regex.Pattern;
public class TestRegex {
public static final String SPECIAL_CHARACTERS = "^[^a-z_]*|[^a-z_0-9-._]";
public static void main(String[] args) {
// TODO Auto-generated method stub
String name = " # !`~!##$%^&*()-_=+{}[];:',<>/?19.- 98Cc#19 #/9_-8-.";
Pattern pattern = Pattern.compile(SPECIAL_CHARACTERS, Pattern.CASE_INSENSITIVE);
System.out.println(pattern.matcher(name).replaceAll("")); // output _19.-98Cc199_-8-.
}
}
I'll assume you are trying to identify anything in the String that doesn't match the pattern. What you have looks almost correct. It looks like your regex might work like this:
"(?i)^([^a-z_]|[^a-z0-9-_.])"
That would only match whenever one of those two groups appear at the start of the String. Instead, try this:
"(?i)(^[^a-z_])|[^a-z0-9-_.]"
To shorten it even further, you could use the predefined character class \\W which is the same as [^a-zA-Z_0-9]. With that, you wouldn't even need the case-insensitivity.
"(^\\W)|[\\W-.]"
Given a String called str, str.replaceAll("(^\\W)|[\\W-.]",""); will remove all invalid characters.
Test for your string:
class RegexTest
{
public static void main (String[] args)
{
String str = "#1998St #";
str = str.replaceAll("(^\\W)|[\\W-.]","");
System.out.println(str);
}
}
Output:
1998St

Given a regex, how can I know from the pattern what the largest number of fields there are that could be matched?

I need to know, based on the regex itself (without any sample data), what the maximum number of fields it could find is.
For example, for the expression
"^(ABC) ?([0-9]{4}|[0-9]{6})?(?:(?:/)([0-9]{4}|[0-9]{6}))?(?:(?: ?XYZ ?)([0-9]{4}))?$"
I'd like some function that would take that as a String (or a Pattern) and return 4, and would take
"^(DEF) ?([0-9A-Z]{1,2})(?:(?:/)([0-9A-Z]{1,2}))?$"
and return 3.
It would be simpler if all of these groups were captured, but not all are, and I'd like to avoid having to write my own parser if possible.
This is very ugly but... seems to do what you need:
public class TestRegEx1 {
public static void main(String[] args) {
Pattern pat = Pattern.compile("^(ABC) ?([0-9]{4}|[0-9]{6})?(?:(?:/)([0-9]{4}|[0-9]{6}))?(?:(?: ?XYZ ?)([0-9]{4}))?$");
try {
Field groupCount = Pattern.class.getDeclaredField("capturingGroupCount");
groupCount.setAccessible(true);
int count = ((Integer) groupCount.get(pat)) - 1;
System.out.println("count : " + count);
} catch (Exception e) { }
}
}
Or to add the non-reflective version, which depends on .matcher(String) being able to reach into the Pattern class:
public class TestRegEx2 {
public static void main(String[] args) {
Pattern pat = Pattern.compile("^(ABC) ?([0-9]{4}|[0-9]{6})?(?:(?:/)([0-9]{4}|[0-9]{6}))?(?:(?: ?XYZ ?)([0-9]{4}))?$");
int count = pat.matcher("").groupCount(); // it turns out it doesn't matter what pattern you use here
System.out.println("count : " + count);
}
}

Categories