I am having some trouble converting a php pregmatch to java. I thought I had it all correct but it doesn't seem to be working. Here is the code:
Original PHP:
/* Pattern for 44 Character UUID */
$pattern = "([0-9A-F\-]{44})";
if (preg_match($pattern,$content)){
/*DO ACTION*/
}
My Java code:
final String pattern = "([0-9A-F\\-]{44})";
public static boolean pregMatch(String pattern, String content) {
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(content);
boolean b = m.matches();
return b;
}
if (pregMatch(pattern, line)) {
//DO ACTION
}
So my test input is:
DBA40365-7346-4DB4-A2CF-52ECA8C64091-0
Using a series of System.outs I get that b = false.
To implement a function as you did in your code:
final String pattern = "[0-9A-F\\-]{44}";
public static boolean pregMatch(String pattern, String content) {
return content.matches(pattern);
}
And then you can call it as:
if (pregMatch(pattern, line)) {
//DO ACTION
}
You don't need the parenthesis in your pattern because that just creates a match group, which you are not using. If you need access to back references, you would need the parenthesis an a more advanced regex code using Pattern and Matcher classes.
You could just use String.matches()
if (line.matches("[0-9A-F-]{44}")) {
// do action
}
Related
I am looking for a way to incrementally apply a regular expression pattern, i.e. I am looking for a matcher which I can update with characters as they come in and which tells me on each character whether it is still matching or not.
Here is an illustration in code (MagicMatcherIAmLookingFor is the thing I am looking for, characterSource is something which I can query for new character, say an InputStreamReader for that matter):
final Pattern pattern = Pattern.compile("[0-9]+");
final MagicMatcherIAmLookingFor incrementalMatcher = pattern.magic();
final StringBuilder stringBuilder = new StringBuilder();
char character;
while (characterSource.isNotEOF()) {
character = characterSource.getNextCharacter();
incrementalMatcher.add(character);
if (incrementalMatcher.matches()) {
stringBuilder.append(character);
} else {
return result(
stringBuilder.toString(),
remaining(character, characterSource)
);
}
}
I did not find a way to utilize the existing java.util.regex.Pattern like that, but maybe I just did not find it. Or is there an alternative library to the built in regular expressions which provides such a feature?
I did not have any luck searching the web for it - all the results are completely swamped with how to use java regular expressions in the first place.
I am targeting Java 8+
Is this the kind of object you are looking for ?
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class MagicMatcher {
private Pattern pattern;
private Matcher matcher;
private String stringToCheck;
public MagicMatcher(Pattern p , String s) {
pattern = p;
stringToCheck = s;
updateMatcher();
}
public boolean matches() {
return matcher.matches();
}
private void updateMatcher() {
matcher = pattern.matcher(stringToCheck);
}
public void setStringToCheck(String s) {
stringToCheck = s;
updateMatcher();
}
public String getStringToCheck() {
return stringToCheck;
}
public void addCharacterToCheck(char c) {
stringToCheck += c;
updateMatcher();
}
public void addStringToCheck(String s) {
stringToCheck += s;
updateMatcher();
}
}
I have a utility class to resolve a string input with certain patterns as shown in the example below. All variables are surrounded by { and }. If my string is something like Language is {lang} and version 2 is {version}. Home located at {java.home} the output is Language is java and version 2 is 1.8. Home located at C:/java and if my string is like Language is {lang} and version 2 is {version}. Home located at {{lang}.home} the output is Language is java and version 2 is 1.8. Home located at {java.home}. All I am trying to find is a way to resolve nested properties recursively but ran into several issues. Can any logic be inserted into the code so that resolving of inner properties happen dynamically?
import java.util.*;
import java.util.regex.*;
public class MyClass {
public static void main(String args[]) {
System.setProperty("lang" , "java");
System.setProperty("version" , "1.8");
System.setProperty("java.home" , "C:/java");
System.out.println(resolve("Language is {lang} and version 2 is {version}. Home located at {java.home}"));
System.out.println(resolve("Language is {lang} and version 2 is {version}. Home located at {{lang}.home}"));
}
public static String resolve(String input) {
List<String> tokens = matchers("[{]\\S+[}]", input);
String value;
for(String token : tokens) {
value = getProperty(token);
if (null != value) {
input = input.replace(token, value);
}
value = "";
}
return input;
}
private static String getProperty(String key) {
key = key.substring(1, key.length()-1);
return System.getProperty(key);
}
public static List<String> matchers(String regex, String text) {
List<String> matches = new ArrayList<String>();
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
matches.add(matcher.group());
}
return matches;
}
public static boolean contains(String regex, String text) {
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(text);
return matcher.find();
}
}
You just have to ask for the pattern to get only the value without an inner { or } with [^{}]. No "curly bracket" means no inner values. So you can safely do the replace.
First, we create a Pattern, we need to escape those {}... and we add a capture group for later.
Pattern p = Pattern.compile("\\{([^{}]+)\\}");
Then we check with the current value:
Matcher m = p.matcher(s);
Now, we just have to check if there is a match and loop on it.
while( m.find() ){
...
}
In there, we will need the value captured, so we get the first group and get its value (let assume it will always be present) :
String key = m.group(1);
String value = properties.get(key); //add some fail safe.
Using the Matcher.replaceFirst, we will safely replace only the current match (the one we get the value from). If you use replaceAll, it will replace every pattern with the same value.
s = m.replaceFirst(properties.get(key));
Now, since we have updated the String, we need to call check the regex again :
m = p.matcher(s);
Here is a full example:
Map<String, String> properties = new HashMap<>();
properties.put("lang", "java");
properties.put("java.version", "1.8");
String s = "This is {{lang}.version}.";
Pattern p = Pattern.compile("\\{([^{}]+)\\}");
Matcher m = p.matcher(s);
while(m.find()){
String key = m.group(1);
s = m.replaceFirst(properties.get(key));
System.out.println(s);
m = p.matcher(s); //Reset the matcher
}
This is {java.version}.
This is 1.8.
This has one problem, it will required to a lot of Matcher initialisation, so it might not be optimal. Of course, it is most likely not optimized (not the point here)
FYI : Using the Matcher.replaceFirst instead of the String.replaceFirst prevent a new Pattern compilation to be done. Here is the String.replaceFirst code :
public String replaceFirst(String regex, String replacement) {
return Pattern.compile(regex).matcher(this).replaceFirst(replacement);
}
We already have a Matcher to do that, so use it.
There are lots of ways you could achieve this.
You need some way to communicate to the caller either whether a replacement is necessary, or whether one was made.
A simple option:
public boolean hasPlaceholder(String s) {
// return true if s contains a {} placeholder, else false
}
Using this you can repeatedly replace until done:
while(hasPlaceholder(s)) {
s = replacePlaceholders(s);
}
This does scan through the string more times than is strictly necessary, but you shouldn't optimise prematurely.
A more sophisticated option is for the replacePlaceholders() method to report back whether it succeeded. For that you'll need a response class that wraps the result String and the wasReplaced() boolean:
ReplacementResult replacePlaceholders(String s) {
// process string into newString, counting placeholders replaced
return new ReplacementResult(count > 0, newString);
}
(Implementation of ReplacementResult left as an exercise)
Using this you can do:
ReplacementResult result = replacePlaceholders(s);
while(result.wasReplaced()) {
result = replacePlaceholders(result.string());
}
So, each time you call replacePlaceholders() it will either make at least one replacement, or it will report false having verified that there are no more replacements to make.
You mention recursion in the question. This can of course be done, and it would mean avoiding scanning through the whole string each time -- as you can look at just the replacement fragment. This is untested Java-like pseudocode:
String replaceRecursively(String s) {
StringBuilder result = new StringBuilder();
while(Token token = takeTokenFrom(s)) {
if(token.isPlaceholder()) {
String rawReplacement = lookupReplacement(token);
String processedReplacement = replaceRecursively(rawReplacement);
result.append(processedReplacement);
} else {
result.append(token.text());
}
}
return result.toString();
}
For all of these solutions, you should beware of infinite loops or stack-blowing recursion. What if you replace "{foo}" with "{foo}"? (or worse, what if you replace "{foo}" with "{foo}{foo}"!?).
Of course the simplest way is to be in control of the configuration, and simply not trigger that problem. Detecting the problem programatically is entirely possible, but complex enough that it would warrant another SO question if you want it.
I would like to do some simple String replace with a regular expression in Java, but the replace value is not static and I would like it to be dynamic like it happens on JavaScript.
I know I can make:
"some string".replaceAll("some regex", "new value");
But i would like something like:
"some string".replaceAll("some regex", new SomeThinkIDontKnow() {
public String handle(String group) {
return "my super dynamic string group " + group;
}
});
Maybe there is a Java way to do this but i am not aware of it...
You need to use the Java regex API directly.
Create a Pattern object for your regex (this is reusable), then call the matcher() method to run it against your string.
You can then call find() repeatedly to loop through each match in your string, and assemble a replacement string as you like.
Here is how such a replacement can be implemented.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegExCustomReplacementExample
{
public static void main(String[] args)
{
System.out.println(
new ReplaceFunction() {
public String handle(String group)
{
return "«"+group.substring(1, group.length()-1)+"»";
}
}.replace("A simple *test* string", "\\*.*?\\*"));
}
}
abstract class ReplaceFunction
{
public String replace(String source, String regex)
{
final Pattern pattern = Pattern.compile(regex);
final Matcher m = pattern.matcher(source);
boolean result = m.find();
if(result) {
StringBuilder sb = new StringBuilder(source.length());
int p=0;
do {
sb.append(source, p, m.start());
sb.append(handle(m.group()));
p=m.end();
} while (m.find());
sb.append(source, p, source.length());
return sb.toString();
}
return source;
}
public abstract String handle(String group);
}
Might look a bit complicated at the first time but that doesn’t matter as you need it only once. The subclasses implementing the handle method look simpler. An alternative is to pass the Matcher instead of the match String (group 0) to the handle method as it offers access to all groups matched by the pattern (if the pattern created groups).
I am having some weird issues with a pattern replace.
I have these two patterns:
private static final Pattern CODE_ANY = Pattern.compile("&[0-9a-fk-or]");
private static final Pattern CODE_BLACK = Pattern.compile(ChatColour.BLACK.toString());
ChatColour.BLACK.toString() returns "&0"
Next, I have this code:
public static String Strip(String message)
{
while (true)
{
Matcher matcher = CODE_ANY.matcher(message);
if (!matcher.matches())
break;
message = matcher.replaceAll("");
}
return message;
}
I have tried a couple different approaches, but nothing gets replaced.
The initial version just called each CODE_xxx pattern one after the other, but users were bypassing that by doubling up on ampersands.
I just do not understand why this isn't removing anything..
I know it is definitely getting called, as I have printed debug messages to the console to check that.
// Morten
matches() checks if the complete input string matches the pattern, whereas find() checks if the pattern can be found somewhere in the input string. Therefor, I would rewrite your method as:
public static String strip(String message) // lowercase strip due to Java naming conventions
{
Matcher matcher = CODE_ANY.matcher(message);
if (matcher.find())
message = matcher.replaceAll("");
return message;
}
Just realized, this can be done with a one liner:
public static String strip(String message) {
return message.replaceAll("&[0-9a-fk-or]", "");
}
Using the replaceAll() method you don't need a precompiled pattern, but you could extract the regex to a final field of type String.
Here is my original method:
- (BOOL)validateEmail:(NSString *)address
{
NSString *emailRegEx = #"[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?\\.)+[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?";
NSPredicate *emailTest = [NSPredicate predicateWithFormat:#"SELF MATCHES %#", emailRegEx];
return [emailTest evaluateWithObject:address];
}
Here's what I've come up with. Is this correct?
private boolean Validate(String email)
{
Pattern pattern = Pattern.compile("[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?\\.)+[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?");
Matcher matcher = pattern.matcher(email);
if(matcher.matches())
{
return true;
}
else
{
return false;
}
}
It seems to look alright with me, although I would like to point out some differences you should make when using Java.
// use a pattern as a constant instead, using the Java naming conventions (all uppercase and underscores)
private static final String MAIL_PATTERN = "[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?\\.)+[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?";
// lower case method identifier, does not use field so declare static
private static boolean validate(final String email)
{
// matches already returns a boolean, you can use matches directly on a string (shorthand notation)
return email.matches(MAIL_PATTERN);
}