I have string something like this:
10:11:22 [UTP][ROX][ID:32424][APP STR]
I want to seperate each them of. How can I do it with regex?
I want to get seperately "10:11:22", "UTP", "ROX", "ID:32424", "APP STR" as strings.
This would be your macthing parttern: /\[([^\]]+)/g
Working demo # regex101.com
Working Java demo:
public class Main {
private static final String REGEX = "\\[([^\\]]+)";
private static final String INPUT = "10:11:22 [UTP][ROX][ID:32424][APP STR]";
private static Pattern pattern;
private static Matcher matcher;
public static void main(String[] args) {
Pattern pattern = Pattern.compile(REGEX);
Matcher matcher = pattern.matcher(INPUT);
while (matcher.find()) {
System.out.println(matcher.toString());
}
}
}
The simplest solution I can think of, with the drawback that it will create a blank final entry, is this:
"10:11:22 [UTP][ROX][ID:32424][APP STR]".split("[\[\]]+")
That will return you an array as this:
["10:11:22",
"UTP",
"ROX",
"ID:32424",
"APP STR",
""]
If you want regex to do the job. Then try the below,
(?:^([^\s]*)|\[([^]]*)\])
DEMO
All the strings you want are stored separately in groups.
Related
I have a set of IP adresses with special format, which I have to check if it matches the needed regex pattern. My pattern right now looks like this:
private static final String FIRST_PATTERN = "([0-9]{1,3}\\\\{2}?.[0-9]{1,3}\\\\{2}?.[0-9]{1,3}\\\\{2}?.[0-9]{1,3})";
This pattern allows me to check strict IP adresses and recognize the pattern, when IP adresses are static, for example: "65\\.33\\.42\\.12" or "112\\.76\\.39\\.104, 188\\.35\\.122\\.148".
I should, however, also be able to look for some non static IP's, like this:
"155\\.105\\.178\\.(8[1-9]|9[0-5])"
or this:
"93\\.23\\.75\\.(1(1[6-9]|2[0-6])),
113\\.202\\.167\\.(1(2[8-9]|[3-8][0-9]|9[0-2]))"
I have tried to do it in several ways, but it always gives "false", when try to match those IP's. I searched for this solution for a decent amount of time and I cannot find it and also cannot wrap my head around of how to do it myself. Is there anyone who can help me?
UPDATE Whole code snippet:
public class IPAdressValidator {
Pattern pattern;
Matcher matcher;
private static final String FIRST_PATTERN = "([0-9]{1,3}\\\\{2}?.[0-9]{1,3}\\\\{2}?.[0-9]{1,3}\\\\{2}?.[0-9]{1,3})";
public IPAdressValidator() {
pattern = Pattern.compile(FIRST_PATTERN);
}
public CharSequence validate(String ip) {
matcher = pattern.matcher(ip);
boolean found = matcher.find();
if (found) {
for (int i = 0; i <= matcher.groupCount(); i++) {
int groupStart = matcher.start(i);
int groupEnd = matcher.end(i);
return ip.subSequence(groupStart, groupEnd);
}
}
return null;
}
}
and my Main:
public class Main {
public static void main(String[] args) {
IPAdressValidator validator = new IPAdressValidator();
String[] ips =
"53\\\\.22\\\\.14\\\\.43",
"123\\\\.55\\\\.19\\\\.137",
"93\\.152\\.199\\.1",
"(93\\.199\\.(?:1(?:0[6-7]))\\.(?:[0-9]|[1-9][0-9]|1(?:[0-9][0-9])|2(?:[0-4][0-9]|5[0-5])))",
"193\\\\.163\\\\.100\\\\.(8[1-9]|9[0-5])",
"5\\\\.56\\\\.188\\\\.130, 188\\\\.194\\\\.180\\\\.138, 182\\\\.105\\\\.24\\\\.15",
"188\\\\.56\\\\.147\\\\.193,41\\\\.64\\\\.202\\\\.19"
};
for (String ip : ips) {
System.out.printf("%20s: %b%n", ip, validator.validate(ip));
}
}
}
Is there any predefined method stating whether a string contains HTML tags or characters in it?
You can try regular expressions, like this
private static final String HTML_PATTERN = "<(\"[^\"]*\"|'[^']*'|[^'\">])*>";
private Pattern pattern = Pattern.compile(HTML_PATTERN);
public boolean hasHTMLTags(String text){
Matcher matcher = pattern.matcher(text);
return matcher.find();
}
Either Use regular expression to search or identify the HTML tags in String.
boolean containsHTMLTag = stringHtml.matches(".*\\<[^>]+>.*");
Or as Tim suggested use Jsoup like below:-
String textOfHtmlString = Jsoup.parse(htmlString).text();
boolean containedHTMLTag = !textOfHtmlString.equals(htmlString);
You should use find()
private static final String HTML_TAG_PATTERN = "<(\"[^\"]*\"|'[^']*'|[^'\">])*>";
static Pattern htmlValidator = TextUtils.isEmpty(HTML_TAG_PATTERN) ? null:Pattern.compile(HTML_TAG_PATTERN);
public static boolean validateHtml(final String text){
if(htmlValidator !=null)
return htmlValidator.matcher(text).find();
return false;
}
Parsing String with Regex in order to search for HTML (in my case to prevent XSS attack related input) is not the proper way.
A good way to achieve it is by using Spring HtmlUtils
Both are better explained already here,
https://codereview.stackexchange.com/questions/112495/preventing-xss-attacks-in-a-spring-mvc-application-controller
I would like to do some simple String replace with a regular expression in Java, but the replace value is not static and I would like it to be dynamic like it happens on JavaScript.
I know I can make:
"some string".replaceAll("some regex", "new value");
But i would like something like:
"some string".replaceAll("some regex", new SomeThinkIDontKnow() {
public String handle(String group) {
return "my super dynamic string group " + group;
}
});
Maybe there is a Java way to do this but i am not aware of it...
You need to use the Java regex API directly.
Create a Pattern object for your regex (this is reusable), then call the matcher() method to run it against your string.
You can then call find() repeatedly to loop through each match in your string, and assemble a replacement string as you like.
Here is how such a replacement can be implemented.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegExCustomReplacementExample
{
public static void main(String[] args)
{
System.out.println(
new ReplaceFunction() {
public String handle(String group)
{
return "«"+group.substring(1, group.length()-1)+"»";
}
}.replace("A simple *test* string", "\\*.*?\\*"));
}
}
abstract class ReplaceFunction
{
public String replace(String source, String regex)
{
final Pattern pattern = Pattern.compile(regex);
final Matcher m = pattern.matcher(source);
boolean result = m.find();
if(result) {
StringBuilder sb = new StringBuilder(source.length());
int p=0;
do {
sb.append(source, p, m.start());
sb.append(handle(m.group()));
p=m.end();
} while (m.find());
sb.append(source, p, source.length());
return sb.toString();
}
return source;
}
public abstract String handle(String group);
}
Might look a bit complicated at the first time but that doesn’t matter as you need it only once. The subclasses implementing the handle method look simpler. An alternative is to pass the Matcher instead of the match String (group 0) to the handle method as it offers access to all groups matched by the pattern (if the pattern created groups).
package xmlchars;
import java.util.regex.Pattern;
public class TestRegex {
public static final String SPECIAL_CHARACTERS = "(?i)^[^a-z_]|[^a-z0-9-_.]";
public static void main(String[] args) {
// TODO Auto-generated method stub
String name = "#1998St #";
Pattern pattern = Pattern.compile(SPECIAL_CHARACTERS);
System.out.println(pattern.matcher(name).replaceAll(""));//gives wrong output 1998St
}
}
Basically what i'm trying to achieve is
String to start only with a-z and _
String to contain a-z 0-9 _ - . after the start
Case insensitive for the whole string
You could say:
... SPECIAL_CHARACTERS = "^[a-z_][a-z0-9_]+$";
and define the pattern by saying:
Pattern pattern = Pattern.compile(SPECIAL_CHARACTERS, Pattern.CASE_INSENSITIVE);
I managed to crack the regex. Simple change to the existing.
"^[^a-z_]*|[^a-z_0-9-._]"
Here you go, with the working proof.
package xmlchars;
import java.util.regex.Pattern;
public class TestRegex {
public static final String SPECIAL_CHARACTERS = "^[^a-z_]*|[^a-z_0-9-._]";
public static void main(String[] args) {
// TODO Auto-generated method stub
String name = " # !`~!##$%^&*()-_=+{}[];:',<>/?19.- 98Cc#19 #/9_-8-.";
Pattern pattern = Pattern.compile(SPECIAL_CHARACTERS, Pattern.CASE_INSENSITIVE);
System.out.println(pattern.matcher(name).replaceAll("")); // output _19.-98Cc199_-8-.
}
}
I'll assume you are trying to identify anything in the String that doesn't match the pattern. What you have looks almost correct. It looks like your regex might work like this:
"(?i)^([^a-z_]|[^a-z0-9-_.])"
That would only match whenever one of those two groups appear at the start of the String. Instead, try this:
"(?i)(^[^a-z_])|[^a-z0-9-_.]"
To shorten it even further, you could use the predefined character class \\W which is the same as [^a-zA-Z_0-9]. With that, you wouldn't even need the case-insensitivity.
"(^\\W)|[\\W-.]"
Given a String called str, str.replaceAll("(^\\W)|[\\W-.]",""); will remove all invalid characters.
Test for your string:
class RegexTest
{
public static void main (String[] args)
{
String str = "#1998St #";
str = str.replaceAll("(^\\W)|[\\W-.]","");
System.out.println(str);
}
}
Output:
1998St
I am having some weird issues with a pattern replace.
I have these two patterns:
private static final Pattern CODE_ANY = Pattern.compile("&[0-9a-fk-or]");
private static final Pattern CODE_BLACK = Pattern.compile(ChatColour.BLACK.toString());
ChatColour.BLACK.toString() returns "&0"
Next, I have this code:
public static String Strip(String message)
{
while (true)
{
Matcher matcher = CODE_ANY.matcher(message);
if (!matcher.matches())
break;
message = matcher.replaceAll("");
}
return message;
}
I have tried a couple different approaches, but nothing gets replaced.
The initial version just called each CODE_xxx pattern one after the other, but users were bypassing that by doubling up on ampersands.
I just do not understand why this isn't removing anything..
I know it is definitely getting called, as I have printed debug messages to the console to check that.
// Morten
matches() checks if the complete input string matches the pattern, whereas find() checks if the pattern can be found somewhere in the input string. Therefor, I would rewrite your method as:
public static String strip(String message) // lowercase strip due to Java naming conventions
{
Matcher matcher = CODE_ANY.matcher(message);
if (matcher.find())
message = matcher.replaceAll("");
return message;
}
Just realized, this can be done with a one liner:
public static String strip(String message) {
return message.replaceAll("&[0-9a-fk-or]", "");
}
Using the replaceAll() method you don't need a precompiled pattern, but you could extract the regex to a final field of type String.