Java store matches in array - java

Hi I would like to store my matches in my array however constantly getting errors of nullpointer or out of bounds.
final String mcontentURI[] = new String[count];
for (int i = 0; i < count; i++) {
Pattern p = Pattern.compile("src=\"(.*?)\"");
Matcher m = p.matcher(content_val);
if (m.find()) {
mcontentURI[i] = (m.group(i+1));
}
}

Since you keep re-compiling the same regex, the group number is going to stay the same. You can put it at different indexes of the array, though:
final String mcontentURI[] = new String[count];
final Pattern p = Pattern.compile("src=\"(.*?)\"");
for (int i = 0; i < count; i++) {
Matcher m = p.matcher(content_val); // Use different strings here
if (m.find()) {
mcontentURI[i] = m.group(1);
}
}
Note that mcontentURI[i] would remain null for indexes for which the pattern did not match.
If you want to search the same string, do this:
final String mcontentURI[] = new String[count];
final Pattern p = Pattern.compile("src=\"(.*?)\"");
Matcher m = p.matcher(content_val);
int i = 0;
while (i < count && m.find()) {
mcontentURI[i++] = m.group(1);
}

Related

Parsing time with Regex in Java

The code snipped below is trying to extract the hour, minutes and seconds of a string.
Ex:
"PT5M30S"
"PT1H13M59S"
I am getting a NullPointerException in this line (group=null): int number = new Integer(group.substring(0, group.length()-1));
// Create a Pattern object
Pattern pattern = Pattern.compile("PT(\\d+H)?(\\d+M)?(\\d+S)?");
// Now create matcher object.
Matcher matcher = pattern.matcher(duracaoStr);
int hour = 0;
int minute = 0;
int second = 0;
if(matcher.matches()){
for(int i = 1; i<=matcher.groupCount();i++){
String group = matcher.group(i);
int number = new Integer(group.substring(0, group.length()-1));
if(matcher.group(i).endsWith("H")){
hour = number;
} else if(matcher.group(i).endsWith("M")){
minute = number;
} else if(matcher.group(i).endsWith("S")){
second = number;
}
}
}
Just try to compile this code for both the String's individually, one by one.
You'll then notice that this program compiles successfully for the second String i.e., PT1H13M59S whereas it gives NullPointerException for the first String, i.e., PT5M30S
You get this NullPointerException from your first String PT5M30S because this String doesn't contains group 1. Notice that there's no Hour value for your first String PT5M30S
See this Demo:
RegEx
PT(\d+H)?(\d+M)?(\d+S)?
Input
PT5M30S
PT1H13M59S
Match Information
MATCH 1
2. [2-4] `5M`
3. [4-7] `30S`
MATCH 2
1. [10-12] `1H`
2. [12-15] `13M`
3. [15-18] `59S`
Notice that in for the first String in Match 1, there's no output for Group 1.
So what you should do is you should perform appropriate validations. Just enclose your code where you're getting NullPointerException in try catch block and if NullPointerException occurs, then give default values to all the variables.
For example:,
import java.util.regex.*;
public class HelloWorld {
public static void main(String[] args) {
// Create a Pattern object
Pattern pattern = Pattern.compile("PT(\\d+H)?(\\d+M)?(\\d+S)?");
// Now create matcher object.
Matcher matcher = pattern.matcher("PT5M30S");
int hour = 0;
int minute = 0;
int second = 0;
if (matcher.matches()) {
for (int i = 1; i <= matcher.groupCount(); i++) {
try {
String group = matcher.group(i);
int number = new Integer(group.substring(0, group.length() - 1));
if (matcher.group(i).endsWith("H")) {
hour = number;
} else if (matcher.group(i).endsWith("M")) {
minute = number;
} else if (matcher.group(i).endsWith("S")) {
second = number;
}
} catch (java.lang.NullPointerException e) {
if (i == 1) {
hour = 0;
} else if (i == 2) {
minute = 0;
} else if (i == 3) {
second = 0;
}
}
}
}
}
}
#rD's solution above is sufficient and well answered ( please choose his ). Just as an alternative I was working on a solution here as well before I realized it was answered properly:
https://github.com/davethomas11/stackoverflow_Q_39443620
// Create a Pattern object
Pattern pattern = Pattern.compile("PT(\\d+H)?(\\d+M)?(\\d+S)?");
// Now create matcher object.
Matcher matcher = pattern.matcher(duracaoStr);
int hour = 0;
int minute = 0;
int second = 0;
if(matcher.matches()){
for(int i = 1; i<=matcher.groupCount();i++){
String group = matcher.group(i);
//Group will be null if not in pattern
if (group != null) {
int number = new Integer(group.substring(0, group.length()-1));
if(matcher.group(i).endsWith("H")){
hour = number;
} else if(matcher.group(i).endsWith("M")){
minute = number;
} else if(matcher.group(i).endsWith("S")){
second = number;
}
}
}
}
Same thing I've added checking for null.

How do you add a delimiter in a given String format in Java?

I have the following String
"12:00:00, 2:30:003:45:00,23:45:00";
I have to update the string to use the following format:
"12:00:00, 2:30:00 |3:45:00,23:45:00 ";
I am able to split each string, but I do not know how to generate the required format. Here is the code I've written so far:
final String s = "12:00:00, 2:30:003:45:00,23:45:00";
final Pattern p = Pattern.compile("\\s*(\\d+:\\d\\d:\\d\\d)");
final Matcher m = p.matcher(s);
final List<String> tokens = new ArrayList<String>();
while (m.find()) {
tokens.add(m.group(1));
}
for (String tok : tokens) {
System.out.printf("[%s]%n", tok);
}
How about this:
final String string = "12:00:00, 2:30:003:45:00,23:45:00";
final Pattern pattern = Pattern.compile("\\s*(\\d+:\\d\\d:\\d\\d)");
final Matcher matcher = pattern.matcher(string);
final List<String> tokens = new ArrayList<String>();
while (matcher.find()) {
tokens.add(matcher.group(1));
}
System.out.println("tokens = " + tokens);
StringBuilder formattedString = new StringBuilder();
formattedString.append(tokens.get(0));
for (int i = 1; i < tokens.size(); i++) {
if (i % 2 == 0) {
formattedString.append(" | ");
} else {
formattedString.append(", ");
}
formattedString.append(tokens.get(i));
}
System.out.println(formattedString);
Edit: I've updated it to use a for loop when constructing the formatted string based on the comments I've read.
If you want to add | after two dates separated by comma your code can look like
final String s = "12:00:00, 2:30:003:45:00,23:45:00";
final Pattern p = Pattern.compile("(\\d+:\\d\\d:\\d\\d)\\s*,\\s*(\\d+:\\d\\d:\\d\\d)");
final Matcher m = p.matcher(s);
String result = m.replaceAll("$0|");
Or even
String result = s.replaceAll("?:\\d+:\\d\\d:\\d\\d),\\s*(?:\\d+:\\d\\d:\\d\\d)","$0|");
$0 refers to group 0 which holds entire match.
result is 12:00:00, 2:30:00|3:45:00,23:45:00|
You may consider this replaceAll method using lookarounds:
final String s = "12:00:00, 2:30:003:45:00,23:45:00";
System.out.printf("%s%n", s.replaceAll("(?<=:\\d\\d)(?=(?::\\d{1,2}|$))", "|"));
// 12:00|:00, 2:30|:003:45|:00,23:45|:00|

Java String replace

Lets say I have a string "aabbccaa". Now I want to replace occurrences of "aa" in given string by another string. But it should be in following way.
First occurrence of "aa" should be replaced by "1" and next occurrence of "aa" by "2" and so on.
So, the result of the string becomes "1bbcc2".
You can use replaceFirst() in a for loop where counter is incrementing...
for (int i = 1; string.contains("aa"); i++) {
string = string.replaceFirst("aa", "" + i);
}
You can do it using the Matcher's appendReplacement method:
Pattern p = Pattern.compile("aa");
Matcher m = p.matcher("aabbccaahhhaahhhaaahahhahaaakty");
StringBuffer sb = new StringBuffer();
// Variable "i" serves as a counter. It gets incremented after each replacement.
int i = 0;
while (m.find()) {
m.appendReplacement(sb, ""+(i++));
}
m.appendTail(sb);
System.out.println(sb.toString());
This approach lets you avoid creating multiple string objects (demo).
It is possible to do using Java functions but using a char array and doing it using a lower level of logic would be faster.
String s = "aabbccaa";
String target = "aa";
int i = 1;
String newS;
for (int j = 0; j < s.length; j++) {
newS = s.replaceFirst(target, i++);
j += newS.length - s.length;
s = newS;
}
Here is a solution :
public static void main(String[] a) {
int i = 1;
String before = "aabbccaabbaabbaa";
String regex = "aa";
String after = substitute(i, before, regex);
System.out.println(after);
}
private static String substitute(int i, String before, String regex) {
String after = before.replaceFirst(regex, Integer.toString(i++));
while (!before.equals(after)) {
before = after;
after = before.replaceFirst(regex, Integer.toString(i++));
}
return after;
}
Output :
1bbcc2bb3bb4

Java, How to split String with shifting

How can I split a string by 2 characters with shifting.
For example;
My string is = todayiscold
My target is: "to","od","da","ay","yi","is","sc","co","ol","ld"
but with this code:
Arrays.toString("todayiscold".split("(?<=\\G.{2})")));
I get: `"to","da","yi","co","ld"
anybody helps?
Try this:
String e = "example";
for (int i = 0; i < e.length() - 1; i++) {
System.out.println(e.substring(i, i+2));
}
Use a loop:
String test = "abcdefgh";
List<String> list = new ArrayList<String>();
for(int i = 0; i < test.length() - 1; i++)
{
list.add(test.substring(i, i + 2));
}
Following regex based code should work:
String str = "todayiscold";
Pattern p = Pattern.compile("(?<=\\G..)");
Matcher m = p.matcher(str);
int start = 0;
List<String> matches = new ArrayList<String>();
while (m.find(start)) {
matches.add(str.substring(m.end()-2, m.end()));
start = m.end()-1;
}
System.out.println("Matches => " + matches);
Trick is to use end()-1 from last match in the find() method.
Output:
Matches => [to, od, da, ay, yi, is, sc, co, ol, ld]
You cant use split in this case because all split does is find place to split and brake your string in this place, so you cant make same character appear in two parts.
Instead you can use Pattern/Matcher mechanisms like
String test = "todayiscold";
List<String> list = new ArrayList<String>();
Pattern p = Pattern.compile("(?=(..))");
Matcher m = p.matcher(test);
while(m.find())
list.add(m.group(1));
or even better iterate over your Atring characters and create substrings like in D-Rock's answer

extracting a specific part of a url using regex

i wanna extract a part of url which is at the middle of it, by using regex in java
this is what i tried,mostly the problem to detect java+regexis that its in the middle of last part of url and i have no idea how to ignore the characters after it, my regex just ignoring before it:
String regex = "https://www\\.google\\.com/(search)?q=([^/]+)/";
String url = "https://www.google.com/search?q=regex+java&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a";
Pattern pattern = Pattern.compile (regex);
Matcher matcher = pattern.matcher (url);
if (matcher.matches ())
{
int n = matcher.groupCount ();
for (int i = 0; i <= n; ++i)
System.out.println (matcher.group (i));
}
}
the result should be regex+java or even regex java . but my code didnt work out...
Try:
String regex = "https://www\\.google\\.com/search\\?q=([^&]+).*";
String url = "https://www.google.com/search?q=regex+java&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a";
Pattern pattern = Pattern.compile (regex);
Matcher matcher = pattern.matcher (url);
if (matcher.matches ())
{
int n = matcher.groupCount ();
for (int i = 0; i <= n; ++i)
System.out.println (matcher.group (i));
}
The result is:
https://www.google.com/search?q=regex+java&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
regex+java
EDIT
Replacing all pluses before printing:
for (int i = 0; i <= n; ++i) {
String str = matcher.group (i).replaceAll("\\+", " ");
System.out.println (str);
}
String regex = "https://www\\.google\\.com/?(search)\\?q=([^&]+)?";
String url = "https://www.google.com/search?q=regex+java&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(url);
while (matcher.find()) {
System.out.println(matcher.group());
}
This should do your job.

Categories