Substring first, second, third, ... , n match - java

Given: String s = "{aaa}{bbb}ccc.
How to get an array (or list) which elements will be:
0th element: aaa
1st element: bbb
2nd element: ccc
This is my try:
String x = "{aaa}{b}c";
return Arrays.stream(x.split("\\}"))
.map(ss -> {
Pattern pattern = Pattern.compile("\\w*");
Matcher matcher = pattern.matcher(ss);
matcher.find();
return matcher.group();
})
.toArray(String[]::new);
(assume only Java <= 8 allowed)

A simple replace should be enough if your strings are well formed like your examples:
String[] myStrings = {"{aaa}bbb", "{aaa}{bbb}{ccc}ddd", "{aaa}{bbb}{ccc}{ddd}eee"};
for(String str : myStrings){
String[] splited = str.replace("}{", "}").replace("{", "").split("}");
System.out.println(Arrays.toString(splited));
}
prints:
[aaa, bbb]
[aaa, bbb, ccc, ddd]
[aaa, bbb, ccc, ddd, eee]

private static List<String> parse ()
{
String x = "{aaa}{b}c";
Pattern pattern = Pattern.compile ("[^{\\}]+(?=})");
List < String > allMatches = new ArrayList < String > ();
Matcher m = pattern.matcher (x);
while (m.find ())
{
allMatches.add (m.group ());
}
String lastPart = x.substring(x.lastIndexOf("}")+1);
allMatches.add(lastPart);
System.out.println (allMatches);
return allMatches
}
Make sure you do a check for lastIndexOf >-1, if your string may or may not contain last part without braces.

This way is a bit simpler than using regex (and may be a bit faster too):
String[] strings = new String[100];
int index = 0;
int last = 0;
for(int i = 1; i < s.length(); i++){
if(s.charAt(i) == "}"){
strings[index++] = s.substring(last + 1, i - 1);
last = i + 1;
}
}
strings[index++] = s.substring(last, s.length());
If you want to use regex, the pattern needs to identify sequences of one or more letters, you can try the pattern (?:{([a-z]+)})*([a-z]+).

Related

find overlapping regex pattern

I'm using regex to find a pattern
I need to find all matches in this way :
input :"word1_word2_word3_..."
result: "word1_word2","word2_word3", "word4_word5" ..
It can be done using (?=) positive lookahead.
Regex: (?=(?:_|^)([^_]+_[^_]+))
Java code:
String text = "word1_word2_word3_word4_word5_word6_word7";
String regex = "(?=(?:_|^)([^_]+_[^_]+))";
Matcher matcher = Pattern.compile(regex).matcher(text);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Output:
word1_word2
word2_word3
word3_word4
...
Code demo
You can do it without regex, using split:
String input = "word1_word2_word3_word4";
String[] words = input.split("_");
List<String> outputs = new LinkedList<>();
for (int i = 0; i < words.length - 1; i++) {
String first = words[i];
String second = words[i + 1];
outputs.add(first + "_" + second);
}
for (String output : outputs) {
System.out.println(output);
}

How to use regex to split a string containing numbers and letters in java

My task is splitting a string, which starts with numbers and contains numbers and letters, into two sub-strings.The first one consists of all numbers before the first letter. The second one is the remained part, and shouldn't be split even if it contains numbers.
For example, a string "123abc34de" should be split as: "123" and "abc34de".
I know how to write a regular expression for such a string, and it might look like this:
[0-9]{1,}[a-zA-Z]{1,}[a-zA-Z0-9]{0,}
I have tried multiple times but still don't know how to apply regex in String.split() method, and it seems very few online materials about this. Thanks for any help.
you can do it in this way
final String regex = "([0-9]{1,})([a-zA-Z]{1,}[a-zA-Z0-9]{0,})";
final String string = "123ahaha1234";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
matcher.group(1) contains the first part and matcher.group(2) contains the second
you can add it to a list/array using these values
You can use a pretty simple pattern : "^(\\d+)(\\w+)" which capture digits as start, and then when letters appear it take word-char
String string = "123abc34de";
Matcher matcher = Pattern.compile("^(\\d+)(\\w+)").matcher(string);
String firstpart = "";
String secondPart = "";
if (matcher.find()) {
firstpart = matcher.group(1);
secondPart = matcher.group(2);
}
System.out.println(firstpart + " - " + secondPart); // 123 - abc34de
This is not the correct way but u will get the result
public static void main(String[] args) {
String example = "1234abc123";
int index = 0;
String[] arr = new String[example.length()];
for (int i = 0; i < example.length(); i++) {
arr = example.split("");
try{
if(Integer.parseInt(arr[i]) >= 0 & Integer.parseInt(arr[i]) <= 9){
index = i;
}
else
break;
}catch (NumberFormatException e) {
index = index;
}
}
String firstHalf = example.substring(0,Integer.parseInt(arr[index])+1);
String secondHalf = example.substring(Integer.parseInt(arr[index])+1,example.length());
System.out.println(firstHalf);
System.out.println(secondHalf);
}
Output will be: 1234 and in next line abc123

Java: String to integer array

I have a string, which is a list of coordinates, as follows:
st = "((1,2),(2,3),(3,4),(4,5),(2,3))"
I want this to be converted to an array of coordinates,
a[0] = 1,2
a[1] = 2,3
a[2] = 3,4
....
and so on.
I can do it in Python, but I want to do it in Java.
So how can I split the string into array in java??
It can be done fairly easily with regex, capturing (\d+,\d+) and the looping over the matches
String st = "((1,2),(2,3),(3,4),(4,5),(2,3))";
Pattern p = Pattern.compile("\\((\\d+),(\\d+)\\)");
Matcher m = p.matcher(st);
List<String> matches = new ArrayList<>();
while (m.find()) {
matches.add(m.group(1) + "," + m.group(2));
}
System.out.println(matches);
If you genuinely need an array, this can be converted
String [] array = matches.toArray(new String[matches.size()]);
Alternative solution:
String str="((1,2),(2,3),(3,4),(4,5),(2,3))";
ArrayList<String> arry=new ArrayList<String>();
for (int x=0; x<=str.length()-1;x++)
{
if (str.charAt(x)!='(' && str.charAt(x)!=')' && str.charAt(x)!=',')
{
arry.add(str.substring(x, x+3));
x=x+2;
}
}
for (String valInArry: arry)
{
System.out.println(valInArry);
}
If you don't want to use Pattern-Matcher;
This should be it:
String st = "((1,2),(2,3),(3,4),(4,5),(2,3))";
String[] array = st.substring(2, st.length() - 2).split("\\),\\(");

Find secuentially occurrences of all String[] in a given String

I have a pair of Strings in an array to check in another String:
String[] validPair = "{"[BOLD]", "[/BOLD]" };
String toCheck = "Example [BOLD]bold long text[/BOLD] other example [BOLD]bold short[/BOLD]";
I need to check the balance of the tags, I know how to check if a string is inside another string, also how to achieve this using both indexOf of validPair content across the string and saving references, but is an ugly way and I don't wanna reinvent the wheel.
Something like :
int lastIndex = 0;
while (lastIndex != -1) {
int index = toCheck.findNextOccurrence(validPair, lastIndex); // here use indexOf
System.out.println(index);
lastIndex = index;
}
I was guessing if there is a way I can check nextOccurrence of any of the String's in String[] validPair in the String toCheck?
A kind of Iterator or Tokenizer but not splitting the string and giving only occurrences of the contents of the array (or List or any other Object).
OR:
OwnIterator ownIterator = new OwnIterator<String>(toCheck, validPair);
while (toCheck.hasNext()) {
String next = toCheck.findNextOccurrence();
System.out.println(next);
}
OUTPUT:
[BOLD]
[/BOLD]
[BOLD]
[/BOLD]
This is the solution I came up with. it is using array of regular expressions to search for every item in validPair separetely then combine all found occurrences into one list (and its iterator)
public class OwnIterator implements Iterator
{
private Iterator<Integer> occurrencesItr;
public OwnIterator(String toCheck, String[] validPair) {
// build regex to search for every item in validPair
Matcher[] matchValidPair = new Matcher[validPair.length];
for (int i = 0 ; i < validPair.length ; i++) {
String regex =
"(" + // start capturing group
"\\Q" + // quote entire input string so it is not interpreted as regex
validPair[i] + // this is what we are looking for, duhh
"\\E" + // end quote
")" ; // end capturing group
Pattern p = Pattern.compile(regex);
matchValidPair[i] = p.matcher(toCheck);
}
// do the search, saving found occurrences in list
List<Integer> occurrences = new ArrayList<>();
for (int i = 0 ; i < matchValidPair.length ; i++) {
while (matchValidPair[i].find()) {
occurrences.add(matchValidPair[i].start(0)+1); // +1 if you want index to start at 1
}
}
// sort the list
Collections.sort(occurrences);
occurrencesItr = occurrences.iterator();
}
#Override
public boolean hasNext()
{
return occurrencesItr.hasNext();
}
#Override
public Object next()
{
return occurrencesItr.next();
}
}
a quick test :
public static void main(String[] args)
{
String[] validPair = {"[BOLD]", "[/BOLD]" };
String toCheck = "Example [BOLD]bold long text[/BOLD] other example [BOLD]bold short[/BOLD]";
OwnIterator itr = new OwnIterator(toCheck, validPair);
while (itr.hasNext()) {
System.out.println(itr.next());
}
}
gives desired output:
9
29
51
67
EDIT:
found a better solution, with just one regular expression that includes all items in validPair with "or" condition (|). then you have the Matcher's own find() method as the iterator:
String regex = "(";
for (int i = 0 ; i < validPair.length ; i++) {
regex += (i == 0 ? "" : "|") + // add "or" after first item
"\\Q" + // quote entire input string so it is not interpreted as regex
validPair[i] + // this is what we are looking for, duhh
"\\E"; // end quote
}
regex += ")";
System.out.println("using regex : " + regex);
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(toCheck);
while (m.find()) {
System.out.println(m.group(0));
}
you get the output
using regex : (\Q[BOLD]\E|\Q[/BOLD]\E)
[BOLD]
[/BOLD]
[BOLD]
[/BOLD]
You can just do:
int first = toCheck.indexOf(validPair[0]);
boolean ok = first > -1 && toCheck.indexOf(validPair[1], first) > 0;

Java, How to split String with shifting

How can I split a string by 2 characters with shifting.
For example;
My string is = todayiscold
My target is: "to","od","da","ay","yi","is","sc","co","ol","ld"
but with this code:
Arrays.toString("todayiscold".split("(?<=\\G.{2})")));
I get: `"to","da","yi","co","ld"
anybody helps?
Try this:
String e = "example";
for (int i = 0; i < e.length() - 1; i++) {
System.out.println(e.substring(i, i+2));
}
Use a loop:
String test = "abcdefgh";
List<String> list = new ArrayList<String>();
for(int i = 0; i < test.length() - 1; i++)
{
list.add(test.substring(i, i + 2));
}
Following regex based code should work:
String str = "todayiscold";
Pattern p = Pattern.compile("(?<=\\G..)");
Matcher m = p.matcher(str);
int start = 0;
List<String> matches = new ArrayList<String>();
while (m.find(start)) {
matches.add(str.substring(m.end()-2, m.end()));
start = m.end()-1;
}
System.out.println("Matches => " + matches);
Trick is to use end()-1 from last match in the find() method.
Output:
Matches => [to, od, da, ay, yi, is, sc, co, ol, ld]
You cant use split in this case because all split does is find place to split and brake your string in this place, so you cant make same character appear in two parts.
Instead you can use Pattern/Matcher mechanisms like
String test = "todayiscold";
List<String> list = new ArrayList<String>();
Pattern p = Pattern.compile("(?=(..))");
Matcher m = p.matcher(test);
while(m.find())
list.add(m.group(1));
or even better iterate over your Atring characters and create substrings like in D-Rock's answer

Categories