Android get nearby comma, space, or period in a string - java

I have a string and I want to get the first comma, space, or period in it.
int word = title.indexOf(" ", idx);
This will get the first space, how Can I make it to get the first thing from space, comma, or period?
I tried using || but didn't work.
ex.
int word = title.indexOf(" " || "," || ".", idx);

Gets the index of the first occurence of space, comma or dot or -1 if none of them could be found:
Pattern pattern = Pattern.compile("[ ,\\.]");
Matcher matcher = pattern.matcher(title);
int index = matcher.find() ? matcher.start() : -1;
Note that you can pre-compile the pattern and reuse it as often as you like.
See also http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
Note also that if you want to break a text into single words, you can/should use a BreakIterator instead!

What you're doing isn't valid Java syntax. Use the indexOf() method with a space, comma and period, then determine the smallest of these 3 values.
int a = title.indexOf(" ", idx);
int b = title.indexof(",", idx);
int c = title.indexOf(".", idx);
Then just determine which is the smallest.
A faster way would be to write your own method. Behind the scenes, indexOf will just loop over all the characters. You can do that yourself manually
public static int findFirstOccurrence(String s) {
for (int i = 0; i < s.length(); i++) {
if (s.charAt(i) == ',' || // period/space) {
return i;
}
}
return -1;
}

Unfortunately, you can't use array of characters for indexOf, instead you need to call indexOf three times, or you can match a regex, the code you provided is invalid java syntax. this symbol || is a conditional OR operator that you can use to perform boolean operations like
if(x || y )

Related

Find dash "-" that's not inside round brackets "()" within String

I'm trying to find/determine if a String contains the character "-" that is not enclosed in round brackets "()".
I've tried the regex
[^\(]*-[^\)]*,
but it's not working.
Examples:
100 - 200 mg -> should match because the "-" is not enclosed in round brackets.
100 (+/-) units -> should NOT match
Do you have to use regex? You could try just iterating over the string and keeping track of the scope like so:
public boolean HasScopedDash(String str)
{
int scope = 0;
boolean foundInScope = false;
for (int i = 0; i < str.length(); i++)
{
char c = str.charAt(i);
if (c == '(')
scope++;
else if (c == '-')
foundInScope = scope != 0;
else if (c == ')' && scope > 0)
{
if (foundInScope)
return true;
scope--;
}
}
return false;
}
Edit: As mentioned in the comments, it might be desirable to exclude cases where the dash comes after an opening parenthesis but no closing parenthesis ever follows. (I.e. "abc(2-xyz") The above edited code accounts for this.
You might not to want to check for that to make this pass. Maybe, you could simply make a check on other boundaries. This expression for instance checks for spaces and numbers before and after the dash or any other chars in the middle you wish to have, which is much easier to modify:
([0-9]\s+[-]\s+[0-9])
It passes your first input and fails the undesired input. You could simply add other chars to its middle char list using logical ORs.
Demo
Java supports quantified atomic groups, this works.
The way it works is to consume paired parenthesis and their contents,
and not giving anything back, up until it finds a dash -.
This is done via the atomic group constructs (?> ).
^(?>(?>\(.*?\))|[^-])*?-
https://www.regexplanet.com/share/index.html?share=yyyyd8n1dar
(click on the Java button, check the find() function column)
Readable
^
(?>
(?> \( .*? \) )
|
[^-]
)*?
-
If you don't mind to check the string by using 2 regex instead of 1 complicated regex. You can try this instead
public static boolean match(String input) {
Pattern p1 = Pattern.compile("\\-"); // match dash
Pattern p2 = Pattern.compile("\\(.*\\-.*\\)"); // match dash within bracket
Matcher m1 = p1.matcher(input);
Matcher m2 = p2.matcher(input);
if ( m1.find() && !m2.find() ) {
return true;
} else {
return false;
}
}
Test the string
public static void main(String[] args) {
String input1 = "100 - 200 mg";
String input2 = "100 (+/-) units";
System.out.println(input1 + " : " + ( match(input1) ? "match" : "not match") );
System.out.println(input2 + " : " + ( match(input2) ? "match" : "not match") );
}
The output will be
100 - 200 mg : match
100 (+/-) units : not match
Matcher m = Pattern.compile("\\([^()-]*-[^()]*\\)").matcher(s); return !m.find();
https://ideone.com/YXvuem

How to replace a point ('.') in a string, with the word before the point?

If we have
String x="Hello.World";
I'm looking to replace that '.' with "Hello", as to have: "HelloHelloWorld".
The problem is, if I have:
String Y="Hello.beautiful.world.how.are.you;"
the answer would have to be "HelloHellobeautifulbeautifulworldworldhowhowareareyouyou"
Keep in mind that I can't use arrays.
I think you can just use regex replacements to achieve that. In a regex, you can use so called "capture groups". You match a word plus a dot with your regex and then you replace it with two times the matched word.
// Match any number of word characters plus a dot
Pattern regex = Pattern.compile("(\\w*)\\.");
Matcher regexMatcher = regex.matcher(text);
// $1 is the matched word, so $1$1 is just two times that word.
resultText = regexMatcher.replaceAll("$1$1");
Note that I didn't try it out since it would probably take me half an hour to set up the Java environment etc. But I am pretty confident that it works.
Think of the problem like a pointer problem. You need to keep a running pointer pointed at the last place you looked (firstIndex in my code), and a pointer at your current place (nextIndex in my code). Call subString() on whatever is between those places (add 1 to firstIndex after the first occurrence because we don't need to capture the "."), append it twice to a new string, and then change your pointers. There is probably a more elegant solution but, this gets the job done:
String Y="Hello.beautiful.world.how.are.you";
int firstIndex=0;
int nextIndex=Y.indexOf(".",firstIndex);
String newString = "";
while(nextIndex != -1){
newString += Y.substring(firstIndex==0 ? firstIndex : firstIndex+1, nextIndex);
newString += Y.substring(firstIndex==0 ? firstIndex : firstIndex+1, nextIndex);
firstIndex=nextIndex;
nextIndex=Y.indexOf(".", nextIndex+1);
}
System.out.println(newString);
Output:
HelloHellobeautifulbeautifulworldworldhowhowareare
This is what i have:
public String meowDot(String meow){
int length = meow.length();
String beforeDot = "";
String afterDot;
char character;
for(int i=0; i < length; i++){
character = meow.charAt(i);
if (i < largo - 1 && character == '.'){
beforeDot += meow.substring(0, i) + meow.substring(0, i);
afterDot = meow.substring(i, length);
meow = afterDot;
} else if(i == length - 1 && character != '.'){
afterDot += meow + meow;
}
}
return beforeDot;
}

Counting comma and any text in java String

I'm trying to write a function to count specific Strings.
The Strings to count look like the following:
first any character except comma at least once -
the comma -
any chracter but at least once
example string:
test, test, test,
should count to 3
I've tried do that by doing the following:
int countSubstrings = 0;
final Pattern pattern = Pattern.compile("[^,]*,.+");
final Matcher matcher = pattern.matcher(commaString);
while (matcher.find()) {
countSubstrings++;
}
Though my solution doesn't work. It always ends up counting to one and no further.
Try this pattern instead: [^,]+
As you can see in the API, find() will give you the next subsequence that matches the pattern. So this will find your sequences of "non-commas" one after the other.
Your regex, especially the .+ part will match any char sequence of at least length 1. You want the match to be reluctant/lazy so add a ?: [^,]*,.+?
Note that .+? will still match a comma that directly follows a comma so you might want to replace .+? with [^,]+ instead (since commas can't match with this lazyness is not needed).
Besides that an easier solution might be to split the string and get the length of the array (or loop and check the elements if you don't want to allow for empty strings):
countSubstrings = commaString.split(",").length;
Edit:
Since you added an example that clarifies your expectations, you need to adjust your regex. You seem to want to count the number of strings followed by a comma so your regex can be simplified to [^,]+,. This matches any char sequence consisting of non-comma chars which is followed by a comma.
Note that this wouldn't match multiple commas or text at the end of the input, e.g. test,,test would result in a count of 1. If you have that requirement you need to adjust your regex.
So, quite good answers are already given. Very readable. Something like this should work, beware, it's not clean and probably not the fastest way to do this. But is is quite readable. :)
public int countComma(String lots_of_words) {
int count = 0;
for (int x = 0; x < lots_of_words.length(); x++) {
if (lots_of_words.charAt(x) == ',') {
count++;
}
}
return count;
}
Or even better:
public int countChar(String lots_of_words, char the_chosen_char) {
int count = 0;
for (int x = 0; x < lots_of_words.length(); x++) {
if (lots_of_words.charAt(x) == the_chosen_char) {
count++;
}
}
return count;
}

How can I look for two specific characters in a string?

String abc = "||:::|:|::";
It should return true if there's two | and three : appearances.
I'm not sure how to use "regex" or if it's the right method to use. There's no specific pattern in the abc String.
Using a regex would be a bad idea, especially if there's no specific order to them. Make a function that counts the number of times a character sppears in a string, and use that:
public int count(String base, char toFind)
{
int count = 0;
char[] haystack = base.toCharArray();
for (int i = 0; i < haystack.length; i++)
if (haystack[i] == toFind)
count++;
return count;
}
String abc = "||:::|:|::";
if (count(abc,"|") >= 2 && count(abc,":") >= 3)
{
//Do some code here
}
My favorite method for searching for the number of characters in a string is int num = s.length() - s.replaceAll("|","").length(); you can do that for both and test those ints.
If you want to test all conditions in one regex you can use look-ahead (?=condition).
Your regex can look like
String regex =
"(?=(.*[|]){2})"//contains two |
+ "(?=(.*:){3})"//contains three :
+ "[|:]+";//is build only from : and | characters
Now you can use it with matches like
String abc = "||:::|:|::";
System.out.println(abc.matches(regex));//true
abc = "|::::::";
System.out.println(abc.matches(regex));//false
Anyway I you can avoid regex and write your own method which will calculate number of | and : in your string and check if this numbers are greater or equal to 2 and 3. You can use StringUtils.countMatches from apache-commons so your test code could look like
public static boolean testString(String s){
int pipes = StringUtils.countMatches(s, "|");
int colons = StringUtils.countMatches(s, ":");
return pipes>=2 && colons>=3;
}
or
public static boolean testString(String s){
return StringUtils.countMatches(s, "|")>=2
&& StringUtils.countMatches(s, ":")>=3;
}
This is assuming you are looking for two '|' to be one after the other and the same for the three ':'
and one follows the other .Do it using the following single regular expressions.
".*||.*:::.*"
If you are looking to just check the presence of characters and their irrespective of their order then use String.matches method using the two regular expressions with a logical AND
".*|.*|.*"
".*:.*:.*:.*"
Here is a cheat sheet for regular expressions. Its fairly simple to learn. Look at groups and quantifiers in the document to understand the above expression.
Haven't tested it, but this should work
Pattern.compile("^(?=.*[|]{2,})(?=.*[:]{3,})$");
The entire string is read by ?=.* and checked wether the allowed characters (|) occurs at least twice. The same is then done for :, only that this has to match at least three times.

Regex to check if a single quote is preceeded by another single quote

I would like to write a regex to validate if a single quote is preceeded by another single quote.
Valid strings:
azerty''uiop
aze''rty''uiop
''azertyuiop
azerty''uiop''
azerty ''uiop''
azerty''''uiop
azerty''''uiop''''
Invalid strings:
azerty'uiop
aze'rty'uiop
'azertyuiop
azerty'uiop'
azerty 'uiop'
azerty'''uiop
It can be done in one line:
inputString.matches("(?:[^']|'')*+");
The regex simply means, the string can contain 0 or more of
Non-quote character [^']
OR
A pair of consecutive quotes ''
I used possessive version (*+) of 0 or more quantifier (*). Since it would be lengthy to explain what possessive quantifier means, I will refer you to here to learn about it. Simply put, it is an optimization.
No need for a regex, just use .replace() to replace all sequences of two single quotes by nothing, then test whether you still find a single quote; if yes, the string is invalid:
if (input.replace("''", "").indexOf('\'') != -1)
// Not valid!
If you also want to consider that strings with no single quotes are valid, you'll have to create a temporary variable:
public boolean isValid(final String input)
{
final String s = input.replace("''", "");
return s.equals(input) ? true : s.indexOf('\'') == -1;
}
Do you want a very fast solution? Try the next:
public static boolean isValid(String str) {
char[] chars = str.toCharArray();
int found = 0;
for (int i = 0; i < chars.length; i++) {
char c = chars[i];
if (c == '\'') {
found++;
} else {
if (found > 0 && found % 2 != 0) {
return false;
}
found = 0;
}
}
if (found > 0 && found % 2 != 0) {
return false;
}
return true;
}
You can use the code bellow too:
str.matches("([^\']*(\'){2}[^\']*)+");
I think "([^\']*(\'){2}[^\']*)+" is easy to grasp, for the beginners. But this is not the best way to do this. It dies (runs into backtracking hell) when running for long input.

Categories