I have result, which is either a text or numeric value, such as:
String result;
result = "avsds";
result = "123";
result = "345.45";
Sometimes the results also contain commas like:
result = "abc,def";
result = "1,234";
I want to remove the commas from result only if it is a numeric value, and not if it is simple text.
What is the best way of going about this?
Here is your answer:
String regex = "(?<=[\\d])(,)(?=[\\d])";
Pattern p = Pattern.compile(regex);
String str = "Your input";
Matcher m = p.matcher(str);
str = m.replaceAll("");
System.out.println(str);
This only affects NUMBERS, not strings, as you asked.
Try adding that in your main method. Or try this one, it receives input:
String regex = "(?<=[\\d])(,)(?=[\\d])";
Pattern p = Pattern.compile(regex);
System.out.println("Value?: ");
Scanner scanIn = new Scanner(System.in);
String str = scanIn.next();
Matcher m = p.matcher(str);
str = m.replaceAll("");
System.out.println(str);
The easiest way is to use two regexes. The first to make sure it is numeric (something along the lines of [0-9.,]*), and the second to clean it (result.replaceAll("/,//"))
You could try to parse the string first with any of the numeric classes (Integer, Double etc) after removing the unwanted characters, if the parsing succeeds, then it is a numeric and you can remove the unwanted characters from the original string.
Here I have used BigInteger since I am not sure about the precision for your requirement.
public static String removeIfNumeric(final String s, final String toRemove) {
final String result;
if (isNumeric(s, toRemove)) {
result = s.replaceAll(toRemove, "");
} else {
result = s;
}
return result;
}
public static boolean isNumeric(final String s, final String toRemoveRegex) {
try {
new BigInteger(s.replaceAll(toRemoveRegex, ""));
return true;
} catch (NumberFormatException e) {
return false;
}
}
Related
I have several strings in the rough form:
[some text] [some number] [some more text]
I want to extract the text in [some number] using the Java Regex classes.
I know roughly what regular expression I want to use (though all suggestions are welcome). What I'm really interested in are the Java calls to take the regex string and use it on the source data to produce the value of [some number].
EDIT: I should add that I'm only interested in a single [some number] (basically, the first instance). The source strings are short and I'm not going to be looking for multiple occurrences of [some number].
Full example:
private static final Pattern p = Pattern.compile("^([a-zA-Z]+)([0-9]+)(.*)");
public static void main(String[] args) {
// create matcher for pattern p and given string
Matcher m = p.matcher("Testing123Testing");
// if an occurrence if a pattern was found in a given string...
if (m.find()) {
// ...then you can use group() methods.
System.out.println(m.group(0)); // whole matched expression
System.out.println(m.group(1)); // first expression from round brackets (Testing)
System.out.println(m.group(2)); // second one (123)
System.out.println(m.group(3)); // third one (Testing)
}
}
Since you're looking for the first number, you can use such regexp:
^\D+(\d+).*
and m.group(1) will return you the first number. Note that signed numbers can contain a minus sign:
^\D+(-?\d+).*
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Regex1 {
public static void main(String[]args) {
Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher("hello1234goodboy789very2345");
while(m.find()) {
System.out.println(m.group());
}
}
}
Output:
1234
789
2345
Allain basically has the java code, so you can use that. However, his expression only matches if your numbers are only preceded by a stream of word characters.
"(\\d+)"
should be able to find the first string of digits. You don't need to specify what's before it, if you're sure that it's going to be the first string of digits. Likewise, there is no use to specify what's after it, unless you want that. If you just want the number, and are sure that it will be the first string of one or more digits then that's all you need.
If you expect it to be offset by spaces, it will make it even more distinct to specify
"\\s+(\\d+)\\s+"
might be better.
If you need all three parts, this will do:
"(\\D+)(\\d+)(.*)"
EDIT The Expressions given by Allain and Jack suggest that you need to specify some subset of non-digits in order to capture digits. If you tell the regex engine you're looking for \d then it's going to ignore everything before the digits. If J or A's expression fits your pattern, then the whole match equals the input string. And there's no reason to specify it. It probably slows a clean match down, if it isn't totally ignored.
In addition to Pattern, the Java String class also has several methods that can work with regular expressions, in your case the code will be:
"ab123abc".replaceFirst("\\D*(\\d*).*", "$1")
where \\D is a non-digit character.
In Java 1.4 and up:
String input = "...";
Matcher matcher = Pattern.compile("[^0-9]+([0-9]+)[^0-9]+").matcher(input);
if (matcher.find()) {
String someNumberStr = matcher.group(1);
// if you need this to be an int:
int someNumberInt = Integer.parseInt(someNumberStr);
}
This function collect all matching sequences from string. In this example it takes all email addresses from string.
static final String EMAIL_PATTERN = "[_A-Za-z0-9-\\+]+(\\.[_A-Za-z0-9-]+)*#"
+ "[A-Za-z0-9-]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})";
public List<String> getAllEmails(String message) {
List<String> result = null;
Matcher matcher = Pattern.compile(EMAIL_PATTERN).matcher(message);
if (matcher.find()) {
result = new ArrayList<String>();
result.add(matcher.group());
while (matcher.find()) {
result.add(matcher.group());
}
}
return result;
}
For message = "adf#gmail.com, <another#osiem.osiem>>>> lalala#aaa.pl" it will create List of 3 elements.
Try doing something like this:
Pattern p = Pattern.compile("^.+(\\d+).+");
Matcher m = p.matcher("Testing123Testing");
if (m.find()) {
System.out.println(m.group(1));
}
Simple Solution
// Regexplanation:
// ^ beginning of line
// \\D+ 1+ non-digit characters
// (\\d+) 1+ digit characters in a capture group
// .* 0+ any character
String regexStr = "^\\D+(\\d+).*";
// Compile the regex String into a Pattern
Pattern p = Pattern.compile(regexStr);
// Create a matcher with the input String
Matcher m = p.matcher(inputStr);
// If we find a match
if (m.find()) {
// Get the String from the first capture group
String someDigits = m.group(1);
// ...do something with someDigits
}
Solution in a Util Class
public class MyUtil {
private static Pattern pattern = Pattern.compile("^\\D+(\\d+).*");
private static Matcher matcher = pattern.matcher("");
// Assumptions: inputStr is a non-null String
public static String extractFirstNumber(String inputStr){
// Reset the matcher with a new input String
matcher.reset(inputStr);
// Check if there's a match
if(matcher.find()){
// Return the number (in the first capture group)
return matcher.group(1);
}else{
// Return some default value, if there is no match
return null;
}
}
}
...
// Use the util function and print out the result
String firstNum = MyUtil.extractFirstNumber("Testing4234Things");
System.out.println(firstNum);
Look you can do it using StringTokenizer
String str = "as:"+123+"as:"+234+"as:"+345;
StringTokenizer st = new StringTokenizer(str,"as:");
while(st.hasMoreTokens())
{
String k = st.nextToken(); // you will get first numeric data i.e 123
int kk = Integer.parseInt(k);
System.out.println("k string token in integer " + kk);
String k1 = st.nextToken(); // you will get second numeric data i.e 234
int kk1 = Integer.parseInt(k1);
System.out.println("new string k1 token in integer :" + kk1);
String k2 = st.nextToken(); // you will get third numeric data i.e 345
int kk2 = Integer.parseInt(k2);
System.out.println("k2 string token is in integer : " + kk2);
}
Since we are taking these numeric data into three different variables we can use this data anywhere in the code (for further use)
How about [^\\d]*([0-9]+[\\s]*[.,]{0,1}[\\s]*[0-9]*).* I think it would take care of numbers with fractional part.
I included white spaces and included , as possible separator.
I'm trying to get the numbers out of a string including floats and taking into account that the user might make a mistake and include white spaces while typing the number.
Sometimes you can use simple .split("REGEXP") method available in java.lang.String. For example:
String input = "first,second,third";
//To retrieve 'first'
input.split(",")[0]
//second
input.split(",")[1]
//third
input.split(",")[2]
if you are reading from file then this can help you
try{
InputStream inputStream = (InputStream) mnpMainBean.getUploadedBulk().getInputStream();
BufferedReader br = new BufferedReader(new InputStreamReader(inputStream));
String line;
//Ref:03
while ((line = br.readLine()) != null) {
if (line.matches("[A-Z],\\d,(\\d*,){2}(\\s*\\d*\\|\\d*:)+")) {
String[] splitRecord = line.split(",");
//do something
}
else{
br.close();
//error
return;
}
}
br.close();
}
}
catch (IOException ioExpception){
logger.logDebug("Exception " + ioExpception.getStackTrace());
}
Pattern p = Pattern.compile("(\\D+)(\\d+)(.*)");
Matcher m = p.matcher("this is your number:1234 thank you");
if (m.find()) {
String someNumberStr = m.group(2);
int someNumberInt = Integer.parseInt(someNumberStr);
}
I am a newbie to Java and struggling with a possibly simple thing.
I have strings in different formats. An example string is given below
New_System-Updater-For-19974774.ftw
Basically i want to extract the number "19974774". For this i want to find the index where "." is, as there will be only dot in the string and then go back and extract the 8 characters.
Is there a simple way of doing it?
String s = "New_System-Updater-For-19974774.ftw";
int positionOfDot = s.indexOf('.');
String withoutDotFtw = s.substring(0,positionOfDot);
String number = s.substring(s.lastIndexOf("-")+1,positionOfDot);
You can try something like this
If you want 8 char before dot i suggest :
String tst = "New_System-Updater-For-19974774.ftw";
int indexOfDot = tst.indexOf(".");
String extract = tst.substring(indexOfDot-8, indexOfDot);
System.out.println(extract);
If size of digit it not 8 digit, use regex
Pattern pattern = Pattern.compile("(\\d+)\\.");
Matcher matcher = pattern.matcher(tst);
if(matcher.find()){
extract = matcher.group(1);
}
System.out.println(extract);
Try something like this:
public static string GetNumberFromName(String name) {
String myString = "New_System-Updater-For-19974774.ftw";
return myString.split("\\d+\\.")[0].split("\\d+")[0];
}
You can use the following code to retrieve the numbers you want.
Pattern pattern = Pattern.compile("\\D*(\\d*)\\D*");
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
Assuming there is always a - before the digits this will work:
String initial = "New_System-Updater-For-19974774.ftw";
String firstSplit = initial.substring(initial.lastIndexOf("-") + 1);
String finalSplit = firstSplit.substring(0, firstSplit.indexOf("."));
Assuming the number of digits are fixed to 8, this will work:
String initial = "New_System-Updater-For-19974774.ftw";
String firstSplit = initial.substring(0, initial.lastIndexOf("."));
String finalSplit = firstSplit.substring(firstSplit.length()-8, firstSplit.length());
finalSplit will be your number.
Making the assumption that you want the number in front of the last period in your string. You should use a regex ideally, but this may be simpler if regex is unfamiliar.
String string = New_System-Updater-For-19974774.ftw";
String[] splitString = string.split('.');
if(splitString.length > 1) {
String secondToLastString = splitString[splitString.length - 2]; //second to last string
int i = secondToLastString.length - 1;
for(; i >= 0; i--) {
if(!Character.isDigit(secondToLastString.charAt(i))) {//If not digit break out and use that index as bound
break;
}
}
return secondToLastString.substring(i + 1);
} else {
throw new IllegalArgumentException("string doesn't have proper format");
}
Or just use a regex
String string = New_System-Updater-For-19974774.ftw";
Matcher matcher = Pattern.compile("(\\d*)\\.[^\\.]*").matcher(string);
if(matcher.find()) {
return matcher.group(1);
}
I'm trying to replace some case when I put a generic term (here called tampon).
Rules:
I want to replace "AM into "AN","EM" into "AN", IM"into"IN","OM"into "ON","UM" into "UN" and "YM" into "IN".
I also want to replace them only if a consonant is after them except "M" and "N".
I need to replace only the case too when they are alone or at the end of the string.
I've tried some regex but still got some failures into my test (5/18).
Got faillure with "UMUMMUM" the test expects "UMUMMUM" but I've got "UMUMMUN".
Here is my code now :
public class Phonom {
static String[] consonnant={"B","C","D","F","G","H","J","K","L","P","Q","R","S","T","V","W","X","Z",""};
public static String phonom1(final String tampon){
if (tampon == null){
return "";
}
if (tampon.isEmpty()){
return "";
}
int pos=tampon.indexOf("EM");
int pos1=tampon.indexOf("AM");
int pos2=tampon.indexOf("IM");
int pos3=tampon.indexOf("OM");
int pos4=tampon.indexOf("UM");
int pos5=tampon.indexOf("YM");
if(pos==tampon.length()-2 ||pos1==tampon.length()-2|pos2==tampon.length()-2
||pos3==tampon.length()-2||pos4==tampon.length()-2||pos5==tampon.length()-2){
String temp=tampon.replaceAll("AM","AN");
String temp1=temp.replaceAll("EM","AN");
String temp2=temp1.replaceAll("IM","IN");
String temp3=temp2.replaceAll("OM","ON");
String temp4=temp3.replaceAll("UM","UN");
String result=temp4.replaceAll("YM","IN");
return result;
}
String temp=tampon.replaceAll("AM[^AEIOUMNY]","AN");
String temp1=temp.replaceAll("EM[^AEIOUMNY]","AN");
String temp2=temp1.replaceAll("IM[^AEIOUMNY]","IN");
String temp3=temp2.replaceAll("OM[^AEIOUMNY]","ON");
String temp4=temp3.replaceAll("UM[^AEIOUMNY]","UN");
String result=temp4.replaceAll("YM[^AEIOUMNY]","IN");
return result;
}
}
You could have done this in one line if YM was replaced with YN not IN.
tampon.replaceAll("(?<=[AEIOUY])(M)(?![AEIOUYMN])", "N");
Because of the YM to IN rule you will need to use appendReplacement and appendTail instead. The below code uses a negative look ahead to ensure possible replacements aren't followed by a vowel, M or N. If the first group is a Y we replace the match with IN. If not we use a back reference to the character in group 1 and follow it with an N.
public class Phonom {
private static final Pattern PATTERN = Pattern.compile("([AEIOUY])(M)(?![AEIOUYMN])");
public static String phonom1(String tampon) {
Matcher m = PATTERN.matcher(tampon);
StringBuffer sb = new StringBuffer();
while (m.find()) {
if ("Y".equals(m.group(1))) {
m.appendReplacement(sb, "IN");
} else {
m.appendReplacement(sb, "\1N");
}
}
m.appendTail(sb);
return sb.toString();
}
}
In all other related posts people asked about removing leading or/and trailing spaces in String java. Now, my question is how to get the leading or trailing spaces? What I could think is for example to get the trailing spaces using such a function:
private String getTrailingSpaces(String str) {
return str.replace(str.replaceFirst("\\s+$", ""), "");
}
But I'm not sure if this is correct or even if it is, is there any better way to do this?
I'm not an expert but I think you should perform 2 distinct regexp, one for leading spaces and one for trailing spaces.
private static final Pattern LEADING = Pattern.compile("^\\s+");
private static final Pattern TRAILING = Pattern.compile("\\s+$");
public String getLeadingSpaces(String str) {
Matcher m = LEADING.matcher(str);
if(m.find()){
return m.group(0);
}
return "";
}
public String getTrailingSpaces(String str) {
Matcher m = TRAILING.matcher(str);
if(m.find()){
return m.group(0);
}
return "";
}
try this
String leading = str.replaceAll("^(\\s+).+", "$1");
String tailing = str.replaceAll(".+?(\\s+)$", "$1");
Try something like this:
private String getTrailingSpaces(String str) {
Pattern p = Pattern.compile(".*(\\s+)$");
Matcher m = p.matcher(str);
String trailing = "";
if (m.matches) {
trailing = m.group(1);
}
return trailing;
}
Try this.You would need ArrayUtils from Apache Commons though.
String str1 = " abc ";
char[] cs = str1.toCharArray();
Character[] characters = ArrayUtils.toObject(cs);
// System.out.println(characters.length); // 6
for (Character character : characters) {
if (character.isSpace(character.charValue())) {
// PROCESS ACCORDINGLY
} else {
// PROCESS ACCORDINGLY
}
}
I am trying to censor specific strings, and patterns within my application but my matcher doesn't seem to be finding any results when searching for the Pattern.
public String censorString(String s) {
System.out.println("Censoring... "+ s);
if (findPatterns(s)) {
System.out.println("Found pattern");
for (String censor : foundPatterns) {
for (int i = 0; i < censor.length(); i++)
s.replace(censor.charAt(i), (char)42);
}
}
return s;
}
public boolean findPatterns(String s) {
for (String censor : censoredWords) {
Pattern p = Pattern.compile("(.*)["+censor+"](.*)");//regex
Matcher m = p.matcher(s);
while (m.find()) {
foundPatterns.add(censor);
return true;
}
}
return false;
}
At the moment I'm focusing on just the one pattern, if the censor is found in the string. I've tried many combinations and none of them seem to return "true".
"(.*)["+censor+"](.*)"
"(.*)["+censor+"]"
"["+censor+"]"
"["+censor+"]+"
Any help would be appreciated.
Usage: My censored words are "hello", "goodbye"
String s = "hello there, today is a fine day."
System.out.println(censorString(s));
is supposed to print " ***** today is a fine day. "
Your regex is right!!!!. The problem is here.
s.replace(censor.charAt(i), (char)42);
If you expect this line to rewrite the censored parts of your string it will not. Please check the java doc for string.
Please find below the program which will do what you intend to do. I removed your findpattern method and just used the replaceall with regex in String API. Hope this helps.
public class Regex_SO {
private String[] censoredWords = new String[]{"hello"};
/**
* #param args the command line arguments
*/
public static void main(String[] args) {
Regex_SO regex_SO = new Regex_SO();
regex_SO.censorString("hello there, today is a fine day. hello again");
}
public String censorString(String s) {
System.out.println("Censoring... "+ s);
for(String censoredWord : censoredWords){
String replaceStr = "";
for(int index = 0; index < censoredWord.length();index++){
replaceStr = replaceStr + "*";
}
s = s.replaceAll(censoredWord, replaceStr);
}
System.out.println("Censored String is .. " + s);
return s;
}
}
Since this seem like homework I cant give you working code, but here are few pointers
consider using \\b(word1|word2|word3)\\b regex to find specific words
to create char representing * you can write it as '*'. Don't use (char)42 to avoid magic numbers
to create new string which will have same length as old string but will be filled with only specific characters you can use String newString = oldString.replaceAll(".","*")
to replace on-the-fly founded match with new value you can use appendReplacement and appendTail methods from Matcher class. Here is how code using it should look like
StringBuffer sb = new StringBuffer();//buffer for string with replaced values
Pattern p = Pattern.compile(yourRegex);
Matcher m = p.matcher(yourText);
while (m.find()){
String match = m.group(); //this will represent current match
String newValue = ...; //here you need to decide how to replace it
m.appentReplacemenet(sb, newValue );
}
m.appendTail(sb);
String censoredString = sb.toString();