Replace multiple non-digit char to 1 non-digit char - java

I am working on app that read weight value from weighing indicator. The output from the indicator are contains with symbols, non digit char and also a number. I just want to extract the number. I have already turn non-digits and symbols into several pipes using regex \D. Then I wanted to turn this string
||||||||||1234||||||||||||||1234||||||||||||||1234||||||||||||||1234||||||||||||||1234||||
into
|1234|1234|1234|1234
How could I possibly do that?

You could try a regex replacement:
String input = "||||||||||1234||||||||||||||1234||||||||||||||1234||||||||||||||1234||||||||||||||1234||||";
String output = input.replaceAll("\\|+", "|").replaceAll("\\|$", "");
System.out.println(output); // |1234|1234|1234|1234|1234

Related

How can we replace the digits and special characters using the character in String?

There is java program in which I find the digit and special characters in String, same i want to replace the digit and special characters with character 'X'.
This for windows 7
String aplhaonly =s.replaceAll("[^a-zA-Z]+", " ");
String aplhaDigit =s.replaceAll("[^a-zA-Z0-9]+", " ");
If there is a need to replace digits and symbols by "X" you probably want to use this:
s.replaceAll("[^a-zA-Z]+", "X");
s.replaceAll("[^a-zA-Z0-9]+", "X");
The second argument you put in replaceAll() is the replacement.
If you need to replace digits and symbols if String contains "X" you can use:
if (s.contains("X")) {
s.replaceAll("[^a-zA-Z]+", "");
}
for both digit and characters :
System.out.println(a.replaceAll("[*^%$##!&0-9]","X"));
for characters only:
System.out.println(a.replaceAll("[*^%$##!&]","X"));
for digits only:
System.out.println(a.replaceAll("[0-9]","X"));

Eliminating Unicode Characters and Escape Characters from String

I want to remove all Unicode Characters and Escape Characters like (\n, \t) etc. In short I want just alphanumeric string.
For example :
\u2029My Actual String\u2029
\nMy Actual String\n
I want to fetch just 'My Actual String'. Is there any way to do so, either by using a built in string method or a Regular Expression ?
Try
String stg = "\u2029My Actual String\u2029 \nMy Actual String";
Pattern pat = Pattern.compile("(?!(\\\\(u|U)\\w{4}|\\s))(\\w)+");
Matcher mat = pat.matcher(stg);
String out = "";
while(mat.find()){
out+=mat.group()+" ";
}
System.out.println(out);
The regex matches all things except unicode and escape characters. The regex pictorially represented as:
Output:
My Actual String My Actual String
Try this:
anyString = anyString.replaceAll("\\\\u\\d{4}|\\\\.", "");
to remove escaped characters. If you also want to remove all other special characters use this one:
anyString = anyString.replaceAll("\\\\u\\d{4}|\\\\.|[^a-zA-Z0-9\\s]", "");
(I guess you want to keep the whitespaces, if not remove \\s from the one above)

How to parse a string by words

I need to parse a string by highlighting all wards. At now I figured out how to split words with any symbols. But how to rewrite the code to discard words with numbers or any other characters? Here is my code:
String s = "AaA bbd cDef d1s s/4 +xx_x asdgag 34545rtrtr.";
Pattern p = Pattern.compile("\\b[A-Za-z]+\\b");
System.out.println(Arrays.asList(s.split(p.pattern())));
Not valid words:
*“d1s”, “s/4”, “+xx_x”, “34545rtrtr.”*
Appropriate words:
“AaA”, “bbd”, “cDef”, “asdgag”
Try something like:
"\\b[A-Za-z]+\\b"
Where,
\b marks a word boundary.
[A-Za-z] means every letter, upper or lower case
+ means "one or more".

Parse and remove special characters in java regex

So we were looking at some of the other regex posts and we are having trouble removing a special case in one instance; the special character is in the beginning of the word.
We have the following line in our code:
String k = s.replaceAll("([a-z]+)[()?:!.,;]*", "$1");
where s is a singular word. For example, when parsing the sentence "(hi hi hi)" by tokenizing it, and then performing the replaceAll function on each token, we get an output of:
(hi
hi
hi
What are we missing in our regex?
You can use an easier approach - replace the characters that you do not want with spaces:
String k = s.replaceAll("[()?:!.,;]+", " ");
Position matters so you would need to match the excluded charcters before the capturing group also:
String k = s.replaceAll("[()?:!.,;]*([a-z]+)[()?:!.,;]*", "$1");
your replace just removed the "special chars" after the [a-z]+, that's why the ( before hi is left there.
If you know s is a single word
you could either:
String k = s.replaceAll("\\W*(\\w+)\\W*", "$1");
or
String k = s.replaceAll("\\W*", "");
This can be more simple
try this :
String oldString = "Hi There ##$ What is %#your name?##$##$ 0123$$";
System.out.println(oldString.replaceAll("[\\p{Punct}\\s\\d]+", " ");
output :
Hi There What is your name 0123
So it also accepts numeric.
.replaceAll("[\p{Punct}\s\d]+", " ");
will replace alll the Punctuations used which includes almost all the special characters.

Regex to get first number in string with other characters

I'm new to regular expressions, and was wondering how I could get only the first number in a string like 100 2011-10-20 14:28:55. In this case, I'd want it to return 100, but the number could also be shorter or longer.
I was thinking about something like [0-9]+, but it takes every single number separately (100,2001,10,...)
Thank you.
/^[^\d]*(\d+)/
This will start at the beginning, skip any non-digits, and match the first sequence of digits it finds
EDIT:
this Regex will match the first group of numbers, but, as pointed out in other answers, parseInt is a better solution if you know the number is at the beginning of the string
Try this to match for first number in string (which can be not at the beginning of the string):
String s = "2011-10-20 525 14:28:55 10";
Pattern p = Pattern.compile("(^|\\s)([0-9]+)($|\\s)");
Matcher m = p.matcher(s);
if (m.find()) {
System.out.println(m.group(2));
}
Just
([0-9]+) .*
If you always have the space after the first number, this will work
Assuming there's always a space between the first two numbers, then
preg_match('/^(\d+)/', $number_string, $matches);
$number = $matches[1]; // 100
But for something like this, you'd be better off using simple string operations:
$space_pos = strpos($number_string, ' ');
$number = substr($number_string, 0, $space_pos);
Regexs are computationally expensive, and should be avoided if possible.
the below code would do the trick.
Integer num = Integer.parseInt("100 2011-10-20 14:28:55");
[0-9] means the numbers 0-9 can be used the + means 1 or more times. if you use [0-9]{3} will get you 3 numbers
Try ^(?'num'[0-9]+).*$ which forces it to start at the beginning, read a number, store it to 'num' and consume the remainder without binding.
This string extension works perfectly, even when string not starts with number.
return 1234 in each case - "1234asdfwewf", "%sdfsr1234" "## # 1234"
public static string GetFirstNumber(this string source)
{
if (string.IsNullOrEmpty(source) == false)
{
// take non digits from string start
string notNumber = new string(source.TakeWhile(c => Char.IsDigit(c) == false).ToArray());
if (string.IsNullOrEmpty(notNumber) == false)
{
//replace non digit chars from string start
source = source.Replace(notNumber, string.Empty);
}
//take digits from string start
source = new string(source.TakeWhile(char.IsDigit).ToArray());
}
return source;
}
NOTE: In Java, when you define the patterns as string literals, do not forget to use double backslashes to define a regex escaping backslash (\. = "\\.").
To get the number that appears at the start or beginning of a string you may consider using
^[0-9]*\.?[0-9]+ # Float or integer, leading digit may be missing (e.g, .35)
^-?[0-9]*\.?[0-9]+ # Optional - before number (e.g. -.55, -100)
^[-+]?[0-9]*\.?[0-9]+ # Optional + or - before number (e.g. -3.5, +30)
See this regex demo.
If you want to also match numbers with scientific notation at the start of the string, use
^[0-9]*\.?[0-9]+([eE][+-]?[0-9]+)? # Just number
^-?[0-9]*\.?[0-9]+([eE][+-]?[0-9]+)? # Number with an optional -
^[-+]?[0-9]*\.?[0-9]+([eE][+-]?[0-9]+)? # Number with an optional - or +
See this regex demo.
To make sure there is no other digit on the right, add a \b word boundary, or a (?!\d)
or (?!\.?\d) negative lookahead that will fail the match if there is any digit (or . and a digit) on the right.
public static void main(String []args){
Scanner s=new Scanner(System.in);
String str=s.nextLine();
Pattern p=Pattern.compile("[0-9]+");
Matcher m=p.matcher(str);
while(m.find()){
System.out.println(m.group()+" ");
}
\d+
\d stands for any decimal while + extends it to any other decimal coming directly after, until there is a non number character like a space or letter

Categories