I am reading text from Aadhar card and Pan card and I am getting below String.
छ र्णां ;; _,
ज्यो
हें ष्ठछ्येप् ऽठमांआ (38७/8र्क्स "
,; जन्म वर्ष / 78६" ०1‘8६र्णीग् : 1992 खा … खा
पुरुष‘ “'व्'व् हैंप्टेंग्‘
` हु; ";:ळुहुं क्रौं र्मं’फु. ‘_य्; ,; ळु
` हं ` .म्च्हें :: "…. 'दृर्दु‘ऱ्क्ष्क्त
» ॰ -। "' ॰॰ ’ '|’ ""
8471 2211 6099 ,_
I have two tasks to do -
1. detect whether it has an aadhar card no or not.
2. if yes then get that no.
Code I have tried
String data = "b dn b fsd b fsd 6666 8888 9999 bsnfbsdb";
Pattern p = Pattern.compile( "^[a-zA-Z ]*\\d{4}\\s\\d{4}\\s\\d{4}[a-zA-Z ]*$" );
Matcher m = p.matcher( data );
if ( m.find() ) {
String s = m.group(0);
System.out.println(s);
}
But it is not working, I am getting whole String
Is there is any better solution to do this? or am I doing anything wrong?
Thanks in advance.
You may use
(?<!\d)\d{4}(?:\s\d{4}){2}(?!\d)
See the regex demo.
Details
(?<!\d) - no digit immediately to the left is allowed
\d{4} - four digits
(?:\s\d{4}){2} - two repetitions of a whitespace and four digits
(?!\d) - no digit immediately to the right is allowed
See the Java demo:
String data = "b dn b fsd b fsd 6666 8888 9999 bsnfbsdb";
Pattern p = Pattern.compile( "(?<!\\d)\\d{4}(?:\\s\\d{4}){2}(?!\\d)" );
Matcher m = p.matcher( data );
String s = "";
if ( m.find() ) {
s = m.group(0);
}
System.out.println("Result: " + s); // => Result: 6666 8888 9999
Related
I would like to get groups from a string that is loaded from txt file. This file looks something like this (notice the space at the beginning of file):
as431431af,87546,3214| 5a341fafaf,3365,54465 | 6adrT43 , 5678 , 5655
First part of string until first comma can be digits and letter, second part of string are only digits and third are also only digits. After | its all repeating.
First, I load txt file into string :String readFile3 = readFromTxtFile("/resources/file.txt");
Then I remove all whitespaces with regex :
String no_whitespace = readFile3.replaceAll("\\s+", "");
After that i try to get groups :
Pattern p = Pattern.compile("[a-zA-Z0-9]*,\\d*,\\d*", Pattern.MULTILINE);
Matcher m = p.matcher(ue_No_whitespace);
int lastMatchPos = 0;
while (m.find()) {
System.out.println(m.group());
lastMatchPos = m.end();
}
if (lastMatchPos != ue_No_whitespace.length())
System.out.println("Invalid string!");
Now I would like, for each group remove "," and add every value to its variable, but I am getting this groups : (notice this NULL)
nullas431431af,87546,3214
5a341fafaf,3365,54465
6adrT43,5678,5655
What am i doing wrong? Even when i physicaly remove space from the beginning of the txt file , same result occurs.
Is there any easier way to get groups in this string with regex and add each string part, before "," , to its variable?
You can split with | enclosed with optional whitespaces and then split the obtained items with , enclosed with optional whitespaces:
String str = "as431431af,87546,3214| 5a341fafaf,3365,54465 | 6adrT43 , 5678 , 5655";
String[] items = str.split("\\s*\\|\\s*");
List<String[]> res = new ArrayList<>();
for(String i : items) {
String[] parts = i.split("\\s*,\\s*");
res.add(parts);
System.out.println(parts[0] + " - " + parts[1] + " - " + parts[2]);
}
See the Java demo printing
as431431af - 87546 - 3214
5a341fafaf - 3365 - 54465
6adrT43 - 5678 - 5655
The results are in the res list.
Note that
\s* - matches zero or more whitespaces
\| - matches a pipe char
The pattern that you tried only has optional quantifiers * which could also match only comma's.
You also don't need Pattern.MULTILINE as there are no anchors in the pattern.
You can use 3 capture groups and use + as the quantifier to match at least 1 or more occurrence, and after each part either match a pipe | or assert the end of the string $
([a-zA-Z0-9]+),([0-9]+),([0-9]+)(?:\||$)
Regex demo | Java demo
For example
String readFile3 = "as431431af,87546,3214| 5a341fafaf,3365,54465 | 6adrT43 , 5678 , 5655";
String no_whitespace = readFile3.replaceAll("\\s+", "");
Pattern p = Pattern.compile("([a-zA-Z0-9]+),([0-9]+),([0-9]+)(?:\\||$)");
Matcher matcher = p.matcher(no_whitespace);
while (matcher.find()) {
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println(matcher.group(i));
}
System.out.println("--------------------------------");
}
Output
as431431af
87546
3214
--------------------------------
5a341fafaf
3365
54465
--------------------------------
6adrT43
5678
5655
--------------------------------
I'm writing Scala code which splits a line based on a colon (:).
Example, for an input which looked like:
sparker0i#outlook.com : password
I was doing line.split(" : ") (which is essentially Java) and printing the email and the password on Console.
Now my requirement has changed and now a line will look like:
(sparker0i#outlook.com,sparker0i) : password
I want to individually print the email, username and password separately.
I've tried Regex by first trying to split the parantheses, but that didn't work because it is not correct (val lt = line.split("[\\\\(||//)]")). Please guide me with the correct regex/split logic.
I'm not a scala user, but instead of split, I think you can use Pattern and matcher to extract this info, your regex can use groups like:
\((.*?),(.*?)\) : (.*)
regex demo
Then you can extract group 1 for email, group 2 for username and the 3rd group for password.
val input = "(sparker0i#outlook.com,sparker0i) : password"
val pattern = """\((.*?),(.*?)\) : (.*)""".r
pattern.findAllIn(string).matchData foreach {
m => println(m.group(1) + " " + m.group(2) + " " + m.group(3))
}
Credit for this post https://stackoverflow.com/a/3051206/5558072
The regex I would use:
\((.*?),([^)]+)\) : (.+)
Regex Demo
\( # Matches (
( # Start of capture group 1
(.*?) # Capture 0 or more characters until ...
) # End of capture group 1
, # matches ,
( # start of capture group 2
[^)]+ # captures one or more characters that are not a )
) # end of capture group 2
\) # Matches )
: # matches ' : '
( # start of capture group 3
(.+) # matches rest of string
) # end of capture group 3
The Java implementation would be:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class Test
{
public static void main(String[] args) {
String s = "(sparker0i#outlook.com,sparker0i) : password";
Pattern pattern = Pattern.compile("\\((.*?),([^)]+)\\) : (.+)");
Matcher m = pattern.matcher(s);
if (m.matches()) {
System.out.println(m.group(1));
System.out.println(m.group(2));
System.out.println(m.group(3));
}
}
}
Prints:
sparker0i#outlook.com
sparker0i
password
Java Demo
In scala 2.13, there is a simple solution without regrex:
Welcome to Scala 2.13.1 (OpenJDK 64-Bit Server VM, Java 1.8.0_222).
Type in expressions for evaluation. Or try :help.
scala> val input = "(sparker0i#outlook.com,sparker0i) : password"
input: String = (sparker0i#outlook.com,sparker0i) : password
scala> val s"($mail,$user) : $pwd" = input
mail: String = sparker0i#outlook.com
user: String = sparker0i
pwd: String = password
this is without doing much change
String s = "(sparker0i#outlook.com,sparker0i) : password";
// do whatever you were doing
String[] sArr = s.split(":");
sArr[0] = sArr[0].replaceAll("[(|)]",""); // just replace those parenthesis with empty string
System.out.println(sArr[0] + " " + sArr[1]);
Output
sparker0i#outlook.com,sparker0i password
I have a below string which comes from an excel column
"\"USE CODE \"\"Gef, sdf\"\" FROM 1/7/07\""
I would like to set regex pattern to retrieve the entire string,so that my result would be exactly like
"USE CODE ""Gef, sdf"" FROM 1/7/07"
Below is what I tried
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexMatches
{
public static void main( String args[] ){
// String to be scanned to find the pattern.
String line = "\"USE CODE \"\"Gef, sdf\"\" FROM 1/7/07\", Delete , Hello , How are you ? , ";
String line2 = "Test asda ds asd, tesat2 . test3";
String dpattern = "(\"[^\"]*\")(?:,(\"[^\"]*\"))*,|([^,]+),";
// Create a Pattern object
Pattern d = Pattern.compile(dpattern);
Matcher md = d.matcher(line2);
Pattern r = Pattern.compile(dpattern);
// Now create matcher object.
Matcher m = r.matcher(line);
if (m.find( )) {
System.out.println("Found value: 0 " + m.group(0) );
// System.out.println("Found value: 1 " + m.group(1) );
//System.out.println("Found value: 2 " + m.group(2) );
} else {
System.out.println("NO MATCH");
}
}
}
and the result out of it breaks after ,(comma) and hence the output is
Found value: 0 "USE CODE ""Gef,
It should be
Found value: 0 "USE CODE ""Gef sdf"" FROM 1/7/07",
and for the second line Matcher m = r.matcher(line2); the output should be
Found value: 0 "Test asda ds asd",
You may use
(?:"[^"]*(?:""[^"]*)*"|[^,])+
See the regex demo
Explanation:
" - leading quote
[^"]* - 0+ chars other than a double quote
(?:""[^"]*)* - 0+ sequences of a "" text followed with 0+ chars other than a double quote
" - trailing quote
OR:
[^,] - any char but a comma
And the whole pattern is matched 1 or more times as it is enclosed with (?:...)+ and + matches 1 or more occurrences.
IDEONE demo:
String line = "\"USE CODE \"\"Gef, sdf\"\" FROM 1/7/07\", Delete , Hello , How are you ? , ";
String line2 = "Test asda ds asd, tesat2 . test3";
Pattern pattern = Pattern.compile("(?:\"[^\"]*(?:\"\"[^\"]*)*\"|[^,])+");
Matcher matcher = pattern.matcher(line);
if (matcher.find()){ // if is used to get the 1st match only
System.out.println(matcher.group(0));
}
Matcher matcher2 = pattern.matcher(line2);
if (matcher2.find()){
System.out.println(matcher2.group(0));
}
With Stings like 123.456mm I would like to get one String with the number and the other with the measurement. So in the above case, one String with 123.456 and the other String with mm. So far I have this:
String str = "123.456mm";
String length = str.replaceAll("[\\D|\\.*]+","");
String lengthMeasurement = str.replaceAll("[\\W\\d]+","");
println(length, lengthMeasurement);
The output is:
123456 mm
The dot is gone and I can't get it back.
How can I keep the dots?
You can use:
String str = "123.456mm";
String length = str.replaceAll("[^\\d.]+",""); // 123.456
String lengthMeasurement = str.replaceAll("[\\d.]+",""); // mm
Try,
String str = "123.456mm";
String str1 = str.replaceAll("[a-zA-Z]", "");
String str2 = str.replaceAll("\\d|\\.", "");
System.out.println(str1);
System.out.println(str2);
Output:
123.456
mm
Try with Pattern and Matcher using below regex and get the matched group from index 1 and 2.
(\d+\.?\d*)(\D+)
Online demo
Try below sample code:
String str = "123.456mm";
Pattern p = Pattern.compile("(\\d+\\.?\\d*)(\\D+)");
Matcher m = p.matcher(str);
if (m.find()) {
System.out.println("Length: " + m.group(1));
System.out.println("Measurement : " + m.group(2));
}
output:
Length:123.456
Measurement :mm
Pattern description:
( group and capture to \1:
\d+ digits (0-9) (1 or more times)
\.? '.' (optional (0 or 1 time))
\d* digits (0-9) (0 or more times)
) end of \1
( group and capture to \2:
\D+ non-digits (all but 0-9) (1 or more times)
) end of \2
I need to extract this
Example:
www.google.com
maps.google.com
maps.maps.google.com
I need to extraact google.com from this.
How can I do this in Java?
Split on . and pick the last two bits.
String s = "maps.google.com";
String[] arr = s.split("\\.");
//should check the size of arr here
System.out.println(arr[arr.length-2] + '.' + arr[arr.length-1]);
Assuming you want to get the top level domain out of the hostname, you could try this:
Pattern pat = Pattern.compile( ".*\\.([^.]+\\.[^.]+)" ) ;
Matcher mat = pat.matcher( "maps.google.com" ) ;
if( mat.find() ) {
System.out.println( mat.group( 1 ) ) ;
}
if it's the other way round, and you want everything excluding the last 2 parts of the domain (in your example; www, maps, and maps.maps), then just change the first line to:
Pattern pat = Pattern.compile( "(.*)\\.[^.]+\\.[^.]+" ) ;
Extracting a known substring from a string doesn't make much sense ;) Why would you do a
String result = address.replaceAll("^.*google.com$", "$1");
when this is equal:
String result = "google.com";
If you need a test, try:
String isGoogle = address.endsWith(".google.com");
If you need the other part from a google address, this may help:
String googleSubDomain = address.replaceAll(".google.com", "");
(hint - the first line of code is a solution for your problem!)
String str="www.google.com";
try{
System.out.println(str.substring(str.lastIndexOf(".", str.lastIndexOf(".") - 1) + 1));
}catch(ArrayIndexOutOfBoundsException ex){
//handle it
}
Demo