I'm writing Scala code which splits a line based on a colon (:).
Example, for an input which looked like:
sparker0i#outlook.com : password
I was doing line.split(" : ") (which is essentially Java) and printing the email and the password on Console.
Now my requirement has changed and now a line will look like:
(sparker0i#outlook.com,sparker0i) : password
I want to individually print the email, username and password separately.
I've tried Regex by first trying to split the parantheses, but that didn't work because it is not correct (val lt = line.split("[\\\\(||//)]")). Please guide me with the correct regex/split logic.
I'm not a scala user, but instead of split, I think you can use Pattern and matcher to extract this info, your regex can use groups like:
\((.*?),(.*?)\) : (.*)
regex demo
Then you can extract group 1 for email, group 2 for username and the 3rd group for password.
val input = "(sparker0i#outlook.com,sparker0i) : password"
val pattern = """\((.*?),(.*?)\) : (.*)""".r
pattern.findAllIn(string).matchData foreach {
m => println(m.group(1) + " " + m.group(2) + " " + m.group(3))
}
Credit for this post https://stackoverflow.com/a/3051206/5558072
The regex I would use:
\((.*?),([^)]+)\) : (.+)
Regex Demo
\( # Matches (
( # Start of capture group 1
(.*?) # Capture 0 or more characters until ...
) # End of capture group 1
, # matches ,
( # start of capture group 2
[^)]+ # captures one or more characters that are not a )
) # end of capture group 2
\) # Matches )
: # matches ' : '
( # start of capture group 3
(.+) # matches rest of string
) # end of capture group 3
The Java implementation would be:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class Test
{
public static void main(String[] args) {
String s = "(sparker0i#outlook.com,sparker0i) : password";
Pattern pattern = Pattern.compile("\\((.*?),([^)]+)\\) : (.+)");
Matcher m = pattern.matcher(s);
if (m.matches()) {
System.out.println(m.group(1));
System.out.println(m.group(2));
System.out.println(m.group(3));
}
}
}
Prints:
sparker0i#outlook.com
sparker0i
password
Java Demo
In scala 2.13, there is a simple solution without regrex:
Welcome to Scala 2.13.1 (OpenJDK 64-Bit Server VM, Java 1.8.0_222).
Type in expressions for evaluation. Or try :help.
scala> val input = "(sparker0i#outlook.com,sparker0i) : password"
input: String = (sparker0i#outlook.com,sparker0i) : password
scala> val s"($mail,$user) : $pwd" = input
mail: String = sparker0i#outlook.com
user: String = sparker0i
pwd: String = password
this is without doing much change
String s = "(sparker0i#outlook.com,sparker0i) : password";
// do whatever you were doing
String[] sArr = s.split(":");
sArr[0] = sArr[0].replaceAll("[(|)]",""); // just replace those parenthesis with empty string
System.out.println(sArr[0] + " " + sArr[1]);
Output
sparker0i#outlook.com,sparker0i password
Related
I tried to replace the user input with including the password with the same pattern in the code and had no problems.
other question: what is the sign of a white space in java regex?
note: I am new to java so my code might seem a little messed-up
import java.util.Scanner;
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public static void main(String[] args) {
String email_pattern = "\\w{7,20}#(gmail|Hotmail|yahoo)\\.com"; // email pattern String password_pattern=".{10,20}"; //password pattern`(problem in this statement)
Scanner s = new Scanner(System.in);
String email;
String password;
System.out.println("Welcome to my site"); //
System.out.print("Enter your email: ");
email = s.next(); //email input by user
Pattern p = Pattern.compile(email_pattern);
Matcher m = p.matcher(email);
if (m.matches()) {
System.out.print("\nEnter your password: ");
password = s.next(); //password input by user
Pattern p2 = Pattern.compile(password_pattern);
Matcher m2 = p2.matcher(password);
if (m2.matches()) {
System.out.print("\n You are logged in");
} else {
System.out.print("\n" + m2.matches); // outputs the matching result if the password has a wrong format
}
} else {
System.out.print("\nWrong email format please re-enter your email");
}
}
//output problem in password matching with white spaces
Your code has one problem, and that is with java function, not regex pattern please correct that one to solve your problem:
replace each s.next() with
s.nextLine()
because s.next() method returns the next token means input word by word with space as the separator of next token.
check out the implementation of scanner.next(): https://www.tutorialspoint.com/java/util/scanner_next.htm
Just put the regex for both emailId and password.There are many way to put regex for email and password. Here is a simple example with description.
Regex for email = ^[_A-Za-z0-9-\\+]+(\\.[_A-Za-z0-9-]+)* #[A-Za-z0-9-]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})$;
Description
^ # start of the line
[_A-Za-z0-9-\\+]+ # must start with string in the bracket [ ], must contains one or more (+)
( # start of group #1
\\.[_A-Za-z0-9-]+ # follow by a dot "." and string in the bracket [ ], must contains one or more (+)
)* # end of group #1, this group is optional (*)
# # must contains a "#" symbol
[A-Za-z0-9-]+ # follow by string in the bracket [ ], must contains one or more (+)
( # start of group #2 - first level TLD checking
\\.[A-Za-z0-9]+ # follow by a dot "." and string in the bracket [ ], must contains one or more (+)
)* # end of group #2, this group is optional (*)
( # start of group #3 - second level TLD checking
\\.[A-Za-z]{2,} # follow by a dot "." and string in the bracket [ ], with minimum length of 2
) # end of group #3
$ # end of the line
Regex for password = ((?=.*\\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%]).{10,20});
Description
( # Start of group
(?=.*\d) # must contains one digit from 0-9
(?=.*[a-z]) # must contains one lowercase characters
(?=.*[A-Z]) # must contains one uppercase characters
(?=.*[##$%]) # must contains one special symbols in the list "##$%"
. # match anything with previous condition checking
{10,20} # length at least 10 characters and maximum of 20
) # End of group
Now complete code would be like this
public static void main(String[] args) {
String email_pattern = "^[A-Za-z0-9+_.-]+#(.+)$";
String password_pattern = "((?=.*\\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%]).{10,20})";
String email;
String password;
Scanner s=new Scanner(System.in);
System.out.println("Welcome to my site"); /
System.out.print("Enter your email: ");
email=s.next();
Pattern p=Pattern.compile(email_pattern);
Matcher m=p.matcher(email);
if(m.matches()){
System.out.print("\nEnter your password: ");
password=s.next();
Pattern p2=Pattern.compile(password_pattern);
Matcher m2=p2.matcher(password);
if(m2.matches()){
System.out.print("\n You are logged in");
}else{
System.out.print("\n"+m2.matches);
}
}else{
System.out.print("\nWrong email format please re-enter your email");
}
}
I am reading text from Aadhar card and Pan card and I am getting below String.
छ र्णां ;; _,
ज्यो
हें ष्ठछ्येप् ऽठमांआ (38७/8र्क्स "
,; जन्म वर्ष / 78६" ०1‘8६र्णीग् : 1992 खा … खा
पुरुष‘ “'व्'व् हैंप्टेंग्‘
` हु; ";:ळुहुं क्रौं र्मं’फु. ‘_य्; ,; ळु
` हं ` .म्च्हें :: "…. 'दृर्दु‘ऱ्क्ष्क्त
» ॰ -। "' ॰॰ ’ '|’ ""
8471 2211 6099 ,_
I have two tasks to do -
1. detect whether it has an aadhar card no or not.
2. if yes then get that no.
Code I have tried
String data = "b dn b fsd b fsd 6666 8888 9999 bsnfbsdb";
Pattern p = Pattern.compile( "^[a-zA-Z ]*\\d{4}\\s\\d{4}\\s\\d{4}[a-zA-Z ]*$" );
Matcher m = p.matcher( data );
if ( m.find() ) {
String s = m.group(0);
System.out.println(s);
}
But it is not working, I am getting whole String
Is there is any better solution to do this? or am I doing anything wrong?
Thanks in advance.
You may use
(?<!\d)\d{4}(?:\s\d{4}){2}(?!\d)
See the regex demo.
Details
(?<!\d) - no digit immediately to the left is allowed
\d{4} - four digits
(?:\s\d{4}){2} - two repetitions of a whitespace and four digits
(?!\d) - no digit immediately to the right is allowed
See the Java demo:
String data = "b dn b fsd b fsd 6666 8888 9999 bsnfbsdb";
Pattern p = Pattern.compile( "(?<!\\d)\\d{4}(?:\\s\\d{4}){2}(?!\\d)" );
Matcher m = p.matcher( data );
String s = "";
if ( m.find() ) {
s = m.group(0);
}
System.out.println("Result: " + s); // => Result: 6666 8888 9999
I have this Strings :
String test1=":test:block1:%a1%a2%a3%a4:block2:BL";
and
String test2=":test:block2:BL:block1:%a1%a2%a3%a4";
I've created an regex pattern in order to isolate this piece of String
block1:%a1%a2%a3%a4:
from the rest of the String letting those Strings like this :
in the case of test1="block1:%a1%a2%a3%a4:"; (with ':' at the end)
in the case of test2=":block1:%a1%a2%a3%a4"; (with ':' at the beggining)
The regex i've created is :
"(block1:(.*?):|:block1:(.*))";
With test1 is working , but with test2 is retrieving me this :
block1:%a1%a2%a3%a4:block2:BL";
Can someone give me a hand with this ?
Cheers!
You may use
block1:([^:]*)
It matches block1: text and then captures into Group 1 any 0 or more chars other than :.
See Java demo:
String patternString = "block1:([^:]*)";
String[] tests = {":test:block1:%a1%a2%a3%a4:block2:BL",
":test:block2:BL:block1:%a1%a2%a3%a4"};
for (int i=0; i<tests.length; i++)
{
Pattern p = Pattern.compile(patternString, Pattern.DOTALL);
Matcher m = p.matcher(tests[i]);
if(m.find())
{
System.out.println(tests[i] + " matched. Match: " +
m.group(0) + ", Group 1: " + m.group(1));
}
}
Output:
:test:block1:%a1%a2%a3%a4:block2:BL matched. Match: block1:%a1%a2%a3%a4, Group 1: %a1%a2%a3%a4
:test:block2:BL:block1:%a1%a2%a3%a4 matched. Match: block1:%a1%a2%a3%a4, Group 1: %a1%a2%a3%a4
I have a string line
String user_name = "id=123 user=aron name=aron app=application";
and I have a list that contains: {user,cuser,suser}
And i have to get the user part from string. So i have code like this
List<String> userName = Config.getConfig().getList(Configuration.ATT_CEF_USER_NAME);
String result = null;
for (String param: user_name .split("\\s", 0)){
for(String user: userName ){
String userParam = user.concat("=.*");
if (param.matches(userParam )) {
result = param.split("=")[1];
}
}
}
But the problem is that if the String contains spaces in the user_name, It do not work.
For ex:
String user_name = "id=123 user=aron nicols name=aron app=application";
Here user has a value aron nicols which contain spaces. How can I write a code that can get me exact user value i.e. aron nicols
If you want to split only on spaces that are right before tokens which have = righ after it such as user=... then maybe add look ahead condition like
split("\\s(?=\\S*=)")
This regex will split on
\\s space
(?=\\S*=) which has zero or more * non-space \\S characters which ends with = after it. Also look-ahead (?=...) is zero-length match which means part matched by it will not be included in in result so split will not split on it.
Demo:
String user_name = "id=123 user=aron nicols name=aron app=application";
for (String s : user_name.split("\\s(?=\\S*=)"))
System.out.println(s);
output:
id=123
user=aron nicols
name=aron
app=application
From your comment in other answer it seems that = which are escaped with \ shouldn't be treated as separator between key=value but as part of value. In that case you can just add negative-look-behind mechanism to see if before = is no \, so (?<!\\\\) right before will require = to not have \ before it.
BTW to create regex which will match \ we need to write it as \\ but in Java we also need to escape each of \ to create \ literal in String that is why we ended up with \\\\.
So you can use
split("\\s(?=\\S*(?<!\\\\)=)")
Demo:
String user_name = "user=Dist\\=Name1, xyz src=activedirectorydomain ip=10.1.77.24";
for (String s : user_name.split("\\s(?=\\S*(?<!\\\\)=)"))
System.out.println(s);
output:
user=Dist\=Name1, xyz
src=activedirectorydomain
ip=10.1.77.24
Do it like this:
First split input string using this regex:
" +(?=\\w+(?<!\\\\)=)"
This will give you 4 name=value tokens like this:
id=123
user=aron nicols
name=aron
app=application
Now you can just split on = to get your name and value parts.
Regex Demo
Regex Demo with escaped =
CODE FISH, this simple regex captures the user in Group 1: user=\\s*(.*?)\s+name=
It will capture "Aron", "Aron Nichols", "Aron Nichols The Benevolent", and so on.
It relies on the knowledge that name= always follows user=
However, if you're not sure that the token following user is name, you can use this:
user=\s*(.*?)(?=$|\s+\w+=)
Here is how to use the second expression (for the first, just change the string in Pattern.compile:
String ResultString = null;
try {
Pattern regex = Pattern.compile("user=\\s*(.*?)(?=$|\\s+\\w+=)", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE);
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
ResultString = regexMatcher.group(1);
}
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
}
I have String user#domain:port
I want to fetch user, domain and port from this String.
So I created regex:
public static final String MATCH_USER_DOMAIN_PORT = "^([0-9,a-zA-Z-.*_]+)#([a-z0-9]+[\\.-][a-z0-9]+\\.[a-z]{2,}+):(6553[0-5]|655[0-2]\\d|65[0-4]\\d{2}|6[0-4]\\d{3}|[1-5]\\d{4}|[1-9]\\d{0,3})$";
and this is my method in Unitest so far:
public void test____matchesUserDomainWithPort(){
String identityText = "maxim#domain.com:5555";
String user = "";
String domain = "";
String port = "";
if(identityText.matches(MATCH_USER_DOMAIN_PORT))
{
Pattern p = Pattern.compile(MATCH_USER_DOMAIN_PORT);
Matcher m = p.matcher(identityText);
user = m.group(1);
domain= m.group(2);
port= m.group(3);
}
assertEquals("maxim", user);
assertEquals("domain.com", domain);
assertEquals("5555", port);
}
I get error:
java.lang.IllegalStateException: No successful match so far
at java.util.regex.Matcher.ensureMatch(Matcher.java:607)
....
in row: user = m.group(1);
I opened http://gskinner.com/RegExr/?2v5r0
and there all seems good:
Output:
RegExp: /^([0-9,a-zA-Z-.*_]+#[a-z0-9]+([\.-][a-z0-9]+)*)+\.[a-z]{2,}+:(6553[0-5]|655[0-2]\d|65[0-4]\d{2}|6[0-4]\d{3}|[1-5]\d{4}|[1-9]\d{0,3})$/
pattern: ^([0-9,a-zA-Z-.*_]+#[a-z0-9]+([\.-][a-z0-9]+)*)+\.[a-z]{2,}+:(6553[0-5]|655[0-2]\d|65[0-4]\d{2}|6[0-4]\d{3}|[1-5]\d{4}|[1-9]\d{0,3})$
flags:
3 capturing groups:
group 1: ([0-9,a-zA-Z-.*_]+#[a-z0-9]+([\.-][a-z0-9]+)*)
group 2: ([\.-][a-z0-9]+)
group 3: (6553[0-5]|655[0-2]\d|65[0-4]\d{2}|6[0-4]\d{3}|[1-5]\d{4}|[1-9]\d{0,3})
Do I miss something?
in C i just write: sscanf(identityText,"%[^#]#%[^:]:%511s",user,domain,port);
For sure I can split this text with # and : and get 3 values, but its interesting how to do that in gentle form :)
Please, help
Please use
if(identityText.matches(MATCH_USER_DOMAIN_PORT)){
Pattern p = Pattern.compile(MATCH_USER_DOMAIN_PORT);
Matcher m = p.matcher(identityText);
while(m.find()){
user = m.group(1);
domain= m.group(2);
port= m.group(3);
}
}
thanks
Yes, I think your regex is wrong.
public static final String MATCH_USER_DOMAIN_PORT = "^([0-9,a-zA-Z-.*_]+#[a-z0-9]+([\\.-][a-z0-9]+)*)+\\.[a-z]{2,}+:(6553[0-5]|655[0-2]\\d|65[0-4]\\d{2}|6[0-4]\\d{3}|[1-5]\\d{4}|[1-9]\\d{0,3})$";
To break it down:
^(
[0-9,a-zA-Z-.*_]+
any number of these characters, will match "maxim"
#
will match "#"
[a-z0-9]+
any number of these characters, will match "domain"
([\\.-][a-z0-9]+)*
will match ".com" (or theoretically ".somethingelse.com", nice)
)+
will make group #2 "maxim#domain.com", I believe, but what's with the "+" ?
\\.
nothing in the input string here
[a-z]{2,}+
is this for a country code like .eu ? Again, what's with the "+" ?
:
(6553[0-5]|655[0-2]\\d|65[0-4]\\d{2}|6[0-4]\\d{3}|[1-5]\\d{4}|[1-9]\\d{0,3})
seems overly complicated - probably don't do the numeric validation with the regex
$
Take a look at Using a regular expression to validate an email address for some advice on validation of email addresses.