Is there a comprehensive way to test if two email addresses are equal? I know that I can universally LOWER both. But there are some other rules that differ from server to server. For example "william.burroughs#gmail.com", "will.iam.burroughs#gmail.com", and "williamburroughs#gmail.com" are all equivalent for gmail. But I don't think this is true in all cases. So given two email addresses I need to ensure they are equivalent. Currently my code does not consider "william.burroughs#gmail.com" and "williamburroughs#gmail.com" to be the same. I can start special casing something like "gmail" so they are but I was hoping there is a better approach.
Gmail only really has two rules for customization
Any periods ( . ) are ignored. This is easily overcome with a regex for gmail addresses.
Anything after a + is ignored. Again. a regex will fix this for gmail addresses.
The biggest challenge as I see it is that Google hosts thousands of cloud based domains that do not end in googlemail.com or gmail.com. Your only way to recognize these would be to do a DNS lookup and see what MX record the domain points to. Here's a python example that works:
http://eran.sandler.co.il/2011/07/17/determine-if-an-email-address-is-gmail-or-hosted-gmail-google-apps-for-your-domain/
You could do the same in any other language. Look at the 'MX' record for gmail or googlemail.
For any non-google domains, you can do a lowercase string compare.
A part from a custom coded set of rules for given email providers (e.g gmail for your example) I don't think there is any other way...
If you're comparing 2 different e-mails addresses just as characters, you may want to consider using regex to split them up for comparisons. A simple enough one could work as "([\w\.]+)#([\w\.]+\.\w+)". You could run group 2 through a switch to compare group 1 appropriately, defaulting to a more general comparison.
boolean emailsEquals(String email1,String email2) {
Pattern address=Pattern.compile("([\\w\\.]+)#([\\w\\.]+\\.\\w+)");
Matcher match1=address.matcher(email1);
Matcher match2=address.matcher(email2);
if(!match1.find() || !match2.find()) return false; //Not an e-mail address? Already false
if(!match1.group(2).equalsIgnoreCase(match2.group(2))) return false; //Not same serve? Already false
switch(match1.group(2).toLowerCase()) {
case "gmail.com":
String gmail1=match1.group(1).replace(".", "");
String gmail2=match2.group(1).replace(".", "");
return gmail1.equalsIgnoreCase(gmail2);
default: return match1.group(1).equalsIgnoreCase(match2.group(1));
}
}
Hope it helps!
Here this works but maybe I didnt understand your question
String email1 = new String("william.burroughs#gmail.com");
String email2 = new String("williamburroughs#gmail.com");
Pattern emailFinder = Pattern.compile("gmail.com");
Matcher emailmatcher = emailFinder.matcher(email1);
Matcher emailmatcher1 = emailFinder.matcher(email2);
if (emailmatcher.find() && emailmatcher1.find()) {
email1 = email1.replaceAll(".","");
email2 = email2.replaceAll(".","");
if(email1.equals(email2)){
System.out.println("The values match");
}
}
Related
I have a method to validate emailAddress.
public boolean isValidEmailAddress(String email) {
boolean result = true;
try {
InternetAddress emailAddr = new InternetAddress(java.net.IDN.toASCII(email));
emailAddr.validate();
} catch (AddressException ex) {
result = false;
}
return result;
}
I wrote a unit test, and it accepts abce#cdf, it is not looking for .com or something.
Assert.assertEquals(
new CustomRuleEmailService().isValidEmailAddress("abc#cdf"),
false
);
returns true instead of false
When You check JavaDoc for InternetAddress.validate(), you can see:
Validate that this address conforms to the syntax rules of RFC 822.
And in wiki for email address is described as:
An email address is generally recognized as having two parts joined
with an at-sign (#), although technical specification detailed in RFC
822 and subsequent RFCs are more extensive
So everything is working as expected
While foo#server is technically a legal Email address, it's quite feasible to want to restrict addresses to one under "real domains".
Now the definition of "real domain" is non-trivial and ever-changing, so I wouldn't attempt to code one myself and rely on existing code for that instead.
For example Guava provides a InternetDomainName class which provides the IsUnderRegistrySuffix method which would accept foo.com, foo.bar.com and example.accountant but decline foo.thisIsNotaTld or just foo.
The main class comment of that class also has a good explanation of the problem and likely solutions.
I wanna detect exact domain url in string and then change that with another string and finally make it clickable in TextView.
What I want:
this is sample text with one type of url mydomain.com/pin/123456. another type of url is mydomain.com/username.
Wel, I wrote this regex:
([Hh][tT][tT][pP][sS]?://)?(?:www\\.)?example\\.com/?.*
([Hh][tT][tT][pP][sS]?://)?(?:www\\.)?example\\.com/pin/?.*
this regex can detect:
http://www.example.com
https://www.example.com
www.example.com
example.com
Hhtp://www.example.com // and all other wrong type in http
with anything after .com
Issues:
1. How detect end of domain ( with space or dot)
2. How detect two type of domain, one with /pin/ and another without?
3. How to replace detected domain like mydomain.com/pin/123 with PostLink and mydomain.com/username with ProfileLink
4. I know how to make them clickable with Linkify but if it possible show me best way to provide content provider for links to open each link with proper activity
You could try:
([\w_-]+(?:(?:\.[\w_-]+)+))([\w.,#?^=%&:/~+#-]*[\w#?^=%&/~+#-])?
which is a regex I found after a quick search here on stackoverflow:
Regular expression to find URLs within a string
I just removed the http:// part of that regex to fit your needs.
Be aware though that because of that it now tracks everything that is connected with a dot and no whitespace. For example: a.a would also be found
With special thanks of Gildraths
Answer to question 1
String urlRegex = "(https?://)?(?:www\\.)?exampl.com+([\\w.,#?^=%&:/~+#-]*[\\w#?^=%&/~+#-])?";
Pattern pattern = Pattern.compile(urlRegex, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(textString);
Answer to question 2, 3
while(matcher.find()){
// Answer to question 2 - If was true, url contain "/pin"
boolean contain = matcher.group().indexOf("/pin/") >= 0;
if(contain){
String profileId = matcher.group().substring(matcher.group().indexOf("/pin/") + 5, matcher.group().length());
}
// Answer to question 3 - replace match group with custom text
textString = textString.replace(matcher.group(), "#" + profileId);
}
Answer to question 4
// Pattern to detect replaced custom text
Pattern profileLink = Pattern.compile("[#]+[A-Za-z0-9-_]+\\b");
// Schema
String Link = "content://"+Context.getString(R.string.profile_authority)+"/";
// Make it linkify ;)
Linkify.addLinks(textView, profileLink, Link);
I have the following requirement where in I need to do few things only if the given string ends in "Y" or "Years" or "YEARS".
I tried doing it using regex like this.
String text=1.5Y;
if(Pattern.matches("Y$",text) || Pattern.matches("YEARS$",text) || Pattern.matches("Years",text))
{
//do
}
However this is getting failed.
Can someone point me where I have gone wrong or suggest me any other feasible method.
EDIT:
Thanks.That helps.
Finally I have used "(?i)^.*Y(ears)?$| (?i)^.*M(onths)?$".
But I want to make more changes to make it perfect.
Let's say I have many strings.
Ideally only strings like 1.5Y or 0.5-3.5Y or 2.5/2.5-4.5Y should pass if check.
It can be number of years(Ex:2.5y) or the period of years(2.5-3.5y) or the no of years/period of years(Ex.2.5/3.5-4.5Y) nothing more.
More Examples:
--------------
Y -should fail;
MY - should fail;
1.5CY - should fail;
1.5Y-2.5Y should fail;
1.5-2.5Y should pass;
1.5Y/2.5-3.5Y should fail;
1.5/2.5-3.5Y should pass;
You don't need a regex here:
if(text.endsWith("Y") || ...)
matches method attempts to match full input so use:
^.*Y$
for your first pattern.
btw you can use a single regex for all 3 cases:
if (text.matches( "(?i)^.*Y(ears)?$" ) ) {...}
(?i) does ignore case match.
.*(?:Y|YEARS|Years)$
You can directly use this .Match matches from beginning.So yours is failing.
You can simply use the regex pattern:
if (Pattern.matches(".*(Y|YEARS|Years)$",text)) {/*do something*/}
/((?!0)\d+|0)(.\d+)?(?:years|year|y)/gi
https://regex101.com/r/gJ6xD2/2
var text = "1.6y 1.5years 1year 1.5h";
text.match(/((?!0)\d+|0)(\.\d+)?(?:years|year|y)/gi);
Result["1.6y", "1.5years", "1year"]
(?=^(0\.\d+|[1-9](?:\d+)?(?:\.\d+)?)(?:(\s+)?[\/-](\s+)?(?:0\.\d+|[1-9](?:\d+)?(?:\.\d+)?))*(?:\s+)?(?:y(?:(ea)?rs|ears?)?|m(?:onths?)?)$).*
https://regex101.com/r/kL7rQ1/3
Only thing I wasn't sure "2.3 - 4 / 6.2 y" format is acceptable or not, so I've included it.
I have the following REGEX that I'm serving up to java via an xml file.
[a-zA-Z -\(\) \-]+
This regex is used to validate server side and client side (via javascript) and works pretty well at allowing only alphabetic content and a few other characters...
My problem is that it will also allow zero lenth strings / empty through.
Does anyone have a simple and yet elegant solution to this?
I already tried...
[a-zA-Z -\(\) \-]{1,}+
but that didn;t seem to work.
Cheers!
UPDATE FOLLOWING INVESTIGATION
It appears the code I provided does in fact work...
String inputStr = " ";
String pattern = "[a-zA-Z -\\(\\) \\-]+";
boolean patternMatched = java.util.regex.Pattern.matches(pattern, inputStr);
if ( patternMatched ){
out.println("Pattern MATCHED");
}else{
out.println("NOT MATCHED");
}
After looking at this more closely I think the problem may well be within the logic of some of my java bean coding... It appears the regex is dropped out at the point where the string parse should take place, thereby allowing empty strings to be submitted... And also any other string... EEJIT that I am...
Cheers for the help in peer reviewing my initial stupid though....!
Have you tried this:
[a-zA-Z -\(\) \-]+
Is there any way to verify in Java code that an e-mail address is valid. By valid, I don't just mean that it's in the correct format (someone#domain.subdomain), but that's it's a real active e-mail address.
I'm almost certain that there's no 100% reliable way to do this, because such a technique would be the stuff of spammer's dreams. But perhaps there's some technique that gives some useful indication about whether an address is 'real' or not.
Here is what I have around. To check that the address is a valid format, here is a regex that verifies that it's nearly rfc2822 (it doesn't catch some weird corner cases). I found it on the 'net last year.
private static final Pattern rfc2822 = Pattern.compile(
"^[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?$"
);
if (!rfc2822.matcher(email).matches()) {
throw new Exception("Invalid address");
}
That will take care of simple syntax (for the most part). The other check I know of will let you check if the domain has an MX record. It looks like this:
Hashtable<String, String> env = new Hashtable<String, String>();
env.put("java.naming.factory.initial", "com.sun.jndi.dns.DnsContextFactory");
DirContext ictx = new InitialDirContext(env);
Attributes attrs = ictx.getAttributes(domainName, new String[] {"MX"});
Attribute attr = attrs.get("MX");
if (attr == null)
// No MX record
else
// If attr.size() > 0, there is an MX record
This, I also found on the 'net. It came from this link.
If these both pass, you have a good chance at having a valid address. As for if the address it's self (not just the domain), it's not full, etc... you really can't check that.
Note that the second check is time intensive. It can take anywhere from milliseconds to >30 seconds (if the DNS does not respond and times out). It's not something to try and run real-time for large numbers of people.
Hope this helps.
EDIT
I'd like to point out that, at least instead of the regex, there are better ways to check basic validity. Don and Michael point out that Apache Commons has something, and I recently found out you can use .validate() on InternetAddress to have Java check that the address is really RFC-8222, which is certainly more accurate than my regex.
You cannot really verify that an email exists, see my answer to a very similar question here: Email SMTP validator
Without sending an email, it could be hard to get 100%, but if you do a DNS lookup on the host that should at least tell you that it is a viable destination system.
Apache commons provides an email validator class too, which you can use. Simply pass your email address as an argument to isValid method.
Do a DNS lookup on the hostname to see if that exists. You could theoretically also initiate a connection to the mailserver and see if it tells you whether the recipient exists, but I think many servers pretend they know an address, then reject the email anyway.
The only way you can be certain is by actually sending a mail and have it read.
Let your registration process have a step that requires responding to information found only in the email. This is what others do.
I'm not 100% sure, but isn't it possible to send an RCPT SMTP command to a mail server to determine if the recipient is valid? It would be even more expensive than the suggestion above to check for a valid MX host, but it would also be the most accurate.
If you're using GWT, you can't use InternetAddress, and the pattern supplied by MBCook is pretty scary.
Here is a less scary regex (might not be as accurate):
public static boolean isValidEmail(String emailAddress) {
return emailAddress.contains(" ") == false && emailAddress.matches(".+#.+\\.[a-z]+");
}
public static boolean isValidEmail(String emailAddress) {
return emailAddress.contains(" ") == false && emailAddress.matches(".+#.+\\.[a-z]+");
}