Java InternetAddress not validating email properly - java

I have a method to validate emailAddress.
public boolean isValidEmailAddress(String email) {
boolean result = true;
try {
InternetAddress emailAddr = new InternetAddress(java.net.IDN.toASCII(email));
emailAddr.validate();
} catch (AddressException ex) {
result = false;
}
return result;
}
I wrote a unit test, and it accepts abce#cdf, it is not looking for .com or something.
Assert.assertEquals(
new CustomRuleEmailService().isValidEmailAddress("abc#cdf"),
false
);
returns true instead of false

When You check JavaDoc for InternetAddress.validate(), you can see:
Validate that this address conforms to the syntax rules of RFC 822.
And in wiki for email address is described as:
An email address is generally recognized as having two parts joined
with an at-sign (#), although technical specification detailed in RFC
822 and subsequent RFCs are more extensive
So everything is working as expected

While foo#server is technically a legal Email address, it's quite feasible to want to restrict addresses to one under "real domains".
Now the definition of "real domain" is non-trivial and ever-changing, so I wouldn't attempt to code one myself and rely on existing code for that instead.
For example Guava provides a InternetDomainName class which provides the IsUnderRegistrySuffix method which would accept foo.com, foo.bar.com and example.accountant but decline foo.thisIsNotaTld or just foo.
The main class comment of that class also has a good explanation of the problem and likely solutions.

Related

UUID.fromString(var) is accepting additional characters in Java

I have a requirement where I accept a UUID and check whether the UUID is in valid format or not. Package java.util provides a UUID.FromString(var) method which accepts a variable and it throws an exception when it is not in valid format.
For some unknown reasons, it's accepting an additional character and filters it out instead of throwing an exception.
For example,
String var1="BAAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA"
UUID.fromString(var1)
Here, there is an additional character B but instead of throwing an exception, it's filtering it out as "AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA".
Can anyone help me understand why is this happening?
I am using Java 8.
Historically, it looks like Java's UUID class has been very lenient in what it accepts. This includes too short segments, and too long segments. This behaviour was tightened up somewhat between Java 8 and Java 11 (I didn't want to check Java 9 or 10 to see exactly when), as Java 11 no longer accepts your example (but still accepts too short segments).
I haven't been able to identify the exact bug in the Java bug database that changed this, but there are a lot around this behaviour (example: JDK-8216407 : java.util.UUID.fromString accepts input that does not match expected format), but most of them are either still open, closed as duplicate, or marked as "won't fix" because of backwards compatibility concerns.
I double-checked (javac 11.0.15)
and it is behaving as expected.
Here is an example of how I tested it:
I guess that you are not handling the exception properly.
class Example {
public static void main(String[] args) {
// Valid case
System.out.println(isValidUUID("AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA"));
// Invalid cases
System.out.println(isValidUUID("Cc21a392e-3da6-4cd4-ac9f-9d963b313e22"));
System.out.println(isValidUUID("BAAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA"));
System.out.println(isValidUUID("AAAA-AAAAAAAAAAAA"));
System.out.println(isValidUUID("Hope this helps"));
}
/**
Check if a supplied UUID is valid
*/
public static boolean isValidUUID(String uuid) {
try {
UUID.fromString(uuid);
} catch (RuntimeException e) {
return false;
}
return true;
}
}

How to handle exceptions with builders

I have a builder for a class with the following method which uses the Google libphonenumber library.
public final Builder withPhoneNumber(String phoneNumber, String region)
throws NumberParseException,IllegalArgumentException {
PhoneNumberUtil phoneUtil = PhoneNumberUtil.getInstance();
if (!phoneUtil.getSupportedRegions().contains(region)) {
throw new IllegalArgumentException("Region code is invalid.");
}
PhoneNumber inputNumber = phoneUtil.parse(phoneNumber, region);
String formattedNumber = phoneUtil.format(inputNumber, PhoneNumberFormat.E164);
this.phoneNumbers.add(formatterNumber);
return this;
}
The method needs to check that the phoneNumber argument is a valid form and that the region argument is one of the allowable regions. Normally if this was just a mutator method for the class not the builder I'd just handle the exceptions internally and return false if one occurs, but since I'm limited to returning the Builder type I cannot return a boolean. Would it be better to propagate the exception to the user as I have done above (requiring the user to surround their builder call with try catch blocks) or should I just manage the exceptions internally by returning the unchanged builder (i.e. rejecting the new phone number if it's invalid)? My only concern with the second option is that the user is not notified why the method call isn't adding the phone number to the builder, but I suppose I could just put that in javadoc.
What should I do?
The code snippet that you posted seems the right approach to me. I don't think you have any reason to be worried because:
The IllegalArgumentException is not a checked exception, so you're not forcing clients to surround the building code in try/catch blocks.
The input is usually supposed to be validated before you start building your domain objects. The validation logic should guard the UI input and not allow any transaction to begin until the input data is valid.
The second bullet means that inside your builder you just perform a double-check, and not the actual validation which results in error messages in the UI.
You can think about it this way: what if the input was null (for either phoneNumber or region). Most likely, that will result in a NullPointerException, which is the correct behavior. And you don't expect clients to catch the NPE; they're just not supposed to call your method with that kind of input.

How do I test two email addresses for equality

Is there a comprehensive way to test if two email addresses are equal? I know that I can universally LOWER both. But there are some other rules that differ from server to server. For example "william.burroughs#gmail.com", "will.iam.burroughs#gmail.com", and "williamburroughs#gmail.com" are all equivalent for gmail. But I don't think this is true in all cases. So given two email addresses I need to ensure they are equivalent. Currently my code does not consider "william.burroughs#gmail.com" and "williamburroughs#gmail.com" to be the same. I can start special casing something like "gmail" so they are but I was hoping there is a better approach.
Gmail only really has two rules for customization
Any periods ( . ) are ignored. This is easily overcome with a regex for gmail addresses.
Anything after a + is ignored. Again. a regex will fix this for gmail addresses.
The biggest challenge as I see it is that Google hosts thousands of cloud based domains that do not end in googlemail.com or gmail.com. Your only way to recognize these would be to do a DNS lookup and see what MX record the domain points to. Here's a python example that works:
http://eran.sandler.co.il/2011/07/17/determine-if-an-email-address-is-gmail-or-hosted-gmail-google-apps-for-your-domain/
You could do the same in any other language. Look at the 'MX' record for gmail or googlemail.
For any non-google domains, you can do a lowercase string compare.
A part from a custom coded set of rules for given email providers (e.g gmail for your example) I don't think there is any other way...
If you're comparing 2 different e-mails addresses just as characters, you may want to consider using regex to split them up for comparisons. A simple enough one could work as "([\w\.]+)#([\w\.]+\.\w+)". You could run group 2 through a switch to compare group 1 appropriately, defaulting to a more general comparison.
boolean emailsEquals(String email1,String email2) {
Pattern address=Pattern.compile("([\\w\\.]+)#([\\w\\.]+\\.\\w+)");
Matcher match1=address.matcher(email1);
Matcher match2=address.matcher(email2);
if(!match1.find() || !match2.find()) return false; //Not an e-mail address? Already false
if(!match1.group(2).equalsIgnoreCase(match2.group(2))) return false; //Not same serve? Already false
switch(match1.group(2).toLowerCase()) {
case "gmail.com":
String gmail1=match1.group(1).replace(".", "");
String gmail2=match2.group(1).replace(".", "");
return gmail1.equalsIgnoreCase(gmail2);
default: return match1.group(1).equalsIgnoreCase(match2.group(1));
}
}
Hope it helps!
Here this works but maybe I didnt understand your question
String email1 = new String("william.burroughs#gmail.com");
String email2 = new String("williamburroughs#gmail.com");
Pattern emailFinder = Pattern.compile("gmail.com");
Matcher emailmatcher = emailFinder.matcher(email1);
Matcher emailmatcher1 = emailFinder.matcher(email2);
if (emailmatcher.find() && emailmatcher1.find()) {
email1 = email1.replaceAll(".","");
email2 = email2.replaceAll(".","");
if(email1.equals(email2)){
System.out.println("The values match");
}
}

Parse MIME sender in Java (RFC 822)

MIME message senders appear in formats such as:
"John Doe" <johndoe#gmail.com>
<johndoe#gmail.com>
I'm trying to figure out how to extract the string "johndoe#gmail.com" in the above examples, although I will also need the "johndoe" and "gmail.com" parts (per RFC I'm pretty sure splitting on # is all that's needed from here). Obviously regex-ing up my own parser is one (not great) option.
It seemed this may be possible using javax.mail.internet.MimeMessage. All of the constructors require a Folder which I do not have (well, I sort of do, it exists in the IMAP layer), e.g.
MimeMessage(Folder folder, InputStream is, int msgnum)
Which makes me feel I'm using this class wrong. Nonetheless, if I parse this way I do get access to the getFrom() method which returns an array of Address, which itself doesn't offer methods of use to me.
Using mime4j it's easy to get this far:
case T_FIELD: // field means header
if(token.getName() == "from") {
// get raw string as above - unparsed
So using mime4j or using java, javax etc. utilities it should be possible to extract the "a#b.com" part of the address from there, but I haven't found a class within javax or mime4j that is responsible for this yet.
I think you need InternetAddress class from javax.mail:
http://docs.oracle.com/javaee/6/api/javax/mail/internet/InternetAddress.html#getAddress()
Minimum working example:
import javax.mail.internet.AddressException;
import javax.mail.internet.InternetAddress;
public class JavaMailExample {
public static void main(String[] args) throws AddressException {
String fullemail = "\"John Doe\" <johndoe#gmail.com>";
InternetAddress addr = new InternetAddress(fullemail);
System.out.println(addr.getPersonal()); // John Doe
System.out.println(addr.getAddress()); // johndoe#gmail.com
}
}

Verify email in Java

Is there any way to verify in Java code that an e-mail address is valid. By valid, I don't just mean that it's in the correct format (someone#domain.subdomain), but that's it's a real active e-mail address.
I'm almost certain that there's no 100% reliable way to do this, because such a technique would be the stuff of spammer's dreams. But perhaps there's some technique that gives some useful indication about whether an address is 'real' or not.
Here is what I have around. To check that the address is a valid format, here is a regex that verifies that it's nearly rfc2822 (it doesn't catch some weird corner cases). I found it on the 'net last year.
private static final Pattern rfc2822 = Pattern.compile(
"^[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?$"
);
if (!rfc2822.matcher(email).matches()) {
throw new Exception("Invalid address");
}
That will take care of simple syntax (for the most part). The other check I know of will let you check if the domain has an MX record. It looks like this:
Hashtable<String, String> env = new Hashtable<String, String>();
env.put("java.naming.factory.initial", "com.sun.jndi.dns.DnsContextFactory");
DirContext ictx = new InitialDirContext(env);
Attributes attrs = ictx.getAttributes(domainName, new String[] {"MX"});
Attribute attr = attrs.get("MX");
if (attr == null)
// No MX record
else
// If attr.size() > 0, there is an MX record
This, I also found on the 'net. It came from this link.
If these both pass, you have a good chance at having a valid address. As for if the address it's self (not just the domain), it's not full, etc... you really can't check that.
Note that the second check is time intensive. It can take anywhere from milliseconds to >30 seconds (if the DNS does not respond and times out). It's not something to try and run real-time for large numbers of people.
Hope this helps.
EDIT
I'd like to point out that, at least instead of the regex, there are better ways to check basic validity. Don and Michael point out that Apache Commons has something, and I recently found out you can use .validate() on InternetAddress to have Java check that the address is really RFC-8222, which is certainly more accurate than my regex.
You cannot really verify that an email exists, see my answer to a very similar question here: Email SMTP validator
Without sending an email, it could be hard to get 100%, but if you do a DNS lookup on the host that should at least tell you that it is a viable destination system.
Apache commons provides an email validator class too, which you can use. Simply pass your email address as an argument to isValid method.
Do a DNS lookup on the hostname to see if that exists. You could theoretically also initiate a connection to the mailserver and see if it tells you whether the recipient exists, but I think many servers pretend they know an address, then reject the email anyway.
The only way you can be certain is by actually sending a mail and have it read.
Let your registration process have a step that requires responding to information found only in the email. This is what others do.
I'm not 100% sure, but isn't it possible to send an RCPT SMTP command to a mail server to determine if the recipient is valid? It would be even more expensive than the suggestion above to check for a valid MX host, but it would also be the most accurate.
If you're using GWT, you can't use InternetAddress, and the pattern supplied by MBCook is pretty scary.
Here is a less scary regex (might not be as accurate):
public static boolean isValidEmail(String emailAddress) {
return emailAddress.contains(" ") == false && emailAddress.matches(".+#.+\\.[a-z]+");
}
public static boolean isValidEmail(String emailAddress) {
return emailAddress.contains(" ") == false && emailAddress.matches(".+#.+\\.[a-z]+");
}

Categories