Verify email in Java - java

Is there any way to verify in Java code that an e-mail address is valid. By valid, I don't just mean that it's in the correct format (someone#domain.subdomain), but that's it's a real active e-mail address.
I'm almost certain that there's no 100% reliable way to do this, because such a technique would be the stuff of spammer's dreams. But perhaps there's some technique that gives some useful indication about whether an address is 'real' or not.

Here is what I have around. To check that the address is a valid format, here is a regex that verifies that it's nearly rfc2822 (it doesn't catch some weird corner cases). I found it on the 'net last year.
private static final Pattern rfc2822 = Pattern.compile(
"^[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?$"
);
if (!rfc2822.matcher(email).matches()) {
throw new Exception("Invalid address");
}
That will take care of simple syntax (for the most part). The other check I know of will let you check if the domain has an MX record. It looks like this:
Hashtable<String, String> env = new Hashtable<String, String>();
env.put("java.naming.factory.initial", "com.sun.jndi.dns.DnsContextFactory");
DirContext ictx = new InitialDirContext(env);
Attributes attrs = ictx.getAttributes(domainName, new String[] {"MX"});
Attribute attr = attrs.get("MX");
if (attr == null)
// No MX record
else
// If attr.size() > 0, there is an MX record
This, I also found on the 'net. It came from this link.
If these both pass, you have a good chance at having a valid address. As for if the address it's self (not just the domain), it's not full, etc... you really can't check that.
Note that the second check is time intensive. It can take anywhere from milliseconds to >30 seconds (if the DNS does not respond and times out). It's not something to try and run real-time for large numbers of people.
Hope this helps.
EDIT
I'd like to point out that, at least instead of the regex, there are better ways to check basic validity. Don and Michael point out that Apache Commons has something, and I recently found out you can use .validate() on InternetAddress to have Java check that the address is really RFC-8222, which is certainly more accurate than my regex.

You cannot really verify that an email exists, see my answer to a very similar question here: Email SMTP validator

Without sending an email, it could be hard to get 100%, but if you do a DNS lookup on the host that should at least tell you that it is a viable destination system.

Apache commons provides an email validator class too, which you can use. Simply pass your email address as an argument to isValid method.

Do a DNS lookup on the hostname to see if that exists. You could theoretically also initiate a connection to the mailserver and see if it tells you whether the recipient exists, but I think many servers pretend they know an address, then reject the email anyway.

The only way you can be certain is by actually sending a mail and have it read.
Let your registration process have a step that requires responding to information found only in the email. This is what others do.

I'm not 100% sure, but isn't it possible to send an RCPT SMTP command to a mail server to determine if the recipient is valid? It would be even more expensive than the suggestion above to check for a valid MX host, but it would also be the most accurate.

If you're using GWT, you can't use InternetAddress, and the pattern supplied by MBCook is pretty scary.
Here is a less scary regex (might not be as accurate):
public static boolean isValidEmail(String emailAddress) {
return emailAddress.contains(" ") == false && emailAddress.matches(".+#.+\\.[a-z]+");
}

public static boolean isValidEmail(String emailAddress) {
return emailAddress.contains(" ") == false && emailAddress.matches(".+#.+\\.[a-z]+");
}

Related

Java InternetAddress not validating email properly

I have a method to validate emailAddress.
public boolean isValidEmailAddress(String email) {
boolean result = true;
try {
InternetAddress emailAddr = new InternetAddress(java.net.IDN.toASCII(email));
emailAddr.validate();
} catch (AddressException ex) {
result = false;
}
return result;
}
I wrote a unit test, and it accepts abce#cdf, it is not looking for .com or something.
Assert.assertEquals(
new CustomRuleEmailService().isValidEmailAddress("abc#cdf"),
false
);
returns true instead of false
When You check JavaDoc for InternetAddress.validate(), you can see:
Validate that this address conforms to the syntax rules of RFC 822.
And in wiki for email address is described as:
An email address is generally recognized as having two parts joined
with an at-sign (#), although technical specification detailed in RFC
822 and subsequent RFCs are more extensive
So everything is working as expected
While foo#server is technically a legal Email address, it's quite feasible to want to restrict addresses to one under "real domains".
Now the definition of "real domain" is non-trivial and ever-changing, so I wouldn't attempt to code one myself and rely on existing code for that instead.
For example Guava provides a InternetDomainName class which provides the IsUnderRegistrySuffix method which would accept foo.com, foo.bar.com and example.accountant but decline foo.thisIsNotaTld or just foo.
The main class comment of that class also has a good explanation of the problem and likely solutions.

How do I test two email addresses for equality

Is there a comprehensive way to test if two email addresses are equal? I know that I can universally LOWER both. But there are some other rules that differ from server to server. For example "william.burroughs#gmail.com", "will.iam.burroughs#gmail.com", and "williamburroughs#gmail.com" are all equivalent for gmail. But I don't think this is true in all cases. So given two email addresses I need to ensure they are equivalent. Currently my code does not consider "william.burroughs#gmail.com" and "williamburroughs#gmail.com" to be the same. I can start special casing something like "gmail" so they are but I was hoping there is a better approach.
Gmail only really has two rules for customization
Any periods ( . ) are ignored. This is easily overcome with a regex for gmail addresses.
Anything after a + is ignored. Again. a regex will fix this for gmail addresses.
The biggest challenge as I see it is that Google hosts thousands of cloud based domains that do not end in googlemail.com or gmail.com. Your only way to recognize these would be to do a DNS lookup and see what MX record the domain points to. Here's a python example that works:
http://eran.sandler.co.il/2011/07/17/determine-if-an-email-address-is-gmail-or-hosted-gmail-google-apps-for-your-domain/
You could do the same in any other language. Look at the 'MX' record for gmail or googlemail.
For any non-google domains, you can do a lowercase string compare.
A part from a custom coded set of rules for given email providers (e.g gmail for your example) I don't think there is any other way...
If you're comparing 2 different e-mails addresses just as characters, you may want to consider using regex to split them up for comparisons. A simple enough one could work as "([\w\.]+)#([\w\.]+\.\w+)". You could run group 2 through a switch to compare group 1 appropriately, defaulting to a more general comparison.
boolean emailsEquals(String email1,String email2) {
Pattern address=Pattern.compile("([\\w\\.]+)#([\\w\\.]+\\.\\w+)");
Matcher match1=address.matcher(email1);
Matcher match2=address.matcher(email2);
if(!match1.find() || !match2.find()) return false; //Not an e-mail address? Already false
if(!match1.group(2).equalsIgnoreCase(match2.group(2))) return false; //Not same serve? Already false
switch(match1.group(2).toLowerCase()) {
case "gmail.com":
String gmail1=match1.group(1).replace(".", "");
String gmail2=match2.group(1).replace(".", "");
return gmail1.equalsIgnoreCase(gmail2);
default: return match1.group(1).equalsIgnoreCase(match2.group(1));
}
}
Hope it helps!
Here this works but maybe I didnt understand your question
String email1 = new String("william.burroughs#gmail.com");
String email2 = new String("williamburroughs#gmail.com");
Pattern emailFinder = Pattern.compile("gmail.com");
Matcher emailmatcher = emailFinder.matcher(email1);
Matcher emailmatcher1 = emailFinder.matcher(email2);
if (emailmatcher.find() && emailmatcher1.find()) {
email1 = email1.replaceAll(".","");
email2 = email2.replaceAll(".","");
if(email1.equals(email2)){
System.out.println("The values match");
}
}

Get all hostnames for an IP address in the network

I have a requirement wherein an IP address can have multiple hostnames mapped to it. I tried looking into InetAddress.getAllByName("10.33.28.55") however I did not get the desired result, it returned just one entry. nslookup on the IP address returns all DNS entries. How do I retrieve all the hostnames associated with this IP address in Java?
Looking at the source code for InetAddress.getAllByName() you find that it doesn't actually do a DNS query if the provided String is textual representation of an IP address. It simply returns an array containing a single InetAdddress object containing the IP. They even put a handy comment right in the method:
// if host is an IP address, we won't do further lookup
(See: http://javasourcecode.org/html/open-source/jdk/jdk-6u23/java.net/InetAddress.java.html)
If only the JavaDoc was so clear. It states "If a literal IP address is supplied, only the validity of the address format is checked." ... I would argue that doesn't tell you that it isn't going to be looked up.
Thinking about it, however ... it makes sense in the context of InetAddress - the class encapsulates an IP address of which ... you only have one. It really needs getHostNames() and getAllCanonicalNames() (note the plurality) methods that would do what you are asking. I'm thinking of opening an issue / submitting a patch.
That said, it would appear currently there's no built in method of doing a RDNS query where multiple PTR records are supported. All the other lookup methods simply lop off the first record returned and that's what you get.
You're going to have to look into 3rd party DNS libraries for java (sorry, I don't have experience with using any of them).
Edit to add: I like figuring things out. I do not have an IP handy that has multiple PTR records to test this against, but it should do the trick.
import java.io.IOException;
import java.util.Properties;
import javax.naming.Context;
import javax.naming.NamingException;
import javax.naming.directory.Attribute;
import javax.naming.directory.Attributes;
import javax.naming.directory.InitialDirContext;
public class App
{
public static void main(String[] args) throws IOException, NamingException
{
Properties env = new Properties();
env.put(Context.INITIAL_CONTEXT_FACTORY, "com.sun.jndi.dns.DnsContextFactory");
InitialDirContext idc = new InitialDirContext(env);
String ipAddr = "74.125.225.196";
// Turn the IP into an in-addr.arpa name
// 196.225.125.74.in-addr.arpa.
String[] quads = ipAddr.split("\\.");
StringBuilder sb = new StringBuilder();
for (int i = quads.length - 1; i >= 0; i--)
{
sb.append(quads[i]).append(".");
}
sb.append("in-addr.arpa.");
ipAddr = sb.toString();
Attributes attrs = idc.getAttributes(ipAddr, new String[] {"PTR"});
Attribute attr = attrs.get("PTR");
if (attr != null)
{
for (int i = 0; i < attr.size(); i++)
{
System.out.println((String)attr.get(i));
}
}
}
}
Well, there is only one good way: call nslookup or dig or whatever from the Java process.
With Runtime.getRuntime().exec(..)
or better with ProcessBuilder...
This answer might be helpful: https://stackoverflow.com/a/24205035/8026752
Using the lookupAllHostAddr method of DNSNameService works for me, and returns all IP addresses by hostname. Maybe it will also help with finding all hostnames by IP address, but it seems this depends on DNS server configuration. In my case I even couldn't find all hostnames using nslookup, so I couldn't test it, so I'm not sure about this solution.
One suggestion is that lookupAllHostAddr is not static method, so you should use it like this:
InetAddress[] ipAddress = new DNSNameService().lookupAllHostAddr("hostname")
Also, from my perspective, this link could be interesting (it's also information from the same answer thread mentioned by me above, I just summarize it a bit):
https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html
On the linked page you can find properties to disable lookups caching:
sun.net.inetaddr.ttl - you should add it to JVM start command line like this: -Dsun.net.inetaddr.ttl=0, 0 here means that hostname will be cached for 0 seconds
networkaddress.cache.ttl - you should add the needed value to the java.security file located at %JRE%\lib\security
A bit more info can be found here also:
http://www.rgagnon.com/javadetails/java-0445.html

Regex submitting with empty string

I have the following REGEX that I'm serving up to java via an xml file.
[a-zA-Z -\(\) \-]+
This regex is used to validate server side and client side (via javascript) and works pretty well at allowing only alphabetic content and a few other characters...
My problem is that it will also allow zero lenth strings / empty through.
Does anyone have a simple and yet elegant solution to this?
I already tried...
[a-zA-Z -\(\) \-]{1,}+
but that didn;t seem to work.
Cheers!
UPDATE FOLLOWING INVESTIGATION
It appears the code I provided does in fact work...
String inputStr = " ";
String pattern = "[a-zA-Z -\\(\\) \\-]+";
boolean patternMatched = java.util.regex.Pattern.matches(pattern, inputStr);
if ( patternMatched ){
out.println("Pattern MATCHED");
}else{
out.println("NOT MATCHED");
}
After looking at this more closely I think the problem may well be within the logic of some of my java bean coding... It appears the regex is dropped out at the point where the string parse should take place, thereby allowing empty strings to be submitted... And also any other string... EEJIT that I am...
Cheers for the help in peer reviewing my initial stupid though....!
Have you tried this:
[a-zA-Z -\(\) \-]+

Detecting all available network's broadcast addresses in Java

For my project I wanted to get a list of all available broadcast addresses so I could broadcast a request and my other application located on other computer in the unspecified network would respond and to get the list I (now using little modified version with contribution of Mike) came up with this:
private ArrayList<InetAddress> getBroadcastAddresses() {
ArrayList<InetAddress> listOfBroadcasts = new ArrayList();
Enumeration list;
try {
list = NetworkInterface.getNetworkInterfaces();
while(list.hasMoreElements()) {
NetworkInterface iface = (NetworkInterface) list.nextElement();
if(iface == null) continue;
if(!iface.isLoopback() && iface.isUp()) {
System.out.println("Found non-loopback, up interface:" + iface);
Iterator it = iface.getInterfaceAddresses().iterator();
while (it.hasNext()) {
InterfaceAddress address = (InterfaceAddress) it.next();
System.out.println("Found address: " + address);
if(address == null) continue;
InetAddress broadcast = address.getBroadcast();
if(broadcast != null) listOfBroadcasts.add(broadcast);
}
}
}
} catch (SocketException ex) {
return new ArrayList<InetAddress>();
}
return site;
}
It works quite well for reqular LAN however when it comes to the WiFi LAN it just skips the second while loop after one step because of having address equals null even though when I used System.out.println(interfaceItem) just to view what interfaces are being gone through it wrote wireless LAN's name and my IP corresponding to the network.
EDIT 1:
This is the output where 172.16.1.104 is my IP in the wireless network. The problem appears ONLY on my notebook with Wifi. The output is from my notebook where I mostly use wireless and sometimes I use UTP to connect with my friend. There is also one network interface of VirtualBox on my notebook.
Could you tell me what's wrong with it? Thank you!
Note: So it turns out that this might be problem for my notebook in particular and the code works for everybody else in general, I love this kind of problem :-) Seems like a dead end to me but thank for help anyway :-)Still love you! ;-)
I think you'll need to iterate across all the addresses, and additionally check if the broadcast address is null as well.
Consider that you might have addresses that you aren't expecting assigned to the interface as well. On my Linux system, with your code the first address I see is an IPv6 address, with a null broadcast (since there is no such thing as an IPv6 broadcast - though you can use multicast to achieve the same effect).
You need to completely remove the 1st way section of code. When you continue; there you'll go to the next interface instead of considering the possibility that there are two addresses.
The other reason why you always want to iterate all the addresses that can have broadcasts is because you need to consider that you might have addresses on two networks assigned to an interface. For example, you might have an interface with both 192.168.0.1/24 and 172.16.0.1/24 assigned.
Also, consider using a Set to store the broadcast addresses to protect against the case where you might have two addresses on the same subnet assigned.
Finally, since using broadcast addresses will restrict you to talking only to hosts that have an IP address in the same subnet, you might miss hosts that are not configured properly with the same subnet/netmask. So you might want to consider using multicast for this; you could use the IPv4 (or IPv6) all nodes multicast addresses to reach all hosts on the subnet, regardless of the configured address. (224.0.0.1 and FF01::1, respectively)
Edit: You also have a bug on the 2nd way, related to your use of the iterator. Since you're getting a new .iterator() each time around the for loop, you're lucky there isn't an infinite loop here. I changed your code to this, and it works for me:
$ cat Broadcasts.java
import java.net.*;
import java.util.*;
public class Broadcasts
{
public static void main(String[] args)
{
HashSet<InetAddress> listOfBroadcasts = new HashSet<InetAddress>();
Enumeration list;
try {
list = NetworkInterface.getNetworkInterfaces();
while(list.hasMoreElements()) {
NetworkInterface iface = (NetworkInterface) list.nextElement();
if(iface == null) continue;
if(!iface.isLoopback() && iface.isUp()) {
//System.out.println("Found non-loopback, up interface:" + iface);
Iterator it = iface.getInterfaceAddresses().iterator();
while (it.hasNext()) {
InterfaceAddress address = (InterfaceAddress) it.next();
//System.out.println("Found address: " + address);
if(address == null) continue;
InetAddress broadcast = address.getBroadcast();
if(broadcast != null)
{
System.out.println("Found broadcast: " + broadcast);
listOfBroadcasts.add(broadcast);
}
}
}
}
} catch (SocketException ex) {
System.err.println("Error while getting network interfaces");
ex.printStackTrace();
}
// return listOfBroadcasts;
}
}
Another problem you may run into is the try/catch around basically the entire function, which would cause this code to stop if it hit something unexpected. It would be better to surround possible failure points with a try/catch and do something sane (like skip the interface or address), but I didn't look at which methods can throw exceptions.
Edit 2: I misread your code; your iterator was fine. ;-) The problem (which I pointed out earlier) was that your 1st way is short-circuiting your 2nd way; since it hits the continue; statement, if the first address is null you don't even try to loop through them all.
In any case, run with those println statements and post the results if you're still having trouble.
Edit 3: OK, I give up. ;-) Based on the output you posted, it looks like you are running into a bug in the NetworkInterface class.
I don't know if it would help to turn off the preferIPv4Stack option, but you should test that. I searched around a little bit for bug reports that describe this behavior and could not find any.
Since you're on Linux, you could always take the fallback approach of shelling out and calling something like:
/sbin/ip addr | perl -ne 'print "$1\n" if $_ =~ /inet.* brd ([0-9\.]*)/'
... which should return you a list of broadcast addresses.
Edit 4: I just noticed in the JavaDoc for NetworkInterface there is a getSubInterfaces() call. Maybe you need to call this in order to make sure you get all the addresses? (it might help to post the output of /sbin/ip addr and /sbin/ifconfig)
Edit 5: Regarding the just-added bounty. (This question is over a year old!) Could someone please run the code in my answer above (edited to make it easy to copy/paste/run) and tell me if it works? If it doesn't, please edit the question and note the exact errors/problems.

Categories