Friday, May 22, 2009

Normalizing Email Addresses

I was going to write a post here describing this function, why I wrote it, and what it does. But then I re-read the comments that I'd added to the function and the examples I'd included in the code and said "Hey, this is perfectly documented and doesn't need any further explanation."

/// <summary>
/// Takes an email address and converts it to a normalized base form.
/// This is needed for some email addresses such as gmail where you can
/// add dots anywhere in the name as well as a + sign and tag anything
/// extra on to the email address.
/// Example email addresses to normalize:
/// [email protected]
/// [email protected]
/// [email protected]
/// These should all normalize to:
/// [email protected]
/// </summary>
/// <param name="emailAddress">Any email address</param>
/// <returns>A modified/normalized email address or the original string</returns>
static public string NormalizeEmailAddress(string emailAddress)
{
    // So far we only know about gmail.com that does this
    if(emailAddress.Contains("@gmail.com"))
    {
        string[] parts = emailAddress.Split('@');
        // If there are more than 2 @'s in the address then this is an
        // invalid email address so don't try and do anything to it.
        if (parts.Count() == 2)
        {
            string[] beforeAtParts = parts[0].Split('+');
            return beforeAtParts[0].Replace(".","") + "@" + parts[1];
        }
    }

    return emailAddress;
}

Ironically I love this gmail feature and I use it all the time and I think that most people use it for the good. Unfortunately there's a small element of the population who will use this for multiple registrations on sites so that they can troll and spam so I am doing this conversion as a preventitive measure. Note that on the sites that I've implemented this on I've still allowed the users to register with any address that they want. However, when comparing a registration attempt against a site that does not allow registrations with the same email address then this will prevent multiple registrations by the same person.

7 comments:

  1. Thanks, Guy. It seems like there's alot more you have to do to normalize an email address (esp if it isn't gmail).
    Also, how did you find out that gmail does this?

    ReplyDelete
  2. A friend of mine (Bill Brown) blogged about it. Very few people know about this feature of gmail.

    ReplyDelete
  3. This is a really stupid feature. An alias on a user's domain is fine
    [email protected]
    but part alias on an isp/mail domain is asking for trouble. It's irresponsible.

    ReplyDelete
  4. Thanks for highlighting this!

    ReplyDelete
  5. Mike - I disagree, I think that this is a great feature of Google's. It allows me to register a unique email address at each site on the web that I register at.
    If I suddenly start receiving spam from some unknown source then my email address will tell me which site I registered at that sold my email address or was a front for spam address collection.
    These unique email addresses also allow me to set-up filters on my email client to delete or categorize mail from each source.

    ReplyDelete
  6. Hi, just stumbled accross your post (because i'm going to implement something similar)
    So here are just some thoughts:
    There are quite a few providers, that have the plus (or similar) logik, letting you create variations of the same email, here's an overview: http://en.wikipedia.org/wiki/Email_address#Address_tags

    additionally, all the gmail logic extends to google.mail addresses (ie [email protected] = [email protected])
    similar iCloud.com = me.com
    hotmail uses the + syntax

    sorry if this is just a bother or a bore...

    best regards
    joerg

    ReplyDelete
  7. Hi Joerg - thank you for that update. I was not aware that the others that you mentioned had implemented that. I'll improve my algorithm to handle those as well. Good luck with your implementation. Feel free to reply with how you did it.

    ReplyDelete