Regular Expressions: Finding Email Addresses

Recently I fixed the Sendmail configuration on one of our boxes, and I’m now inundated daily with over 1700 return mails from old and expired email addresses. This is the second day so I decided to combat it:

  1. Create a procmail script to redirect all the bounced mails into a file.
  2. Grab all the email addresses from that file and create SQL statements to disable sending mail to those users of our site.

With Google’s help, I came up with the following procmail recipe, and stuffed it into my .procmailrc:

* ^From: .*MAILER-DAEMON.*
* ^Subject:.*(Undeliverable|failure notice|Returned mail:|Delivery (Status )?Notification|Mail System Error|Delivery fail|Nondeliverable mail|Message status – undeliverable|Mail Delivery Problem|Notification d’état de la distribution).*

That should catch almost all the returns sent to my inbox.
I used the following code to extract the emails from the RETURNS file. It can probably be done better, but this works well enough.

awk -F “< " '// {print $2}’ ” ‘{print $1}’| sort|uniq

I then grep out bogus lines such as the ones smtp servers add, opened the file in vi and added SQL statements around each email address. I expect a lot less email in my inbox tomorrow..

Here’s a handy online regex tester if you want to test a regular expression easily.

Leave a Reply

%d bloggers like this:

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.