Regular Expressions: Finding Email Addresses

Recently I fixed the Sendmail configuration on one of our boxes, and I’m now inundated daily with over 1700 return mails from old and expired email addresses. This is the second day so I decided to combat it:

  1. Create a procmail script to redirect all the bounced mails into a file.
  2. Grab all the email addresses from that file and create SQL statements to disable sending mail to those users of our site.

With Google’s help, I came up with the following procmail recipe, and stuffed it into my .procmailrc:

:0
* ^From: .*MAILER-DAEMON.*
* ^Subject:.*(Undeliverable|failure notice|Returned mail:|Delivery (Status )?Notification|Mail System Error|Delivery fail|Nondeliverable mail|Message status – undeliverable|Mail Delivery Problem|Notification d’état de la distribution).*
RETURNS

That should catch almost all the returns sent to my inbox.
I used the following code to extract the emails from the RETURNS file. It can probably be done better, but this works well enough.

awk -F “< " '// {print $2}’ ” ‘{print $1}’| sort|uniq

I then grep out bogus lines such as the ones smtp servers add, opened the file in vi and added SQL statements around each email address. I expect a lot less email in my inbox tomorrow..

Here’s a handy online regex tester if you want to test a regular expression easily.

Sendmail – Masquerading And Relaying

Ah! The joys of Sendmail cf configuration. How arcane can the following get?

R$+ $@ $>93 $1

That line tells Sendmail to masquerade the headers of an email as another domain. Luckily I just generated another sendmail.cf with my install-sendmail script and did a vimdiff of both files. Comments are helpful too, and takes some of the bite out of the .cf file.

Jeremy Zawodny's blog: The Facts… Sort of.

I guess Jeremy never said it explicitly but I (and probably lots of others) assumed Yahoo Finance used MySQL in the delivery of data, but it’s only used in the back-office which is a lot different!
Where I work, Tradesignals.com, uses MySQL to store our market data and it’s obviously used to deliver data. Then again, the Futures stock market is a lot smaller than the range of data Yahoo offers. While I can’t reveal how much data we go through, American futures trading generate a lot of information!

Linux: Improving Interactivity

Kernel Trap reports on recent activity to make Linux more responsive. It’s looking good!

I’d handwavingly describe both your patch and sched-2.5.64-a5 as 80% solutions, and the combo 95%.” Robert Love agreed, “This is great for me, too. I played around with some mp3 playing and did the akpm-window-wiggle test. It is definitely the smoothest.”