ok, this may sound daft but just use robots.txt file to control them. THEN the ones that ignore the robots.txt file can be blocked … read the following article
http://www.pixel2life.com/publish/tutorials/472/log_and_block_bad_bots_that_disregard_robots_txt/

I have used a simular method for a long time and have blocked ALL bad bots. It updates itself.