Tag Archives: harverster

Spam Poison

I’ve never understood the rationale behind the various web scripts that claim that they are ‘poisoning’ spammers lists.

They claim that, by putting a bunch of garbage email addresses on a web page and encouraging spammers to harvest   the page, they are poisoning a spammers email list and thus making the list useless so that they will discard the list.

Where did they ever get the idea that spammers CARE about the good & bad addresses that they harvest.   Considering how much it costs them to send spam, they wouldn’t are one lick if 90% of the addresses on their list is valid or not.   Especially since most spam sent these days is done by infected personal computers controlled by ‘bot herders‘.

I often watch the log file on my mail server … and observe the servers trying to deliver mail to mangled and non-existent addresses.   It’s clear the sending system is just blasting names at my server without any consideration if the address exists or not.   My mail server software (sendmail) had an option to throttle the connection of any system that gets more than a certian number of bad addresses.   I also have a script that monitors for that behavior and adds a firewall rule to completely block such systems.

My advice to those who create such scripts … focus your energies elsewhere … right now you’re just wasting your time.

Oh, by the way, for those of you who are trying to use “User-Agent” lists to block harvesters … don’t waste your time on that either.   Harvesters, like spammers, will never clearly identify themselves as such … and will use a completely legitimate user agent when spidering your website.   Additionally, they will absolutely ignore any robots.txt file that you have in your site.

There are, however, systems that use a somewhat similar technique to stop spammers … but are much more effective.   Instead of just trying to poison a spammers list … they use traceable email addresses.   This means that, when a harvester visits a page with this traceable email address, they log the IP the harvester is using and the email address that is harvested.   This way, when spam is sent to the address, they know when and how the address was harvested and where it was sent from.   Theoretically the spammer is reported both to the ISP that the mail was sent through AND the ISP that was used to harvest the address.   Project Honeypot is one such anti-harvester organization.