Fog Creek Software
Discussion Board




Hiding website email address from spammers?

Dear Forum

On my website http://www.lingolanguage.com I use mailto: tags followed by my email address and a subject line. I realize this format makes it harvestable by spammers but I hoped the unique subject line would enable the adaptive spam filter to pass through the genuine enquiries and to block the spam.

This works, but the Msblaster worm has a nasty side effect. It uses harvested valid email addresses as reply-to and sender addresses in spam. Therefore I get auto-replies saying my email has been bounced because it contained a virus, even though I never sent the email in the first place.

What's the best way for a webmeister to allow email and avoid spam? I thought of:

1. Use escape notation for the mailto link (%65%66%67 instead of ABC). But I think this would have no effect.

2. Use form-based email. Then they can't harvest it.

3. Use a GIF to show the email address. This is a bit unfair on the visually impaired.

4. Use Javascript to dynamically generate the email address. But then people not running javascript can't contact me.

Any ideas?

Bill Rayer
Tuesday, August 26, 2003

My*NO*Email@Ser*SPAM*ver.com

Don't know if it works, but I don't have any spam yet.  Make the user type it in to email you.  They can figure out what to filter.

Elephant
Tuesday, August 26, 2003

Put a robots.txt telling the spamhaus robots to leave your site alone.

They surely honour that, right?  I mean spammers are the epitome of honorable.

Dennis Forbes
Tuesday, August 26, 2003

(1) I've seen this done on some sites and heard people testify that it worked for them. However, if it becomes popular enough, the spammers will start harvesting those too.

(2) If you can do forms based, it's a pretty good solution. The drawbacks are (a) you have to trust users to give you a real return address and (b) not flood you, and (c) users can't save a carbon as easily as email.

(3) Well, if you break it into two images and use alt text, visullay impaired users should have about the same impaired usability as everybody else would on this one. Elephant's human but not program readable text seems like a better route.

(4) Like you say, it only works for JS clients. If you can generate addresses, perhaps a new address every week? (although, if users put you in their address book, the most you could do is tighten your filters on your older addresses)

GersonK
Tuesday, August 26, 2003

Use JavaScript. Add a nospript tag saying "Since we're not in a perfect world, the email address can not be shown without JavaScript. You may contact us at info @ our domain name."

You can change the wording to something else..

NoScript
Tuesday, August 26, 2003

Give users 2 options.  Form mail or use CGI to dynamically generate a new email address for each user.  I don't right now, but you might consider taking advantage of http://www.spamgourmet.com/ to automate the process.

You probably want to make sure that people are properly hiding your email addy.  I'm more likely to give a URL than an email address at this point not out of a sense of self-promotion, but more because I can be sure that nobody's getting addresses for spamming off of the URL.

Anything javascript and/or client based, if widely used, will eventually be cracked by spammers and other undesirables.

The biggest problem for me is that once a can of worms has been opened, you ain't getting them back into the can.  So I've been able to reduce my spam intake, but given that I've had the same mail address for at least 8 years, your address has wide circulation.  Most of SoBig mailings I got were from the unagumented version of my email address, not from the version on my webpage.

w.h.
Tuesday, August 26, 2003

If you set up adresses like info@lingolanguage.com or webmaster@, or info@, hiding your adress on your site is pointless because spammers will try out those adresses anyway since your site is connected.

Johnny Bravo
Tuesday, August 26, 2003

I can say, as somebody who has tracked the spam that has made it to my accounts, that some spammers do obey robots.txt simply because there's software out there to detect when they do that.

But not all of them.

w.h.
Tuesday, August 26, 2003

I use form-based mail on my websites, and automatically CC a copy back to the sender.  I've had the same valid email address for a few years and get very little spam.  Curiously, I did actually get spam submitted to me via my contact form for the first time last week, but I honestly get so little spam that it was more amusing than anything else.  (=

Sam Livingston-Gray
Tuesday, August 26, 2003

I use a fairly simple Javascript.  I do a pattern replace on "abuse@localhost" ;)  I'm sure a harvester could code something to decipher it, but there's so many exposed addresses, I doubt it would be worthwhile. 

I also have a webform for non-Javascripters. 

I've been doing this for a year and had no complaints.

Making users type in images they see in a .gif is great for technical audiences, but ours would have a fit.

Lee
Tuesday, August 26, 2003

If you're running on Apache you can modify .htaccess to block a majority of known spambots.  Mark Pilgrim has a good how-to on his site (although he blocks a lot more than spam harvesters so don't just copy and paste).  The site is http://www.diveintomark.org/ you'll have to dig around in the site to find it, but it shouldn't be that hard.

Escaped notation isn't quite perfect, but you can use the encoding you're sending your text in already (you do specify the encoding don't you?) to mask the text.  Its essentially the same thing.  There are a few utilities out there, Zeldman has mentioned a few of them ( http://www.zeldman.com/ ) but I think his links were all Mac specific (as is the one I use).

You're right that robots.txt may be ignored, but .htaccess can't be (although certain things can be modified to avoid it (thus Mark's tricks banning entire IP spans and whatnot).  And harvesters may learn to understand encodings, but I'm not aware of any that do, so its a safe alternative for now.

IIS implementations can look at Mark's site for some links to others who performed similar magic on their sites.

And do get rid of webmaster, info, mail and yourdomainname emails, they just draw default traffic. 

We had a real spam problem which I finally addressed last month by installing Spaminator and sending all spam to a spam catcher (an email address specifically for dealing with spam).  I have a bayesian filter there on that email which sorts out the crap from the few (none so far) real emails which got mislabeled.  I've found the combination of all these methods to be highly effective. 

For the three email addresses I monitor I've received only 4 pieces of spam in the last week. YMMV.  Good luck.

Lou
Tuesday, August 26, 2003

Thanks for the feedback.

I think solutions where you require the sender to manually edit the email address only work if your senders are 100% technical.

Thinking more about javascript, I've always assumed a spammer would download the raw HTML and parse it and grab all the email addys. Escapes probably wouldn't work because converting them back to ASCII is trivial. But what's to stop a spammer executing the javascript in the downloaded HTML to see what it expands into? Javascript's pretty standard. Then no javascript system could work, surely?

I also read if your address includes the word 'spam' or 'no-spam' you don't get much because the harvesters exclude these terms. Eg if I registered www.spamfree-lingolanguage.com then email probably wouldn't get sent there.

Sounds like forms are the only reasonable system. Guess that's what the nice Mr Joel is using :)

Bill Rayer
Tuesday, August 26, 2003

I should add that I use a hosting service, so modifications to the way the web server works are unlikely. Also I'm not a hosting expert, HTML and javascript are my limits!

Bill Rayer
Tuesday, August 26, 2003

"Sounds like forms are the only reasonable system. Guess that's what the nice Mr Joel is using :)"

He started out using escaping, and is now using forms. It's likely that the spammers see through that, and it was causing complaints.

Brad Wilson (dotnetguy.techieswithcats.com)
Tuesday, August 26, 2003

Bill Rayer:
"Thinking more about javascript, I've always assumed a spammer would download the raw HTML and parse it and grab all the email addys."

You need to make the distinction between a spammer and a bot.  Nothing is preventing a spammer from coming directly to your site and viewing your address, even if you encode it.  The same goes with making the email a gif file.  Right? The whole point is to prevent bots from crawling around and picking up your address to add to their lists.  So, you bring up a good point that (in the case of JavaScript) the code could just be extracted and evaluated.  While true, this is very impractical.  The bot would A) need to know the name of the JavaScript function in the first place, and B) would have to parse and evaluate EVERY line of JavaScript on all the pages it crawls in hope that something will look like an email address when it gets evaluated.  I don't think this would be practical for a bot in search of high volumes of addresses.

Beer to anyone who can explain (with exact detail) how the following JavaScript code does its magic!

Paste the following code into an HTML document:

############BEGIN CODE#############

<script type="text/javascript">
//<![CDATA[
<!--
var x="function f(x){var i,o=\"\",l=x.length;for(i=l-1;i>=0;i--) {try{o+=x.c" +
"harAt(i);}catch(e){}}return o;}f(\")\\\"function f(x,y){var i,o=\\\"\\\\\\\""+
"\\\\,l=x.length;for(i=0;i<l;i++){if(i<109)y++;y%=127;o+=String.fromCharCode" +
"(x.charCodeAt(i)^(y++));}return o;}f(\\\"\\\\\\\\\\\\n\\\\\\\\037\\\\\\\\02" +
"1\\\\\\\\001\\\\\\\\033\\\\\\\\035\\\\\\\\024\\\\\\\\010Pvqlslgc'3/t7qixy\\" +
"\\\\\\034\\\\\\\\177\\\\\\\\007JHBA[^\\\\\\\\tRXV_bU.!\\\\\\\\005%</)6\\\\\\"+
"\\1770::\\\\\\\\005y}+\\\\\\\\010\\\\\\\\027\\\\\\\\t\\\\\\\\002T7O6\\\\\\\\"+
"036\\\\\\\\006R\\\\\\\\005\\\\\\\\034[\\\\\\\\034 avgkai|0s`6p\\\\\\\\177}l" +
"D\\\\\\\\002\\\\\\\\036\\\\\\\\013\\\\\\\\001v\\\\\\\\016\\\\\\\\020i]A\\\\" +
"\\\\026\\\\\\\\\\\\\\\\SX\\\\\\\\036)6ezg(tiev~t\\\"\\\\,109)\\\"(f};)lo,0(" +
"rtsbus.o nruter};)i(tArahc.x=+o{)--i;0=>i;1-l=i(rof}}{)e(hctac};l=+l;x=+x{y" +
"rt{)05=!)31/l(tAedoCrahc.x(elihw;lo=l,htgnel.x=lo,\\\"\\\"=o,i rav{)x(f noi" +
"tcnuf\")"                                                                    ;
while(x=eval(x));
//-->
//]]>
</script>

############END CODE#############

Good Luck, you're gonna need it :-)

Eddy
Tuesday, August 26, 2003

There are a bunch of javascript scramblers available -- for yet another one, check out AddressScrambler http://www.sourceforge.net/projects/spamgourmet which is a side project on http://www.spamgourmet.com  This one has a helper page that will generate the necessary javascript for you so you can paste it in place.  This can save a lot of time, and make things easier to maintain.

I obfuscation through encoding/scrambling will work reasonably fine for as long as we want it to -- as has been said, it's quite impractical for spammers to get past this.  The fact that there are so many techniques and variants should immunize users from any concerted attempt to harvest such obfuscated addresses.

namewithheldbyrequest
Tuesday, August 26, 2003

Also note that the anti-spambot software is like spam filters..  It's an arms race.

Flamebait Sr.
Tuesday, August 26, 2003

Your best bet would be something like:


bill (at) lingo-stripthis-language.com

Then tell your users to replace (at) with @ and remove -stripthis-.

Mickey Petersen
Tuesday, August 26, 2003

Why did a few bad apples have to spoil it for the bunch?

Guy Incognito
Tuesday, August 26, 2003

Tragedy of the commons, man, tragedy of the commons...

Grumpy Old-Timer
Tuesday, August 26, 2003

Form is good. Otherwise change the email address every hour/day/week/month/whatever and invalidate the previous one after, say, another 24 hours. This could easily be automated.


Wednesday, August 27, 2003


Scramblers are useless against the plethora of robots based on IE's Webbrowser ActiveX control.

Leonardo Herrera
Wednesday, August 27, 2003

Encoding my e-mail address as html entities did cut down my spam flow, but doubtless that won't continue to work forever.  The bayesian filter keeps most of it out of my mailbox though.

I found it to be a useful enough trick that I built a little tool that scrambles addresses for me.  You can download it at http://www.lazarusid.com/encoder.shtml

Clay Dowling
Wednesday, August 27, 2003

Eddy:

Thanks for the indecipherable javascript! I appreciate your point about the distinction between a spammer and a bot. A spammer can visit sites and read addresses in the same way as a legitimate user, and as you say this can't be stopped. I should probably have said 'hiding addresses from spambots' instead of from spammers.

Mickey Petersen:

Although 'bill at lingolanguage dot com' would be safe against bots, I prefer it when visitors can email using a link or a button. It's hard enough getting visitors without adding extra layers.

Other thoughts:

The point about about the 'bad apples' is so true. It's frustrating having to spend time and money just to 'fix' something that was never broken.

Encoding addresses using Javascript:

The more I think about this, the more I wonder if it works. The spambot grabs the HTML and parses it looking for email addys. In this case javascript would successfully hide the addy. But... what if the spambot submits the HTML to a browser control and then harvests it *after* the javascript has been interpreted? Then it doesn't matter how clever the javascript is, since it's been expanded into the proper address.

If I was writing spambots, I *would* interpret the javascript. It could be done automatically, you wouldn't have to write an interpreter, and the javascript that generates the address would be standard javascript that would run pretty much anywhere (it would have to be, otherwise some browsers couldn't render the address).

If this scenario is correct, then using javascript to hide addresses is of limited use.

Conclusions:

1. Use forms, which gives people the one-button option.

2. Also show the address in a GIF, which makes it available to the casual reader.

Or...

3. Register a second domain, eg lingolanguage-nospam.com and openly use this in mailto: links.

Bye for now

Bill Rayer
Wednesday, August 27, 2003

mailto:info[AT]yourserver[DOT].com

Shane_from_Dominos
Thursday, August 28, 2003

Bill: "using javascript to hide addresses is of limited use"

I would just like to add one more point.  Since there are no "real" solutions yet, the whole issue IMO, is just staying ahead of the game.  Right now, using JavaScript does just that.  I'd bet that 90% or more of the spambots are just scanning for addys.  Of the other remaining 10%, I'd bet that only half of those are actually able to successfully retrieve the addys when hidden within JavaScript code.  So until the bots "wise up" so to speak, using the JavaScript method will hopefully keep us ahead of the game until there is a "real" solution to this whole mess.  Or we'll just have to find yet another trick to move ahead of the majority of the bots out there again.

Eddy
Thursday, August 28, 2003

You could also try this or one of the many alternatives:
http://www.jracademy.com/~jtucek/email/download.html

Nepherim
Monday, September 01, 2003

Address Encoder that looked good, but that I've not yet used:

http://www.wbwip.com/wbw/emailencoder.html

Entrepreneur
Tuesday, September 09, 2003

*  Recent Topics

*  Fog Creek Home