Google "hacker" honeypots explained

[Ad box removed; this image serves to flag pages that need to be updated in my log file.]

Slashdot did it again:  they're reporting on an alleged Google-related security issue by linking to a Web site so uninformative, it requires multiple careful readings to figure out just what the Hell they're blathering about; and once figured out, the issue turns out to be much less exciting than it sounds.  Bonus points for abusing the word "hackers" to mean not even regular script kiddies, but some pathogenic life-form even more unsophisticated.  Here's my take on what the actual issue is.

So, there are a lot of popular software packages used to run Web sites, right?  And some of those are insecure, right?  The attack is...  wait for it...  if you're running an insecure package on your Web site...  someone could find that out by searching for the package name, or some phrase that typically appears in its output, with Google!  For instance, if BackPage Server Extension version 1.2.3 is known to be insecure, someone could type "BackPage Server Extension 1.2.3" and find sites that are using it, to point their scripts at.  Such people are called (by the article) "Google hackers", and such searches are called (by the article) "attacks" - the searches themselves, before the "hackers" even actually do anything.  The article is about setting up a honeypot for such "attacks":  you put up a Web site that claims to be running an insecure software package, see how many people are searching for it in Google, and then feel very self-satisfied, though why I'm not sure.

Something I must admit I have noticed in my referrer logs is the large number of people who are searching for the phrase "index of" alongside keywords typical of porn.  These people appear to be looking for auto-generated server directory listings, so that they can view what they consider to be the good stuff without having to look at the crummy page design of the people who put up the Web site.  Some of the searches even show attempts (usually with invalid query syntax) to exclude HTML files from the results.

I have mixed feelings on that.  On the one hand, the people who do it seem to be doing something sneaky.  On the other hand, I also am annoyed by the gratuitous "This Virtual Directory does not allow its contents to be listed" messages I get from IIS sites when I attempt to bypass broken HTML myself; sites that do that impress me as being run by jerks.  Part of the issue may be the reasons involved:  wholesale use of Google to attempt to pre-emptively bypass site design makes you look like a scumsucking freeloader, whereas I only go hunting for directory listings on specific sites, after giving the sites' own navigation a chance to work for me instead of against me.  After I've given them a chance and they've failed, I think they should at least let me make it work properly for myself.  I guess what it comes down to is that I'd like to see more honour on both sides.  Sites should permit harmless exploration like directory listings, but users shouldn't reach for such methods as a first resort.

I find the "index of" searchers annoying, and I consider myself fully justified in catching them by referrer header and redirecting them.  You can see an example of where I send them here; that's what you'd see (instead of the page you thought you were getting) if you tried to get to a page on my site by searching for "\"index of\" nude illegal".  It always is something like "nude illegal" they're looking for - that or unauthorized MP3s.  I don't really have anything against against nude undocumented immigrants, nor MP3s, but the attempt to take such material from Web sites while skipping over the site authors' intended presentation, seems sleazy and antisocial to me.  If someone is going to be generous enough to share their contraband with you, and the entire world, you should be neighbourly about how you exploit the offer.  But I hesitate to call searching for "index of" an "attack" and I wouldn't dream of calling the people who do it "hackers".  Those are strong words, and inapplicable.

If you wanted to make money on the Web, you could make a lot of it by setting up fake pages purporting to be Apache auto-generated indexes full of porn, and then redirecting readers into a maze of twisty little ad-links pages, all different, once you had them hooked.  That would be a much more interesting - and profitable - honeypot than the one in the article, and I think it would be morally justifiable, because all your victims would be people who were trying to cheat you in the social contract first.

Incidentally, I also get a great many hits from Google China (or whatever it's called) - and only from Google China, even though the query also hits me on other search engines - looking for the phrase "18 girl".  I'm not sure what's up with that.  The page that it hits has now gone cookies-forbidden, in the hope that I can at least inflict some education.  Most of these people probably can't read English well enough for it to sink in, though.  That may be why they're bothering to click through, and why all the hits are from China:  an English-literate person might know from the search result summary that my page isn't porn, and the Great Firewall (blocking other sites that attract non-Chinese traffic, but evidently not blocking mine) might be a factor as well.

[Ad box removed; this image serves to flag pages that need to be updated in my log file.]

Comments

george of the jungle from 213.84.157.209 at Mon, 02 Apr 2007 11:19:32 +0000:
Get over it man. Any professional website has their directory listings made invisible and that's the way it should be, directory listings are something that belong to the pre-2K internet. Websites that haven't done this are being "attacked" by leechers and it's their own damn fault.

Matthew Skala from 67.158.75.233 at Mon, 02 Apr 2007 12:30:56 +0000:
The correct thing to do is make those transactions return something useful - not block them with a meaningless error message.

Guy from 71.61.13.35 at Sun, 17 Aug 2008 00:02:29 +0000:
"...you could make a lot of it by setting up fake pages..." Haha, this has been done a lot. Search Google for "index of mp3" and then a song title or something. You'll get tons of pages that are fake directories that have every song ever made listed but when you click they redirect to some page where you can buy it. "skipping over the site authors' intended presentation, seems sleazy and antisocial to me. If someone is going to be generous enough to share their contraband with you" The thing is, lots of people will have, say, personal pages that just have a few links and pictures on them. But they will make directories on their sites without linking to them so they can have a personal backup of their mp3s. You can actually find mp3s with Google if you skip over the fake sites [usually .ru or have "mp3" in the domain] and most of those sites are completely unrelated to piracy or what not. Just normal sites with authors thinking they'll make a private page nobody can get to but them.

ep thorn from 67.84.176.35 at Sat, 13 Jun 2009 05:00:39 +0000:
"The correct thing to do is make those transactions return something useful - not block them with a meaningless error message."

Well as someone pointed out, LOTS of crappy MP3 sites are doing something like that. Typically these are sites that rip off artists and recording labels, selling songs for pennies on the dollar. MP3 Fiesta and crap like that. But they don't "add" anything useful, except internet pollution. Your suggestion is rubbish, except from the perspective of a marketing department that is too stupid to figure anything better out, and too selfish to care. These sites are almost always virus-ridden, spam-inducing domains that take money and keep it all to themselves, managing to give us the worst of both worlds, while making it impossible for users to find anything ending with an Mp3 extension, even if it's not under copyright and is sitting in an index that is purposely open.

Matt from 216.59.249.202 at Sat, 13 Jun 2009 05:22:16 +0000:
I think you're confusing a couple of different points, "ep thorn." When I said that those transactions (the URLs pointing at directories instead of the files within) should do "something useful," I meant useful to the reader, not useful to the marketers. Such URLs should point to an index of what's in the directory; if the Web site people don't want it to be auto-generated, they're welcome to make their own, but it really does have to be an index of what is in *that directory*, not just a redirect to the front page of the site or an opaque error message. The existing kneejerk habit of treating directory listings as a security threat has to stop.

I also wrote about the idea of people using redirects on these URLs for marketing purposes, for instance as search engine bait, but I thought it was already abundantly clear that I'm not on the side of the people to do such things.

Add Comment

Your name (required):
Your email address or URL (optional):
Type "bonobo" for anti-spam purposes:

This form is for posting public comments to be read by other people who visit this Web site. If you have a software support question, or other material directed to the page author instead of to the general public, please send email instead.

All the data you enter, and your IP address, will be saved and displayed. Don't enter secret information. HTML is not accepted; it will be displayed as plain text. Your comment will only be added if you enter valid data in all required fields; if it isn't, use the back button and try again.

I, and I alone, reserve the right to remove postings for any reason.

Copyright © 2005, 2007 Matthew Skala
Updates to this entire site: [RSS syndication file]
Updates to this category (spam) only: [RSS syndication file]