You are not logged in.

#1 2003-09-16 00:00:02

jskier
Member
From: Minnesota, USA
Registered: 2003-07-30
Posts: 383
Website

Running Apache, question about spiders and bots

I'm running Apache and set up a php traffic analzyer. I see that bots and spiders are able to track pages and directories which are not at all linked on any page. How on earth do they figure out my directory structure? And how do I stop it, it makes me uneasy (tried metatags, this only stops them from posting the content). Any help would be appreciated, thanks ahead,

jskier


--
JSkier

Offline

#2 2003-09-16 07:37:10

andy
Member
From: Germany
Registered: 2002-10-11
Posts: 374

Re: Running Apache, question about spiders and bots

Some robots (or spiders) simply guess typical names. But I think all adhere to
http://www.robotstxt.org/wc/robots.html (or simply plug robots.txt into google - you'll find a lot)
or was that what you meant with meta-tags ?

The next thing you can do is look into possibilities listed on :
http://httpd.apache.org/docs/howto/auth.html
especially "access control". If you only skim over this page ;-), here is an important snippet :

These directives may be placed in a .htaccess file in the particular directory being protected, or may go in the main server configuration file, in a <Directory> section, or other scope container.

But other than that : not linking pages or not showing info does not protect you in any way, and the web is not designed to be that way.

Offline

#3 2003-09-17 03:02:56

jskier
Member
From: Minnesota, USA
Registered: 2003-07-30
Posts: 383
Website

Re: Running Apache, question about spiders and bots

Thanks, the link was useful. I still don't understand how these bots are getting into less common folder names- oh well, at least the folders are secure now  tongue
This sort of thing never happened in IIS but at least I don't have to worry patching every otherday.

jskier


--
JSkier

Offline

Board footer

Powered by FluxBB