[6142] in www-talk@info.cern.ch

home help back first fref pref prev next nref lref last post

Re: No Nasty Robots! [Was: Full-text indexing for WWW conference

daemon@ATHENA.MIT.EDU (Martijn Koster)
Thu Oct 13 05:25:52 1994

Date: Thu, 13 Oct 1994 10:22:42 +0100
Errors-To: listmaster@www0.cern.ch
Errors-To: listmaster@www0.cern.ch
Reply-To: m.koster@nexor.co.uk
From: Martijn Koster <m.koster@nexor.co.uk>
To: Multiple recipients of list <www-talk@www0.cern.ch>


Nick Arnett wrote:

| If you're interested, please send me e-mail.  Our spider is quite simple at
| the moment. 

| The spider will hit your server fairly hard.  We have a real-time indexing
| engine and a T-1...

Groan -- Of course I agree with Roy Fielding, although I appreciate Nick
asks for a bilateral agreement beforehand.

And Dan Connolly wrote:

> When we at HaL built our CD ROM of abstracts of 10,000 web documents
> (with links to the documents themselves, with our OLIAS browser on the
> CD-ROM.. ask jps@hal.com for details), we implemented a "spider" that
> visited the various sites in an order such that no site was visited
> more than once per minute.

As always I'd like to know any details about any new spiders that go
around. What machine do they come from, what User-agent etc...  at
least the net.folk will know what's going on then.

> Vince consulted the published guidelines[1], I believe. You will not
> please the net.folk if you blatantly disregard them.
> 
> Dan
> 
> [1] "Guidelines for Robot Writers"
> 	Martijn Koster 
> 	http://web.nexor.co.uk/mak/doc/robots/guidelines.html

:-)

-- Martijn
__________
Internet: m.koster@nexor.co.uk
X-400: C=GB; A= ; P=Nexor; O=Nexor; S=koster; I=M
X-500: c=GB@o=NEXOR Ltd@cn=Martijn Koster
WWW: http://web.nexor.co.uk/mak/mak.html

home help back first fref pref prev next nref lref last post