[129478] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: yahoo crawlers hammering us

daemon@ATHENA.MIT.EDU (Harry Strongburg)
Wed Sep 8 01:55:11 2010

Date: Wed, 8 Sep 2010 05:54:55 +0000
From: Harry Strongburg <harry.nanog@harry.lu>
To: nanog@nanog.org
Mail-Followup-To: nanog@nanog.org
In-Reply-To: <20100907201958.GM24371@sizone.org>
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org

On Tue, Sep 07, 2010 at 04:19:58PM -0400, Ken Chase wrote:
> This makes it look like Yahoo is actually trafficking in pirated software, but
> that's kinda too funny to expect to be true, unless some yahoo tech decided to
> use that IP/server @yahoo for his nefarious activity, but there are better sites
> than my customer's box to get his 'juarez'.

It's not uncommon at all for a web-spider to find large files and 
download them. I don't think there's some conspiracy at Yahoo to find 
warez; they are just opperating as a normal spider, indexing the 
Internet.

> ~500K/s (4Mbps+) for a 3 gig file is kinda... a bit harsh.

What speed would you like a spider to download at? You could configure 
the speeds to Yahoo's blocks server-side if you care enough. Ideally, 
request your customer doesn't throw large programs on there if you're 
concerned about bandwidth. 4 Mb/s isn't abnormal at all for a spider, 
and especially on a larger file.

> Is this expected/my own fault or what?

A little bit of both :)


home help back first fref pref prev next nref lref last post