LINUX.IE, website of the Irish Linux Users' Group
Tux rules!

   
Home
New Users
Articles
Download
Projects
Community
Vendors

  Print Version
Email to...
 
Archives:


planetILUG

Recent News

News Archive


Join the
ILUG
on FaceBook


Join the
ILUG
on LinkedIn


Join the
ILUG SETI
Group



















 
 :: Mailing Lists

[ILUG] MSN Search Beta

[ILUG] MSN Search Beta

John McCormac jmcc at hackwatch.com
Thu Nov 11 11:12:36 GMT 2004



On Thu, 11 Nov 2004, Paul Jakma wrote:

> On Thu, 11 Nov 2004, John McCormac wrote:
>
> > here.) The msnbot is so badly written that it does not use 304s and puts
> > excessive loads on webservers. So many webmasters have complained about it
> > that Microsoft even introduced its own robots.txt entry so that webmasters
> > can use a delay between pages being fetched by its scrapers.
>
> Ah, where would that be? I followed the URL in the client id it uses
> to http://search.msn.com/webmasters/msnbot.aspx, but there's nothing
> there on how to limit it except the standard robots.txt and robot
> meta tags.

It was on the mssearch forum on http://www.webmasterworld.com but the
syntax was something like

User-agent: msnbot
Crawl-delay: nn
where nn is the delay between fetches in seconds.

> I've sent a mail to their mail address asking them to rate limit, but
> received no reply - at this stage I'm considering barring the MSNBot
> altogether. It's responsible for 40% of hits to a site I maintain..

Imagine what it is like on whoisireland - it kept whacking the site every
few days because the gobshites in Microsoft couldn't build a 304 capable
robot. :)

Regards...jmcc




More information about the ILUG mailing list
Read this without the formatting.
                                                                                                    

 

Hosted by HEAnet


Maintained by the ILUG website team. The aim of Linux.ie is to support and help commercial and private users of Linux in Ireland. You can display ILUG news in your own webpages, read backend information to find out how. Networking services kindly provided by HEAnet, server kindly donated by Dell. Linux is a trademark of Linus Torvalds, used with permission. No penguins were harmed in the production or maintenance of this highly praised website. Looking for the Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!
RSS Version
Powered by Dell