[34232] in North American Network Operators' Group
Re: Scalable Mail solution with NAS
daemon@ATHENA.MIT.EDU (Adrian Chadd)
Wed Jan 31 15:28:35 2001
Date: Thu, 1 Feb 2001 04:16:54 +0800
From: Adrian Chadd <adrian@creative.net.au>
To: Matthew Zito <mzito@register.com>
Cc: nanog@merit.edu
Message-ID: <20010201041652.H36522@ewok.creative.net.au>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <0101311412541X.01062@sweeney.localdomain>; from mzito@register.com on Wed, Jan 31, 2001 at 02:12:54PM -0500
Errors-To: owner-nanog-outgoing@merit.edu
On Wed, Jan 31, 2001, Matthew Zito wrote:
> If you're looking for large scalability AND high performance, my preferred
> solution would be to have a relational database as the backend, but don't
> store any messages in it - simply pointers to their location on disk. Then
> store the messages without regard to intended username in a hashed directory
> structure. The pop3 server then gets the list of new messages from the
> database server, which could just be a list of filenames. Then, the pop3
> server simply has to open the message to return it - it doesn't have to do an
> opendir(). Also, if you use the filename as the UIDL returned, there's no
> need to even stat() the file, again saving you a whole nfs call. The
> obvious downside is that you can't do a :
>
> rm -f /users/j/o/h/n/johndoe.mbx
>
> But, with 200k mailboxes, you should have an automated way to do that anyway.
Hah. Unlink the directory, and do a background fsck every few hours? :)
The trouble with the above format is that you're ignoring any locality
that exists in the filesystem. For example, in Berkeley FFS, files in
a given directory are allocated in the same cylinder group (or at least
it is attempted..)
Which, under heavy heavy load could actually give a slight performance
boost on a non-filled FFS.
I believe there was a paper covering this locality for web caches.
Ah, yes:
"Reducing the Disk I/O of Web Proxy Server Caches"
- Carlos Maltzahn and Kathy J Richardson
Compaq Computer Corporation, Network Systems Laboratory
- Dirk Grunwald
University of Colorado
.. some (not all) of the concepts included there are relevant here.
Other filesystems will have different allocation/layout policies,
and additions such as "hinting" which can substantially speed up
mail accesses.
But, this is off topic, and I digress. :-)
Adrian
--
Adrian Chadd "Sex Change: a simple job of outside
<adrian@creative.net.au> to inside plumbing."
- Some random movie