[2172] in Release_7.7_team

home help back first fref pref prev next nref lref last post

Re: AFS performance measurements

daemon@ATHENA.MIT.EDU (John Hawkinson)
Thu Mar 16 10:36:29 2000

Date: Thu, 16 Mar 2000 10:36:22 -0500 (EST)
Message-Id: <200003161536.KAA25979@mary-kay-commandos.mit.edu>
To: Greg Hudson <ghudson@mit.edu>
cc: release-team@mit.edu
In-reply-to: "[2168] in Release_7.7_team"
From: John Hawkinson <jhawk@MIT.EDU>

This seems to be an unfair test.

i.   Lot's of the time, cluster users won't be benefitting from the AFS cache
for netscape. 

Here's the cmdebug entry for netsacpe 4.61 on this Solaris box:

** Cache entry @ 0xf5ffb9d0 for 1.537088507.28.2856
    14395996 bytes      DV 5 refcnt 6
    callback f5f77300   expires 953226956
    2 opens     0 writers
    normal file
    states (0xffffff85), stat'd, read-only, mapped

uniquified by the vid+fid (537088507.28) (vid from 'vos examine', fid
from ls -li and then /mit/watchmaker/bin/calcfid).

So, a quick check to see how many Sun machines in the w20 cluster have it:

grep ': W20-575-' hstath.txt | grep SOLARIS > /tmp/s1
awk -F: '{print $3}' /tmp/s1 > /tmp/s2
sh -c 'for i in `cat /tmp/s2`; do cmdebug $i -long | grep 537088507.28 > /dev/null && echo $i yes || echo $i no; done'

The output is in /mit/jhawk/tmp/w20.cmdebug, but the quick summary:
[mary-kay-commandos!jhawk] ~/tmp> grep -c yes w20.cmdebug 
63
[mary-kay-commandos!jhawk] ~/tmp> grep -c no w20.cmdebug
27

So 70% of the time in the cache, 30% not.


ii. Netscape is demand-paged (and goes through the afs chunkification when
via AFS). As such, it's definitely not the same as reading the entire
file.

iii. I think (meaning I haven't actually checked) Netscape reads a lot
of ancillary files (like java stuff, etc.) at startup time that helps to
slow it down. In an unscientific check:

[mary-kay-commandos!jhawk] ~> cmdebug -long mkc | grep 537088507 | wc -l
      53
[mary-kay-commandos!jhawk] ~> cmdebug -long w20-575-55 | grep 537088507 | wc -l
      29
[mary-kay-commandos!jhawk] ~> cmdebug -long w20-575-56 | grep 537088507 | wc -l
      62
[mary-kay-commandos!jhawk] ~> cmdebug -long w20-575-54 | grep 537088507 | wc -l      32

Perhaps better would be to look at bytecounts:

[mary-kay-commandos!jhawk] ~> cmdebug -long w20-575-54 | gawk '/Cache/{f=$7}; /bytes/{if (f ~ /537088507/) b+=$1} END{print b}'
20678972
[mary-kay-commandos!jhawk] ~> cmdebug -long w20-575-55 | gawk '/Cache/{f=$7}; /bytes/{if (f ~ /537088507/) b+=$1} END{print b}'
16135068
[mary-kay-commandos!jhawk] ~> cmdebug -long w20-575-56 | gawk '/Cache/{f=$7}; /bytes/{if (f ~ /537088507/) b+=$1} END{print b}'
21763952
[mary-kay-commandos!jhawk] ~> cmdebug -long mkc | gawk '/Cache/{f=$7}; /bytes/{if (f ~ /537088507/) b+=$1} END{print b}'
24278088

iv. This problem is most noticable when the network is being slow.
Presumably your test did not account for that variability.

--jhawk

home help back first fref pref prev next nref lref last post