[2172] in Release_7.7_team
Re: AFS performance measurements
daemon@ATHENA.MIT.EDU (John Hawkinson)
Thu Mar 16 10:36:29 2000
Date: Thu, 16 Mar 2000 10:36:22 -0500 (EST)
Message-Id: <200003161536.KAA25979@mary-kay-commandos.mit.edu>
To: Greg Hudson <ghudson@mit.edu>
cc: release-team@mit.edu
In-reply-to: "[2168] in Release_7.7_team"
From: John Hawkinson <jhawk@MIT.EDU>
This seems to be an unfair test.
i. Lot's of the time, cluster users won't be benefitting from the AFS cache
for netscape.
Here's the cmdebug entry for netsacpe 4.61 on this Solaris box:
** Cache entry @ 0xf5ffb9d0 for 1.537088507.28.2856
14395996 bytes DV 5 refcnt 6
callback f5f77300 expires 953226956
2 opens 0 writers
normal file
states (0xffffff85), stat'd, read-only, mapped
uniquified by the vid+fid (537088507.28) (vid from 'vos examine', fid
from ls -li and then /mit/watchmaker/bin/calcfid).
So, a quick check to see how many Sun machines in the w20 cluster have it:
grep ': W20-575-' hstath.txt | grep SOLARIS > /tmp/s1
awk -F: '{print $3}' /tmp/s1 > /tmp/s2
sh -c 'for i in `cat /tmp/s2`; do cmdebug $i -long | grep 537088507.28 > /dev/null && echo $i yes || echo $i no; done'
The output is in /mit/jhawk/tmp/w20.cmdebug, but the quick summary:
[mary-kay-commandos!jhawk] ~/tmp> grep -c yes w20.cmdebug
63
[mary-kay-commandos!jhawk] ~/tmp> grep -c no w20.cmdebug
27
So 70% of the time in the cache, 30% not.
ii. Netscape is demand-paged (and goes through the afs chunkification when
via AFS). As such, it's definitely not the same as reading the entire
file.
iii. I think (meaning I haven't actually checked) Netscape reads a lot
of ancillary files (like java stuff, etc.) at startup time that helps to
slow it down. In an unscientific check:
[mary-kay-commandos!jhawk] ~> cmdebug -long mkc | grep 537088507 | wc -l
53
[mary-kay-commandos!jhawk] ~> cmdebug -long w20-575-55 | grep 537088507 | wc -l
29
[mary-kay-commandos!jhawk] ~> cmdebug -long w20-575-56 | grep 537088507 | wc -l
62
[mary-kay-commandos!jhawk] ~> cmdebug -long w20-575-54 | grep 537088507 | wc -l 32
Perhaps better would be to look at bytecounts:
[mary-kay-commandos!jhawk] ~> cmdebug -long w20-575-54 | gawk '/Cache/{f=$7}; /bytes/{if (f ~ /537088507/) b+=$1} END{print b}'
20678972
[mary-kay-commandos!jhawk] ~> cmdebug -long w20-575-55 | gawk '/Cache/{f=$7}; /bytes/{if (f ~ /537088507/) b+=$1} END{print b}'
16135068
[mary-kay-commandos!jhawk] ~> cmdebug -long w20-575-56 | gawk '/Cache/{f=$7}; /bytes/{if (f ~ /537088507/) b+=$1} END{print b}'
21763952
[mary-kay-commandos!jhawk] ~> cmdebug -long mkc | gawk '/Cache/{f=$7}; /bytes/{if (f ~ /537088507/) b+=$1} END{print b}'
24278088
iv. This problem is most noticable when the network is being slow.
Presumably your test did not account for that variability.
--jhawk