[960] in Moira

home help back first fref pref prev next nref lref last post

Slow mail and broken mailhub clocks.

daemon@ATHENA.MIT.EDU (John Hawkinson)
Fri Mar 22 09:07:17 1996

Date: Fri, 22 Mar 1996 09:04:50 -0500 (EST)
To: postmaster@MIT.EDU
Cc: moiradev@MIT.EDU
From: John Hawkinson <jhawk@MIT.EDU>


I just sent mail to sipb-staff <199603221254.HAA01537@lola-granola.MIT.EDU>,
and it took 3 minutes, 5 seconds for my copy to get to me.
This is disappointingly (but certainly not fatally so), but I'm curious
why (both machines report only one BOBO).

One possible explanation is that apparently the mailhub DCM fails to
reorder mailing lists such that local (.LOCAL) recipients are listed
first. Since in general there is a fair likelyhood that mail to the po
servers will be more reliabliy delivered than to J. Random Internet
site, this seems to be a flaw. Please consider applying the patch I've
attached.

A lot more disturbing is that the clocks on the mailhubs and po
servers are blatantly wrong. I extracted the times from the Received
headers, and then determined offsets with "rdate -p HOSTNAME; date"
(on granola, which is running xntpd), and computed the "adjusted time"
column. Do the mailhubs not run xntpd normally (!?) or is something
broken?

	supposed time		adjusted time

granola	7:54:13			7:54:13
pca	7:53:25			7:54:14
po9	7:56:28			7:57:18

BTW, I started to look at the external addresses, and sure enough, my
SMTP connection to lehman.com (probe is above me on sipb-staff) hasn't
returned a greeting in the past two or three minutes, suggesting
that PCA would have timed it out, and that might have resulted
in the delay. Poor Richard.

In case you care, here are the headers of the message:

Received: from PACIFIC-CARRIER-ANNEX.MIT.EDU by po9.MIT.EDU (5.61/4.7) id AA2497
3; Fri, 22 Mar 96 07:56:28 EST
Received: from LOLA-GRANOLA.MIT.EDU by MIT.EDU with SMTP
        id AA00802; Fri, 22 Mar 96 07:53:25 EST
Received: (from jhawk@localhost) by lola-granola.MIT.EDU (8.7.4/8.6.11) id HAA01
537; Fri, 22 Mar 1996 07:54:13 -0500 (EST)
Date: Fri, 22 Mar 1996 07:54:13 -0500 (EST)

--jhawk

ps: I wouldn't exactly call this a high-confidence patch. While I'm
reasonably certain that it will compile and not make things any worse
than they are, it might not quite do the right thing, either.  I
haven't spent enough time staring at the silly SQL code and the list
insertion stuff to be sure, and I don't have access to SOS. I also
assume it won't DTRT on lists within lists, but it will certainly be
no worse than the current state [actually, assuming it worked it would
have the effect of placing the heavy and hard-to-deliver stuff all in
one place, so that a typical mailing list no longer had that
well-distributed over the time it took to deliver the list, but
instead concentrated at the end. Given that there are multiple lists
running at any given time, however, I don't think this change in work
distribution will be a problem]

===================================================================
RCS file: src/gen/mailhub.dc,v
retrieving revision 1.11
diff -c -r1.11 src/gen/mailhub.dc
*** src/gen/mailhub.dc	1996/03/22 13:22:36	1.11
--- src/gen/mailhub.dc	1996/03/22 13:54:16
***************
*** 114,119 ****
--- 114,120 ----
      char *last;
      char mi;
      char *pobox;
+     int localuser;
  };
  struct member {
      struct member *next;
***************
*** 234,241 ****
--- 235,244 ----
  	if (type[0] == 'P' && (s = hash_lookup(machines, pid))) {
  	    sprintf(buf, "%s@%s", u->login, s);
  	    u->pobox = pstrsave(buf);
+ 	    u->localuser = -1;
  	} else if (type[0] ==  'S') {
  	    u->pobox = hash_lookup(strings, bid);
+ 	    u->localuser = 0;
  	} else
  	  u->pobox = (char *) NULL;
  	if (hash_store(users, id, u) < 0) {
***************
*** 354,365 ****
  }
  
  
! insert_login(id, u, dummy)
  int id;
  struct user *u;
! int dummy;
  {
!     if (u->pobox && u->login[0] != '#')
        insert_name(u->login, id, TRUE, FALSE);
  }
  
--- 357,368 ----
  }
  
  
! insert_login(id, u, dolocal)
  int id;
  struct user *u;
! int dolocal;
  {
!     if (u->pobox && u->login[0] != '#' && u->localuser == dolocal)
        insert_name(u->login, id, TRUE, FALSE);
  }
  
***************
*** 493,499 ****
  sort_info()
  {
      names = create_hash(20001);
!     hash_step(users, insert_login, NULL);
      incount = 0;
      fprintf(out, "\n%s\n# Mailing lists\n%s\n", divide, divide);
      hash_step(lists, save_mlist, FALSE);
--- 496,503 ----
  sort_info()
  {
      names = create_hash(20001);
!     hash_step(users, insert_login, TRUE);
!     hash_step(users, insert_login, FALSE);
      incount = 0;
      fprintf(out, "\n%s\n# Mailing lists\n%s\n", divide, divide);
      hash_step(lists, save_mlist, FALSE);


home help back first fref pref prev next nref lref last post