[801] in Release_Engineering

home help back first fref pref prev next nref lref last post

timed

daemon@ATHENA.MIT.EDU (epeisach@ATHENA.MIT.EDU)
Thu Feb 23 12:28:45 1989

From: <epeisach@ATHENA.MIT.EDU>
Date: Thu, 23 Feb 89 12:28:14 EST
To: rel-eng@ATHENA.MIT.EDU

I have built timed and timedc on vax and rt platforms. They have both
been installed on the respective /srvd's. 

I am currently running timed with tracing option on crash and burn and
e40-342-18. (logging in /usr/adm) Unless there is a timed started as a
master, no time synchronization is performed so I started e40-342-18 as a
master. 

The master (started -M) is the program which sends the time adjustments
to the slaves. Apparently from the code, if there is more than one
master, there are periodic elections to decide who should act as master.
I believe that if a new master comes on line, it will become a slave for
the other. 

Regarding man pages: Good news: They exist and have been on the packs
for months. Bad news: The timedc man page looks out of date.

Good news: timed will throw out totally bogus time setting. 
Bad news: "bogus" is relative. Ten minutes is within it's capabilities.

The master server is not logging the times of the other machines (which
is an option) as I did not compile with MEASURE defined. I don't believe
this is critical.

Bugs:
-----
1) It is feasible for a slave to royally screw up the times on all
the other machines. By using the date command, support currently exists
(and is in use) to send the info to the master who interprets this and
treats it as if god has set the time. All clients quickly make
adjustments (in 12 second jumps). Unfortunatly, gettime does not follow
the same conventions as date so it does not foward it's knowledge on to
timed. Timed does figure it out every 5 minutes or so and then starts
adjusting every one's clocks accordingly. The program is blind however
to setting the time on the master with gettime as it does not propogate
the differences. It is unclear if the slaves determine the absolute
time. 

2) timedc: if no master's are on and you use the command msite, it will
try to find masters. It returns "communication error" probably because
the response timesout.


What needs to be done:
----------------------
While I do not support timed as the master's time can be skewed
rather easilly and propogated, the following will still need to be done
for timed to be used by machines

1) Modify /etc/rc to start it. I imagine rc should only check for the
existance of /etc/timed before starting. This should be after the
gettime to prevent really weird time skews from occurring everytime a
machine reboots.

2) Modify the subscription lists to track this program over.

3) Designate a machine in each cluster as a master. I don't think timed
looks beyond the current subnet. (Although the are options to tell it to
look elsewhere) This is an option possibly to keep in mind.

4) I would suggest more testing to see how wel it works.

5) I would modify the code such that a master will never set it's time
based on a slave. If the masters eventually will be using ntpd, they
should rely on this to set the time, not the judgement of the slaves.
This includes modifying the code such that if more than one master is on
a subnet, it will not set it's time when running as a slave. (Otherwise,
we can play musical master screwing if one of the masters is really
wrong).

6) Eventually gettime should have the same hooks as the date program to
send out the correct date. (Again the same problems with masters exist).


	Ezra



home help back first fref pref prev next nref lref last post