[15185] in Athena Bugs

home help back first fref pref prev next nref lref last post

Sun4 8.0J: lertstop has serious problems problems

daemon@ATHENA.MIT.EDU (Matt)
Mon Jun 9 01:42:03 1997

Date: Mon, 9 Jun 1997 01:42:02 -0400
From: Matt <matt@MIT.EDU>
To: mbarker@MIT.EDU
Cc: bugs@MIT.EDU, ops@MIT.EDU


System name:            minos
Type and version:       SPARC/5 8.0J 
Display type:           cgthree
 
What were you trying to do?
	remove the 'stale' entries from the lert db, since many of the users 
	lerted in 1996 have been deactivated or purged by now
 
What's wrong:

	minos# lertstop a 
	(and it spins...I control-c'd it after 76 minutes of CPU time)  

	...and it trashes the db 

Before: -rw-rw-r--   1 root     other     157668 Jun  8 23:58 /tmp/lertdb
After:	-rw-rw-r--   1 root     other    15360000 Jun  9 01:30 /tmp/lertdb2


What should have happened:
	It should have removed category a from the DB and given me back my
	prompt
 
What other relevant things do you want to tell us: 

	Glad you asked...for starter, the lert db is *big*
	
	minos# wc /tmp/lertdb 
	    5552   22208  157668 /tmp/lertdb
	(/tmp/lertdb is the output of lertdump) 

	so it is in fact possible that it is not spinning, but working 
	very hard to go through the whole db

What else is wrong: 

	Even if it had finsished in real time, it would have removed 
	some users it should not have.  It seems that if a user is 
	in multiple categories and one of those categories gets lertstop'd 
	then the user will dissapear entirely (from the lertdb) instead of 
	just being removed from the category that was lertstop'd.  It also 
	seems that it introduces corruption into the the db under
	certain similiar circumstances (this is very bad!).
	  For example:

(mhbraun@forever) /var/ops/lert/% ./lertload a < /tmp/users1
(mhbraun@forever) /var/ops/lert/% ./lertload b < /tmp/users2
(mhbraun@forever) /var/ops/lert/% ./lertload c < /tmp/users3
(mhbraun@forever) /var/ops/lert/% ./lertdump
name: mwhitson  categories: ac			(this output is in fact
name: nathanw  categories: bc			correct for what I loaded)
name: mhbraun  categories: a
name: kretch  categories: ac
name: jweiss  categories: a
name: joanna  categories: bc
name: ted  categories: b
name: dkk  categories: b
name: cat  categories: bc			
(mhbraun@forever) /var/ops/lert/% ./lertstop a 
(mhbraun@forever) /var/ops/lert/% ./lertdump
name: mwhbccat  categories: c		<------	??? and notice kretch is 
name: nathanw  categories: bc			no longer in the db when 
name: joanna  categories: bc			he should still be in c
name: ted  categories: b
name: dkk  categories: b
name: cat  categories: bc
(mhbraun@forever) /var/ops/lert/% 

Where is the source for the binaries that produced this: 

	/mit/ops/src/lert/src

What does this mean in the great scheme of things: 

	we are up to category 's', that means we have 7 categories left
	(8 if you count 'j' which we skipped for some reason).  We seem to use
	from 1 to 4 a month depending on the number of failures (and if 
	different users from the same failure need to get different 
	notices, and accounts will likely be lerting people for deactivation
	in the near future.  We really need this fixed by the fall.

Please describe any relevant documentation references:

	/mit/ops/src/lert/doc/lert.dvi
	The documentation does not give any indication what will 
	happen if I control-c the lertstop...ie it would be nice to know
	if that will corrupt the db.


home help back first fref pref prev next nref lref last post