[27596] in Athena Bugs
Re: debathena-zephyr-config if-up being run too early
daemon@ATHENA.MIT.EDU (Jonathan Reed)
Mon Sep 3 14:05:59 2012
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset=us-ascii
From: Jonathan Reed <jdreed@mit.edu>
In-Reply-To: <20120903133556.y4z6polzgkosogkk@webmail.mit.edu>
Date: Mon, 3 Sep 2012 14:05:52 -0400
Message-Id: <BD9B9ACE-2A52-4860-9DB0-9CE1086176C9@mit.edu>
To: ezyang <ezyang@mit.edu>
Content-Transfer-Encoding: 8bit
Cc: bugs@mit.edu
Reply-To: debathena@mit.edu
Errors-To: bugs-bounces@mit.edu
What Ubuntu version does this occur on? And what hardware? If ifup.d is actually running before an interface is up, that's arguably an upstream bug. But I've also seen poor interactions with some chipsets (primarily Broadcom) where the interface is brought up, but nothing is actually useful because the chipset takes a few seconds to test for a physical link and do other stupid stuff. And if this is wireless, all bets are off.
Arguably, this is related to http://debathena.mit.edu/trac/ticket/133 And I think is firmly in the "patches welcome" state. I don't have a good answer here, though I'm intrigued by your assertion that zhm doesn't need to care about when it is moved across networks.
I suspect that there are two potential solutions here:
- patch zhm to keep retrying if it can't find a server so that we can just start it and forget about it?
- Upstart-ify the zhm init script
But I'm mostly just guessing, and I think this needs further discussion, which should happen on the debathena list.
-Jon
On Sep 3, 2012, at 1:35 PM, ezyang wrote:
> This may be a nonreproduceable Ubuntu bug (I haven't checked on another Ubuntu
> instance yet), but it appears to be the case that under some circumstances,
> if-up.d can get run before network has been fully established. In this case,
> zhm will attempt to restart, fail to contact the server, and die. In fact, it
> may be better for us not to be restarting zhm at all, since it can handle
> moving around (you just won't keep your subs.)
>
> Steps to reproduce:
> 1. zstat localhost # check that zhm is running
> 2. Roam to a different network
> 3. zstat localhost
>
> Expected Behavior: zhm is running
>
> Actual Behavior: zhm is not running.
>
> Edward