[2681] in linux-net channel archive
Re: Multiport help
daemon@ATHENA.MIT.EDU (RHS Linux User)
Sat Apr 27 03:56:58 1996
Date: Thu, 25 Apr 1996 11:30:21 -0500 (CDT)
From: RHS Linux User <zap@kraken.port-aransas.k12.tx.us>
To: Jon Lewis <jlewis@inorganic5.fdt.net>
cc: moss@kraken.port-aransas.k12.tx.us, ub@kraken.port-aransas.k12.tx.us,
johns@hydra.port-aransas.k12.tx.us,
Linux Net Mailing List <linux-net@vger.rutgers.edu>
In-Reply-To: <Pine.LNX.3.91.960424135953.27145s-100000@inorganic5.chem.ufl.edu>
Jon,
We've got a DX4100, a cyclades serial port rig (only 16 ports), and
kernel -1.2.8- (Slackware 3.0) running on it. Now I know thats an old
kernel, but I dont think its part of the problem.
When we first got this machine, it had no modem, and and we had no
Internet connection, so it had VERY light traffick, and was HIGHLY
stable. I was practically the only user on it in fact.
When we got our 56 K, this machine became a gateway. It began to hang
about once every week, under very light load still. As the user load
increased on the network, it began to hang a little more frequently. A
recompile of the kernel, with only those drivers used on the system
(buslogic busmastering Fast SCSI, ne2000, 16 channels slip/cslip/ppp),
the system got very stable again.
We have since added the 16 port multiserial, and under a normal load (for
us) of about 7-12 users, with some competition for bandwidth, the system
will lock maybe two or three times in a day!
Usually when it hangs, there are no problems reported on any of the logs,
and the screensaver will have obsured anything that may have been logged
on the console.
One thing of note is that this problem has really gotten critical since
the addition of the cyclades multiserial.
However, the addition of the multiserial facillitated the use of our site
on an unprecedented level; CSLIP connections have become common, and it
is not unusual to 5 or 6 such connections active (14.4) at the time of a
crash.
There are some obvious factors that might be responsible:
o Steadily increasing workload
o The advent of the dialup SLIP user
o the addition of the multiserial and driver
When it works, it works really well; I'm twinsocked in from home, running
XTERM remotely on a M$ Windows-based X Server; its quite usable ;) and
very discouraging when it suddenly dies :(
We are about to replace the 56K with a T1; It will be interesting to see
if the problem is affected by available bandwidth.
I am about to move gateway & webserver duty from this machine to a
Sun netra-i we have laying around idle, (Thanks to all who encouraged me to
tackle SunOS ! :), both to lighten the load on the DX4, and to get the
gateway function off the machine that keeps hanging (it brings down the
whole freakin' net every time!)
I'll be posting re: changes in stabilty, etc in the near future.
-Z-
On Wed, 24 Apr 1996, Jon Lewis did studiously inscribe:
> I've got 2 64-port terminal servers...ewok and endor. Ewok was recently a
> 486-100 with 2 32 port RocketPort cards. It would kernel panic often
> (like just about daily)...even after applying about every 1.2.13 bug-fix I
> could find. I started to blame things on rocket.o (RocketPort driver)
> race conditions, heard that a faster CPU might do the trick, and so I
> upgraded ewok to a P100. Since then it's been up 26 days.
>
> In the interest of science and stable terminal servers, I decided to put
> 64 ports of Cyclades gear into ewok's old 486-100 board. It ran for 5
> days while I configured things, then I put close to 40 modems on it. I
> call this one endor. Endor then started locking up every 24 hours. On a
> tip from Cyclades, I turned off swap (it has plenty of RAM for what it
> does) and it ran a few days. I've kept swap disabled, and now instead of
> locking up, it kernel panics every few days.
>
> Endor just panic'd again...this time under very light load:
>
> Wed Apr 24 13:30:02 EDT 1996
> 1:30pm up 3 days, 11:34, 9 users, load average: 0.00, 0.02, 0.04
> total used free shared buffers
> Mem: 31168 30472 696 7172 19988
> -/+ buffers: 10484 20684
> Swap: 0 0 0
>
> Apr 24 13:32:08 endor pppd[28341]: remote IP address 205.229.51.140
> Apr 24 13:32:49 endor pppd[29774]: pppd 2.2.0 started by topherjc, uid 333
> Apr 24 13:36:31 endor syslogd: restart
> Apr 24 13:36:32 endor kernel: Kernel logging (proc) started.
> Apr 24 13:36:32 endor kernel: kswap 2.2.1.3 (Exp 1995/06/03 04:10:43)
>
> Again, it panic'd and then rebooted itself with the reset_on_panic
> patch. Nothing about the Oops got logged...it rarely does...but the
> interesting thing is that this mode of crash, panic right as a PPP
> session starts up, is exactly what the RocketPort based system used to
> do. It used to do this under pppd 2.1.2d, 2.2.0e, and 2.2.0f...so it
> would seem unlikely that the PPP code is at fault, but it seems very much
> to me that the problem is independant of the RocketPort and Cyclades
> drivers, and must be elsewhere in the 1.2.13 kernel.
>
> It should be noted that I'm not using a standard (known to be quite
> buggy) 1.2.13...but a heavily bugfix patched one. I'm using all of the
> bugfixes at http://trishul.sci.gu.edu.au/~tony/linux/patches.html, and
> have been for some time.
>
> Endor's P100 parts and case just came in today...so I'll start building
> that this afternoon. I suspect endor will magically stabilize as ewok
> did once it's on a P100 board.
>
>
> ------------------------------------------------------------------
> Jon Lewis | Mime attachments are OK
> jlewis@inorganic5.fdt.net | But please ask before sending
> http://inorganic5.fdt.net | unsolicited huge files.
> ________Finger jlewis@inorganic5.fdt.net for PGP public key_______
>
>