[189758] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

RE: Monitoring system recommendation

daemon@ATHENA.MIT.EDU (Raymond Burkholder)
Mon Jun 6 12:17:55 2016

X-Original-To: nanog@nanog.org
X-OneUnified-MailScanner-From: ray@oneunified.net
From: "Raymond Burkholder" <ray@oneunified.net>
To: "=?utf-8?Q?'Manuel_Mar=C3=ADn'?=" <mmg@transtelco.net>,
 "'NANOG'" <nanog@nanog.org>
In-Reply-To: <CAD0TWZ8i-Y9cqWZ9irM15BH2QrMRpBAhFOe39D5eFPhhpy3NSw@mail.gmail.com>
Date: Mon, 6 Jun 2016 13:17:51 -0300
Errors-To: nanog-bounces@nanog.org

> We are currently planning to upgrade our monitoring system (Opsview) due
> to scalability issues and I was wondering what do you recommend for
> monitoring
> 5000 hosts and 35000 services. We would like to use a monitoring system t=
hat

Another consideration is check_mk.  We use it in our shop.  The check_mk pe=
ople wrapped a bunch of python around the Nagios notification engine.  No l=
onger do you need to worry about the tedium of nagios config files, those a=
re all built automatically from commands from a gui or from a single config=
uration file.

Check_mk has a benchmarking page which scales to more hosts than you specif=
ied:
https://mathias-kettner.de/checkmk_checkmk_benchmarks.html

For an architecture diagram of how they use nagios for alerting, and python=
 for scanning:
http://mathias-kettner.com/check_mk.html

If an included agent isn't available, new ones can be written.=20=20

We are quite happy with the solution.  We've replaced cricket, cacti, nagio=
s, observium, and a little bit of smokeping with this almost all in one too=
l.




--=20
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


home help back first fref pref prev next nref lref last post