We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Starting the sysadmin tools renaissance: Flapjack + cucumber-nagios

Formal Metadata

Title
Starting the sysadmin tools renaissance: Flapjack + cucumber-nagios
Title of Series
Number of Parts
97
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Monitoring software is ripe for a renaissance. Now is the time to for building new tools and rethinking our problems. Leading the charge are two projects: cucumber-nagios, and Flapjack. A systems administrator's role in today's technology landscape has never been so important. It's our responsibility to manage provisioning and maintenance of massive infrastructures, to anticipate ahead of time when capacity must be grown or shrunk, and increasingly, to make sure our applications scale. While developer tools have improved tremendously, we sysadmins are still living in the dark ages, other than a few shining beacons of hope such as Puppet. We're still trying to make Nagios scale. We're still writing the same old monitoring checks. Getting statistics out of our applications is tedious and difficult, but increasingly important to scaling. cucumber-nagios lets you describe how a website should work in natural language, and outputs whether it does in the Nagios plugin format. It includes a standard library of website interactions, so you don't have to rewrite the same Nagios checks over and over. cucumber-nagios can also be used to check SSH logins, filesystem interactions, mail delivery, and Asterisk dialplans. By lowering the barrier of entry to writing fully featured checks, there's no reason not to start testing all of your infrastructure. But as you start adding more checks to your monitoring system you're going to notice slowdowns and reliability problems - enter Flapjack Flapjack is a scalable and distributed monitoring system. It natively talks the Nagios plugin format (so you can use all your existing Nagios checks), and can easily be scaled from 1 server to 1000. Flapjack breaks the monitoring lifecycle into several distinct chunks: workers that execute checks, notifiers that notify when checks fail, and an admin interface to manage checks and events. By breaking the monitoring lifecycle up, it becomes incredibly easy to scale your monitoring system with your infrastructure. Need to monitor more servers? Just add another server to the pool of workers. Need to take down your workers for maintenance? Just spin up another pool, and turn off the old one.