We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Collectl: A system monitoring tool like no other

Formale Metadaten

Titel
Collectl: A system monitoring tool like no other
Serientitel
Anzahl der Teile
57
Autor
Mitwirkende
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Collectl is a comprehensive, fine-grained monitoring tool that collects a vast quantity of system metrics Collectl was developed over a dozen years ago as a very lightweight yet highly detailed system monitoring tool, capable of collecting hundreds system performance metrics as frequently as every second. Its companion tool colplot, provides an easy to use web-based plotting package capable of displaying detailed statistics for multiple systems at the same time. Think of colmux, a third tool, as top-anything across a number of machines. It is capable of displaying anything collectl can collect in top-format, sorted by any column of your choice. For example, say you have a 100 node cluster, with colmux you can look at the disk wait times across thousands of disks sorted sorted from slowest to fasted, allowing you to easily identify bad or hot drives, OR look at memory consumption for leaks. How about a bad NIC consuming a CPU by interrupts? Or how about which process is doing the more disk reads, or write, or page faults? And remember, this is across a cluster. The focus of collectl has always been highly efficient metrics collection and display via a CLI, no pretty pictures and no databases to slow it down. However what it does have is an API to allow it to pass those metrics onto whatever high level tools one may wish to communicate with. For example it has native support for ganglia or graphite over a socket. If you have some other favorite tool it can usually be adapted to communicate with it as well. Unfortunately most centralized tools are easily overwhelmed with fine-grained metrics and can only deal with them at granularities in the 1-min range. Not to worry, collectl has the ability to record and save metrics to local disk at one rate and send simultaneously send them to a central tool at a different rate, making it possible to get a coarser-grained centralized view and if there is a problem, still have access to finer-grained data. Collectl has been used for monitoring some of the largest computing clusters in the world and in the last several years has been enhanced for monitoring Open Stack Clouds. It is currently packaged as part of OpenSUSE.