We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

GrimoireLab a Python toolset for software development analytics

Formale Metadaten

Titel
GrimoireLab a Python toolset for software development analytics
Serientitel
Anzahl der Teile
611
Autor
Lizenz
CC-Namensnennung 2.0 Belgien:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache
Produktionsjahr2017

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
The talk will explain how to analyze software development repositories ofcommon use in the free software community with [GrimoireLabtools], a toolset for software developmentanalytics writting in Python. It will start by explaining how to retrieve datafrom git, Bugzilla, GitHub, mailing lists, StackOverflow, Gerrit, and manyother repositories by, and organizing it in a database. The talk will laterexplain how this database can be exploited with several components of thetoolset, for different purposes. In this context, special attention will begiven to how to extract useful information from it using Python/Pandas andiPython/Jupyter Notebooks; and how to use ElasticSearch/Kibana to deployactionable dashboards that show data in all its glory. Many free / open source software (FOSS) projects feature an open developmentmodel, with public software development repositories which anyone can browse.These repositories are normally used to find specific information, such acertain commit or a particular bug report. But they can also be mined toextract all relevant data, so that it can be analyzed later to learn about anyspecific or general aspect of the project. This talk will explain theGrimoireLab method for doing that, which is based on organizing all thatinformation in a database, which can be later analyzed. This approach allowsfor minimal impact on the project infrastructure, since data is retrieved onlyonce, even if it later analyzed many times. It allows as well for efficiencyand comfort when mining data for an analysis, since the results are readilyavailable, databases can be shared and replicated at will, and queried themwith any kind of tools is easy. The tools that retrieve information from the repositories are grouped in theGrimoireLab toolset. It includes mature, widely tested programs capable ofextracting information from most repositories used by FOSS projects of anyscale. Many of them are agnostic with respect to the database used, althoughcurrently ElasticSearch is the best supported. The produced databases can be exploited in several ways, of which two will beexplained during the talk: using Python/Pandas to produce iPython/JupyterNotebooks which analyze some aspect of the project; and using Python to feed aElasticSearch cluster, with a Kibana front-end for visualizing in a flexible,powerful dashboard. All these approaches can be used to understand general aspects of the project,such as how efficient are the code review or bug fixing processes, how diverseare contributions to the git repository, or how conversations in mailing listsor StackOverflow are shaped. But they can be used as well to drill down, andanalyze the contributions by a certain developer, or the longer code reviewprocesses, or the contents of the most lively email and QA threads. The talk will explain the whole process from data retrieval to visualization,and will show some specific cases of real world use, such as the dashboardsproduced for Eclipse, OPNFV, MediaWiki and many others. Some of the contentsof the talk are described in detail in the online book GrimoireLab Training.