PyAutoFit: A Classy Probabilistic Programming Language For Data Science

Cite

Related Material

EuroPython

Nightingale, James W.

Formal Metadata

Title

PyAutoFit: A Classy Probabilistic Programming Language For Data Science

Title of Series

EuroPython 2021

Number of Parts

115

Author

Nightingale, James W.

Contributors

Arora, Dhanshree (Moderation)

License

CC Attribution - NonCommercial - ShareAlike 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this

Identifiers

10.5446/58768 (DOI)

Publisher

EuroPython

Release Date

2021

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

A major trend in data science is the rapid adoption of Bayesian statistics for data analysis and modeling. With modern data-sets growing by orders of magnitude in size, the focus is now on developing methods capable of applying contemporary inference techniques to extremely large datasets. To this aim, I present PyAutoFit, an open-source probabilistic programming language for automated Bayesian inference that was recently published in the Journal of Open Source Software. I will begin by giving an overview of PyAutoFit’s core features, in particular how it: Makes it simple to compose and fit probabilistic models using a range of Bayesian inference libraries, such as emcee and dynesty. Handles the 'heavy lifting' that comes with model-fitting, including model composition & customization, outputting results, model-specific visualization and posterior analysis. Is built for big-data analysis, whereby results are output as a sqlite database which can be queried after model-fitting is complete. PyAutoFit was developed by Astronomers seeking to fit large libraries of galaxy images to better understand the nature of dark matter. Using this science-case, I will describe PyAutoFit’s advanced features, such as multi-level models, automated model-fitting pipelines and support for massively parallel computing infrastructures. The goal of this talk is to introduce the audience to PyAutoFit so they can adopt it for their use-case. The only prerequisite is a basic understanding of object oriented programming in Python.