We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Caching for Jupyter Notebooks

Formale Metadaten

Titel
Caching for Jupyter Notebooks
Serientitel
Anzahl der Teile
131
Autor
Mitwirkende
Lizenz
CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Caching data and calculation results in jupyter notebooks is a great way to speed up development by making expensive cells easier to re-run. Data scientists and developers using notebooks on a daily basis, can improve their notebook workflow with low-effort changes in the notebook code, cut the time spent waiting and reduce context switches. This talk targets developers and data scientist of all experience levels and will cover: Why caching in notebooks? Setting up the context in which developers and data scientists use notebooks for exploratory work and how caching is relevant in it. What is caching Quick definition of caching, introducing the different types of persistence (in-memory, on disk, database, object storage …), cache invalidation strategies (parameters, code changes, ttl, …), with some cautionary comments about data security when caching protected data. Caching Techniques Going through readily available options from the python standard library, and how to use them in notebooks. Introducing a few off-the-shelves options like ipython % magics, and cachetools. Showcasing how one would build their own mini-caching framework, that fits for their specific use case, using pandas and spark for the example Explaining when to stop trying to cache, and keeping the caching framework mini, what are the signs that caching went overboard.