We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Nbval - Testing your notebooks

00:00

Formal Metadata

Title
Nbval - Testing your notebooks
Title of Series
Number of Parts
43
Author
Contributors
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Production PlaceErlangen, Germany

Content Metadata

Subject Area
Genre
Abstract
Many scientific computing projects involve code in notebooks, either to produce results or to demonstrate and explain the use of a software package. But how do you ensure that your notebooks still work as the libraries they import and call change? nbval is a plugin for the pytest testing framework which runs your notebooks as part of your test suite, so that unexpected errors will show up as test failures. You can also use nbval to check that key pieces of the produced outputs match those saved in the notebook file, to be sure that the code is still doing the same thing. This ability to automatically test notebooks with nbval enables notebooks to be a part of verifying a reproducible scientific publication.
5
Thumbnail
1:34:10
33
Thumbnail
1:31:57
34
Thumbnail
1:28:12
35
Thumbnail
1:27:32
41
Thumbnail
1:31:21
43
Statistical hypothesis testingComputer fileDemo (music)Cellular automatonKernel (computing)Firefox <Programm>MereologyException handlingACIDRandom numberLetterpress printingSoftware frameworkPlotterFile formatFunction (mathematics)QuicksortStatistical hypothesis testingProjective planeFunctional (mathematics)Computer fileSocial classRight angleLibrary (computing)Presentation of a groupData conversionCellular automatonNumberOpen setLaptopResultantCodePlug-in (computing)Basis <Mathematik>Boss CorporationSoftware development kitDifferent (Kate Ryan album)Order (biology)Installation artComputer animation
Computer fileDemo (music)Plug-in (computing)CodeModule (mathematics)Division (mathematics)IntegerSystem callFunction (mathematics)LaptopCellular automatonCoefficient of determinationCASE <Informatik>Video game consoleCodeStatistical hypothesis testingComputer animation
Letterpress printingRandom numberKernel (computing)Cellular automatonComputer fileFirefox <Programm>Demo (music)System callCodeBitFunctional (mathematics)Moment (mathematics)Error messageCellular automatonCASE <Informatik>Right angleSineComputer animation
Letterpress printingRandom numberCellular automatonFirefox <Programm>Demo (music)Computer fileModule (mathematics)outputDivision (mathematics)IntegerSystem callPlug-in (computing)CodeMereologyStatistical hypothesis testingMessage passingCASE <Informatik>Error messageComputer animation
Letterpress printingRandom numberKernel (computing)Cellular automatonComputer fileFirefox <Programm>Demo (music)Error messageFunction (mathematics)Different (Kate Ryan album)CASE <Informatik>ForceFunction (mathematics)State of matterMultiplication signFlagQuicksortCodeComputer animation
Module (mathematics)Division (mathematics)IntegerDemo (music)Plug-in (computing)CodeLetterpress printingoutputFunction (mathematics)LaptopCellular automatonRandom numberComputer fileSample (statistics)Function (mathematics)RandomizationCellular automatonInformationComputer filePower (physics)View (database)Statistical hypothesis testingCodeSoftware frameworkDisk read-and-write headBus (computing)Computer animation
Function (mathematics)Cellular automatonLaptopRandom numberComputer fileProcess (computing)FlagFunction (mathematics)BitOperator (mathematics)Latent heatException handlingCellular automatonComputer animation
Computer fileIntegerDivision (mathematics)Demo (music)LaptopCellular automatonPlug-in (computing)CodeDifferent (Kate Ryan album)Exception handlingCellular automatonFunction (mathematics)Latent heatComputer animation
Letterpress printingFunction (mathematics)Random numberDemo (music)Kernel (computing)Cellular automatonComputer fileFirefox <Programm>System callFunction (mathematics)BitStatistical hypothesis testingCASE <Informatik>Error messageLaptopCellular automatonMathematicsCodeException handlingNeighbourhood (graph theory)Task (computing)Computer animation
Division (mathematics)IntegerSystem callKernel (computing)Computer fileCellular automatonLetterpress printingFirefox <Programm>Demo (music)Data typeLaptopModule (mathematics)Plug-in (computing)CodeFunction (mathematics)Random numberoutputACIDStatistical hypothesis testingComputer animation
outputSystem callData typeKernel (computing)Letterpress printingFirefox <Programm>Function (mathematics)Demo (music)Random numberPlug-in (computing)CodeComputer fileDivision (mathematics)IntegerCellular automatonDataflowRange (statistics)Line (geometry)EmailACIDException handlingMoment (mathematics)Observational studyCellular automatonGame controllerView (database)Menu (computing)Exception handlingFunction (mathematics)Point (geometry)SoftwareBounded variationMachine visionLabour Party (Malta)Library (computing)MathematicsMultiplication signProof theoryLine (geometry)Formal languageSoftware bugServer (computing)LaptopCodeCASE <Informatik>BitError messageOrder (biology)Revision controlComputer animation
LaptopRegulärer Ausdruck <Textverarbeitung>Kernel (computing)Cellular automatonDemo (music)Firefox <Programm>Function (mathematics)Computer fileFunction (mathematics)Programmer (hardware)Bounded variationRegulärer Ausdruck <Textverarbeitung>BitPattern languageComputer fileConfiguration spaceCASE <Informatik>Web pageWordFlagMultiplication signFile formatCivil engineeringLine (geometry)BackupMessage passingTouch typingRevision controlComputer animation
Computer fileDemo (music)Plug-in (computing)CodeStatistical hypothesis testingKernel (computing)Cellular automatonFirefox <Programm>Random numberLetterpress printingACIDModule (mathematics)GUI widgetIntegerDivision (mathematics)DivisorSystem callFunction (mathematics)Regulärer Ausdruck <Textverarbeitung>LaptopCellular automatonComputer fileComputer animation
Computer fileDemo (music)Plug-in (computing)CodePell's equationCellular automatonLaptopFunction (mathematics)Letterpress printingCycle (graph theory)Multiplication signPairwise comparisonComputer fileComputer animation
LaptopRegulärer Ausdruck <Textverarbeitung>Cellular automatonComputer fileKernel (computing)Firefox <Programm>Demo (music)Function (mathematics)Letterpress printingPlug-in (computing)CodeSurfaceBitWordDifferent (Kate Ryan album)Matching (graph theory)Pattern languageRegulärer Ausdruck <Textverarbeitung>Function (mathematics)Computer animation
Regulärer Ausdruck <Textverarbeitung>Demo (music)Computer fileCellular automatonKernel (computing)Firefox <Programm>ACIDBitDifferenz <Mathematik>Function (mathematics)Radical (chemistry)Projective planeLaptopTraffic reportingComputer animation
Function (mathematics)Demo (music)Letterpress printingPlug-in (computing)CodeLaptopUniform resource locatorDifferenz <Mathematik>outputCellular automatonFirefox <Programm>PC CardCASE <Informatik>PlotterFunction (mathematics)Cellular automatonHecke operatorFile formatCodeLaptopQuicksortIntegrated development environmentPlanningPrandtl numberMereologyPhysical systemRight angleFlagRevision controlDifferent (Kate Ryan album)Medical imagingSource codeTraffic reportingInheritance (object-oriented programming)MathematicsStatistical hypothesis testingException handlingError messageLibrary (computing)Presentation of a groupUtility softwareData managementContext awarenessRadical (chemistry)Plug-in (computing)Kernel (computing)Projective planeOcean currentMetadataSubstitute goodDirection (geometry)WordPattern languageVideo gameIdentity managementMultiplication signFingerprintRoundness (object)Covering spaceNetbookWritingPerturbation theoryComputer animation
Computer animation
Transcript: English(auto-generated)
Next up is Thomas Klaue with his presentation on testing with NML. Thank you. So if you've got a repository with notebooks, you may know that it's not very easy to
test them conventionally, this isn't just about the technology, this is also because the code that you write in notebooks is different to the kind of code that you write in a library, so you're not so much writing functions and classes that can be sort of isolated and reused in different places, and that's kind of how our conventional
testing frameworks work. Notebooks are much more about telling a story with code, and having a narrative from getting your data in to producing your results in your plots that somebody can follow through.
So nbval is a tool for testing notebooks, and it's using one of the features of the notebook format, which is that the output is saved inside the notebook along with the code, so the basis of nbval is that you're running the code and you're checking does
this output look right, and this is a project that's been developed by a number of people under the Open Dream Kit project, which is an EU funded project, Vidar, who spoke this morning about the 3D stuff in iPython and Jupyter notebooks, has done quite a lot
of the work on it, along with a collection of people at Southampton who I'm kind of here to represent, but David, Oliver and our boss Hans have all done more of the work than I have on this. So you can install it, as with so many things, by doing pip install nbval, it's probably
also on Conda, Forge or anaconda.org somewhere, I can't remember what channel you have to look under for that, so I have some cells with some code here, and you can see that they're producing some output, and in order to run it, you run a command that looks
like this, py.test-nbval-lax, and this will, nbval is a plugin for the PyTest framework, which is one of the very popular frameworks for testing Python code, and if I switch
into a terminal and run py.test-nbval-lax, then you'll see it runs these code cells, so each dot which would normally represent one test here represents one notebook code cell, and you can see that in this case, these cells printing output have worked,
but in this cell where I'm trying to demonstrate an error, there's an error case in this function demonstrating it, and nbval has gone this code didn't run right, there's an error in this, so if for the moment I'll comment that out, rerun that cell and
save it, and then go back and run this again, we'll come back in a bit to how we can deal with that case where you want to demonstrate an error as part of the notebook, but you can see that now all of the tests on this one are passing, so
I mentioned before though that it uses the output, and so we've kind of got three different cases here with different kinds of output, so 2 plus 2 will, fingers crossed, always be the same, if it isn't then we've probably got bigger problems than our code
not working. The date.today is going to be predictable but not always the same, so I ran that a couple of days ago while I was preparing this talk, and that's got the 27th
of August as the date, and the last one is doing something that's randomised so it's going to be different every time I run this code. So the kind of crudest way of using the output checking is to use the sort of original flag that was defined
for this which is py.test dash dash nbval, and if I do that then you'll see it runs those cells again but now both the one with the date and the one with the random output have failed, and you can see that nbval gives you back a nice view of what's failed
here so you can see the code that ran and then underneath it what the output should have been or what we were expecting it to be from the file and what the output is when we ran it just now, so kind of when if you're evaluating any kind of testing
framework then everybody loves it when your tests are passing and the thing goes green but that's not really when you need the power of a testing framework, the valuable things about a testing framework is how well it copes when stuff goes wrong and how much useful information it gives you to work out what the problem is. So hopefully we've
done a fairly good job of that with this. So pretty much all of the rest of the complexity of nbval then comes from trying to deal with these variable outputs like which bits
do we want to compare and which bits do we want to say it's okay for this to vary and there's two different ways that you can approach this. So the flag that I showed first with nbval lax starts from we don't check any of the output, we just run the code and
check for exceptions and then you can build up, you can ask it to check specific cells output for that and the other way is you can work down from checking all of the output and tell it to ignore specific cells output. So there's a couple of different ways you
can do that. You can do it with comments in the code, so starting with the nbval lax, so starting without checking output you can put a comment in the code saying nbval check output and that would tell it this cell the output should be consistent, you should
be able to trust that so check this and fail the test if that has changed. Going the other way nbval ignore output tells it this output is not reliable, don't fail the tests if this changes and then I said we'd come back to this thing about raising
an exception. If you're expecting a bit of code to raise an exception and you want that to be shown in the notebook then you can use this nbval raises exception and that will say this exception is okay and in this case it will check the error that
you're getting, so if I raise a different error here before it gets to that one then
okay, that wasn't what I expected. Did I save this? Maybe I forgot to save it. No, okay, possibly it doesn't check that at the moment. Better start debugging. I thought it checked that it was giving you the same error that you'd got before but that
doesn't seem to be working at the moment, that might be a bug, that might be me misremembering the features of the library. You can do all of the same things with cell tags which are a new feature that was added in notebook version 5.0 so if you go into the view
menu then you can select the cell tagging toolbar and then you can give cells these little tags on the top to control particular bits of behaviour so that means that you don't have to be changing the code in order to control nbval and this also means
that this can be language agnostic so these tags work the same way regardless of what your language's comments look like. So the same possibilities us with comments work
with tags so the names now are lower case and with dashes instead of underscores but again you have nbval check output to tell it this should be reliable like I want you to check this, nbval ignore output to tell it I don't mind if this thing changes
and nbval raises exception to tell it this cell is going to raise an exception, don't complain about that. And then there's another way that you can more selectively allow
for bits of variation in the output using regexes so there's a famous programmer joke about regexes which is I have a problem and I thought I'd use regexes now I have two problems so they're kind of famously tricky to get right especially for more complicated things and they're one of those things that's really it looks like you can do an awful
lot of things with it and so people go like oh yeah I'll pass this HTML page using regexes and then like it's yeah they're quite they're more limited than they appear but there are there are use cases for them so like here I've I've got a regex pattern
for the date format that you saw up at the top of the file and you can see that the and so we're going to write this little config file so we say this is the regex pattern
we want to every time we see this in the output we want to replace it with the words date stamp and so you write that to a little file and then you pass it on the command
line with this dash dash sanitize with flag so if I if I just go back up to the start of this notebook and I will convert that to a markdown cell which won't be executed and run that to create the file now if I do dash dash nvval again then you should see one
failure with that date comparison as we saw before and if I now to sanitize with what should I call the file nvval sanitize concerned by the fact that it's not tab
completing here let's see if that's going to work yes it is okay so now although that bit of output differs because they both match that same regex pattern it's replaced them both
with the words date stamp and so it no longer sees the difference so that's just another way to control what what bits of output it's actually checking and then finally vidar integrated it with his other project nbdime so if you have a you if you install nbdime as well as notebook diffs
and merges then you can use the two together so instead of reporting the output in the terminal it can report the output in html format using nbdime so now we've got that one failure again
and here it's going to say this is the cell this is what changed in this case that was fairly easy to see even in the terminal but if you had like a plot or something that was that was changing then nbdime could show you that much more easily than you can see that in a terminal
thank you very much thank you for the interesting talk do you have any questions thank you for the talk um would it make sense to instead of having nvval check whether cells
have an error that you like execute the cell show the error save it and then have nvval check that exactly the same error is indeed raised you mean with the the demonstrating an error yeah yeah possibly yeah that's something that we should we should think about i can't at the
top of my head think of a good reason why we haven't made it work like that but i suspect there might be one or there might be a bad reason i don't know yeah thanks this is this was very
interesting um so i have a two-part question first is um can you use the pi test uh utilities for uh for example uh the context manager for uh handling exceptions and the other
thing is um can you use the fixtures like the the pi test utilities utility for parameterization and directly use it from the notebook thanks um so i didn't i didn't understand the second
part of the question properly the first part um i don't think you can like because the aim isn't that you're writing tests in your notebook so much the aim is to run the notebook and see if see if it's still doing what you expect it to um so we're not kind of we're not
really designing it around people importing pi test and writing test code inside the notebook
um a great presentation thomas i was curious it's still like a very new tool but do you have opinions now on how like the best practices as to how this should be used like as much as possible stuff should be still tested in like source files and this is just you know for long running notebooks like do you have any opinions on this yeah so kind of one of the the cases that
we've particularly designed this for is where you have notebooks as examples for a python library as we have for many of our python libraries in the jupiter project um and so yeah this isn't a a substitute for having real tests for your your code um though obviously if you're
if you're lazy about writing tests then it is possible to use it that way like it gives you it gives you some assurance that your code is not sort of so completely broken that it can't even import um but yeah i said yeah i would definitely say like it is good to test your
your library properly and then think of this as like checking that your examples haven't been
i've tried to use it actually already um and i get warnings when i uh yeah or errors when i use it with different python versions so i test my libraries my um notebook examples against python 2 and python 3 and it warns me like yeah no such kernel named python 2
um is that yeah yeah it's a i come and have a chat with us afterwards but i think the answer is probably going to be there's a flag that you can add that's dash dash current dash end i think which tells it
like use ipython in this environment because because when you save the notebook it will save the notebook with like saying either python 2 or python 3 in metadata and then it will try to use that same version when it runs it again that's kind of part of the way jupyter works and
for the way we want nbval to work that's a bit problematic so we have a flag that tells it sort of use the same python that the that the test system is already running on oh all right yeah i'll i'll hope to find it in a documentation do you plan a sprint on friday
um we haven't planned one i don't think but if there are if there are people interested then then i'm certainly around on friday all right cool thank you i talk lovely
lovely tools a quick one about nb-dime if i'm correct i've read that that can be used as a locally is there any scope any plans any way to use nb-dime on pr's on github um vidar may be able
to to answer this more i i don't think we've specifically talked to them about it but we have like we have worked with github before to integrate viewing notebooks directly in github so it's not beyond the realms of possibility that that could happen
um sort of two questions as one um does it actually run in a notebook environment and how does it handle plugins and things like uh plots in your output um so i think the way
it handles plots currently is if you're using nb-dime then it compares them and shows you if they differ um i don't know if that's sort of super sensitive any bite changes we're going to
assume the whole plot has changed um again that that's something that vidar can probably answer better um if you're not using the nb-dime reporting then i think it just ignores differences in images and only considers the text output are there any more questions okay then thank you
very much