Nbval - Testing your notebooks
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 43 | |
Author | ||
Contributors | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/38195 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Place | Erlangen, Germany |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
4
5
6
7
9
10
11
14
28
29
32
33
34
35
38
39
41
43
00:00
Statistical hypothesis testingComputer fileDemo (music)Cellular automatonKernel (computing)Firefox <Programm>MereologyException handlingACIDRandom numberLetterpress printingSoftware frameworkPlotterFile formatFunction (mathematics)QuicksortStatistical hypothesis testingProjective planeFunctional (mathematics)Computer fileSocial classRight angleLibrary (computing)Presentation of a groupData conversionCellular automatonNumberOpen setLaptopResultantCodePlug-in (computing)Basis <Mathematik>Boss CorporationSoftware development kitDifferent (Kate Ryan album)Order (biology)Installation artComputer animation
02:44
Computer fileDemo (music)Plug-in (computing)CodeModule (mathematics)Division (mathematics)IntegerSystem callFunction (mathematics)LaptopCellular automatonCoefficient of determinationCASE <Informatik>Video game consoleCodeStatistical hypothesis testingComputer animation
03:00
Letterpress printingRandom numberKernel (computing)Cellular automatonComputer fileFirefox <Programm>Demo (music)System callCodeBitFunctional (mathematics)Moment (mathematics)Error messageCellular automatonCASE <Informatik>Right angleSineComputer animation
03:25
Letterpress printingRandom numberCellular automatonFirefox <Programm>Demo (music)Computer fileModule (mathematics)outputDivision (mathematics)IntegerSystem callPlug-in (computing)CodeMereologyStatistical hypothesis testingMessage passingCASE <Informatik>Error messageComputer animation
03:42
Letterpress printingRandom numberKernel (computing)Cellular automatonComputer fileFirefox <Programm>Demo (music)Error messageFunction (mathematics)Different (Kate Ryan album)CASE <Informatik>ForceFunction (mathematics)State of matterMultiplication signFlagQuicksortCodeComputer animation
04:45
Module (mathematics)Division (mathematics)IntegerDemo (music)Plug-in (computing)CodeLetterpress printingoutputFunction (mathematics)LaptopCellular automatonRandom numberComputer fileSample (statistics)Function (mathematics)RandomizationCellular automatonInformationComputer filePower (physics)View (database)Statistical hypothesis testingCodeSoftware frameworkDisk read-and-write headBus (computing)Computer animation
05:44
Function (mathematics)Cellular automatonLaptopRandom numberComputer fileProcess (computing)FlagFunction (mathematics)BitOperator (mathematics)Latent heatException handlingCellular automatonComputer animation
06:17
Computer fileIntegerDivision (mathematics)Demo (music)LaptopCellular automatonPlug-in (computing)CodeDifferent (Kate Ryan album)Exception handlingCellular automatonFunction (mathematics)Latent heatComputer animation
06:42
Letterpress printingFunction (mathematics)Random numberDemo (music)Kernel (computing)Cellular automatonComputer fileFirefox <Programm>System callFunction (mathematics)BitStatistical hypothesis testingCASE <Informatik>Error messageLaptopCellular automatonMathematicsCodeException handlingNeighbourhood (graph theory)Task (computing)Computer animation
07:49
Division (mathematics)IntegerSystem callKernel (computing)Computer fileCellular automatonLetterpress printingFirefox <Programm>Demo (music)Data typeLaptopModule (mathematics)Plug-in (computing)CodeFunction (mathematics)Random numberoutputACIDStatistical hypothesis testingComputer animation
08:06
outputSystem callData typeKernel (computing)Letterpress printingFirefox <Programm>Function (mathematics)Demo (music)Random numberPlug-in (computing)CodeComputer fileDivision (mathematics)IntegerCellular automatonDataflowRange (statistics)Line (geometry)EmailACIDException handlingMoment (mathematics)Observational studyCellular automatonGame controllerView (database)Menu (computing)Exception handlingFunction (mathematics)Point (geometry)SoftwareBounded variationMachine visionLabour Party (Malta)Library (computing)MathematicsMultiplication signProof theoryLine (geometry)Formal languageSoftware bugServer (computing)LaptopCodeCASE <Informatik>BitError messageOrder (biology)Revision controlComputer animation
10:00
LaptopRegulärer Ausdruck <Textverarbeitung>Kernel (computing)Cellular automatonDemo (music)Firefox <Programm>Function (mathematics)Computer fileFunction (mathematics)Programmer (hardware)Bounded variationRegulärer Ausdruck <Textverarbeitung>BitPattern languageComputer fileConfiguration spaceCASE <Informatik>Web pageWordFlagMultiplication signFile formatCivil engineeringLine (geometry)BackupMessage passingTouch typingRevision controlComputer animation
11:25
Computer fileDemo (music)Plug-in (computing)CodeStatistical hypothesis testingKernel (computing)Cellular automatonFirefox <Programm>Random numberLetterpress printingACIDModule (mathematics)GUI widgetIntegerDivision (mathematics)DivisorSystem callFunction (mathematics)Regulärer Ausdruck <Textverarbeitung>LaptopCellular automatonComputer fileComputer animation
11:43
Computer fileDemo (music)Plug-in (computing)CodePell's equationCellular automatonLaptopFunction (mathematics)Letterpress printingCycle (graph theory)Multiplication signPairwise comparisonComputer fileComputer animation
11:59
LaptopRegulärer Ausdruck <Textverarbeitung>Cellular automatonComputer fileKernel (computing)Firefox <Programm>Demo (music)Function (mathematics)Letterpress printingPlug-in (computing)CodeSurfaceBitWordDifferent (Kate Ryan album)Matching (graph theory)Pattern languageRegulärer Ausdruck <Textverarbeitung>Function (mathematics)Computer animation
12:28
Regulärer Ausdruck <Textverarbeitung>Demo (music)Computer fileCellular automatonKernel (computing)Firefox <Programm>ACIDBitDifferenz <Mathematik>Function (mathematics)Radical (chemistry)Projective planeLaptopTraffic reportingComputer animation
12:53
Function (mathematics)Demo (music)Letterpress printingPlug-in (computing)CodeLaptopUniform resource locatorDifferenz <Mathematik>outputCellular automatonFirefox <Programm>PC CardCASE <Informatik>PlotterFunction (mathematics)Cellular automatonHecke operatorFile formatCodeLaptopQuicksortIntegrated development environmentPlanningPrandtl numberMereologyPhysical systemRight angleFlagRevision controlDifferent (Kate Ryan album)Medical imagingSource codeTraffic reportingInheritance (object-oriented programming)MathematicsStatistical hypothesis testingException handlingError messageLibrary (computing)Presentation of a groupUtility softwareData managementContext awarenessRadical (chemistry)Plug-in (computing)Kernel (computing)Projective planeOcean currentMetadataSubstitute goodDirection (geometry)WordPattern languageVideo gameIdentity managementMultiplication signFingerprintRoundness (object)Covering spaceNetbookWritingPerturbation theoryComputer animation
20:34
Computer animation
Transcript: English(auto-generated)
00:05
Next up is Thomas Klaue with his presentation on testing with NML. Thank you. So if you've got a repository with notebooks, you may know that it's not very easy to
00:23
test them conventionally, this isn't just about the technology, this is also because the code that you write in notebooks is different to the kind of code that you write in a library, so you're not so much writing functions and classes that can be sort of isolated and reused in different places, and that's kind of how our conventional
00:45
testing frameworks work. Notebooks are much more about telling a story with code, and having a narrative from getting your data in to producing your results in your plots that somebody can follow through.
01:01
So nbval is a tool for testing notebooks, and it's using one of the features of the notebook format, which is that the output is saved inside the notebook along with the code, so the basis of nbval is that you're running the code and you're checking does
01:22
this output look right, and this is a project that's been developed by a number of people under the Open Dream Kit project, which is an EU funded project, Vidar, who spoke this morning about the 3D stuff in iPython and Jupyter notebooks, has done quite a lot
01:42
of the work on it, along with a collection of people at Southampton who I'm kind of here to represent, but David, Oliver and our boss Hans have all done more of the work than I have on this. So you can install it, as with so many things, by doing pip install nbval, it's probably
02:05
also on Conda, Forge or anaconda.org somewhere, I can't remember what channel you have to look under for that, so I have some cells with some code here, and you can see that they're producing some output, and in order to run it, you run a command that looks
02:27
like this, py.test-nbval-lax, and this will, nbval is a plugin for the PyTest framework, which is one of the very popular frameworks for testing Python code, and if I switch
02:44
into a terminal and run py.test-nbval-lax, then you'll see it runs these code cells, so each dot which would normally represent one test here represents one notebook code cell, and you can see that in this case, these cells printing output have worked,
03:05
but in this cell where I'm trying to demonstrate an error, there's an error case in this function demonstrating it, and nbval has gone this code didn't run right, there's an error in this, so if for the moment I'll comment that out, rerun that cell and
03:25
save it, and then go back and run this again, we'll come back in a bit to how we can deal with that case where you want to demonstrate an error as part of the notebook, but you can see that now all of the tests on this one are passing, so
03:45
I mentioned before though that it uses the output, and so we've kind of got three different cases here with different kinds of output, so 2 plus 2 will, fingers crossed, always be the same, if it isn't then we've probably got bigger problems than our code
04:03
not working. The date.today is going to be predictable but not always the same, so I ran that a couple of days ago while I was preparing this talk, and that's got the 27th
04:21
of August as the date, and the last one is doing something that's randomised so it's going to be different every time I run this code. So the kind of crudest way of using the output checking is to use the sort of original flag that was defined
04:42
for this which is py.test dash dash nbval, and if I do that then you'll see it runs those cells again but now both the one with the date and the one with the random output have failed, and you can see that nbval gives you back a nice view of what's failed
05:05
here so you can see the code that ran and then underneath it what the output should have been or what we were expecting it to be from the file and what the output is when we ran it just now, so kind of when if you're evaluating any kind of testing
05:23
framework then everybody loves it when your tests are passing and the thing goes green but that's not really when you need the power of a testing framework, the valuable things about a testing framework is how well it copes when stuff goes wrong and how much useful information it gives you to work out what the problem is. So hopefully we've
05:44
done a fairly good job of that with this. So pretty much all of the rest of the complexity of nbval then comes from trying to deal with these variable outputs like which bits
06:00
do we want to compare and which bits do we want to say it's okay for this to vary and there's two different ways that you can approach this. So the flag that I showed first with nbval lax starts from we don't check any of the output, we just run the code and
06:25
check for exceptions and then you can build up, you can ask it to check specific cells output for that and the other way is you can work down from checking all of the output and tell it to ignore specific cells output. So there's a couple of different ways you
06:45
can do that. You can do it with comments in the code, so starting with the nbval lax, so starting without checking output you can put a comment in the code saying nbval check output and that would tell it this cell the output should be consistent, you should
07:06
be able to trust that so check this and fail the test if that has changed. Going the other way nbval ignore output tells it this output is not reliable, don't fail the tests if this changes and then I said we'd come back to this thing about raising
07:26
an exception. If you're expecting a bit of code to raise an exception and you want that to be shown in the notebook then you can use this nbval raises exception and that will say this exception is okay and in this case it will check the error that
07:46
you're getting, so if I raise a different error here before it gets to that one then
08:00
okay, that wasn't what I expected. Did I save this? Maybe I forgot to save it. No, okay, possibly it doesn't check that at the moment. Better start debugging. I thought it checked that it was giving you the same error that you'd got before but that
08:24
doesn't seem to be working at the moment, that might be a bug, that might be me misremembering the features of the library. You can do all of the same things with cell tags which are a new feature that was added in notebook version 5.0 so if you go into the view
08:43
menu then you can select the cell tagging toolbar and then you can give cells these little tags on the top to control particular bits of behaviour so that means that you don't have to be changing the code in order to control nbval and this also means
09:06
that this can be language agnostic so these tags work the same way regardless of what your language's comments look like. So the same possibilities us with comments work
09:20
with tags so the names now are lower case and with dashes instead of underscores but again you have nbval check output to tell it this should be reliable like I want you to check this, nbval ignore output to tell it I don't mind if this thing changes
09:40
and nbval raises exception to tell it this cell is going to raise an exception, don't complain about that. And then there's another way that you can more selectively allow
10:01
for bits of variation in the output using regexes so there's a famous programmer joke about regexes which is I have a problem and I thought I'd use regexes now I have two problems so they're kind of famously tricky to get right especially for more complicated things and they're one of those things that's really it looks like you can do an awful
10:25
lot of things with it and so people go like oh yeah I'll pass this HTML page using regexes and then like it's yeah they're quite they're more limited than they appear but there are there are use cases for them so like here I've I've got a regex pattern
10:47
for the date format that you saw up at the top of the file and you can see that the and so we're going to write this little config file so we say this is the regex pattern
11:06
we want to every time we see this in the output we want to replace it with the words date stamp and so you write that to a little file and then you pass it on the command
11:21
line with this dash dash sanitize with flag so if I if I just go back up to the start of this notebook and I will convert that to a markdown cell which won't be executed and run that to create the file now if I do dash dash nvval again then you should see one
11:51
failure with that date comparison as we saw before and if I now to sanitize with what should I call the file nvval sanitize concerned by the fact that it's not tab
12:08
completing here let's see if that's going to work yes it is okay so now although that bit of output differs because they both match that same regex pattern it's replaced them both
12:22
with the words date stamp and so it no longer sees the difference so that's just another way to control what what bits of output it's actually checking and then finally vidar integrated it with his other project nbdime so if you have a you if you install nbdime as well as notebook diffs
12:49
and merges then you can use the two together so instead of reporting the output in the terminal it can report the output in html format using nbdime so now we've got that one failure again
13:05
and here it's going to say this is the cell this is what changed in this case that was fairly easy to see even in the terminal but if you had like a plot or something that was that was changing then nbdime could show you that much more easily than you can see that in a terminal
13:24
thank you very much thank you for the interesting talk do you have any questions thank you for the talk um would it make sense to instead of having nvval check whether cells
13:45
have an error that you like execute the cell show the error save it and then have nvval check that exactly the same error is indeed raised you mean with the the demonstrating an error yeah yeah possibly yeah that's something that we should we should think about i can't at the
14:08
top of my head think of a good reason why we haven't made it work like that but i suspect there might be one or there might be a bad reason i don't know yeah thanks this is this was very
14:30
interesting um so i have a two-part question first is um can you use the pi test uh utilities for uh for example uh the context manager for uh handling exceptions and the other
14:49
thing is um can you use the fixtures like the the pi test utilities utility for parameterization and directly use it from the notebook thanks um so i didn't i didn't understand the second
15:04
part of the question properly the first part um i don't think you can like because the aim isn't that you're writing tests in your notebook so much the aim is to run the notebook and see if see if it's still doing what you expect it to um so we're not kind of we're not
15:26
really designing it around people importing pi test and writing test code inside the notebook
15:40
um a great presentation thomas i was curious it's still like a very new tool but do you have opinions now on how like the best practices as to how this should be used like as much as possible stuff should be still tested in like source files and this is just you know for long running notebooks like do you have any opinions on this yeah so kind of one of the the cases that
16:03
we've particularly designed this for is where you have notebooks as examples for a python library as we have for many of our python libraries in the jupiter project um and so yeah this isn't a a substitute for having real tests for your your code um though obviously if you're
16:23
if you're lazy about writing tests then it is possible to use it that way like it gives you it gives you some assurance that your code is not sort of so completely broken that it can't even import um but yeah i said yeah i would definitely say like it is good to test your
16:41
your library properly and then think of this as like checking that your examples haven't been
17:04
i've tried to use it actually already um and i get warnings when i uh yeah or errors when i use it with different python versions so i test my libraries my um notebook examples against python 2 and python 3 and it warns me like yeah no such kernel named python 2
17:24
um is that yeah yeah it's a i come and have a chat with us afterwards but i think the answer is probably going to be there's a flag that you can add that's dash dash current dash end i think which tells it
17:41
like use ipython in this environment because because when you save the notebook it will save the notebook with like saying either python 2 or python 3 in metadata and then it will try to use that same version when it runs it again that's kind of part of the way jupyter works and
18:01
for the way we want nbval to work that's a bit problematic so we have a flag that tells it sort of use the same python that the that the test system is already running on oh all right yeah i'll i'll hope to find it in a documentation do you plan a sprint on friday
18:23
um we haven't planned one i don't think but if there are if there are people interested then then i'm certainly around on friday all right cool thank you i talk lovely
18:45
lovely tools a quick one about nb-dime if i'm correct i've read that that can be used as a locally is there any scope any plans any way to use nb-dime on pr's on github um vidar may be able
19:03
to to answer this more i i don't think we've specifically talked to them about it but we have like we have worked with github before to integrate viewing notebooks directly in github so it's not beyond the realms of possibility that that could happen
19:27
um sort of two questions as one um does it actually run in a notebook environment and how does it handle plugins and things like uh plots in your output um so i think the way
19:44
it handles plots currently is if you're using nb-dime then it compares them and shows you if they differ um i don't know if that's sort of super sensitive any bite changes we're going to
20:00
assume the whole plot has changed um again that that's something that vidar can probably answer better um if you're not using the nb-dime reporting then i think it just ignores differences in images and only considers the text output are there any more questions okay then thank you
20:30
very much