We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Design Your Tests

00:00

Formale Metadaten

Titel
Design Your Tests
Serientitel
Teil
111
Anzahl der Teile
119
Autor
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache
ProduktionsortBerlin

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Julian Berman - Design Your Tests While getting started testing often provides noticeable immediate improvement for any developer, it's often not until the realization that tests are things that need design to provide maximal benefit that developers begin to appreciate or even enjoy them. We'll investigate how building shallow, transparent layers for your tests makes for better failures, clearer tests, and quicker diagnoses. ----- * Life span of a test * 5 minute - why does this fail? * 5 day - what is this missing? * 5 week - do I have coverage for this? * 5 month - what's *not* causing this bug? * Transparent simplicity * one or two "iceberg" layers for meaning * Higher-order assertions - build collections of state that have meaning for the domain in the tests * bulk of the details are in the code itself * show an example * grouping for organization * Mixins * show an example * unittest issues * assertion/mixin clutter * setUp/tearDown tie grouping to the class layer or to inheritance via super * addCleanup * weak association / lookup-ability between code and its tests * package layout * other conventions * Alternative approaches * testtools' matchers * py.test `assert` magic
Schlagwörter
80
Vorschaubild
25:14
107
Vorschaubild
24:35
CodeGoogolSignifikanztestMetropolitan area networkSignifikanztestProgrammierumgebungInternetworkingGeradeE-MailVorlesung/Konferenz
Fatou-MengeSchnittmengeSignifikanztestZustandsdichtePCMCIAExogene VariableCodeE-MailDatentypCodeSignifikanztestProzess <Informatik>TypentheorieStatistischer TestEchtzeitsystemDifferenteSchlussregelQuick-SortRechenschieberEntscheidungstheorieSchlüsselverwaltungData DictionaryFunktionalKonstanteProgrammverifikationServerMultiplikationsoperatorSpeicherabzugIdeal <Mathematik>PaarvergleichZahlenbereichNatürliche ZahlProgrammierungProdukt <Mathematik>CASE <Informatik>SelbstrepräsentationImplementierungMultiplikationSchätzfunktionSchreib-Lese-KopfAnwendungsspezifischer ProzessorFunktion <Mathematik>Rechter WinkelInformationMailing-ListeE-MailIntegralKartesische KoordinatenPunktHypermediaProtokoll <Datenverarbeitungssystem>HoaxZeichenketteQuellcodeMathematikSoftwareUngleichungRegulärer GraphZellularer AutomatExogene VariableProgrammbibliothekRechenwerkUmwandlungsenthalpieMinkowski-MetrikKontextbezogenes SystemMessage-PassingGewicht <Ausgleichsrechnung>SichtenkonzeptMomentenproblemObjekt <Kategorie>KomponententestSuite <Programmpaket>Schreiben <Datenverarbeitung>Green-ITIn-Memory-DatenbankSchnittmengeMereologieWeb-ApplikationArithmetisches MittelAggregatzustandÄußere Algebra eines ModulsSchaltnetzBitQuaderInhalt <Mathematik>Syntaktische AnalyseReelle ZahlGüte der AnpassungComputeranimation
DatentypExt-FunktorExogene VariableE-MailQuick-SortMetrisches SystemGeradeMinimumBitInhalt <Mathematik>KonditionszahlInformationsspeicherungLoginZahlenbereichHierarchische StrukturDickeExogene VariableRechenschieberMixed RealityArithmetisches MittelMessage-PassingUmsetzung <Informatik>MultiplikationsoperatorObjekt <Kategorie>ProgrammbibliothekGebäude <Mathematik>PaarvergleichDatenbankMultigraphSystemaufrufWurzel <Mathematik>Statistischer TestKartesische KoordinatenCASE <Informatik>MathematikAdditionSignifikanztestElektronische PublikationMereologieUmwandlungsenthalpieVererbungshierarchieIntelligentes NetzBimodulCodeSoftwareNichtlinearer OperatorValiditätSchlüsselverwaltungSchnitt <Mathematik>BrowserProzess <Informatik>HalbleiterspeicherStatistikEntscheidungstheorieZeichenketteSuite <Programmpaket>GrundraumPhysikalische TheorieBestimmtheitsmaßDateiformatInterleavingGruppenoperationData DictionaryWeb logPerspektiveHilfesystemStandardabweichungInformationOrdnung <Mathematik>Computeranimation
SignifikanztestMAPDatenstrukturRechter WinkelComputeranimationVorlesung/Konferenz
ZustandsdichteCodeDifferenteSummierbarkeitSchreiben <Datenverarbeitung>Computerunterstützte ÜbersetzungBefehl <Informatik>BestimmtheitsmaßSignifikanztestResultantePunktCASE <Informatik>Arithmetisches MittelPaarvergleichComputeranimation
Quick-SortSignifikanztestStatistischer TestKartesische KoordinatenMultiplikationsoperatorsinc-FunktionProgrammbibliothekSuite <Programmpaket>Einfache GenauigkeitGruppenoperationKontextbezogenes SystemMultiplikationFahne <Mathematik>CASE <Informatik>InformationMathematische LogikTupelSchnittmengeSummengleichungInstantiierungRahmenproblemPaarvergleichZusammenhängender GraphStellenringKomponententestProjektive EbeneEndliche ModelltheorieSpannweite <Stochastik>SchlussregelTeilbarkeitt-TestGradientWort <Informatik>Metropolitan area networkComputerspielPunktVideokonferenzHarmonische AnalyseFunktion <Mathematik>Vorlesung/Konferenz
Computeranimation
Transkript: Englisch(automatisch erzeugt)
Hi, everybody. Thanks for having me here. I hope everybody can hear okay. So, testing
design. This is me on the internet. You can find me on GitHub. My email is on the last slide. You can come talk to me afterwards. I work for magnetic. We're in advertising. We do online bidding on realtime bidding on online advertising. We use Python, PyPy, in production. Lots of fun stuff. You can talk to me about that also. Okay.
So, luckily, this talk is fairly simple. So simple that the core ideas fit on a slide and a half, which is basically that we know a lot of things, or we think we know a lot of things about what makes designing software good, and the good thing about testing is that most of those things translate quite easily over into what makes writing
test suites good. So tests are just code like anything else. So we have all these principles that we think help us out when we write software, and they're obviously something that you think about as you're writing software, and they translate pretty
well over to tests. So we have these principles like make sure your objects only do one thing, or make sure that you try separating things so that things are fairly simple and composing objects together and those sort of things. And most of those principles translate pretty directly into testing, so try to keep unit tests, integration tests
of any sort down to testing one specific thing. Trying to make sure that your tests are both simple and also transparent because you're not testing them. So all of these principles that we have for regular software design apply pretty well to testing.
Getting down slightly to specifics, we have this three-step process that gets drilled typically into our heads when writing tests, which is that there's three steps to a test. First you set up some stuff, then you do some exercise, you do whatever it is that you think you want to test in that test, and then you make some verification that
what you expected to happen actually is what ended up happening. This applies fairly uniformly across all the different types of tests that you're going to end up writing. As a three-step process that if you actually think through it as you're writing tests, your tests end up clearer, they end up being more self-documenting, all the sorts of nice things that we like out of our test suites. One particular thing that people sometimes say when we have
this three-step process is make sure that your verification is only one assertion. So you write a test, make sure it has only one assertion. And it's kind of a peculiar thing to hear the first time that you hear it. First of all, most of the time people's first thoughts are, how do I actually make that happen? Because they remember back to
times when they've written tests and they had this long list of assertions, so how do you actually make it happen? But even more than that is when you hear this statement, when you hear someone tell you this, your first thought is, what's the actual benefit? What am I going for by keeping my tests down to a single assertion?
You stare at this example here, which is a fairly simple function. It just takes a bunch of dictionaries and adds them together, adds all the keys and associated values together. Then you have this alternative. It's green, it's better. What is the actual
difference between the two examples here? Why is this any better? Of course, this is a simplified example, but it's the first representation of this idea to make sure to keep your tests down to one assertion. The most obvious benefit, just to answer that straight off, is that the most important things about tests are their failures, because
tests are destined to fail but meant to pass, so first you've got to see the failure. And the difference between those two slides is basically how much context you get when that test fails. The main benefit that we're aiming for with this sort of idea is that
we want more context when our tests fail, so rather than seeing stuff like, well, this isn't this, which is what you get when you have assertions that look like that, if you make these larger assertions, then you get extra context. That's useful for a lot of reasons, one of which is that while the first one tells you that what you got is not what you expected, the second one tells you not only what you got isn't
what you expected, but possibly the extra information that you got is telling you in what way the actual implementation in your code differs from what you expected it to be. So for example, you're swapping values for some reason, for keys, obviously. In this example, that's pretty unlikely from an implementation point of view, but in real
world examples, it's quite common for things like that to happen, and if all you see is just, well, this isn't this, it gives you sort of less information, less ability to just be able to look at it and say, oh, well, that doesn't look right in multiple places, and the combination of the places where it doesn't look right means that what I did wrong is something. And in particular, this applies to unit tests too, which has
this type of quality protocol where you can get all sorts of nice context like that. So, moving on a bit, so now we sort of like the idea of having one assertion in a test for the reason of getting this extra context, but sometimes it in fact turns out to not
be possible. Oftentimes now we're shifting over from unit tests to integration tests, and sometimes what happens is that there's sort of these two sort of worlds of assertions that you want to make. Sometimes you want to make assert equal, or some assertion of that sort, which are basically data comparison assertions. They're like, I have some stuff,
some values, and some expected values, and I want to make sure that those two values match up with each other. But a lot of times when you're writing applications, what you actually want to assert at some point when you're writing your test suite is more like I want to assert that some collection of things are true, that the state of my
object application something is true, and unit tests won't have assertions for those because unit tests don't know about those. They're part of your application. It's about basically the difference between making assertions about some data and making assertions about the meaning of some application-specific thing.
So to take another specific example, you want to compare some strings, cool, unit tests can help you, you're just making a quality assertion. But if in your application those strings are actually HTML, unit tests probably isn't going to help you because while a standard library has a bunch of parsers for things, it doesn't have assertions for
that. So if you want to make assertions about two pieces of HTML being equal as HTML, not necessarily as string literals, you're sort of out of luck. That's kind of unfortunate because that's kind of useful to do. I'll talk about it a bit more in a minute, but in the test suites that we write, this is what we're doing a lot of the time, and sorts of things like changes in whitespace and things like that are just annoying. So
if you can just compare some HTML, that's way more useful, it makes the test way less brittle, but it's not something that you're going to find out of the box. So here's a pretty specific example of that. I have this sort of fake but very much
similar to a real example of a test for an ad server, some web application that basically you give it an ID, it shows you the associated piece of media with that ID. And we have our three steps here. We do a little setup, there's some hand waving here that basically there's some in-memory database that we're adding this advertisement to. Then we hit
the URL in our application that's actually supposed to be showing that, and then we make these assertions about, okay, I expect these three things to happen. I want to make sure that I got the right status code, I want to make sure that I'm actually properly setting headers for content type or whatever, and then I want to actually make some assertions
about the body of my response. And you stare at this for a moment and you try to apply the rule that you had before, which is one assertion per test, and it's sort of not necessarily obvious what the way to make that happen is in this case, because all three of these assertions are useful, they're all things that are basically, does my response actually work correctly? And so the same mindset that we had before basically
leads some people to basically split this up into three tests with the same setup and exercise, which is not ideal for reasons that I'll skip over at the moment, and instead just skip straight to something like this, where I have nothing highlighted
because this is way more code on a slide than I expect anyone to read. So what is this? This is basically the conversion of this, these three assertions, into one assertion that actually encapsulates some meaning. And the meaning that this assertion is actually trying to encapsulate is that responses have content, I sometimes want
to assert against that content, and there are a couple of things that need to be true when I'm making that assertion, and I want this assertion to just handle taking care of all of those things. So for example, we're checking all of the same things here, along with the addition of content length, so we're checking all these things, making
all these assertions in the assertion method. And so you end up with something that looks more like this. And obviously we've cut down the number of lines in the test, I think also we've sort of gotten some extra clarity, I think the assertion with this name has more of a direct, it's telling you directly what you're trying to assert against. And
we also have this same benefit that we get any time that we take a bunch of code and refactor it into one place, which is any time that you want to come back and make some improvement to this slide, that's going to immediately go and affect all of the tests that you have in some positive way. And you actually obviously have to be careful
with that, because you can silently break tests in that way. But if you are careful with that, what it means is that things like, if someone comes and makes a change to a test, and it starts failing in this assertion, and they notice that it fails in a particular order, so for example it will fail first for the content length order, and they decide
that you know what, that isn't really as helpful as if I reordered these assertions. So if an assertion fails, potentially they want to see the body first, and have that be the failure message if you got a different response than you expected, because that will tell you more information. They can go in here and reverse that assertion, and now that particular thing, which is probably true across your test suite, you now get
better failure messages everywhere, just because someone noticed and went there and made that improvement. So it's the same benefit that we get any time we take some software or we take some particular operation that we are trying to perform and refactor
it out into one place that we can basically concentrate on. And so what happens? What happens is, if you do something like this, the proposal of starting to build assertions on top of the data comparison assertions, is that you end up with a sort of hierarchy of assertions. So at the bottom you have your
data comparison assertions, because at the end of the day you are just comparing two layers, so that has to happen somewhere, but on top of that you can add these layers of meaning. So rather than just being about comparing strings, now you have an assertion built on top of that that's really about comparing HTTP responses, even though
at the end of the day it's just comparing a bunch of values, but it becomes a much more powerful assertion that's able to add all of the nice messages and things like that that you might possibly want to do once you know that you are actually dealing with comparing HTTP responses. And then on top of that you can even build more interesting
things. Some of these things in our case are only in theory, they don't actually exist yet, much as I would like that they would, but if you come back and decide that you know what I want to do, even more fancy things, or even more useful things depending on your perspective, you can come back and layer on top of these assertions that
are making assertions about HTTP responses and start making assertions about, well, is this valid HTML, or layer on top of that something like, well, now I want to start doing assertions about how these things render in the browser given whatever conditions I actually feel like placing around them. And so we're actually doing well on time,
so I'll actually spend a bit more on this slide, which is, so where do we go from here? Assuming that we've all agreed that this sort of using assertions to build more meaning on top of the data comparison assertions, so what comes out of that? And so it's kind
of interesting what comes out of that. I think after a while of doing this, what comes out of that is lovingly mix in hell, I think. By which I mean we build up all of these assertions in a bunch of mix ins in our case, and that's great. It gets you
a ton of benefit, and I'll specifically tell you what some of these things have before I actually tell you why there's a nicer way of doing all this. So for example, we deal with GDBM. It's like a new database in memory key value store thing. It's in
there's no object layer on top of that. There's no way of basically comparing GDBM databases. They're files. So we have a mix in that will take a GDBM database and compare it to a dictionary and tell you if those two things are equal. We have a mix in for logging.
This is something that I think everyone has written at least once. There are some packages that actually try to make this helpful for you. It basically attaches a handler to the root logger for the standard library logging module and then lets you make assertions about things that have been logged. I think everybody has written that. Well, I think that assertion is quite often written and rewritten by people. We have a mix in that has assertions
about our own proprietary log format, which is quite crazy. So we parse that into something more sane and are able to make assertions against that. We have this response content mix in, which I mentioned, which has this content assertions. Something has content, doesn't have content, has content that looks a certain way. We use StatsD with Datadog.
StatsD is like a, you send it metrics and it basically puts a nice UI on top of that and shows you graphs and things like that. So you want to make assertions about things having been, a metric having been incremented, all those sorts of things.
So what happens is basically you end up growing this companion to your test suite, which is I have all the things I want to test and then I have all these useful assertions that either don't exist yet or have some meaning specific to my application and then I'm able to basically use those in all the places in my application. I called it mix
in hell because it's sort of annoying. The coupling of inheritance to actually adding these assertions to your test is kind of annoying. So I will mention that there is something else, it's up on this slide, which encapsulates this idea of I have this
collection of things that I call an assertion and I want to just be able to use that all over the place. Those are test tools matchers. They sort of claim to solve this problem so they're worth checking out if you're convinced.
And so the last thing that I want to say is sort of a call to action, which is that, in part to myself, which is that I think a lot of people are writing these sorts of assertions. There are things that are useful, right? Like if you want to compare some HTML, where are you going to go and look for the assertion that actually does that in a way similar to how I described. So I think we need to start sharing these
things that we're writing. When was the last time that you downloaded and installed a package whose job it was to add a bunch of assertions? I think it's a bit more common to do that for test tools matchers, and there possibly are some packages that do that,
but I'm not sure how widespread they are, so I'm sure someone will tell me. But regardless, I think there's this whole layer of things that is possible to build on top of the simpler assertions that we have for doing these sorts of comparisons, and it would be nice if we sort of built up a bit in sharing these assertions so that they only have to
be written once. We can sort of distribute that benefit across everyone. That's what I got. Thanks everybody.
Thank you Julian. So we have at least five minutes for questions, so there's one microphone here. Is there another one in the room? No. Okay, so has anyone got any questions, and I'll bring the microphone to you, or as close to you as I can. One question here. I would like to know how strict are you with this one or third equal statement,
because if you go back to your first example, what if your data structure, say you have a dictionary, is more complex than that, but in this test specifically, you only want to test for cats and dogs. If you do a direct comparison, you will get a big diff, which is not telling you anything, right? So you would need two assert statements that
look into that dict. So what would you do? Sorry, I didn't mean to cut you off. So I gave this talk last week in London, and it took 40 minutes, not 20, which is why I thought it'd be close, and someone asked that same question. I think he's actually here. There's no general answer. We're all gonna just cave at some point
sometimes. The ideal answer I think is usually that I try to actually use those cases as examples for when to try and look at the code again and see if there's a possible way to
split whatever it is that's doing that apart, because if you have this thing that's being outputted and there's a bunch of assertions that you're making on pieces of it, then possibly that means that what you actually really have in code land is this thing that really should be spitting out a whole bunch of different results and then something later that's combining them. So I try to be fairly strict because I think that's turned
out well in that particular case. It sometimes tells me useful things about the actual code that I'm writing and tells me how to break it up differently, but sometimes it happens. If I catch you correctly, it makes less assertions but makes them more meaningful by using custom
assertions, and that works well I guess if you have generic things that you test for, like HTML for instance output, and you can even share them and use them, but I guess in a lot of cases those will be specific to your application, and I've had something recently with a pytest
fixture and then I realized, jeez, I'm putting so much logic into this assertion, into this fixture, I need a test for this. So then I ended up writing a test for my fixture and then I suddenly noticed the afternoon was over and I really got, and then I kind of went home and thought, is that really the way forward? I'm making my tests more,
they're smarter and they're certainly better readable, but now I have to test the test assertions. Do you see any way out of that? What's your experience? Yeah, it's the same thing, and it's the same thing with anything in testing I think, which is like, it's always a balance. So that's obviously not good, I don't want
to be writing tests for my tests, I don't think anybody does. The nice thing I think is that often the assertions that we have that we've written come out of like, I have a bunch of places where I want to make these same set of assertions, and I notice that
and I say, okay, that means that's a good place to factor out. We try not to start from the other side, which is, I have this thing that I want to assert about, start writing the assertion method and then using it in places, because then it often does turn out like that. You try shoving a lot of things into that assertion at the
end of the day, like there'll be flags in the assertion to do different things. Somebody tried to, it actually got merged I think. There's a flag in our test suite now where, sorry this is embarrassing, this can be on video, there's a flag somewhere in our test suite for doing comparisons on like HTML, and it
sort of switches on whether or not to do an assert in or an assert equal, and I it happens, unfortunately. I don't have any other, I don't have anything that can help out with that other than just use your best judgement as it's happening.
Any other questions? Hold up your hands if you have a question. No. Well, since we've got some time, I've got a question. So we both work on the Twisted project, and I know that I've been told off umpteen times for having multiple assertions in my tests, but I wonder when
you've just moved all your assertions into a custom assertion wrapper, then you're going to fail what JP will always preach to us, is that. The problem is that then if the test does fail, you have to run the test multiple times to get to find out, if it fails on multiple assertions, you have to run the
test multiple times to get all the failures. So would it be worth, instead of having those multiple assertions in the wrapper, just collecting all of the information and putting it in one single container? Yeah. Yeah. You couldn't have gone with the softball question?
Yeah, so I agree with him and with that, that sometimes it's nicer to just shove it into a container. I think it's very easy to be lazy in that case, because it's sort of a non-obvious thing, and until someone tells you to do that, I think you're like,
people probably think of that when they try to apply this rule, like okay I'll just shove it in a tuple and make tuple comparisons, and they say no that seems ridiculous, and I don't think that's a solution that people are likely to come up with unless someone tells them. So to be perfectly honest, I think that the actual solution for that problem, which I understand,
is that TestTools has this nice other thing, which is failures that don't stop execution of the test. Right, that's what he always talks about. I haven't actually seen it in action. Yeah, they're pretty great. So if you actually have a bunch of assertions and you want to also get this thing of, I want to execute all three of these and then I want all of the context, then I think that's
the actual way to go, is to get something like that. Okay, great. Any final questions? Oh, there is one more. Have we got time? Yeah, I think so. Did you ever look into PyTest?
Because I think it solves some of the problems you described. Yeah, I'll be perfectly honest. I don't know of the surrounding things that PyTest adds, so I believe you. I'm sure that it has similar, like other than just the eliminating test case thing, I know it has like fixtures. I don't know of the other components that PyTest has.
I mean, you don't need a matcher library because PyTest shows much more of the context. Yeah, I know that it shows nice things, like it'll show locals and frames when tests fail and things like that. I know it has some nice things. I can't imagine it can
provide the same sort. It's not going to give you layered assertions because those are things that you're probably defining in your application. I know that PyTest will give you some nice things, but I haven't used them. Okay, thanks Julian. That's a great talk, very informative.