But how do you know your mock is valid? Verified fakes of web services

Video thumbnail (Frame 0) Video thumbnail (Frame 4577) Video thumbnail (Frame 5880) Video thumbnail (Frame 13554) Video thumbnail (Frame 16098) Video thumbnail (Frame 22488) Video thumbnail (Frame 23695) Video thumbnail (Frame 28219) Video thumbnail (Frame 29105) Video thumbnail (Frame 29834) Video thumbnail (Frame 32711) Video thumbnail (Frame 35322) Video thumbnail (Frame 37459)
Video in TIB AV-Portal: But how do you know your mock is valid? Verified fakes of web services

Formal Metadata

But how do you know your mock is valid? Verified fakes of web services
Title of Series
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
But how do you know your mock is valid? Verified fakes of web services [EuroPython 2017 - Talk - 2017-07-10 - Arengo] [Rimini, Italy] If your code calls a third party service then you may want to test that your code works but you don't want to call the service in your tests. It may be expensive, slow or impossible to call that service. For example, if you are making a Slack bot, you want to create tests which don't make calls across the network to Slack. One approach is to create a mock of that service. Our tests can now run quickly, cheaply and reliably. But if we copy the service incorrectly, or if the service changes, our tests will pass while our code does not work. Verified fakes solve this problem. You can write tests which confirm that your mock is an accurate representation of the service being mocked. Those tests can be a small subset of your test suite and they can be run periodically, to verify the validity of the many tests which use the mock. This talk will follow the example of VWS-Python, a verified fake for a proprietary web service. It will discuss the practicalities of creating such a fake and it will focus on the trade-offs, tooling and approaches involved. By the end of this talk the audience will understand how to tie together pytest, Travis CI, requests and Responses to create a verified fake. The talk is aimed at people who have an interest in writing correct software. It is assumed that the audience is familiar with basic testing techniques
Statistical hypothesis testing Intel Code Computer-generated imagery Code division multiple access Database Client (computing) Statistical hypothesis testing Web 2.0 Medical imaging Web service Prototype Software Operating system Software framework World Wide Web Consortium Pattern recognition Dependent and independent variables Fourier series Hoax Data storage device Client (computing) Database Cartesian coordinate system Statistical hypothesis testing Digital photography Personal digital assistant Web service Data center Point cloud Figurate number Object (grammar) Musical ensemble Local ring
Statistical hypothesis testing Suite (music) Code INTEGRAL Multiplication sign Covering space Execution unit 1 (number) Image registration Mereology Software bug Subset Web 2.0 Medical imaging Mathematics Web service Personal digital assistant Perimeter Physical system Fourier series Software developer Structural load Bit Unit testing Statistical hypothesis testing Digital photography Process (computing) Right angle Functional (mathematics) Disintegration Statistical hypothesis testing Hypothesis Number Internetworking String (computer science) Task (computing) Dependent and independent variables Matching (graph theory) Interface (computing) Code Database Limit (category theory) System call CAN bus Software Estimation Personal digital assistant Normed vector space Library (computing)
Point (geometry) Dependent and independent variables Functional (mathematics) Context awareness Matching (graph theory) Dependent and independent variables Code Multiplication sign Set (mathematics) System call Statistical hypothesis testing Data management Different (Kate Ryan album) String (computer science) Formal grammar Video game Normal (geometry) Software framework Aerodynamics Object (grammar)
Statistical hypothesis testing Context awareness Server (computing) Open source Code State of matter INTEGRAL Multiplication sign Execution unit Client (computing) Login Mereology Product (business) Statistical hypothesis testing Revision control Medical imaging Mathematics Web service Conjugacy class Different (Kate Ryan album) Endliche Modelltheorie Error message Library (computing) Physical system Programming language Dependent and independent variables Fourier series Interface (computing) Software developer Projective plane Shared memory Formal language CAN bus Data management Software Vector space Personal digital assistant Object (grammar) Arithmetic progression Library (computing)
Statistical hypothesis testing Point (geometry) Suite (music) Server (computing) Implementation INTEGRAL State of matter Execution unit Parameter (computer programming) Statistical hypothesis testing Subset Product (business) Formal language Hypothesis Writing Medical imaging Web service Computer configuration Personal digital assistant Implementation Error message Library (computing) Execution unit Dependent and independent variables Suite (music) Fourier series Variance Bit Set (mathematics) Statistical hypothesis testing Subset Exterior algebra Error message Personal digital assistant Self-organization Right angle Library (computing)
Statistical hypothesis testing Point (geometry) Suite (music) Implementation Scheduling (computing) Hoax Clique-width Divisor Code Confidence interval Real number Multiplication sign Statistical hypothesis testing Medical imaging Mathematics Representation (politics) Error message Physical system Fourier series Real number Structural load Line (geometry) Cartesian coordinate system Statistical hypothesis testing Web service Resultant Local ring
Statistical hypothesis testing Point (geometry) Email Implementation Clique-width Open source Code Software developer Weight Source code Login Statistical hypothesis testing Statistical hypothesis testing Medical imaging Web service Software Exception handling Library (computing)
Statistical hypothesis testing Randomization Context awareness Code State of matter View (database) Multiplication sign Source code Workstation <Musikinstrument> Insertion loss Coma Berenices Parameter (computer programming) Medical imaging Mathematics Web service Coefficient of determination Core dump Physical system Fourier series Maxima and minima Bit Formal language Product (business) Category of being Data management Exterior algebra Befehlsprozessor Phase transition Right angle Annihilator (ring theory) Row (database) Point (geometry) Server (computing) Game controller Implementation Observational study Real number Connectivity (graph theory) Similarity (geometry) Event horizon Statistical hypothesis testing Hypothesis Authorization Gastropod shell Data structure Traffic reporting Dependent and independent variables Matching (graph theory) Key (cryptography) Interface (computing) Planning Volume (thermodynamics) Set (mathematics) Limit (category theory) System call Cache (computing) Uniform resource locator Word Software Personal digital assistant Object (grammar) Table (information) Library (computing)
figure thinking everyone becoming so my name's adam Bengal and I work at a company called Mrs. that building an operating system for data centers but
last year I was working something quite different and I was working on the back and all of
my that now what you do is as a user you take a photo of a wine label with your phone and that would tell you all kinds of details about that 1 or using something like that and I'm going to stick my NBA here today and and our ap was a flask now if you don't have flask it's a really simple web framework and looks something like this now the really cool thing about flask is that it provides and music in a very sorry test client has little um and what that means is that you can make requests against an in-memory application and you can get response objects which you can inspect so if we look at this test here it kind of looks like we've made an HTTP request and but actually everything's being done in memory and in out now in our when recognition that we use something as well called the Fourier Web services and basically what for Web services is is it's a tool that lets you upload a whole bunch of images let's say in our case images of wine labels and then when the user uploaded photo to us well we could send that image to the fourier and then the fourier would tell us which 1 of our previously uploaded images that photo most closely matched and then we can do is we could fetch details about the wine from our database and tell the user's details about 1 like how much it should cost hello rated was from exactly where it's from that kind of thing but when we built a prototype will we kept finding ways of problems was about its and in particular those buds came from assumptions that we've made about the fourier which 1 quite right often that came from reading the documentation trusting that it was truthful and full end-to-end we can't always make these assumptions and and so what we want to do we wanted to add tests for our matching workflow and that matching work of course used the Fourier and we wanted those test to be in our existing tests now before it here was accessed over HTTP and that's what I'm going to focus on today but the general idea is really on specific to http because you might want to test because let's say their uses a database for local storage or you might want to test deployment workflow which uses doctor or maybe you even want to test code which uses an Amazon S 3 or some other cloud storage a storage back yeah
now we were lucky we had a very clear idea of what we wanted our 1st has to be under this is quote of code to have on the slide but and simply what we wanted to test was that if the user uploaded a photo of a 1 label which match the photo that we had already added will then they would get details about that wine so I wrote to test a little bit like this I had to warnings here and add 1 let's say that it's like database but also uploaded it to the Fourier and then I checked that I get the right 1 back when I query the match function that much function uses the Fourier on the back and and now with some third-party tools maybe even some of the ones I mentioned like Dr. you might be cuddly vitally cool to call that real tool in your test suite but when
we called the for in our tests we actually hit some problems no personal we were at the mercy of the network and what that meant is when other system had a little network glitch well then our whole test we would fail and because our test made http requests it it is the internet we can know if the is well because of the network failure or because there was some kind of flakiness in our but also we were at the mercy of a fourier so similarly when the Fourier went down all went down temporarily access we would fail and it really does slow down development if you're constantly warning have you made a mistake or is it on there now I see you using a rail service like S 3 as 3 might be pretty stable probably even more stable than yourself west you might not have to worry too much about flakiness but S 3 charges you per megabyte you so if you want to use it in your test we might actually come really expensive to run the tests might to pay per megabyte and um work just think well among and another problem it you might run into is resource limits as deftly something that i've picked and a lot of services have resource limits than a certain number of requests that your account can make and so if you call something you test suite very heavily especially if the let's say you're doing and performance benchmarking you making a load of calls well then you might hit those resource limits and common tests anymore perimeter blocked on development and even when those things 1 2 problems and everything was reduced blood so before it's quite advanced software does a lot of processing magic so that it can do the image matching and that means that after you've uploaded an image well it takes a few minutes until that image can be manifested in reasonable agreement I could really expect them to they're instantly but in our test suite within really want to have to wait a few minutes to know if our get matched code worked so we call these tasks like the 1 that I showed you before integration tests because well they tested the integration of our software with the 4 of the goal of people that views about the terminology some people call these things acceptance tests ointment test but hey we can agree that the high-level tests and they would have to be useful and they really did help us track down some bugs but we also want to unit tests because unit tests give us a lot of benefits over integration tests the and particularly tell us if how code calls before it correctly in this case even when the before is down and you testarossa is small in scope and what that means is well let's England fails Knowles time but often you know exactly which part of your code for it and and if he change of of code to make unit tests past well that there could be a small isolated change and when you vote unit tests that run quickly and all small you can even use some tools maybe like hypothesis to generate a whole bunch of unit tests so we will we want we want to turn our code which can currently be tested only by integration tests into 1 which can also be tested with unit tests and 1 way that some people achieve this is by using MOX now roughly a mark is some code which provides the same interface as something that your code calls but it reduces or remove some cost In this case the main costs that we can about like you mentioned were time get about the slow tests and flakiness again you might want to avoid financial cost resource limits all kinds of other Corsican can come into your tests with so my goal was that where the code under test made a request to the Fourier at least in our unit tests we the test would make sure that that request that http request was actually handled by a mock function rather than going over the web now we're very fortunate we were using the requests library that I'm sure some of you lose the familiar with and there are a few ways with Python to get request which are made with the requests library to point to some and 2 I chose is this 1 is called requests not under there's also another 1 and by efforts to make sense responses there's also something called HTTP GET if you're on Python 2 and maybe you're not using the request library now this simple requests not example is this 1 so what you can say is here when I make a GET request to test of column return a string that says data and that's pretty simple and and at the same
time as using requests mark and I'm sorry presence shuts the devoted his life
will be online at home and at the same time as using requests smart we will also using pyt now and will practice this is is it's a test runner which gives you a really neat way to do set and 10 downfall test requirements now that the features gold fixtures and we have a fixed around here and this 1 says is OK if I use this picture then requests in this test will be handled by mock code you can see we yield when we're in the context manager um but I'm sure that if you using more traditional test framework continues just the normal set intent on this now so what i wanted i didn't just want to return the string data or something like that I want to some quite advanced features in my mind and in particular and I wanted to have a state formal and that will allow me to give different responses based on previous request so I can give a different match response if someone had already uploaded to the mark and a picture of a matching label but so a user requests not feature which let me use call instead of a predefined response and that callable takes a request like objectives gives me with details of the request and so we created a whole bunch of small mocked functions for every n point we used and at this point we pretty much achieve thought all right we could test of close without touching the real before the
but then we had some more problems problems when we were using that clock and and I actually think that these are problems that a lot of mock states and sometimes we found that we'd copied the interface correctly unit can be pretty hard there lots of an edge case is what if the image is too big to be give the right error back that kind of thing and humans make mistakes even with code review and so we found that we've copied a lot of things incorrect but then even when we were extra careful we found that the more quickly became updated whenever the Fourier changed and if they sent out a really nice changelog we could change on not to match it but that's not always the case especially for very minor and minor things and this isn't you know a Python library where you can even inspect the code changes this is a web service the now when you have an outdated not you have quite serious problem all at least was serious for us which is out tests passed but our software is actually failing in production and when you've got that you can have a really difficult time tracking down exactly what your code is broken because everything looks like it should be working if defined 0 actually my mark is wrong where is it wrong trying to basically remake those manual requests to check your mark AIDS very tedious so and there a conjugate and that conflict ended and I kind of felt like had built in over k solution is working alright for the client but I really felt like the only the problem to be tackled in a better way and then I could have provided a better solution if I had more time In in particular because we kept hitting those issues of the Fourier changing and of human and at the same time I really believe that the Fourier and I still do and could be a genuinely useful tool for about people and it could be especially useful if it was easy to development so I set out to make the WS pipes which is basically an open-source library um using the before a web services with Python 8 so in progress hopefully coming very soon 2 pi but I also had another girl and I started testing it with an open source not part part about library but I realize that the most itself is very useful whether or not you using my library and I wanted to shift that not to people so that if they were writing code which used Fourier will then they could have the mark for their own tests so I wrote integration tests library and the exam unit tests library which the mark and I put their tests we wanted have a CI because while the same you at and because it was free for open source projects and 1 really cool feature Travis and for a lot of other yeah systems share and is that I can give it the credentials of a Fourier and I don't have to have this credentials sharpen the codebase and where some abuse them but I also don't have to should have been shot in the logs so I could really use the real service even from CI system and every time I made a change to the library the test alright and this integration tests ran against the real the 4 but if you remember the goal set I wanted people to be able to use my mom to test their code whether or not they were using my library and is a cool way to like even people who use different programming languages not despite and to use your mark while you'll still keeping the interface really nice and pleasant if you've ever had a high test fixture or if you're not using Pieters just the context manager a decorator and so you wanna keep that for Python uses but you want to let other people use your code as well and the way that I did this is well i n build the mark in a way that meant they could be run as a stand-alone so and that is ditching the requests mocks in fact that we had before but at the same time um while all monophone so
I went still that kind of thing to get into it too too deeply because and maybe a model biases that a hairy back but really let me rewrite the mark as if lost and keep using it would request smoke so that means that I've got a fast at and that I can just run as a standalone server but if I get if I use this code it ties it into requests mark so let's see what it does is it translates those request objects from request mark into something that you use that can be used by the ideas the vectors so each test client again but then you also translate responses from Matisse climbed into something that request smokin use all those be alignment in time so
if you're not using Python that we can do is you can start a fast outlets in a doctor container for every test and then you can read your request so that contain no using whatever kind of requests mock alternative your language as a and that can be particularly useful even especially even if you're on an old Python boson that dozens for my next so I'd say this if your in an organization and your writing a mark anyone that not to be used across your organization even if people that use different languages this is a really cool way to do the so that writing mark this time around and the mockers that he called my product so I don't want to just do it in an ad hoc manner I wanted to test it thoroughly and I wanted to write this as the confirmed it was doing what I want the so if you think about it at this point and kind of probably duplicating all of the work that the people at the Florida right I'm rewriting a bit of the service and I'm also thinking about edge cases for and what I'm doing is is very 91 making requests to their servers with those kind of edge cases that I'm thinking about then I'm noting the responses down in tests and then I'm making sure that test passes from my mark and I test things especially that aren't mentioned in the documentation so let's say 1 example is the take away its for the image in CM well happens if you give it a name and and a negative way well I did it I found a given article copy that exact variance my not in in the library of which is the kind of the main product handles that error and raise is an appropriate place and must likely exceptions so that so this point I have 3 sets of test so I have a few integration tests which use the test library with the real the Fourier I have a whole bunch of unit tests the library made many hundreds of thousands if you count those which are generated by hypothesis and those used the mock and then I have some unit tests for the market itself but I'm still tolerable to those problems that I mentioned earlier copying incorrectly um and before changing which will render my mom connector and now my library possibly even by so they're turning aim not into a verified state which is the title of this talk is all about avoiding those problems now what are verified fake is roughly is it's a fake implementation which is verified against the subset of of a subset of the same test suite as the real and implementation now I don't have the fourier coat and i'd if we don't have better sweet if they've even got 1 and so if I wanted to make a verified fate which I did I needed to have my own so turning them all into a verified failed really meant making a test suite which ran both against the mark and the real thing so if
you recall that simple pike test fixture from before well I expanded so pike this is really cool feature and coal parameterization and you can parameterise fixtures so that test which use those fixtures around once with each parameter options so here a simple true false and an I map that to use real the Fourier or not and so many test which uses this fixture is run twice that's run once with with the real the Fourier and once with the mark for for so um these are the test
results look something like this you can see each test runs twice and end fortunately I already had at least the style the test suite for that mark so the 1st thing I did was I applied this to those tests so they're entered into the mark and the real thing and of course I found that I I've made a whole bunch of missing so now we've got a verified to fate and we have a test suite which runs against both the fake implementation and the real implementation now because the most
intended to verify verified think we actually trusts that is representative of the real the fourier so we have loads of confidence in those hundreds of tests that we have the library and and and we know that they don't just rely on an unrealistic but we also another problem if you remember we were worried that the Fourier would change and that would make our mark an actor well now whenever these tests passed I know that the market is still a faithful representation of the Fourier and we only incur the cost of running a hundred test against the Fourier but we get almost the better whole benefit of running thousands of tests against before so we lesson and that kind of cost of slow tests but at this point our test only ran when we make a change to the code which might not be that often especially once expenditure and so we want to know what happens if a Fourier changes at that point local feature of Travis and I'm sure a lot of other bills systems is that you can actually set has to run on the schedule and so this is trade off if you run them more time you find out problems quickly um but you had those those costs if you run very rarely takes a long time to find out the problems so the trade that I chose was to on every nite and you can do them every release every week will just whatever works for your particular situation the now back to that which example In the line application I talked about the beginning we really didn't care about the physical width of 0 1 label it wasn't a differentiating factor and also is actually really hard to get that's why we didn't care about it that much that and we tell the for all the time the with was 0 the matter to us and always work and how not supported and when we get to the if I take now the months later verified fake also supports and has a test that the width of 0 is is OK no errors is returned images that
act but 1 morning I get an e-mail from traps and it looks something like this
and it tells me that the bill so that logs and I see that we actually have a very precise data point of exactly what's changed in the fourier so that the mark passes for this test but the real implementation fails and the test is but what if I had a an image with a width of 0 so I know what I do I just change the mock function and on the test so that the behavior new behavior is represented by the mark that's very easy and but now if you remember the libraries tests they themselves depended on what so now the library expects that a weight of 0 is valid but it's invalid so as soon as I change the mark will in the libraries tests immediately started failed so I have to change the library to give a nice Python exception when user with the 0 and what I really demonstrates is that really within a few hours before and made an undocumented change and that introduced compatibility with my library and then this incompatibility is fixed without any real complexity but and to me that shows the value all having verified to add any developer really he's using his writing code which integrates with third-party software so neither you can imagine that building a verified face when you have the original source code is much simpler than when you don't and all that they can share like checkered with the real implementation and hardly any open a web services are open source so as to be really valuable if your shipping software
people if you shipping software people which they might want to call and tests well you can actually add tremendous value to that software ownership and verified and might even call someone like me to choose choose your software over competitors and if you make a verify phase as the author of a software was much easier because well because you can get told the formatting any changes that it would make the fake and real estate so you know when to make changes to your own code without the need for that once per day test from so I'm hoping that maybe in the future having an API which is easily tested against will become kind of table states and and 1 cool thing about making verified they will you don't really have to shift your secret sauce you can just something that does the bare minimum of your API interface you making something like a fourier you can just have a really rubbishy kind of image matching thing that's you at the core of your business you don't need to show that people and so I have now that you have a rough idea release of what verified is why might be useful and how you can start making want yourself and for your users made so thank you very much of mental and
welcome In some questions hi very great great talk in 80 % overlapping with the 1 behavior 2 dogs at all but your you've got a case study which is great I don't have the general discussion and so the I think your editing is exactly how the worried should be is like there is no justification for releasing component without a the terminology but I am trying to use the to use the mark and so so station between faith in and loss of words they 1 thing when the investor is is that essentially the act of face should be a spy are in but not the the original idea was to be a surprise that featured providing introspection API perhaps and with this case I don't worry about about that so much because the API itself provides introspection about its brought the thing that's missing here in my view is the ability to Jamaica because I'm going to give is lives your my but you don't really want to be there with like to give rise to your CPU to check your code you have viewed the ball cafe whatever should be programmable to raise matter so actually and I've got a response that so that's all it's very difficult for the on 5 takes to verify it right because how do you have a test that checks against and they differ in this volume meant that I had have test the test that when you say this is going to give a 500 it would give a 500 just like when their services them because this service on down right now but actually if you check out the source code for the various Python and it takes a state object and and so I have various states like on fire but not quite and so you can say just like I have this verified mock fixture all verify the Fourier fixture will verify the Fourier context manager I you can give that parameter which says broken inactive slow and and then your you can see a your tests work even when there is a 5 minute delay in the matching and I hope that gives you a little bit of insight into how it dealt with that issue there the that I really appreciate the talk about that and I was wondering I In this kind of service you were walking there are basically the the response was depending on the data you put before uh how would you go about looking service for which you don't uh the ability to specify the data for example if I want to know all of the events in a specific location the change every day they're not in control and how can we write tests against these data which I'm sure so you can imagine that that API that event consuming API that you're doing and let's say I think event right is 1 of those those companies all meet up don't call but they also have an API added that right but maybe I might not be public to you so what you've got to do is an act as if you all the meetup . com person right you the meet of the concert servers ages make some ability to add to the Adam event even if it doesn't have a mock API that will be exactly like this and then you can know that a given that I've already added an event it works in the same structure now uh and if you want to verify it well you can do you can have a test account that has an event in a particular location with um you know a particular image and then you can make your test remedies that test accounts and have uploaded that kind of event into your mock already and then you can say OK I want to take this event and check that the response is exactly the same I hope that roughly answers your question but you're right it's not a solved issue it's not that I'm always that easy and it is context thank you and because we the is there any way you couldn't agree this would fuzzing to find out the peer responses the you may not be able to think of your petition by using short so I I mentioned hypothesis before that's the closest tool that I personally used to forcing them at any 1 doesn't know it's it's a property based testing tool and I like to generates a lot of text which is kind of what was in and I have actually done it for this because the request limits was so slow this is so low and the request took so long and actually the point of doing this to me was so that I could add fuzzing to my code and but you can imagine that if those problems when the case well you could say hey hypothesis or my cousin tool please run random requests against my mark and the real implementation and check that they either are exactly the same in response all share some properties like they have the same keys that that would be ideal but it really wasn't suitable in this case yeah an electrician than August so after adjustment plan the but had a really nice thinking on I wonder and you have like libraries like we see FAR or big amounts which is forbidden from Ruby right and they you can like reporter response like and it's recorded being Jason and I wonder why I you wouldn't use just like of for data they those things like that and then at midnight or 1 so they just disable the cash and see that this was that and so FEC tools definitely something that I've used a bit but can How do you know that the a put it aside maybe you have a very similar case right that the API can change and then when you disable the cache then you have to update your VCR responses and then you've kind of got a very similar thing but you might not have the had component if I want to hear add an image what do I do it of the shell system at a time I tend to have it so I don't have a grant of about 100 to pass on to the next 1 this is an alternative I guess to the future system that you just have to use all a that they so your you know I think that people use Bcl 2 Bcl some other service and I've certainly and very briefly contributed to pipe gets and the cavity i and what they do is they recorder responses from VCR and really I try to avoid it because it came with its own set of problems and and it was more painful for me to use in this system at reference to on but thank you Aaron thank you the