We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

API-schema-based testing with schemathesis

00:00

Formal Metadata

Title
API-schema-based testing with schemathesis
Subtitle
Automatically generate test-cases based on your API-schemas.
Title of Series
Number of Parts
130
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
The goal with this talk is to introduce the audience to property-based testing for API’s using schemas to automatically generate test scenarios, enabling them to write more powerful tests faster. The talk will focus on a subset of the field of property-based testing where we focus on testing by automatically generating properties and test strategies from the API Schemas that we often already have. These tests ensure that our APIs conform to their specified schema and enables us to write a much larger amount of tests in less time. I will focus on the schemathesis library which leverages the strong hypothesis library as well as the hypothesis-JSONSchema extension strategies, and will in the future also support GraphQL via the hypothesis-graphql strategies. I’m a contributor to schemathesis and currently working on the future GraphQL support with schemathesis creator, Dmitry Dygalo. I will also compare it with its predecessor “swagger-conformance”, pure property-based testing through hypothesis, schema strategies with hypothesis-graphql and hypothesis-jsonschema, and discuss their advantages and disadvantages. I will also briefly talk about “QuickREST: Property-based Test Generation of OpenAPI-Described RESTful APIs” (https://arxiv.org/abs/1912.09686), the research paper that’s part of the inspiration for these tools. By focusing on property-based test generation using schemas we already have, I will show that a field like property-based testing, which can seem quite daunting at first, actually can have a low barrier to entry while yielding large amounts of value in return and is useful for most common web projects today. The talk will show how formal schemas for APIs can and will continue to provide additional value outside the scope of documentation.
61
Thumbnail
26:38
95
106
James Waddell Alexander IISlide ruleInternetworkingRight angleNumbering schemeHypothesisInformation technology consultingStatistical hypothesis testingStatistical hypothesis testingSoftwareSoftware testingEmailTouchscreenMeeting/Interview
James Waddell Alexander IIStatistical hypothesis testingSlide ruleComputer fontStatistical hypothesis testingCategory of beingDemo (music)Interface (computing)Common Language InfrastructureNumbering schemeFocus (optics)Query languageFile formatLogicCodeSoftware frameworkAnwendungsschichtDatabaseLibrary (computing)Personal digital assistantSpectrum (functional analysis)Client (computing)Code generationError messageInformation securityDensity of statesService (economics)LeakAuthenticationProxy serverComputerStandard deviationEndliche ModelltheorieoutputHypothesisData modelStatistical hypothesis testingHypothesisNegative numberCrash (computing)Statistical hypothesis testingOpen setStandard deviationState of matterBitAsynchronous Transfer ModeINTEGRALGroup actionCASE <Informatik>Query languageAuthenticationCategory of beingInformation securityWeb 2.0Error messageCodeRemote procedure callSlide ruleLink (knot theory)Software frameworkMereologyFile formatInterface (computing)Common Language InfrastructureDemo (music)Topological algebraLibrary (computing)ImplementationArithmetic progressionLatent heatTrailRevision controloutputEndliche ModelltheorieDenial-of-service attackTwitterControl flowInformation technology consultingMultiplication signGeneric programmingType theorySound effectLeakFlow separationDifferent (Kate Ryan album)LogicClient (computing)Spectrum (functional analysis)DatabaseCartesian coordinate systemWebsiteComputer fontMusical ensembleConnected spaceMeeting/InterviewComputer animation
Data modeloutputPersonal digital assistantNumbering schemeLibrary (computing)Information managementServer (computing)Link (knot theory)Query languageStatistical hypothesis testingServer (computing)Conformal mapCodeNumbering schemeDampingMoment (mathematics)Arithmetic progressionGoodness of fitOpen setHypothesisError messageLogicDivision (mathematics)IntegerTopological algebraStatistical hypothesis testingLibrary (computing)Content (media)Dependent and independent variablesInstance (computer science)Endliche ModelltheorieExpected valueCategory of beingNatural numberMusical ensembleComputer fileType theoryCASE <Informatik>Set (mathematics)CodeStrategy gameException handlingCrash (computing)Cartesian coordinate systemRevision controlDemo (music)God
Random numberCommon Language InfrastructureStatistical hypothesis testingInterface (computing)Electric currentStatistical hypothesis testingCategory of beingBulletin board systemParameter (computer programming)Row (database)Compact CassetteKörper <Algebra>Function (mathematics)Entire functionChainPersonal digital assistantDependent and independent variablesAuthenticationError messageServer (computing)Content (media)CodeCurve fittingObject (grammar)Link (knot theory)Extension (kinesiology)ComputerService (economics)Covering spaceInformation technology consultingHypothesisSign (mathematics)YouTubeError messageRow (database)RandomizationServer (computing)Library (computing)BitInterface (computing)Statistical hypothesis testingStatistical hypothesis testingResultantHypothesisOnline helpCompact CassetteCodeSoftwareResponse time (technology)File formatAsynchronous Transfer ModeState of matterFormal languageASCIIMobile appCommon Language InfrastructureCartesian coordinate systemPatch (Unix)Extension (kinesiology)Link (knot theory)Open setService (economics)FreewareNeuroinformatikStandard deviationSound effectObject (grammar)Dependent and independent variablesTopological algebraArithmetic progressionType theoryLogicConformal mapEntire functionAuthenticationContent (media)Category of beingNumbering schemeRule of inferenceComplex (psychology)Form (programming)GoogolDemo (music)Web pageConfidence intervalMultiplication signDifferent (Kate Ryan album)CASE <Informatik>Revision controlInformation technology consultingYouTubeSlide ruleWave packetGraph (mathematics)Physical systemXML
Information technology consultingSign (mathematics)HypothesisLink (knot theory)James Waddell Alexander IIStatistical hypothesis testingCurve fittingError messageServer (computing)CASE <Informatik>Statistical hypothesis testingEndliche ModelltheorieInterface (computing)Software suiteUnit testingComputer animationMeeting/InterviewXML
James Waddell Alexander IIHypothesisLatent heatTwitterComputer fileStatistical hypothesis testingWindowTDMAInterface (computing)Text editorUnit testingMeeting/Interview
Transcript: English(auto-generated)
Our next speaker is from Sweden, it's Alexander Hultner and he's a software consultant and founder of Hultner Technologies AB and yes you're going to show us something about scheme
taste about this is how to pronounce you know thesis right yeah that's correct so this is scheme based testing right yeah so you generate tests based on your schemas okay is everything fine on your side the internet working have you told your kids to not use the Wi-Fi yeah
everything is great perfect then I think we should begin please start sharing your screen and we'll see it I think works fine yes okay so I hope you see my slides now yes so please
thank you yeah okay so no sorry okay so schema based testing yes okay sorry I just dropped yeah
so schema based API testing today I'm gonna talk about a technique that allows you to automatically automatically create tests from your API schemas using a tool called schema
thesis and so everyone welcome to my talk it's interesting to have it online I've had a couple now so firstly I'm gonna talk a little bit about myself so as you heard I'm a freelance consultant I'm the founder of Hultner Technologies AB you can find me on twitter
at a Hultner you can also email me at contact at Hultner.sc I have a website Hultner.sc and you can also see all the slides from this talk and all other talks I have more or less at the slides.com slash Hultner and I'm on LinkedIn as Hultner as well
so should be easy to find there so let's get on with the talk so I short outline them first I'm going to start with a short introduction to EPI schemas in case you aren't familiar with
them already and then I'm going to talk about some problems you might have encountered or you may encounter I'm going to talk about property based testing shortly about the library called hypothesis and then I'm going to talk about schema thesis which builds on top of hypothesis but
allows you to automatically generate tests based on schemas I'm going to have a short demo showcasing how it works in action I'm going to showcase it's both its CLI interface and it's Pytest integration mode I'm also going to talk a little bit about stateful testing about the
future of schema thesis and then some Q&A so let's continue so API schemas for those of you haven't heard about it it's used to describe an API and one of the most widespread standards
these days is the open API spec which was previously known as swagger swagger today is UI for open API but it's also used to reference to the older versions version one and two of the open API spec and it's based on rest and JSON
I'm also going to talk a little bit about GraphQL which is another technique I know there was a talk earlier about it in this track and it's a typed query language where the schema and
data format is a part of the specification and there is some support for this as well in and it's work in progress so it's getting better every day so there is a lot of Python implementations for open API or swagger some of them are connection by
salando which is a spec first then there is a lot of code first which generates the specs I'm not gonna say that one is better than the other and depending on your use case different styles can fit but there is fast API which is based on async I've used that a lot recently and
I really like it it's very flask like but built from the ground up for APIs and async then there is flask res techs which was previously known as rest plus but it's worked
and there is flasker there is API spec which generates specs from marshmallow if you already use those Django rest framework is another all these links are clickable so if you go to the slides you can click on these links and come to go to the websites for different parts
yeah so based on that we can go ahead then and go into the actual subject so the problem you may have is that you maybe have inaccurate data maybe you get unexpected user requests or
maybe a mismatch between the database layer and the application layer there can be a library defect there can be human errors invalid schemas missing edge cases there is a lot of things that can go wrong even if you have good schemas for your APIs so a schema isn't guaranteed that it works 100 and there is of course a spectrum of the
effects so not all errors are equal even if all are bad so maybe you have an incorrect or non-conforming schema maybe this isn't high severity your application probably won't be
compromised only based on this but you can break client generation code it leads to incorrect assumptions which in its turn costs time and money and engineering time it can break client code generation oh I already said that yeah then you have
unhandled errors which is lower severity it looks bad and it's an inconvenience for both the user and whoever comes up on it can cause confusion and if you're unlucky it can
lead to further escalation maybe you have logic errors these are higher it can lead to data corruption incorrect behavior maybe your application crashes or incorrect billing maybe even a negative number on your checkout if you have a web shop
and security problems of high to critical severity denial of service attacks data leaks authentication bypasses remote code execution and many more so of course we want to avoid
this the effects and errors and problems as much as possible and testing is of course something we use to minimize this and today today I'm gonna talk about a specific solution or a specific way to run testing I'm what I'm using today is based on property-based testing
in a library called hypothesis which is very great at finding corner cases and generating a lot of examples and tests it does the heavy lifting in creating exhaustive tests so hypothesis is the de facto standard for property-based testing in python and property
models a property models the behavior of a piece of code given a certain type of input so you know it's the the way it should work but you specify that in your property in a generic
fashion but I'm not going to go into all the nitty-gritty details of property-based testing in this talk I have another talk on that which you can find from my links instead we're going to talk about a specific solution
and when it comes to modeling properties so the schema is a way to define expected behavior and expected input and this is pretty much the same thing as a properties so using this fact we can
leverage this with the schema thesis library so schema thesis lets you model properties and strategies from schemas so it automatically generates test cases based on what we already know about our application and about the specs it was created by Dimitri de Gallo in the mid-2019
and it's very actively developed it supports both older swagger specs and newer open API schemas but it also supports the GraphQL the GraphQL support is still basic the Python runner is
in the works it's work in progress but it's making some really good progression and it's working at the moment so try it out if you are interested and look up the effort being made into getting it to work and short about some history influences and related works so
schema thesis is its own library made from scratch upon hypothesis but there's been another library prior to schema thesis called swagger conformance which was developed from
2017 to mid-18 and it never reached a fully stable version but it showcased that this kind of generation was possible and from my knowledge was the first really known example of this type
of testing but it's not actively developed anymore and thus schema thesis was created there's also a research paper called quick rest and some of the features in schema thesis are inspired from this research paper and their findings are interesting and
well aligned with what schema thesis does so if you're interested in going deeper you could read that and then let's go into the more detailed stuff so now we want to model some
errors and to do that we need to think about how should application work and how should it work in a more generic way so some things we know is that the application should probably respond the server shouldn't crash you shouldn't get the unexpected exceptions
you should get a status code which is one of the defined status responses it shouldn't be over 500 and you can have stateful links and make sure then that those behave in expected way
so if for instance you create a resource and then you query for the same resource you want to make sure that that works and if you update it that should work and if you delete it that should work and so on so quickly like some set of code we could define this like you have a response you have a status code you want to know that it's under 500 and that it's in the
allowed responses status codes you want to know that the content type is in the allowed sets of content types and that the content actually matches the schema spec
so these things are true for all requests in our schema and therefore we can actually use this as some kind of base to generate tests for all our endpoints so let's pray to the demo gods and see if this works now so i'm gonna showcase
quickly how it works so here i have some example code this is pulled pretty much directly from the flask rest x documentation i've just done some simple
small modifications but it's basically exactly the same so this is a to-do api it can create delete the update and read the to-dos and we want to test this api so
uh then we want to use schema thesis so i've made a make file and here we have how we can run it over http so so let's try this out so i'm gonna run the server
and i'm gonna run the tests over http towards the server now when i run this you're gonna see that a lot of requests will start to be logged here because it starts to test them
so and here you can see that it's it's really hitting it with a lot of data and you can see that it actually went through and it tested the get the post to get for a specific one
to put for a specific one and the delete for a specific one and it passed so now we know it works for something that works let's do a small modification to our code and see what happens then so if we see here here we have some business logic which we want to
activate where we want to inverse the id when we get a to-do item so let's see what happens if we save this and we run the tests again and it's running and it's running
and now we can see that we actually got some failures so what we saw is that if the id is zero because we hadn't specified that it had to be a positive integer only that it was an integer
it fails and both the put the get and the delete which uses the id get the same problem so now we have a way to reproduce this failure and based on this we can actually find out why it's happening and we can make sure that this doesn't always happen
instead we can do something like if id is zero then or if id is isn't zero we want to run that otherwise let's just return zero for now
and if we run this again hopefully we shouldn't get a division by zero error anymore and it works so this is a very quick showcase I'm going to show you one other
mode as well because this library can also import your application directly if it uses ascii or risky so if we run test imported we can see that it runs here and here I've added some extra logging to really show that it outputs some
random data basically for every endpoint and it's successful so if we look how that looks basically we just run the app with the import path
when we run schema thesis so that's the demo so great we can see that it generated a lot of requests we could see that it could find a failure that we could fix the failure
and that it worked afterwards so a quick feature overview we have cli we have graph tool testing built-in whiskey and ascii support hdp interface so you can use it with any language you want
it's agnostic we have a pytest interface stateful testing there's fixups I'm going to talk a bit about those built-in fixups for libraries like fast api which have some small non-conformance to open api there are some hooks global test and schema based so you can customize the behavior
there's something called target the property-based testing which lets you search for the desired goal and quicker find the results but reduce the randomness of the tests there's vcr recording so you can record all your tests in vcr cassettes and replay them later or
use them with any software you may already have that supports the vcr cassette format so you saw the cli interface basically you can run schema thesis help to get all the documentation for it as well but a minimal example you can see that you run schema feces
run towards the end point and then it will run the tests and you get some results so it's very easy to use and the whiskey and ascii interface I showed you as well here's an example using a flask whiskey app and you run schema feces with the app path
imported and you point to the end point for the schema then we have a pytest interface so let's make an example or let let's look at what we could
do so we have already built in checking if it's not a server error status code conformance content type conformance and response schema conformance but maybe we want to extend it with
some complex business rules maybe we have some response time or sla or maybe we want to write some properties for our authentication and make sure it's not a 401 or that it works as we expect so to do this we can use the pytest interface and the pytest interface
I can from the documentation or this is the example from the documentation but it can be used to customize the entire schema feces testing and you can generate tests and the data based on the schema but you can write your own logic for how to validate that it's correct
so in this example you can see we test if the response code is under 500 this is already built in but just to showcase how it could work so next up we have stateful testing it's a very good way to enhance detection of certain defects
they also talk about this in the quick rest research and see that it can greatly find defects faster and it was recently added to schema feces
it reuses data from a previous request and response resulting in easier ways to find defects faster and reaches further into your code base it requires links between your objects so it will work with open api free but in open api 2 or swagger you need to use the xlinks
extension to make this work but basically it can look something like this so you make a post request and then you make a get request with the same user id you patch it with the same user id you can get get the user again patch it again and so on so you can see that it works as it should
so the future of schema feces system we need your help to grow graph tool is being developed right now and a lot of progress is being made but a lot more can be done as well so hopefully
it will be as good for graph tool in the future as it is for open api today it's it's supporting the open api two and three versions today open api 3.1 is in the works
so it will support that when that is out and the idea is to have it agnostic from the schema standard and be able to work with a lot of different schemas they are working on foster test generation growing the community and of course improve
the documentation so it's easier to adopt for new users and more so concluding we want you to spend less time writing tests but cover more and let your computer do the lifting so you can gain deeper confidence in your services more things are coming and you should
try it out it's very easy to get started and it's a great way to make it much easier to get much more testing in your application so concluding if you have some questions and you don't come up with them now you can contact me here are some links to my github
to this talk's github page and all the slides as i said before is at slides.com i'm making a course on hypothesis right now and you can sign up in this google form in case
you want to get notified when it's ready i have a previous hypothesis talk on youtube from picon sweden last year where you can see more details about property-based testing and of course on the discord go to the talk testing with the schema feces channel if you want to ask more
and i'm available for training workshops and freelance consulting so any questions first thank you very much for the talk and we have a few questions and one of the question is can the test be customized for specific cases
uh so if i mean if i uh understand the question correctly uh basically what you're thinking about this uh this uh uh pi test interface which lets you customize uh tests basically you can
make uh tests based on a specific endpoint or you can add tests for the entire suite so in this case i just showed the parameterize for the entire for the entire uh schema but you could as well add some custom by test based model for a specific endpoint and if you go to the
documentations documentation you can read more about it but yeah i think that should be what you're after yeah thank you very much there's the other question of will this work with unit tests
uh i mean uh i actually haven't tried so i guess you're talking about uh using the built-in unit test in python i've only used the pi test interface and i'm not sure how well hypothesis in general plays with unit test but uh i mean if you're using the cli interface you're
not really you don't really have to care about how it's implemented behind the scenes but if you want to extend i would recommend at least using pi test maybe you can get it to work with unit
test but it's not something that's uh officially supported as far as i know okay thanks very much and there was one unrelated question to your talk what's your editor and command line set up because it's so good looking yeah so i'm using vi as my actually i'm using new vim these days
but yeah so this is vi and i'm using tmux as my multiplexer so i have all my my sessions from my multiplexing and all my windows so i can go around between them uh and yeah actually i have my dot files on github so if you go to my github you can replicate my setup
or if you can't you should send me a tweet or something and i will help you okay thank you very much by the way the discussion can continue in the talk specific discord channel and somebody that just posted
that it's available for unit tests so that works thank you very much for