Logo TIB AV-Portal Logo TIB AV-Portal

System Testing with pytest and docker-py

Video in TIB AV-Portal: System Testing with pytest and docker-py

Formal Metadata

System Testing with pytest and docker-py
Title of Series
Part Number
Number of Parts
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
Christie Wilson/Michael Tom-Wing - System Testing with pytest and docker-py System tests are an invaluable tool for verifying correctness of large scale online services. This talk will discuss best practices and tooling (pytest and docker-py) for writing maintainable system tests. Demonware has used System tests to verify online services for some of the biggest AAA video game launches as well as internal operational tools. Many folks who write software are familiar with unit testing, but far fewer with system testing. ----- System testing a microservice architecture is challenging. As we move away from monolithic architectures, system testing becomes more important but also more complicated. In the video game industry, if a game doesn’t work properly immediately after launch, it will heavily impact game success. We have found system testing to be an important tool for pre launch testing of game services and operational tools, to guarantee quality of these services at launch. We want to share with you best practices for system testing: when to write system tests, what to test and what not to, and common pitfalls to avoid. Using python’s pytest tool and docker-py for setting up services and their dependencies has made it easier than ever to write complex but maintainable system tests and we’ll share with you how we’ve made use of them. Developers (senior and junior) and ops folks can walk away from this talk with practical tips they can use to apply system testing to their software.
online services Games tests
testing flow Development operations system system ups tests tests
testing current batch Actions states Development RIP tests tests production environment provide orders Games localization
complex integrators provide unit system model unit testing
testing focus confidence time ones acidication staff applications tests Bugs tests Computer animation Software orders Right encoding
testing information time provide Databases clients Databases help tests tests Auswahlverfahren Software provide Software statements Rolling life Cats libraries libraries
testing factor integrators Integrationen unit interactive provide clients Databases Databases unit testing perspective tests tests training Types component-based Computer animation case provide system system libraries libraries
testing standards server beta time bits Continuous Integration tests tests processes Computer animation environment hypermedia orders system ideal system cycle PEAS Windows
point modules testing functionality code bits help unit testing tests system sort model objects errors write
testing default standards building states time bits applications Part tests entire hypotheses tests alternatives logic orders ideal box Right contrast report libraries
script injectives code interfaces coma bits clients REST API tests image Computer animation provide hash objects record libraries
Slides bottom server link code images clients clients versions image environment repository versions system Right objects Hacker
image response errors case log images Part tests exception tests
counterexample Open Source states prime factors time sets Databases tests tests service Computer animation Schätzung green orders box sort
default fake system structure clients IP address tests oracle
sample sign in report events tests
dynamic functionality information mapping code unit bits volume IP address tests Computer animation provide write
testing code provide Development time law feedback perspective libraries tests
functionality Development integrators Development Ranges unit sets Continuous Integration TDD tests Software Les system system write
script functionality webcrawlers script Development life ups tests tests Types category environment naturally provide system Rolling system contrast sort tasks
testing building Actions presentation integrators time unit water functions argument Part stacks programs image strategy file system Display predictive software engineers Development moment storage unit testing meetings entire means message-based provide orders Right sort write spacetime point current machine branch theoretical tests second training elements number system structure equations tasks information refrigerator law interactive experts Databases lines system call Software environment case mix Blog cloud family libraries
so welcome Christian Michael and that everybody and Preston Wilson financing in another editing in where and managing leader of the test tools team of Michael Moore also develop a region where you were focused mainly on test automation and general quality stuff and also from Canada but way so 1st about must roles and you know where I'm working with the industry and we do online services for those games and you will learn more consistent than their efforts today we're going
to take you on a a journey is 1 of the many details in commandement flow holes in the quest for quality today's tail once upon a system test there are many stops along
the commanders journey until starts in 2011 and even where we're testing needed improvement before the semantic improve testing the commander had to learn what testing was with their newfound understanding the commander could understand some homes containing best practices for system testing and commended picked up a couple allies along the way hi test and occupied before we go like it is impractical take place that hopefully you can use right away details tailored specifically for people who do more development work for more operation so long
story so restore the commanders of journey by going over the current state of testing and so back in
2011 we had this gigantic monolithic popcorn I mentioned that we did services following games wouldn't mention was that was actually 1 gigantic service the and
in order to disaster features in a gigantic service we do things like test production for test manually in a local development documents and we even tried to ease the burden of spinning up assess environment I'm making a really complicated batch groups that work is really an unmaintainable now
2016 and even learners caught up at the micro service craze so instead of having 1 model that we have a whole bunch of micro micro-services but complicated dependencies between them the same thing is it turns out it's actually easier to test the model is that it is because the micro services
so deal with this additional complexity we now have a team dedicated to test tooling instead of relying on just the unit test we have unit integration and system so
it along with banners journey next that they had to find out what testing actually was but we actually found it very difficult to define testing and that way some several focus on why we do tests instead so what we test 1st and foremost we test in order to increase our own confidence that staff right she does will be expected to right when we write tests were actually codifying the intended behavior of our applications in encode right and so we can go back and this is the way to see I'm as our sulfur evolves how it's supposed to work and also as we can you to run these tests we can easily catch bugs that maybe reintroduced as time goes on we
also want to clear up some common misconceptions about testing and some people think that the test in order to find all the bugs in software but it's actually pretty much impossible the matter what you do they're going to be somebody that you don't find the only bad going to find the ones you already know how to look for acid illustrate this so you see behind the commander there's a pink but with the bleeding hearts for those you don't know the the bug and as you see was saying if only the that will actually will to cast but before affects us just because when we write our tests we really have other bugs and minds and as see it can be immediately placed behind the commanders another misconception
about testing is that testing improves the quality of the software that test themselves don't actually improve the quality of the software by the time you run the test that at the software already has whatever but is going to have if you want higher quality software place to do that is during the requirements-gathering were designed by testing in the information about the quality of your software and software is intestine is usually viewed as lower quality because there is less information available about so
time for example suppose we have a simple service let's say if the cat matchmaking service because that means and all the services to help catfight endocast to play video games with him so the cast themselves will talk with service using a my library and service itself will store statements of a database so how we test this
masculinity before I might do you have these these 3 types of tests unit testing integration testing system so unit tests we're going
to use unit tests to provide almost 100 per cent coverage of the library and the service itself unit tests are the fastest test to run the easiest to write and the easiest to maintain so are going to cover pretty much everything that unit tests when issued for 100 per cent coverage were probably not gonna make it that's not really reasonable that responding to go for it we also have some interesting tests in this case they will test the interaction between a service in the database and you have until now we can test each of the components of our service in isolation and now the system test which is the main thing we're here to talk about today so system test test the entire system from the perspective of the end user but the most valuable test because they actually use your system the way the user does and the most likely to find but on the other hand there the hardest to write most complicated to run and that the slowest the slowest because of all the set up that's required so briquette matchmaking service were gonna rely mostly on the unit testing integration testing your coverage we test all of our training factor components the unit test we cover the gaps between the service in the database with the integration and the edges the sprinkling of system test is a couple happy path tests and many if you cases
so with the commanders newfound alter testing they were finally able to decipher those ancient tones that they just laying around the tones were real with phrases like ship and darker adopt wallet commander was taken aback when this arcane terminology intrepid hero carried through the anyway and the process standards and best practices for systems so the
best practices and the first one is that you should be giving your test the fresh test environments like whenever there there's this will help avoid extensively so for example it should not matter what any any individual test does because you get a new environment for each test Amen as we'll see in a bit media but operates environments that makes it even easier to achieve this ideal of having a fresh test it's also important to make sure the test can easily run both on your build server is locally so you want to be on a build server so that the continuous integration is making sure that they work over time but you have to make sure that if there's a problem and somebody needs to be but something they can really easily run test locally as well another important details to restrict the environment support and if you start using doctor you might be under the impression that Dr. runs the same way everywhere but it actually runs very differently using a bond to work using something like the the doctor beta for Windows it actually behaves quite differently and then if you so if you allow people to use all those environments and of supporting a lot of obscure problems they're just specific to their environments so more us practices so as I mentioned you test should be running on a on a 1st date but it's also between a bath themselves right because for everything that the test is running after completed it just puts extra burden on the person actually doing the testing I will probably make them less likely to run the tests the future for 1 to run the test additionally attachable fail fast and informatively in order to reduce the time it takes to identify a problem and also to react to to an overall tightening the years that cycle just a
quick note about glucose but if you're writing really well-factored tiny bits of functionality do 1 thing and do 1 thing well at some point you're going have to bring all those things together somewhere we often refer to this as glucose so this example coherence is using a bunch of other modules and calling into them if you've written unit test before you know that if you want a unit test this you have to create a whole bunch of mock objects and you have to sort of model this complicated dependencies between them the test that results from that is often very hard to write it's really hard to maintain it doesn't really add anything so for this kind of code we recommend skipping unit test altogether and discovering system so as the error
1 along the journey more they came across 2 allies that promise to help make system testing a lot easier to do 1st of all I was quite
fast so for those of you don't know pi test is an alternative hypothesis library of this is in contrast to the building new test library in the Python standard library and you'll see that when you write as price as you'll find others less so let's go over all the remaining more from but also part comes a lot more features built-in by default however they are optional so use them whatever you like to if you'd like to of the main thing we talk about today is practiced pictures entire test the feature is simply a nice way of defining some set up and some turnaround logic for some state value you test requires and Titus will ensure that that the set of and teardown are called in that order for each of your test which is very important and as we'll see when users and test you generally need to set up a lot of states right because whenever you copy application testing so
time for examples on the left here we have 2 green boxes these are the set of internal for the fixture and on the right is a test I mentioned practice will make sure that so that was around before you test and turn after tests and ready default election do this for every single test you right something very easily easy to achieve that clean state ideals I can also change when test will so that their down so this example with the Yellow Book they'll fixture the these isn't this thing called for each test is actually called once before any test run and teardown is also called once after all your tests for a given actually combined together to create a more complicated so that a better application reports and now a little bit about doctors
and even where services are fairly hard to set up and run to make it easier with put them the docker containers only started to write tests that use these containers at 1st row complicated bash scripts that the set up carried out of the wasn't very maintainable so then
recommended found the next allied occupied the doctor applies a Python library for using the offer on the interface however has a one-to-one mapping with the REST interface so it's a little bit clunky now demonstrate what that looks like so in of show some code that you would use with occupied to create a client objects pull image created container started container and then remove the container if you use the doctor command line at all you know that steps 2 through 4 are usually just the doctor run command and you don't get the same convenience with occupied you have to be more explicit so 1st we're going to
create the doctor client object also also if you're interested in using any of this code and there's a link at the bottom of all of our slides ago get have people that has all the example code so especially when examples get a bit longer if you actually want to take a look at it in more detail just go to the euro so this example code right off the the bat you can see that this would only work on a system that has Unix sockets so restricting the environment becomes pretty important and the other candidates that were this fled to automatically detect the version of the service so that we don't have to keep the client and server and saying next are going to pull the
image that we want to run on the darker command line if you don't specify a tag with people to the latest the darker side is not do this instead of lexical the entire repository you have to be explicit about the taking 1
another caveat is that the doctor doctor I will often not raise exceptions in cases that you think it would you think if you feel the the image of a reason exception that actually you have the part of the response yourself so something it's important to be aware of they were going to
create a container the important detail here is that we're adding a special label to it so what we do in our test is we add the same label to all the containers that we started our test and then we can do some fancy things like dumping a lot smaller containers on after the test over then we start the
container and then when we're done we it
and so it afterward counterexample of Dr. prime factors working together so it might replace the generic second-tier down in the green boxes with container and the container and you need more concrete example suppose that we have a web service presented by the yellow box here and that has no state by itself it sources say the database somewhere and so we want to test that's souls but once you have state but it does sort states inside of 2 database containers rats and mice equal so we're going to have a 2nd set of features which gets a different order for each individual test that will give us new databases each time
yeah the example of a simple or simple I just picture and this does what Christy was primarily a crater Dr. clients starts to container and it was and again at the end of the the main thing to note here if you can see it's so this also an oracle is the yields in the old we actually are returning the IP address of the newly started container and may not be apparent here but we're actually able to use IP address inside of our test that uses this structure up
I just also has a very elaborate hoax system which lets you modify the default behavior prices and we actually use this
to to dump the logs of all of all of our containers at the end of the test run so this difficulty is the log report which is executed whenever Pieters wants to dump the test report somewhere and in the event that the test has tetanus failed of what you wanna go through each of the containers that have a special label and will damp out all of the logs and that's sample and we were pretty
impressed us so if you use a talker
all you might be wondering about Dr. composed I could continue Dr. compose instead it seems like it is a very similar functionality to what we're doing so yes you can and it works really well and especially if you want use exactly the same set up for every single test the writings of the cluster of services you if you're running for each test is the same topic compose makes a lot of sense and if it's dynamic if you're doing things like changing the volume that amount interchanging the port mapping doing anything more complicated than something like occupied makes a bit more sense if you do decide to use Dr. composers still fits in really well I test test fixtures to can have a fixed unit does the doctor composed of members doctor compose down and then you can also use the occupied to inspect some of the containers and get information out of so this is another
example of what would that look like so this is a fixture doesn't opera composer up and then yields that the IP address of 1 of the containers that started in and tears down the question again the example code is often a republic you if you're interested in using it we also
encountered a few important gotchas along the way and 1 of them is that doctor has no notion of a service actually being able to receive requests and so sometimes test will fail because the service in the container is actually still starting so you can get around this by by having an executable inside containers that you can call me outside that says whether the service is ready for request and you can use back off and is double the Python libraries of back off and we try which make is really easy it's also important to make sure that your containers start up as quickly as possible something that we can win the hard way and the slower the containers starts the slower the test will run in this law people get feedback on the code and the story development time will be and this will lower the overall quality of so sometimes the rapid sort of and so the commander has had a long and arduous journey but has gained lot knowledge along the way as the next day like to share with you some of the takeaways they've gotten for both the devil perspective of testing you may be
thinking now is kind of cool but what I do it that so hoping that we can give me you some specific things that you could try when it when you're back and that developing so if you do more
development work from it's really cool to know how to write tests and range this is great but sometimes it's also even more important to know when not to write tests on if you going use system tests use them sparingly and that being said that if the next and have a feature to develop Triton test-driven development try starting with the system test if you don't have any system
tests with the software that you're working on try introducing 1 for each piece of software that you don't have and make sure that it can run with as little set up as possible and it runs as quickly as possible and then added to some kind of continuous integration system if you already have tests take a critical look at them you actually need all the test that you have are some of the retesting functionality that the unit and integration tests already cover and you remove them and can you make them any faster and the same
thing is to apply to you optimize monofocal there as well and 1st of all you 101 utterances and test so for example you probably do not need them if you want this test someone off scripts right because by their very nature you not care whether you're stressed is 1 of scripts keep working in the future in contrast if you do have to lean and other scripts that do need to work future then yes you should therefore answers I start by having just 1 system test which will exercise enough functionality in your tool to prove to yourself that it works and as well as developers should also run is regularly so that you give me a value generally
for OPS test them and they're like 2 categories the first one is test that involves services that you can run and so for example before we had them the fiction of sort of a mystical container as in the we can run locally and for that we recommend using something like the commander described really with the spiders and occupied now for a task that requires things that you cannot run like those in Amazon Web Services for example on the DL by test but there are some questions you should ask yourself 1st so for 1 it is a possible is that feasible to have a short test in this external environment is of course a lot of money and also easy easy enough to clean up after yourself in its external environment so you don't up access charges or anything and if you're couple of these with the questions and yes you should therefore run test for these types of tools but use them sparingly so
in conclusion system tested great definitely right system test alright too many system test and if you're if you if you have services that you can run in containers try checking out prices fixtures with occupy and or Dr. composed it works really well so as the commanders tail comes to a close theory contends with all the knowledge that they gained across the journey and they're looking forward to bringing that knowledge back with them to their own counsel thanks for listening you know you can doesn't consumes I think you will find the over interesting I have a lot of questions that I was trying to do just 1 and when do you I mean I do this England integration you have dedicated their environments and how do you test do you system testing their developers machines on not and yet his 1st use the which had assessed as as much as possible so let y you developing the future working on you will run the test I'm ideally you would catch any failures in the unit has stated in is a faster and faster this system test but actually we want everyone to bring these tests all the time and we have a mind an example for example all the time just to make sure that the UN but ideally the developers of all steering them to we have we have a team is dedicated to our and build build infrastructure and so we'll be using mostly bamboo when we run all the system test and bamboo on family agents and we're slowly migrating over Jenkins now using the features that were in the cloud we're trying to make sure that this test will run on developers machines as well yeah more questions rather than the numerical later and so I'm interested in what's your ratio between 1 and of integration terrorists and fundamentalists that would be the 1st part of the question and the 2nd part of why not only use into instead OK and the University of X so the and the ratio I would say it so it depends on this is more like an ideal that we're going forward with the burning soccer we'll have a lot of legacy software and it is not it's basically all sort of unit tests that operating in a test of the law in Florida we have like saying like hundreds or even many thousands of unit tests to like a handful of systems test like 10 for like like like less than 50 system just like how the test so that basically means that 100 per cent coverage and is testing some of the client-facing and points with the system that have the 2nd the question which I believe was what you not just runs success and so as the value seen understood the entire software stack it's very expensive a lot of time and it reduces the turn-around time for 1 summons actually working on on something right test and make sure it works on so that's what you want users sparingly although you are right in that the most benefit because they actually use your software the way it it will actually used so that's 3 but because the bottom line really is how much speed you wanna sacrifice and and usually what we'll do is we'll have more unit test in order to cash things as early as possible because of really fast right before it get this the other thing is it depends on how many past there are 3 software so if you have a lot of like branches and those have branches been covering all that a system test is basically impossible because of the number of cases you have to cover that use unit tests at that goes New for those commodity can measure all that stuff is covered it might be completely infeasible assistant test but some software is better suited for assistant that's like we also write some software specifically for like on a meeting on like testing and deployment and for some of that stuff we have pretty much only system has no unit test all so it really depends and thanks for your talk 1 question regarding the date of fixtures so you the containers but how do you manage getting the data into those of Pocono whatever you use for data stores and then be able to test the flow and so I think will be easy to use we have 1 feature which stands up the database container and a 2nd picture which depends on that picture which will actually insert the data into a common unit there please reply with prior testimony so the parameters used found to ensure that the data before you actually run your test right so you get an example structure we have it is is yielding IQ just the data but for that you could use some other sort of if you want to enzymatic it we can it is in fact is not fast enough in some other cases we build base image that has added amazing in then we will regularly build images on top of that you have the data that we need to test test and the Tetherless stock and container that has the data needs are the I got a question have you tried this approach we need to become unchanged then look at us and vitamins like but around this and you have without doctor yes we we have although news and why it wouldn't work it is making more expensive sustainable whole new virtual machine which adopted but you can use be plug-in of something like the current guess in place of occupied in our examples were like applied over everything you get of this so I have a system that's actually pretty similar to that and I you have this like small technical problems my fixes automatically download and images that you need and I have this problem that you run tests and then nothing happens for like 30 seconds narrate light uh write a blog and pop up during my fixes up by this blog and so I'm able to just stop a message is no it because it's but thought by this is all over seems so that's the way did the did you do something like that actually on from year to year to the of the week also have that problem I don't think we have a great answer for you and in some a lot of cases we have a we have a we have a logic so that if the image was pull the poet and we have our building agency of a previous step that will always the latest image so it's not and we assume that when people running locally the sort of on the pulling themselves if they want the latest image library answer but I think is a lot of opportunity for somebody to write a really good library for using Dhaka with test fixtures around so that there is something like that would be great and that provide a I would recommend looking at the Hawks there might be something that you can add or mediators output picture itself is going out put anything you want just that sort of is up the practice but that but fixed anyone questions the but weak of trying thanks for the it was the interval and my question is how would you approach a situation when you have you know kind of system you described as a around a presentation on stage and normally people with very poor coverage and there's only few system does not with the refrigerator would sold to a test themselves in a fragile part sorry that is as fragile this system as bad as the test of the system so fragile because they're basically some hard coded information and ideas for that because you would start with system test because you can use you verify that the thing is still working Iceland there's only address the flakiness that's sort of an ongoing issue like the whole testing space and but if are resisting tests you can test like this contractual a customer requirements and then as you as you start to refactor refactoring improve the rest your codebase even start writing unit tests and integration test for those but start with a system test so that you know that your software is still think overall work earlier for the knowledge that we mentioned at the beginning of the presentation what we have is we still have all the legacy tax which are kind of like a weird mix that would like reach straight into the internals of the system and have and call things so we're trying to do for new services is a kind of like I work a kind of isolate all those old testament for new things right and unit test right integration testing right system tested Alex slowly transition over that delete the old as we go of the line easy solutions like and so on I'm interested in the whole of the handle because there's are a lot of sentences and are extremely Leakey soldiers repeating the diversity of the referral like 4 times and the answer to that with different ordering of some of those schools that so that structure prediction strategy mentioned every running the test when it fails is actually causes a lot of problems and so several years ago we started doing something like that and because of that the problems with the test of kind of announcing up over time so at the moment where actually a kind of a crisis point we have like something serious about it because been on display the test for so long so I would recommend trying as hard as you can to remove the flakiness like changed and is deterministic it's possible often it has often can achieve that by going with the unit test like tried to figure out how you can write unit tests that remove the the like whatever is flaky like it might be a random element is something of file system something time related to use unit tests to control that part and remove them from the equation and then that sometimes makes the best that's where the animal that together not every running on the test and the lecture up I just plugin for rearing tests automatically so you could do that but try this approach thus was still hi thank you for that that's all I'm interested to know about organizational layer of all testing it so many of them so developers so that they have not compared to that developers that they're working under project and hold on to the mean interact interact with each other yeah so in general I think we try to have the developer writing the new feature you write this as well the the the expert on the future and I think in some cases we have tried like pair programming where some right a because that's less dependent on the internals of more about the general like feature requirements but usually we do try and have the same person right but most all the we have about 120 maybe developers novel company or engineers in general and Michael the only 1 who is explicitly software engineering tasks and I wanna textual steam in there for of us all together so we work on the tooling specifically the Michael's training helps similar genes that have like sort of water and testing concerns but in general at we're trying to encourage people to all the kind of i've skilled in writing test they can write their own tests and problems principle things