Mutants, tests and zombies

Video in TIB AV-Portal: Mutants, tests and zombies

Formal Metadata

Mutants, tests and zombies
aka mutation testing with Python and a pinch of Ruby
Title of Series
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date
Production Year

Content Metadata

Subject Area
Mutation testing is a technique in which the software under test is modifiedin a controlled manner to produce a mutant. Then test cases are executedagainst each mutant. This helps answer the question "How good is our testsuite?". For example if we have made the following change: - if A and B: + if A or B: and our test suite reports a PASS result that means we are not doing a goodjob at detecting possible errors. I have been using mutation testing for production grade software in bothPython and Ruby and I'm also the most active contributor to Cosmic Ray, themutation testing tool for Python. In this talk I will explain how mutationtesting works and what it can be used for. I will give practical examples ofcode which wasn't tested and how to test it and also examples of bugs thatI've found. I will also mention some differences between the Python and Rubytools.
Degree (graph theory) Open source Software Code 1 (number) Statistical hypothesis testing Statistical hypothesis testing
Statistical hypothesis testing Pairwise comparison Suite (music) Touchscreen Suite (music) Code Database Cartesian coordinate system Field (computer science) Statistical hypothesis testing Product (business) Goodness of fit Mathematics Software Term (mathematics) String (computer science) Operator (mathematics) Statement (computer science) Thumbnail Endliche Modelltheorie Social class
Programmschleife Loop (music) Source code Data storage device Control flow Condition number
Predictability Statistical hypothesis testing Suite (music) Loop (music) Integrated development environment Software Suite (music) Code Bit Flow separation Statistical hypothesis testing
Statistical hypothesis testing Rotation Suite (music) Algorithm Suite (music) Algorithm Code Software developer Source code Statistical hypothesis testing Statistical hypothesis testing Statistical hypothesis testing Revision control Type theory Process (computing) Programmschleife Integrated development environment Operator (mathematics) Collision Error message Condition number
Rule of inference Suite (music) Pairwise comparison Code Rule of inference Measurement Computer programming Statistical hypothesis testing Product (business) Revision control Mathematics Software Personal digital assistant Operator (mathematics) Data Encryption Standard Cuboid Game theory Resultant Condition number
Statistical hypothesis testing Logical constant Suite (music) Run time (program lifecycle phase) Code State of matter Equals sign Equaliser (mathematics) Set (mathematics) Parameter (computer programming) Client (computing) Mereology Software bug Formal language Expected value Mathematics Different (Kate Ryan album) Core dump Software framework Social class Exception handling Computer file Variable (mathematics) Formal language Entire function Annulus (mathematics) Type theory Right angle Hacker (term) Resultant Slide rule Statistical hypothesis testing Attribute grammar 2 (number) Object-oriented programming String (computer science) Operator (mathematics) Boolean algebra Default (computer science) Matching (graph theory) Validity (statistics) Inheritance (object-oriented programming) Computational number theory Forcing (mathematics) Cellular automaton Interface (computing) Projective plane Expression Code Computer network Limit (category theory) Software Personal digital assistant Object (grammar) Game theory Nilpotente Gruppe Library (computing)
Statistical hypothesis testing Point (geometry) Default (computer science) Inheritance (object-oriented programming) Multiplication sign Parameter (computer programming) Dynamic random-access memory Limit (category theory) Traverse (surveying) Statistical hypothesis testing Attribute grammar 2 (number) Type theory Error message Arc (geometry) Exception handling Social class
Statistical hypothesis testing Complex (psychology) Suite (music) INTEGRAL State of matter Code Multiplication sign Source code Set (mathematics) Parameter (computer programming) Function (mathematics) Mereology Disk read-and-write head Software bug Fluid statics Mathematics Different (Kate Ryan album) Matrix (mathematics) Descriptive statistics Social class Exception handling Covering space Touchscreen Block (periodic table) Software developer Moment (mathematics) Keyboard shortcut Branch (computer science) Sound effect Bit Variable (mathematics) Sequence Statistical hypothesis testing Type theory Website Configuration space output Pattern language Right angle Metric system Freeware Resultant Functional (mathematics) Game controller Inheritance (object-oriented programming) Link (knot theory) Computer file Markup language Disintegration Branch (computer science) Average Graph coloring 2 (number) Statistical hypothesis testing Attribute grammar Product (business) Number Revision control Operator (mathematics) Gastropod shell Directed set Data structure Plug-in (computing) Boolean algebra Pairwise comparison Default (computer science) Information Suite (music) Expression Projective plane Content (media) Line (geometry) Directory service Uniform resource locator Loop (music) Software Integrated development environment Blog Calculation Statement (computer science) Library (computing)
Statistical hypothesis testing Suite (music) Scheduling (computing) Greatest element Run time (program lifecycle phase) Code Length Multiplication sign Equaliser (mathematics) Source code Primitive (album) Function (mathematics) Mereology Computer programming Formal language Mathematics Hooking Computer configuration Different (Kate Ryan album) Semiconductor memory Extension (kinesiology) Social class Physical system Scripting language Email Mapping Structural load Moment (mathematics) Electronic mailing list Parallel port Measurement Statistical hypothesis testing Parsing Statistical hypothesis testing Abstract syntax tree Type theory Message passing Process (computing) Repository (publishing) Right angle Whiteboard Resultant Writing Mobile app Existence Functional (mathematics) Computer file Virtual machine Translation (relic) Thresholding (image processing) Statistical hypothesis testing 2 (number) Attribute grammar Sequence Differenz <Mathematik> Goodness of fit Programmschleife Term (mathematics) String (computer science) Operator (mathematics) Authorization Energy level Code refactoring Squeeze theorem Condition number Module (mathematics) Dialect Computational number theory Expression Projective plane Java applet Directory service Line (geometry) Cartesian coordinate system Equivalence relation Commitment scheme Software Personal digital assistant Logic String (computer science) Statement (computer science) Point cloud Object (grammar) Pressure Abstraction Library (computing) Computer worm
Statistical hypothesis testing Default (computer science) Pairwise comparison Touchscreen Block (periodic table) Equals sign Multiplication sign Equaliser (mathematics) Mereology Statistical hypothesis testing Attribute grammar Statistical hypothesis testing Type theory Different (Kate Ryan album) Object (grammar) String (computer science) Operator (mathematics) Negative number Squeeze theorem Object (grammar) Squeeze theorem
hello everyone so my name is Alex I have
been testing open source software for the last 10 years this is how to reach me if you if you like and this is how I
look know pretty close so today I'm going to
talk about mutants and zombies but not the ones from the movie that's slightly different and I have some code examples so first let me get up and get an idea how many of you are familiar with Python at least to some degree ok good and how many of you are familiar with Ruby ok great so don't worry the examples are not very hard to understand
so let's take this piece of code it's a simple class describing one model in the rails application there's only one field that goes into the database and this is the only one method which is actually you know really stupid metal just returns an uppercase string of that field and as you can see from the screenshot this class has 100% code coverage and the question that I'm
asking myself as a tester and the question that I expect everybody should be asking themselves here is is my test suite good enough and good enough in terms whenever there's some change in the software under test is my test suite able to detect if this change will break something or the change goes undetected that and possibly goes into production and something bad really happens and mutation testing can help you answer this question but first I need to explain what a mutation is so a mutation is usually a very small change in the software which somehow changes the behavior of the software mutations can can come from comparison operator and if statements and if we take the example on the screen it's the mutation is replacing the great done operator with a less than operator and if you apply this mutation to your code and read it it becomes if H is than 18 then buy beer now imagine that
you are an online store and suddenly you know you start doing this so weird small children and this is not good
then another possible source of mutations is constant values you can replace these values with something else and in this example we'll replace true to false and because it's very early in the morning yet if we didn't have our
coffees we would be looking like this
another possible source of mutations is loops we may for example modified loop condition or in this example we may change break to continue and we get an
endless loop and when you have a Landis loop this is what happens so this is again an example from prediction in all my examples are taken from production environments today not the big one but
by definition mutation testing is a technique and also the tools from the edge testing they modify your software under test they don't modify your test suite you know you keep the test bit separate and you for each mutant that is produced in your software under test you execute the test suite again and again and again and this gives you a pretty good idea at the
end how good your test suite is what it can tell you places where you have tests but they don't go they don't do a very good job of finding you know all possible things that may change or they can also tell you places where you have missing tests but you already know this you know because you we use coverage so mutation testing is really good for telling you where you need to to make better tests and the idea behind this is
some of the mutations which you see and which the to support try to try to mimic errors which developers may may do our writing code for example you know plus one minus one errors delete something by accident and you know just commit this to source control stuff like that and other mutations which you might see you know they are purely artificial but somehow they they help us validate the the test conditions the test environment which we run our tests within and you know help us expose something that's missing the algorithm for mutation testing is actually very simple so we we run three loops one after the other first once we go through the rotation operators that our two in particular can support these are stuff which the two knows how to change with something else and regardless whatever that that is maybe then for each operator we find the lysis in the source code where that operator is used and replace it with something else most mutation operators and collisions can lead to only one another type of mutation but some sometimes you can produce different mutations for only one place in the code like with the
comparison operators and then of course you execute the test wait there are three simple rules to kill mutants now which you which you must remember first when you execute the DES twit against the non modified version of the program everything should pass and this is a hard requirement you cannot go without it if something fails obviously your software doesn't work or your test suite is broken and you need to take measures and if you have flaky tests and sometimes they pass sometimes they fail and you have no idea why this is obviously mutation your mutation results will also be unreliable and you need to take take measures to fix your flaky tests the second true is when you run the test suite against a mutant that is a modified version of your program you expect the result to be fail and that's a good thing when the test suite fails that we say that the mutant was cute or the mutant died and this means we had at least one assertion or one condition in the test suite which wasn't met and the desk we felt so that means at least we have one test which is able to detect this change and there was you know the software was modified and the last thing is which you don't want to happen when you run your test suite against the mutant and the result is pass this is a bad thing then we say we have a zombie or that the mutant survived and as you know from the movie zombies are dating that go around and try to eat you and now imagine you make some change in the code run the test suite it passes but it doesn't really understand there's anything change and now this change goes into production and suddenly this becomes a box and tries to eat you so these are the three rules testing against the non modified version should pass always testing against the mutant should fail this is the way a mutation test to figure out when you cure the mutant when something fails in the desc sweet the two says you kill the mutant
okay now let's play a little game I am going to show you possible mutations and because you all know about testing you're going to tell me what test cases I need to write to kill the mutants I am
using the code from earlier the one with the 100% coverage and only take this method see this is a string variable and we return the upper case of the string Corrado that's what what it does so very simple first possible mutation is instead of returning the upper case drink my method under test returns new value so what test can you propose so I can cure this yeah Bob yes so obviously not new yes correct so we we execute the method under test and expect the result to obviously not be new if we go back if we apply the mutation and run the test the method will return new and expectation will fail everything to fail and we kill the mutant ok next possible mutation instead of returning the upper case string I'm returning self because this way to this part of the class and I have access to the self object so I can do this what test do I need here to curious mutation ok type string correct so I must be checking what the type of the return value is and if I expect only string then it must be only string and nothing else another possible mutation is instead of returning the upper case I just return the the value of this variable as is so how do we cure this ok so I start with a value which is in a lower case and then I execute the method under test and expect this to be in a whopper case and I use this with just a constant because it's easier for me know so this one for example will cue the the first and the second zombies and here if yeah and here if I start with with new organic pink with the new mutation yeah it make you also the first one so some sometimes yeah one of the tests is enough to kill more mutants what actually this is the way I discovered them and I developed the test for them and actually didn't go back to think whether or not I have some tests which is not needed and okay so in the last example from the game is replacing the this ampersand this is the new the new safety navigation operator in Ruby and I just replace it with a dot so for the folks who don't work with Ruby the new safety operator works this way if the object on the left is new then the result of this operator is new and nothing else is executed and if the object on the left of the operator is not new the execution continues to the right and when we replace this with a dot then if the language code is new we just get a runtime exception because a new object doesn't have another I might call up case that that's the difference so we do this mutation and what what tests do I need to cure this should what should not row okay how do you ascertain that okay well at least I don't know how can I assert that there was no exception but I can set this were able to new execute the match under test and if there is an exception the test will fail anyway so I don't have to assert there wasn't an exception but I can assert that the result was new so if the framework asserts whether an exception was not raised then of course that's a valid answer but I don't know how to do this with Ruby sorry OOP okay it collides if you read if yeah it collides if you read the examples as this on the street on the on the slides what I haven't shown is that this variable has a non new value by default sorry duh so that works then but yeah good good catch now let's try and find some bugs with mutation testing one book that I was able to discover just by using mutation testing in a project not actively looking for the burkas is this so we have a class called network which represents the networking settings on your Linux computer and there's an attribute called cell device which is the name of the network interface and this method for equality is obviously wrong you know as highlighted here and we see why so if cell device is not on an empty string that piece of the boolean expression is always evaluated to false and the entire return value is always false so it doesn't matter what you know what's the other object you're comparing to this method always returns false and if so device isn't known or it's not an empty string then which means it has some meaningful value in the software this is evaluated to true and the boolean expression always depends on the second part so and that's actually the fix just you know just just remove this first part of the boolean expression and and here we go the reason this stayed undetected for I guess about seven or eight years is there was there wasn't a single test in the test suite which tried to test this equality method when some of the attribute was an empty string core was none so they were always testing with some valid values and this went on undetected for many many years and also in the software under test when you see normal conditions this value is always it always has some value so it doesn't you know return a force or you know something bad happens another book which I was able to find his s shown I have two classes the second class inherits from the first one and you see both classes have a parameter called speed limit in the init method and both parameters have default values now this is perfectly valid Python code so there's nothing wrong with the way it's written I run the destroy mutation testing and I get a surviving mutant the reason here is this is a constant change the tool for Python adds a plus one to any integer constant and to see what happens and I immediately know the reason for this is that I don't have any test which creates an object from this class and asserts what the default value of this attribute should be so I create my test like this just you know create an object from the class under test and assert what the default value should be and the test immediately fails of course and the reason for this is if you look closely you see that this parameter I'm not sending that to the init method of the parent class so it must go here after self and when I'm not doing this the parent class casts its default value so it uses 50 instead of 90 and that's why I did my test first and this is also
very subtle it state undetected for many many years and also the reason for not being detected is the attribute this value is assigned to is never used in the software under test it was meant to be used by external clients of the software the software under test in this case is a library so it's used by other tools and apparently you know nobody bother to check whether or not the default value is it should be another
possible book very close to the previous one is again I have two classes the second course inherits from the first one again I have a parameter with the default value and this time notice I am actually sending this to the init method of the parent class and everything must be fine I run this through my mutation testing - I get the same surviving mutant so again I write the same type of tests created an object and assert what the default values of the attribute should be and again this time I get another type of exception it's an attribute error telling me object from the more awake class and does not have an attribute called speed limit and at this point I start wondering why this is so you know I look here everything looks cool I look here everything looks ok and I Traverse
back to the parent class and immediately here I notice first there's no parameter called speed limit and then something starts no - look not right and then I look in the body of the parent init method and nobody cares about whether or not there's a parameter called speed limit nobody sets an attribute called speed limit so that's the problem you know I don't have this one possible fixes - if you really need this to just take care of it in the class under test and set the attribute
or another possible fixes to delete everything related to this parameter I don't bother with it and that was actually the fix in production another thing which sure mutation testing is really good at is forcing you to look at your source code and refactor it and places where mutation testing really changes where you have if statements and comparisons and lots of boolean expression stuff like that the reason for this is we have many mutations in places like this so in this example on the screen we have around 1 100 different mutations every comparison operator can lead to almost 10 different mutations so equals here can be replaced with non equals with less than or great down less than equals great equals in Python isn't is not and also in internal teen operators so that many boolean n can be replaced with boolean or we can negate the entire boolean expressions also in the two for Ruby you can replace the balloon expression with the true or false constant and I think also you can change only parts of it so you can replace this with true or false and then leave the rest of this and the Python to doesn't do this at the moment but it's fairly easy to what and so this goes through mutation testing any something you know you know there are surviving mutants and when you start looking at it you you notice the pattern highlighted in red so my thought is I can I can delete this and move the second block block of if statements to the right and it becomes a little bit more clear and then I notice another pattern that whatever value is I'm looking for an attribute under self handle with the same name and do something with it so I can refine this even further and use the get attribute function and this becomes like this in reality this fits only on four lines instead of ten lines and it's much more easier to test and much more you know easier to read actually and that's a good thing so we've seen you know what mutation testing can do it forces you to write better asserts and in my opinion when you have a complex software under test
we not only should assert what the return values of our methods are but we also should assert what intermediate state or side effects are produced by the functions under test and you will all agree that it doesn't matter how much we try to write clean software we always have these methods which do more than one thing at a time so they they do something some calculations return some value and oh by the way I've just said this attribute on the side just so you know and that's what mutation testing is really helpful with it helps you discover these things which which you are not testing by mutating them so you can write your test better we saw we can find some interesting books and we saw that we can find places which we can refactor and the questions to stands is my test with good enough and another question another side of this question is how do I measure how good my test suite is which metric do I use to tell whether or not my test suite is good and no matrix is fairly controversial topic I just want to mention some research that's been going on in the last few years in 2015 at G tach there was one lighting talk which says you know coverage is not a good metric because it doesn't give you a lot of information so go for mutation score you know use mutation testing and measured how much of the mutants are cute if you kill 100% the mutants then then you're good and then last year at the g'tok there was another researcher who basically said well you know the guys from last year they didn't do really good research they didn't research from all many software so we did better research and we claimed that the coverage metric like line coverage and branch coverage is still the best metric in practice they say the problem with mutation testing is it is very expensive to compute and in their research it gave only additional 4% of information which the guys did not know already compared to coverage so they say you know use coverage don't use mutation testing and I decided to do a small experiment and see which one of these researchers is right so my software under test is called Pelican a/b and this is a very small library it's a plug-in for Pelican Pelican is a static HTML generator for Python which you can use to to run your block or your site on and a pelican IB gives you one additional tack for the templating markup which you can use to encode variety of for your website so for example you can change the cover of links or colors of buttons stuff like that the way to use this software is to define the a/b experiment variable in the shell and run D my command if you run make github then the site gets rendered everything's publish directly to github the way to render several versions of your website is for start with the control version and then name each experiment by name and run make in a sequence like shown on the screen and if you do this the three commands then you get the control version of your website on experiment which is called one two three and everything about this experiment goes into a directory with the same name the URL structure is updated and you know you can point your users to only to that experiment and see how they they respond and stuff like that and in the version under test we have 100% branch and live coverage of this software and we also have a book there is a setting called elite output directory which is set to true by default and this setting isn't something the software under test controls and this lifts into an external file were no your website configuration goes so like stuff like your website name your github handle goes into that file as well the bug is when that setting is set to true a pelican will go and delete the output directory and delete all the HTML files and then starts rendering them in a clean directory so the result of this command sequence with the setting set to true is that you have deleted your entire website and only left with the last one and imaginal you know you delete everything and type make it happen everything goes live so that's a pretty good way to destroy your website so you know I I didn't have a test which will fail if the setting is set to true so I decided to integrate mutation testing into the project and I wrote a few more tests to achieve 100% mutation coverage you know mind you this is very small library very small plugin and the book was still present so this means I don't have any test which fails when the setting is set through and I starting okay why why I have these many tests and this book is to present and the answer is you know you cannot discover this type of book without looking at the external environment and that's why we need integration tests so I added an integration test which simulates the external environment with the settings and simulates the make command then tries to verify what content has been rendered and whether or not it's correct and that immediately felt of course so then I said I fixed the book but in reality just you know change this setting to force and also add it to check whether or not this is set to true just raise an exception in my software and now we we have 100% mutation coverage 100% branch coverage and also we have been at least one integration test and I'm thinking okay I must be good then you know if I have so many tests and I'm using these techniques then possibly my software is bug free and of course this is not true I've added pilant to the project and pilant was immediately able to discover this book this is this is a problem with how we call the super method instead of passing the class name I was using a shortcut which is serve dot underscore class and this works perfectly fine when you use the software in its intended environment because we have only one class and self class is evaluated to that particular classes name and everything's fine this becomes a problem the moment you try to inherit from the class under test and create a new class and do something different with it and then when you call the class under test init method button goes into a loop between the deeper intent and inherited class init methods so if you want to you know if you want to learn more just head on to this soup I'll input request number and there's a very detailed description why this is a problem and I am guilty of using this this shortcut in Python but I've seen this in many many projects online at github and I've seen this in popular software which is used by many people so this tells me people don't have very good understanding of what self class actually means and how it works and that's why you know we keep using it in in the wrong way so to conclude the mutation versus coverage topic this is a link to my blog describing my experiment with more details and there are links to actual git commits so you can see what actually what code was changed and how it was change so I think when we first start doing testing what we look at is how much coverage we can do so we strive to write more tests to test as much as possible of the software under test and that's a good thing until some time when we go to 100% coverage we don't get any more information out of this matrix so it doesn't do me any good then we start looking at mutation testing and mutation testing tells us okay now you have some coverage and you're testing some stuff but then you know there's a lot more you can test and this is you know what you need to do and we start doing mutation testing and we get to 100% mutation coverage and when we get to 100% we don't get any more useful information out of that and we also need integration tests because of the external environment and as you see from the examples we have different types of environments one type of environment is the regular environment which our users will be using and another possible type of environment is a developer just taking our software and trying to do something else with it or you know build on top of it and we have no idea what these environments will be and how people will want to use our software so that's why we need different types of tests and you know possibly we also use
different types of tools to detect the problems and and deal with it and I do need more examples on the topic I will be continue to explore this topic throughout the year but if you have examples which you can share with me or you know publish something to github please send me an email now I'll go to something more practical speed of execution so as you as you can imagine mutation testing is very slow and just to measure how slow and I've taken a real-world project from fedora it's called PI kick-start it is a text parser library used by the Fedora installation program it's a medium-sized project with a little over 100 files I think all of these files are Python modules they don't have many dependencies between each of them so that's good each module you know does some checking a few if statements there are hardly any loops in the code so you know very easy to understand actually there is some text then then they write some text as well as an output and this is a library which is meant to be used by other programs so it doesn't really do anything useful on its own the project has a fairly good test suite over 90% coverage with a lot of tests and the other good thing about is the files under the source directory a map almost one-to-one with the files under the test directory so they have the same names and first thing I did was take cosmic-ray the mutation testing 2 for python and told it okay so here's the source directory which means load everything into memory you can find under this directory in terms of modules produce all the possible mutations you can you can produce on these modules and then here is the test directory use the test Runner to discover all possible cases you can discover and just you know start running and let this run on my computer that took over 6 days then I became smarter and wrote a small script to go through the source directory take only one file load this into cosmic-ray which means you know produce but only for that module you know don't bother about the rest of the software only that module and oh by the way here is the only one filing in the test directory which contains your tests so use only that for testing and not everything else and that was faster I also added an option called fail fast which means whenever one of the tests fails we know that there was a failure and we will cure the mutant so don't bother to execute the rest of the cases this is an option for the test runner and I did some refactorings like stuff like app if the length of string is greater than zero I've replaced this with only if string and this helped me save about 1000 mutations which were no very obvious things and I let this run the execution time was now a little over six hours and this is 20 times more improvement in speed of execution was still quite slow for any practical purposes and we say so the way to use mutation testing at the moment in my opinion if you you're testing a very small library or a very small project then you can go you know foo 1 into CI and just okay this is my common line to schedule my mutation testing jobs and let this run for 10 15 20 minutes and that should be fine if your project is sending anything bigger than 200 lines this is not going to work very well but what you can do is create a commit hook or pull request hook which examines the payload and you can first think you can do it scheduled mutation testing only against the files which have been modified but by that particular commit or not to request and that should should be faster next thing you can do the panic on the tooling is going to go down from the module level to the class name and to the method name which has been modified the Python 2 doesn't know anything about classes and modules it only knows and doesn't know anything about classes and medaled sorry it only knows about modules so regardless see if you have one class or 1,000 classes in one module white unloads the entire module and starts doing mutations against everything on the other hand the Ruby 2 knows about classes and methods I think and you can you can tell you know that's only that particular stuff and you can try and go even further because this is a poor request or commit hook you have access to the actual diff and you can apply this to to the source under test and assuming you've tested everything before and it's fine then you can schedule mutations only against the lines which were changed so you can do this and it's not really impractical to do this another thing you can do is go into parallel again depending on the tools the Python 2 is built around salary so you can you can very easily hook this to some messaging back end like rabbitmq and schedule you know hundreds and thousands of messages and let your infrastructure deploy a doc resources or virtual machines in the cloud and just you know run your tests in parallel just get back the results and you can do this if you have a lot of money of course here is a list of some mutation testing tools I have been using only the first two I must use peyten and I contribute to cosmic ray for Python and I've used mutant for Ruby but not very actively at the moment the name to the names to the rights or github repositories okay so as far as I know the two's on the top are based on the abstract abstract syntax tree of the language and they are language specific so whatever you know you want to do in the two it works on is TS but modifies the nodes of the ast one thing I like to do is look at the other tools and see what they are doing especially in terms of mutation operators and try to bring this to Python so if you if you want to actively use mutation testing I really advise you to look at - for other languages and see how they work in how they are doing and because many of the - the tools are very new they're not very mature and on the bottom this is another tool for mutation testing called the mule project which Alex Denis of will be presenting later today so I'll definitely be checking out checking this out this is an LLVM based - so it should work for several different languages if you are into mutation testing also check this out and now before I can go further I can take some questions from the audience and if you like okay go okay so the question is if we have a zombie and but we have 100% code code coverage doesn't mean the coverage was wrong yep yeah okay so it doesn't mean coverage was wrong like I mean it doesn't mean the metric the measurement was wrong it was probably correct there are other problems with coverage like for example if you have one line with lots of with a longer boolean expression then it still gets covered regardless of how much of that expression is evaluated so you may be evaluating only the first part of the expression and still cover that line but you know the next 10 parts of the boardings pressure are not evaluated yeah well the way to count you know these cases is you should take this up to date the to to the authors of 2 who do coverage but there are many many publications online with respect to problem C coverage and why it's not really a good metric why you shouldn't use coverage and really for me the coverage is is a vanity metric it it really tells you how much of the you know of the lines you've covered but nothing more it doesn't take anything more and if you have a zombie on a line
which was covered this simply means you might have executed this line but probably you didn't assert on some condition or you know you assert it on one condition and you needed to assert on two conditions okay yes yes okay we have question there okay so the quick yeah the question is what if we we don't use primitives we use some library for the business logic how does mutation testing come into play well this is very dependent on the tools that you use so for example the the Python tool out in risk until recently it was very poor on mutation operators because it's new too and not many people use it on the other hand the Ruby tool called mutant I think this is one of the best tools currently in existence and it has tons of mutation operators tons of conditions it understands because it's used by people who work on commercial software they get paid to actually write test Suites for commercial software and they support the tool for that reason about JavaScript I don't really have no no idea I don't use JavaScript but really the you know the take here is you need to know your tools very well and you need to know the software very well and that that's when you can decide whether or not that's going to be useful now maybe maybe in your case you might write a plug-in or you know an extension to the tool and produce the mutation based on the functions in that particular library okay question here it's equivalent in regions can you say something about them how often do you see them in production okay so the question was about equivalent mutants equivalent mutants are things like which you know change the code but don't really change behavior in any practical way so for example we may have less than operator which in practice is equivalent to less than or equal if we you know in depending on the values we accept in the application and I don't have any concrete measure about this in practice but I think about ten or fifteen percent of the time I see equivalent mutants equivalent mutants because because the behavior doesn't change but the syntax changes but the behavior doesn't change the test suite doesn't fail and we cannot kill them and they just stay as this in in the project I use petition testing for I don't I usually have some threshold like about ten to ten percent or something like that and if the mutant score is you know above a certain line then I say we're good and then the CI system goes green and if you drop under that line and we go red and inspect what's happening and that threshold is based on how many equivalent mutants I have so I try to you know get some idea about that and then go go from there I've talked to other people who do mutation testing and have been doing mutations for a long long time and this is a problem but still the benefits you get from asserting on all those different conditions and looking at your code base in understanding the software much more better which is the result of mutation testing it just you know is the benefit is greater than having to deal with equivalent mutants okay okay more questions or no questions we have two minutes so on there are if there are no more questions I'll just show you this very quickly I have started to document my findings about mutation testing so first of all I don't forget them and some of them make really good examples so this is available and read the docs and it's also available on github if you'd like to contribute and it since the last week we have also Chinese translation for this and I'll skip the trivial examples and show you the last one so we have we have this sound in Python it's fairly common to have methods for equality which compare two objects by comparing that all of the attributes are equal so in this example we have class code sandwich and you can modify the meat of your sandwich and the bread type of your sandwich and we say that two sandwiches are equal if meat and bread attributes are equal and that's about it no we have a safety check here if the
other object is now the litter falls and you know Tico's method is just a negation of the equals and you way to test this with mutation testing is like so so we create two test objects and which one a sandwich two by default all of their attributes have a value of empty string which is not shown on the screen and first thing we do is test the test for equality they should be equal because everything is an empty string inside these objects and we also serve that the two objects are wrong you know when compares for non equality it returns false and that takes care of half of the testing then we test the safety check compare to none and expect to not be equal and then comes the fun part the way to so this starts changing into different comparison operators and the way to test is like this so we we need to modify only one of the attributes that one of the test object can leave the rest of the attributes and the other object unchanged and we need to do this for every single attribute so when all the attributes are of the same type you just do this and if they're not of the same type you can do Group Inc or something like that to you know to modify this block so when the time's up so thank you very much [Applause]


  559 ms - page object


AV-Portal 3.21.3 (19e43a18c8aa08bcbdf3e35b975c18acb737c630)