We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

BDD - how to make it work?

00:00

Formal Metadata

Title
BDD - how to make it work?
Title of Series
Number of Parts
141
Author
Contributors
License
CC Attribution - NonCommercial - ShareAlike 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Behaviour-driven development promises evergreen documentation or human-readable executable specification - sounds great. However, adopting it takes much more than simply installing behave or pytest-bdd and writing Gherkin. This talk will show what.
114
131
Graph coloringView (database)Binary decision diagramComputer animationLecture/ConferenceMeeting/Interview
Software testingNeuroinformatikFormal languageMathematicsPhysical systemBinary decision diagramPay televisionPurchasingEndliche ModelltheorieBitLibrary (computing)Service (economics)MultiplicationSharewareData managementProduct (business)Prime idealJSONXMLUML
Software testingComputer-assisted translationVotingTrailBinary decision diagramFormal verificationComplex (psychology)BitInsertion lossSoftware engineeringWave packetInformation technology consultingMultiplication signProgramming languageTouch typingControl flowComputer animation
Software developerDatabaseBlack boxVideo gameBus (computing)Speech synthesisData managementRow (database)User interfaceSource codeBitProjective planeRight anglePhysical systemSoftware testingLatent heatFormal languageMereologyResultantDependent and independent variables2 (number)Image registrationProcess (computing)Musical ensembleTraffic reportingCASE <Informatik>Hydraulic jump1 (number)CodePattern languagePasswordBinary decision diagramPresentation of a groupFunction (mathematics)Task (computing)LoginDirection (geometry)ImplementationTouchscreenGroup actionLine (geometry)Mathematical analysisPoint (geometry)MappingElectronic mailing listLevel (video gaming)SoftwareNatural numberFlow separationDoubling the cube
Cone penetration testEmailProjective planeData managementMultiplication signPhysical systemArithmetic mean1 (number)CodeLinear regressionComplex (psychology)WritingBinary decision diagramFormal languageDifferent (Kate Ryan album)CASE <Informatik>PlastikkarteRoboticsComputer fileINTEGRALPasswordBuildingSoftware testingLatent heatString (computer science)Software maintenanceScripting languagePoint (geometry)Library (computing)WordComputer architectureData conversionExpressionAuthenticationInformationNumberResultantWeb pageWave packetBitElectronic mailing listCrash (computing)2 (number)Generic programmingComputer animation
Data managementPlanningPasswordSoftwarePay televisionService (economics)Gene clusterDifferent (Kate Ryan album)BuildingComputing platformIdentifiabilityImage registration
Time domainModul <Datentyp>InformationSoftware testingDecision theoryDomain name2 (number)Projective planeBinary decision diagram
PasswordLimit (category theory)InformationModul <Datentyp>Right angleComputer architectureBinary decision diagramPhysical systemSoftware testingAreaLengthPatch (Unix)Computer animation
Data managementMathematicsFocus (optics)CodeView (database)AreaPlanningPay televisionData managementPoint (geometry)Focus (optics)PasswordMultiplication signPhysical systemBinary decision diagramSingle-precision floating-point formatCodeWeb pageProcess (computing)User interfaceMaxima and minimaLimit (category theory)Code refactoringArithmetic meanNumberFacebookWeightCalculationTerm (mathematics)NeuroinformatikSoftware testingDatabaseImplementationComputer architectureProfil (magazine)CASE <Informatik>Cartesian coordinate systemArithmetic progressionMusical ensembleComputer animation
CodeCASE <Informatik>Communications protocolClient (computing)Library (computing)Module (mathematics)Mobile appLine (geometry)AutomationSoftware testingMagnetic stripe cardFunctional (mathematics)Level (video gaming)BitPattern languagePhysical systemSocial classPresentation of a groupAuthorizationStandard deviationToken ringINTEGRALMereologySoftware developerDifferent (Kate Ryan album)Query languageAuthenticationRight angleNatural numberHTTP cookieCodeSoftware design patternPlanningSoftware frameworkGroup actionDefault (computer science)Binary decision diagramAttribute grammar2 (number)LoginFile formatFlow separationKeyboard shortcutComputer animationMeeting/Interview
DialectPlane (geometry)Patch (Unix)Binary decision diagramData managementPlug-in (computing)Patch (Unix)Pay televisionStandard deviationInformationSoftware testingFactory (trading post)Computer architectureMathematicsImplementationHecke operatorSoftware maintenanceGroup actionCodeInternet service providerConnectivity (graph theory)BitModule (mathematics)Functional (mathematics)Expert systemMobile appSingle-precision floating-point formatPattern languageUnit testingDefault (computer science)Point (geometry)BlogParameter (computer programming)Reading (process)Client (computing)Library (computing)Social classClosed setCASE <Informatik>Software design patternLatent heatModul <Datentyp>Physical systemMereologyGoodness of fitPlanningGame controllerRight angleLevel (video gaming)Type theoryCovering spaceForm (programming)Stress (mechanics)Domain nameSpeech synthesisBuildingInterface (computing)Position operatorData storage deviceObservational studyData miningDatabaseMultilaterationAuthorizationExecution unitCartesian coordinate systemMeeting/Interview
OvalValue-added networkDecision tree learningPlane (geometry)Rule of inferenceWireless LANPlanningPresentation of a groupCASE <Informatik>Pay televisionLecture/Conference
Green's functionProgrammschleifeWeightClient (computing)ConsistencyMessage passingPhysical system2 (number)Binary decision diagramLevel (video gaming)CASE <Informatik>Execution unitMultiplication signLatent heatFunctional (mathematics)BitRight angleService (economics)Loop (music)Lecture/Conference
CodeStapeldateiBinary decision diagramStructural loadResultantBitLatent heatService (economics)Formal grammarMereologyAbsolute valueLimit (category theory)Product (business)Cartesian coordinate systemSoftware testingContext awarenessCASE <Informatik>Physical systemDatabaseWebsiteLecture/Conference
AreaBinary decision diagramEndliche ModelltheorieCodeNatural languageLatent heatObservational studyInformation technology consultingFormal languageExpert systemCASE <Informatik>Visualization (computer graphics)Limit (category theory)MereologyVapor barrierContext awarenessSoftware testingMoment (mathematics)WindowBeta functionProduct (business)Scripting language2 (number)Lecture/Conference
Visual systemCodeCASE <Informatik>GeometrySoftware testingDimensional analysisCurveDifferent (Kate Ryan album)CodeBinary decision diagramUsabilityLevel (video gaming)Self-organizationMatching (graph theory)Projective planeGoodness of fitWebsiteEndliche ModelltheorieMetric systemProcess (computing)Revision controlFigurate numberMathematical analysisPairwise comparisonDecision theoryComputer animationLecture/ConferenceMeeting/Interview
Computer animation
Transcript: English(auto-generated)
Okay, thank you very much for introducing me and the actual talk. So please be aware that this session was marked as an advanced, the red color was a warning. So it's a 4pm, great respect from my side to you that you made it.
But just to be clear, I'll really keep explanations to introduction to BDD itself minimum. So I assume that you either heard about it, saw something, or you practiced it. So that's that kind of talk. Out of curiosity, who uses or saw BDD in the past?
Well, actually I guess if my math is right, around 30%. So it's not that bad. And the rest of you, well, not to say I'm sorry, but the second half of the talk should be more practical to you, even if you don't practically use BDD. Okay, but also for you, a very short introduction.
BDD is like a technique, comes with a lot of libraries that let you describe the systems in this language called Gherkin, where at the top you name the feature and you have scenarios. And we'll be talking about examples in a system of subscription management, something like Netflix is sold in that model or Amazon Prime and many other
products very popular these days, because it makes up for the perfect customers who pay and don't use the service, because you always forget to cancel the subscription. Yeah, so you have a scenario, it's described in several steps,
like first given a user with expired free trial, when the user purchases monthly subscription, then purchase is confirmed, and the account is upgraded to monthly subscription. And that way you describe how the system should behave, so to speak. And that's later mapped to code, that's why we need this fancy language, because it not only aims to be readable by human, but also
parseable by computers, so it's kind of serviceable. And since we're in Python world, there are a couple of libraries. To do that, PyTES BDD, it's fork PyTES BDD or behave. But I must admit to you something, I tricked you a bit, because this talk was submitted to testing, to testing path, and also often BDD is sold,
so to speak, with testing techniques, while the reality is it's not about testing at all. It's only a byproduct that you get your automated verification, but BDD has a completely different goal, which I will shortly uncover.
But you know, that's how marketing in general works, and there comes complexity of the reality, and this is a lovely city of which in Poland I came to you from. Yeah, so my name is Sebastian Butrinski. I work as a trainer consultant in Bottega IT Minds. I'm also tech lead at Sauce Labs, not to lose touch with reality.
Everything I say to you today, this is something I use, not something I invented, you know, between one training or another. This is treated as a lessons learned mostly. And I have a little mission of analyzing software engineering in Python, because you know, many stuff that we like talk or rediscover in Python has already been solved in other programming languages.
Maybe we just need to adapt. We don't really have to go through all the bad experiences. And in my spare time, I experiment with coffee a bit. So going back to BDD, let's break it down, and B stands for behavior. And we really need to first define what it actually is.
And the simple definition I have is whatever is visible on the outside of the system or whatever the scope you're testing, if you guys are looking for some empty seats there on my left side on your right, at the front. And an example of a behavior would be simply stating something that the user can see,
maybe on another screen, maybe directly after direction, maybe in some report or whatever. So a classic example would be how do we spec registration feature with username and password. And some people tested via database. This is not the behavior, because user cannot look into database,
unless they can, but it's a different problem. So a visible behavior a user can do after registration is to log in with that username and password. So that's how we specify behavior, what the user can do, how they can see. Other auxiliary questions are how it affects other features,
is there something new action available to the user, or maybe there's something they cannot do anymore. So in case of feature for banning, let's say a login pattern is banned, the scenario given bus asterisk is banned. Whenever a user registers as bus astral, which kind of fits the pattern,
then registration attempt is rejected, so they can see that, hey, this failed. And as a result, bus asterisk cannot log in. So everything is visible from the outside of the system. And what's not a behavior? Definitely not assertions about records stored in the database, because it's not. It may be under some circumstances helpful to check this, but it's definitely not behavior.
And another bad habit that happens when we try to test behaviors, not implementation, is so-called over-specification. I will show you an example without explaining it a lot. It would be easier. So at the top, let's say you fetch your users from external system.
And I assume that you use mocks or have used them in the past. So at the top, you set whatever should be returned from this external system mock. But at the bottom, let's say you have two assertions. First one is about response. And let's say you can deduct from it that the user data is there. And then you actually want to be extra secure, extra sure, and double check the smoke.
You don't really have to. If it won, only one. But yeah, this is a simplified code. Definitely, it's not working. Yeah, so other two things not to do when checking behavior.
And I think that's not part of the so-to-speak public API, because behavior can be defined not only on user interface, but also on the level of API. And whenever we specify behavior, we should focus on what, not how. I will show you some more examples later. So this puts the first bullet point on our list, how to do BDD,
is namely expected behavior, so something that is visible on the outside. We don't want to go into, think about it like a black box metaphor. We're not interested in what's inside, but we're interested in what's visible on the output. Yeah, so going to Gherkin. So like I said, this language is a bit tricky, because it has two purposes.
One of them is that humans should be able to read it, and this will be the main goal, focusing on that mostly. And the second is, it should be parseable by computers, and as a result, we get something we call executable specification. So you can read it, it feels natural. I mean, I'm not a native language, native English speaker,
so for me it's not that awkward, but I suppose that for them it might be a bit. But it's a trade-off. Nonetheless, if nobody's going to read the BDD specs you write, that kind of defeats the purpose of having them. Maybe you need another approach.
So if we compare like these two scenarios, it's simple that one of them is more readable than the other. I won't tell you which one, I guess you already know that. The second one on your right, it actually has more problems with it, but I will get to them in a second. Yeah, so we want our scenarios to be nice, sweet,
like the one on the left side that could be grasped, you know, without much effort, because otherwise, again, it defeats the purpose, because nobody would gonna read them. Yeah, but before we gonna jump into details how to write such specs, let's talk about development process in particular.
So maybe let's start from recipes that actually work. So it's very nice if we can do it in a collaborative way. So there is a lightweight workshop called Free Amigos, when we have, for example, software developer, a tester, and the project manager, or someone else from business, or a stakeholder, and we work on them together on these examples, maybe using auxiliary techniques like example mapping and so on and so forth.
But the bottom line is that we don't do it solo. And another stuff that can work is pairing, for example, software developer working on a task and pairing with tester to help them work on the BDD scenarios.
And if, for example, your stakeholders, your business is not available right to you, you can also review with them those specs later. That should also be fine because they agree or disagree with you, you could improve and so on. And why do we do that in a second? What definitely doesn't work is, for example,
testers or QA departments working separately. And they are the only ones responsible for maintaining those. Nobody's reviewing that. That's not gonna work because they will just slide away from the system. And for example, developers don't have any incentive to make life easier for the testers. And on the same note, business analysts, for example,
working before the team and just handing them out, these scenarios also wouldn't work that much because, well, they have their understanding but they may fail to convey just everything to the software developer. And now, this is because the main goal of BDD,
and if there is only one thing that you take out from this presentation, let it be it. The goal is to building shared understanding. It will shake between stakeholders and developers. Or for another note, any person that works or interacts with the system. If we lose that shared understanding, we actually have a pretty big problem
because there appears a gap between how people working on the system or how the system actually works and how business stakeholders or project managers think it works. So, other symptoms are something should take little time according to project managers, but actually it takes months.
That's because this gap occurs. So, when this happens, you don't need BDD, you actually need this one. Yeah, or work hard on building that shared understanding. This is sometimes also called ubiquitous language, meaning omnipresent language, that means that we all are on the same page,
we use the same names, they are present in code, in specs, and in day-to-day conversations. So that's the actual goal of BDD. And about Gherkin writing, first piece of advice I can give you is to use realistic common scenarios. It's also worthy to test for some edge cases,
but don't exaggerate. Maybe it's not the best way to put them into BDD specs. Actually, I worked with a pretty smart guy, Chris, in some time, and once he got his hands on BDD stuff, he went, whoa, I can now automate away all my manual work of testing. But the result was that he wrote a lot of tests
that took time to run, but brought no benefit, actually. So, we deleted some of them, and also testers do have the appropriate skills to do so, because when I approached him and asked, hey, Chris, but when you are testing the system for regression before release,
are you actually know we're executing all these scenarios? No, of course not. I'm smarter than this, I'm choosing the ones I'm executing. You should do the same with BDD, because it takes effort and time to maintain and write those tests. The other thing is to rather be specific in your Gherkin,
avoid expressions like less than equal 10 or something like this. Because if this is an executable specification, there is a word specific in a specification for a reason, like if someone reads that and doesn't quite get how it exactly works, it also defeats the purpose.
And last, but I guess the most important, which always get people into trouble when they start with BDD, it put into spec only what matters, because we're tempted to put a lot of details as if we were writing code, but they are all many times not relevant. So for example, let's say we have two specs.
The first one is for authentication based on username and password, and the second one is on the banning users. You kind of already seen this one. So in the first case, when we spec the behavior, we use both username and password. These two pieces of information are relevant to understand the feature of authentication.
But when we are specifying banning users, we actually don't need the password to describe the behavior. It's irrelevant. Of course, we need to have some password between the hood, but we don't really have to put it in the spec, because that's unnecessary. And what not to do?
Definitely don't write scripts in Gherkin. This is script. This exactly tells you what to do step by step, but it's hard to deduct how the system actually works. You'd rather be descriptive, declarative way, describe what the user can see, what they can do, what's possible, what's not, and not explain like a robot what to do step by step,
because it's not really helpful. So whenever you're tempted to do so, just don't. Because you are describing the feature using behavior and not trying to tell step by step what happens. So other piece of advice, don't get lost in UI details.
Whether automating PDD on UI is a good idea, it's a bit different topic. We'll get to it in a second. And also, don't try to make individual Gherkin steps generic and reusable as possible. So that's the case with this username and password. If you have a step already that says there's a user registered with username and this password,
and in other scenario, like with this banning, you don't need this password bit, just do another step. You don't have to be careful about a large number of them. The goal is readability. Also, about reusing.
There was or there is still a feature in Behave library that lets you execute steps by just entering the string. And you just name those steps and Behave will execute them. So back when I see it working with Chris, the same guy, we thought this was a pretty good idea. But the reality is that Gherkin is not meant to be reusable.
So this was pretty dumb. We got a lot of issues with maintainability about it. Because for example, when you have Gherkin, and for example, you use PyCharm, it has an integration that lets you jump from the definition of the steps to the Gherkin files and so on in two ways.
And in this case, it's not even supported. Whenever we change something, a lot of tests that were doing execute steps crash. So also, don't try to reuse Gherkin because it's not the way how it was meant to be. This puts a second point on our list how to do BDD.
Once we know the behavior which is expected, we need to capture it using Gherkin. And it's best to work it collaboratively, at least show it to other person to tell them what they think if they understand it the same way we do, because it's meant to be read later. And if someone tries to read it but doesn't get the information,
then what was the point of this entire exercise? And we want to make specs short and simple, specific to the point. So you might have an impression that the piece of advice so far was to basically keep it simple and everything will be okay. But getting to simple is not easy, especially if the system you have under the hood is complex. So we can describe it simply,
but that complexity is there somewhere lurking. So that's how we need to manage. Out of my experience, one guy I talked to on one training said that, okay, yeah, we tried BDD, but quickly those scenarios become too long and too hard to maintain.
Yes, exactly. Because you perhaps lacked some tools for your architecture weren't supporting it. I will now show you some examples and how to actually make them simple from complex to simple. Let's say we're building this software as a service platform for subscriptions.
And within that, we can identify different clusters of features. For example, we have something about user success management, this registration, password reminders, and so on. Plans management, when you, for example, defined what can be bought, there is monthly plan that is priced $10,
and there is annual plan that's priced $100, and what benefits they bring, and so on. Then you have subscriptions, which actually control let you subscribe to concrete plans and payments. We could pretty much go on, identify those sub-areas until we get bored or tired.
But it's very important to do so from a standpoint of another technique called domain-driven design, which means first look at the problem, the entirety of it, all the sub-problems you have, and then use this information to make decisions about modularity,
the scope of your tests, and so on. You will see examples in a second. So in domain-driven design, these sub-areas are called subdomains, and the entirety of the knowledge is required to work on this project, or the company that's solving the problem is called domain. But that's just extra.
Let's go back to the BDD. So if we want to make our BDD simple, we need to limit their scope. And only if we can distinguish different areas, we are able to do so. And this is a general piece of advice for designing systems. If you want to make them simpler and manageable,
you need to limit the amount of information. And there is lots and lots of problems that you get rid of when you get the modularization of the system right. So if writing BDDs or testing in general hurts, it's not easy, it's difficult, then perhaps your architecture is to blame.
Because in Python, you know, we can go to the great lengths, we can monkey-patch everything we want to, but maybe the effort should be put elsewhere. So going back to our system, let's assume we have these four areas of features or subdomains. So users management, plans management, subscriptions and payments.
And we have a following scenario, which will be using all of them. And I'm doing this on purpose because we will shortly split into smaller manageable pieces. So it starts with a user registered with username foo. Okay, there's no password. That's some progress. Then let's assume we have a plan called monthly
that has a specific price and has a specific benefit. And whenever a user subscribes to this monthly plan, they should be subscribed to the plan. They should be charged $10 minus $0.01. And they should have fast support in their benefits. And the user's net bill date should be a month for now.
So this is pretty huge. And when I say small, nice specs, I mean like three, four steps at maximum. So definitely something to work on because we have like seven. But to simplify, we need to first limit the scope because it appears that we are trying to do everything at once
in this one little spec. We're trying to describe all the behavior and that may or may not be manageable in the longer term. So the first thing we can actually do is to decide which things we need to focus on.
So there are a lot of stuff intertwined here. So let's say I will make this scenario specifically about cash, about how much I pay, when's the next bill date, and so on. So I can limit the amount of details. So first, a little refactoring of this scenario can do is I can move one of those steps to the background.
This is like in a single feature. This will be reused across all scenarios. So of course, we always need something to subscribe to, but it's not important to be in this particular scenario because like doing nothing special about it. And actually, we don't care about the benefits in itself because I only interested the focus I mentioned
about the price I pay. So we can get rid of one step and simplify the other one. So it's already better. And then since focus is on the money, we can omit the first then step. Users should be subscribed to plan monthly. So yeah, we get rid of one another step.
So we add four. And also, so this is one approach to, we remove steps that are not necessary because like we simply decide not to check them at all. And also like remove some details from other steps that may be necessary like this background given plan.
But like we don't care about benefits, so we also like simplify the individual steps as it was with username and password for banning when we didn't care about password. Yeah, and another stuff, we can make some things implicit because in such a system, there's always a user.
Maybe I don't need to spec it. So I will just simply remove this given user because I don't really need it. So in this case, simplifying means limiting number of details either because I don't need them specifically or maybe because like they're not relevant for this scenario. So this puts another point on our how to BDD.
Once we have the expected behavior, we capture it using Gherkin, use some collaborative process not to do it solo. We keep our specs short and simple. And afterwards, like we need to think about architecture, the boundaries so that our scenarios, you know, are not too wide, not trying to test everything at once
because that's not going to work in the longer term. Nobody's going to read it. And also we have bad time updating it later. Okay, but if we like make the Gherkin simple, so it's easy to digest, easy to read, and the underlying system is too complex, we just shifted the problem elsewhere, right?
So, okay, there are no more dragons in Gherkin, but they do live somewhere, right? Yes, of course. And this will be called the automation layer or simply saying all the code that's just after these steps. So then in this place, we'll focus on making the solution maintainable, code that duplicated and so on.
But before we jump into implementation, let's first talk about which layer should we use to implement our BDD specs on. So what I see in examples over and over again is to starting with UI, maybe with Selenium or some similar solution. I don't think that's a right approach in most cases,
especially when single page applications, you know, exploded, kind of became very popular. For example, when I open Facebook on my computer, on my profile, it sends 140 HTTP request to the background. And I assume that at least half of them actually does something, apart from tracking me, of course.
So, but it also means that, you know, probably they read some data from the database, maybe some do some calculations and so on. And I need to set up my system for that. So in general, user interface will have more dependencies, which means more setup, slower tests, hard to maintain and so on.
So out of experience, like I learned, API level should be considered first to automate BDD on. But of course, if it makes sense in your case, you can go up and down. It's just my default. Yeah, I start with API. So let's now do some coding, right? Because I was showing you Gherkin.
You are on the Python conference. I'm also a Python developer. So where's meat, so to speak? Where's actual code? So let's automate this scenario that we simplified. So a user subscribes to monthly plan. The first thing we do, like if we do it on the API,
we are going to have a lot of HTTP requests using test clients of our framework, for example, fast API in this case, and we'll be sending various requests. And the first thing we do to make this code more manageable is to obstruct away the protocol. So we can simply grab all those distinct requests,
distinct methods of different endpoints and move them to another class or maybe a bunch of functions in the module works the same way, which we'll call a client. And this will just accept some attributes, but we no longer have to write the entire query.
This may not be a big issue for HTTP because it's pretty concise protocol. But once in a day, I worked with GraphQL that was also used as the main API. And it was really hard for us to test to manage this because built-in GraphQL query requires a bit more lines of code,
it's harder to format and so on parameterized, so it was much better when we introduced such a pattern. And of course, we pack it into some Python fixture like Python BDD is the library I use in this presentation. And eventually, our step definition looks like this, so it's much simpler and much easier.
So the first pattern we can use is app client to just abstract away the protocol. We actually name the methods with the actions that we can do on the system. So then our steps can be much, much smaller. But when we are dealing with API,
we're usually making requests on behalf of some user. And like I said, I decided to make this implicit so that I don't explicitly name which user they will be calling requests on behalf of. And with HTTP and for example,
standard methods like authorization bearer token, this is pretty simple because we make simply parameterize our app client to keep this user token as another fixture and keep it inside and just reuse it. So whenever we make a request using app client,
we then don't have to deal explicitly with the authentication or who sends the request because it's part of app client. And for example, if we have a test scenario that needs a few users, then we can have a few app clients separate for each user. And now about this user token.
Of course, we can make it the hard way, the usual way, register the user, then log in, then grab the token and return it from the fixture. But in many systems that are, for example, using microservices are correctly decoupled, what we actually need is a token. Maybe sometimes the user doesn't have to exist.
And in these cases, it may be very handy to just do the shortcut and to generate the token the other way. We used a similar approach to our test that we're using Selenium on the UI. So in every test, you have to go through the login, right? No, you don't. You just need a cookie.
That's valid. So instead of logging over and over again in our scenarios, we just generated the cookie and added it to subsequent requests. Right, so to recap, we have our Gherkin that is optimized for readability for humans. Then we have the steps definition in code.
And one of the design patterns we can use is app client that we abstract away the protocol used. And in our case, app clients will be always associated with some user credentials because all requests are done on behalf of some user. OK, so let's now focus on the second step. User should be charged something, dollars.
And it's natural that we will not be implementing our integration with bank and payment systems, but we'll use something off the shelf that's ready to use, like Stripe or ADN or whatever. And what I often see, and I perceive it
as a regrettable practice, is that we go straight away to monkey patching the client's library. Don't do this. This is wrong. This is very wrong. First of all, there is a general piece of advice for testing. Don't mock what you don't own. And what it means, don't simply mock the code
you haven't write or you have no control of. Because whenever, for example, you upgrade the library and you didn't quite catch the subtle change, you will simply fail to notice it. So you mocked one behavior by the library, for example, can behave a different way. Also, for example, ADN requires quite a lot
of arguments to call. So why burden and put a mess in your tests when you can make it a bit simpler? And the second reason is that concrete payment provider actually belongs to the payments, not the subscriptions. So therefore, it could be hidden from the outside world.
And we don't have to put this information, VDD especially. And what we can do with that, we can use another design pattern called facade. And for each of our modules, like user management, payments, subscriptions, and plans management, we can have facades that will expose the API
in a form of a single class. It's, of course, one of the ways. So we'll, for example, have method like create recurring payment. And it can, of course, use ADN inside. But in VDD tests, we don't have to go that down to actually make sure that this will happen. Speaking of type hinting and type checkers,
of course, we need those to actually ensure that this will be working later. And facade doesn't have to be a class, obviously. It can be just a bunch of functions or a bunch of classes. The point is to just have one entry point to any given module or component of your application.
So you know where to stop when you, for example, implement VDD. And then, since we have our facade, which we said it's an API, so by definition, we'll be more stable and also owned by you. We can resort to using our standard wanky patching.
And, for example, assert that the exact amount of money expected has been charged. And last but not least, there is also a step with monthly plan. So, of course, we could naturally use plans facade inside our tests to just create the plan
and store it in the database. So that it will be checked later. But we can also just mock the method that will be used by subscriptions. In this case, getPlan. So we limit the scope of the VDD by, you know, the actual plans is not exercised here.
We only have mocked this getPlan part. And regarding the implementation, it's pretty nicer right now because we have this mock on a stable interface. But there's one thing I still don't like. I deliberately, you know, removed the benefits part
from the VDD spec because I said it wasn't relevant. So why the heck do I have to repeat, put it into the step definition? Turns out I don't really have to. And for that, okay, sorry. And for that purpose, we can use, for example,
or just, you know, write a custom function with default arguments. So like we have something called sensible defaults. So in general case, we can have like, for example, a plan without benefits because maybe they are optional. Maybe we manage it other way. And close, we are close to the end of the presentation.
And after that, I put the last, the final point on how to VDD. You can't really do VDD in any non-trivial system that lacks modularization. You need to take care about, you know, architecture first and then to make, you know, code reusable, maintainable,
use various patterns in automation layers such as app clients, such as facets and so on. So that will be simplest and factories, of course, too. So apart from the most important takeaway from you, that goal of VDD is not testing, but rather building shared understanding,
is that to verify behavior. This is also a good piece of advice in general for tests. Yeah, testing on implementation details like makes your code difficult to refactor later. When you verify behavior, I mean you closer to it, of course, not always on the UI, then it will be just easier for you to later evolve.
And also sometimes you like don't have to write that many tests because maybe two scenarios, you know, on the top level will be enough to cover everything. Yeah, and I can't stress this enough, building shared understanding is really the point of the VDD. It's not tests, it's not writing Gherkin itself,
it's to build shared understanding, which also means when that understanding changes, because, for example, some new requirements come, maybe you hire another expert, then says it's all wrong, should be done differently, then you also need to be able to refactor those specs. And that's not something you write once and forget about it,
it should, you know, evolve along with your understanding. And regarding further reading, if you are more interested in the topic, the first position I recommend the most, it's a specification by example book by Goiko Ajits, I think. That's how it's pronounced.
And it's like, it has been a goldmine. It's pretty exhaustive because it has three or four case studies of actual companies, so there's a lot of knowledge in there, a lot of practical knowledge, a lot to choose from. The second one is BDD in action,
you can read optionally, and there are also like two next articles, which are very helpful. Whose domain is it anyway is from the author of BDD, and it explicitly says about the problem when we try to put, you know, too many things at once in a single BDD spec. So then it's a second read, recommended from my side.
And if you, well, I'm not making BDD explicitly, but you would just like some general advice on unit testing, then you can start from this blog post, People Behavior and Unit Testing by Vadimid Korikov, and I recommend also other posts on his site. Yeah, and that will be it.
Thank you for bearing with me. Thanks for the interesting presentation. I have one question on the example that you showed with the monthly subscription plan.
So I was looking at what is implicit, what is explicit, and it seemed to me that the monthly plan had an implicit 30-day expiration in the Gherkin. Was that intentional or is that, because if we look at the Gherkin, it never says like monthly is 30 days. Yeah, that's a very good question.
So yeah, this is a simplification, really. This shouldn't be hidden, especially, I mean, for this scenario, yeah, when, okay, when we look at the last step, the next build date, this should be made explicit in this case, yeah, definitely. Okay, thanks.
Hey, great talk, first of all. I would like to ask, do you have any tips for dealing with eventual consistency on the system that's underneath? So this is a very general question about eventual consistency.
So usually we start from explicit weights done in a loop. So for example, we know that, for example, some endpoint will start returning a different answer, but I don't know when, yeah, eventual consistency, right? So we keep on polling it, meaning,
querying repeatedly with some small intervals, like 50 milliseconds or so, the first approach, yeah. But if you have any specific case in mind, maybe you could elaborate a bit. No, it just, it seems like on this level, it seems like everything happens instantly,
and is that a problem to deal on the spec level or rather the automation level? Okay, so that depends as always. But in general, I'm not, you know, dodging the question. We are generally doing this on automation level, yeah, layer.
We like do these loops with weights and we have some timeouts and then we, but if we're like where to use Gherkin, you know, to specifically specify, you know, this eventual consistency, then, you know, we would put it, yeah. But then, you know, it's, I still find it hard to imagine,
for example, you know, I would specify that this message like comes in less than five seconds and yeah, kinda seems awkward then, yeah. So I would deal it on automation layer. Thank you for your talk. Really excited to dive in. I have a similar feeling when I first started using TDD
to kind of not just test the units, but to inform how I design a function. It seems like BDD is really nice for informing the design of your service layer. My question is, was the reason that we're dealing with the API client was because this was a microservices thing? If I'm not dealing with microservices, can I just use the service layer?
Yeah, absolutely. Absolutely you can. You know, I just, I'll tell you why that's my approach. So even if you like have a smaller, not large monolith, but the smaller one that's still manageable, I would still do it on the API because simply I can more easy get 100% code coverage
that way, yeah. Because if I, you know, do it on the service layer, which is perfectly reasonable, I still have to, you know, somehow cover these views, yeah. And you know, they don't slow test that much and you know, I kill two birds with one stone, so to speak. Sure, thanks. Thank you for the talk.
So I have a question of how generally applicable do you think BDD is? Because I always see it in the context of, you know, websites, other API scenarios, but let's say I have a data pipeline which like runs on a batch job, which every midnight extracts some data from the database,
transforms it, loads into a different database, and then there is another user which consumes the results and we need to have a shared understanding with him. Would you say BDD is a good tool for this to agree on a specification or would you advise to use something more formal here? Okay, so the question is about limits of application BDD.
So in your case, I would rather not choose BDD per se because I feel and like the team I'm currently working with also learned the hard way that perhaps putting technical details in Gherkin is a bit awkward, it doesn't really help much, yes? So we just decided, you know, to that part we just use usual Python and just write it very technically
because that better worked for us. And the other part of your question which you didn't ask explicitly is about data-driven systems, yes? Because I'm working mostly with product companies and I deliberately choose them because I like it and in this area BDD works better
but what if we have more like data-driven? I cannot answer that but Gojko Adjic in his book specification by example has some case studies about that explicitly, yeah? So if any one of you works with more data-driven approach I'm not an expert on that but yeah, Gojko is. Thanks. Thank you.
I have a second question. Okay, I would like to know what do you think about BDD as a part of a concept of test as living documentation? I think it's a pretty utopian idea. We, you know, I'm pretty disappointed
by, you know, not many like successful products or, you know, consultancies built around the idea. You know, there are only few. So I really like the idea. I'm actually, let's say, experimenting to make it happen but the big blocker, for example, is, you know, to make these specs accessible to people that are not necessarily writing code.
So for example, one person that's not writing code in my team goes to, you know, Git and can read them but I guess, you know, for other people this is, you know, a barrier not impossible to cross, yeah. So yeah, I really, I admire the idea. I'm really surprised by lack of success stories in the field
and don't understand it at the moment. Maybe you know something I don't know. Okay, but I will try anyway. So I work with AI and last six months have been crazy and when looking at this GPT and language model capabilities
my first instinct was whether it is possible to use automatic generation either for turning nature language into Gherkin or for basically writing Gherkin steps automatically with things like Copilot because it requires only like limited context window.
Have you tried it or have you heard about anyone trying that? Yeah, so I can say you that, for example, Copilot is pretty bad and suggesting Gherkin. Actually, this bad example that I saw you, which is a script, was generated by Copilot,
but Copilot, okay, oh, I don't have this open anymore because there is also this closed beta of Copilot X which, you know, ships for Visual Studio Code insiders which is an early build and I asked it to provide
some examples and it were better, yeah. So perhaps if they put some more attention into it it may be better, yeah. Awesome, thank you. Right, first thing, first, thank you for the talk today. You argued against the use of behavior-driven design on the UI level but, I mean, you left unsaid
or probably you just hinted about the idea of using VDD at the model level. Would you argue for or against using VDD at that very internal layer and if so, is there any example that comes to mind? So personally, I have no objection
because if I understand correctly, you're asking if I would be against using VDD like lower, yeah, on the model. I see no problem with that personally but haven't done this. I think it might work. It should be good. Thank you again. Hello, thank you very much for the talk.
We are currently thinking about using VDD in our project and I'm wondering whether we should use behave or Pytest VDD and what are the features we should consider for our decision. Right, so I worked with behave 2018, I believe and as far as I checked before this talk,
we haven't released any version since then which is kind of interesting but also we had a couple of issues with behave like it wasn't really running tests in parallel so I remember there was a match request supposed to provide that feature. I don't know if it was merged.
We're now with Pytest VDD and I'm really fond of it. I think it makes the job. It's not ideal because for example, it doesn't support all the new syntax of Gherkin but it's more than enough for us, so to speak. This fork, Pytest VDD-NG
which I noticed it was some update four months ago and it has those features so I would just run a comparison between those two. Pytest VDD, if you already have experience with Pytest should be less deep learning curve so that would be my recommendation. Thank you.
Hi, thank you for your talk. It was super interesting. I have a question around, you mentioned code coverage just before but you also said that it's not really about coverage. I wonder what success looks like and what are the cues that tell you that you have enough scenarios or that your tests are broad enough for you to consider it a good code base or a healthy one?
I don't think there is a clear success scenario because the thing is that like with testing itself, yeah, we can have code coverage on 100% but it's only a metric of how many code actually we executed. With test scenarios in general,
you can only write those that you figure out or copilot suggests. So that's it. I wouldn't tell about it. We can use another metric but it's not easy to calculate or enforce. Are people actually using it and do they find it helpful?
So that's different. For example, if these scenarios are not too big that would be my success metric around it. But really, really tough question. Thank you for this one. I will be thinking about it all night. Yeah, so we'll thank everyone and thanks to Sebastian for the talk.