The Butler and the Snake - Continuous Integration for Python
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 20 | |
Number of Parts | 173 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/20213 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Place | Bilbao, Euskadi, Spain |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
EuroPython 201520 / 173
5
6
7
9
21
27
30
32
36
37
41
43
44
45
47
51
52
54
55
58
63
66
67
68
69
72
74
75
77
79
82
89
92
93
96
97
98
99
101
104
108
111
112
119
121
122
123
131
134
137
138
139
150
160
165
167
173
00:00
Open sourceCloningMultiplication signWeb-DesignerWordContinuous integrationProjective planeEstimatorStructural loadState of matterIntegrated development environmentINTEGRALSoftware testingEnterprise architectureMultiplicationSoftware developerProcess (computing)Revision controlControl systemPhysical systemRight anglePower (physics)Basis <Mathematik>Local ringTerm (mathematics)CodeBranch (computer science)MereologyPoint (geometry)CuboidWebsiteTest-driven developmentSoftwareStatisticsContrast (vision)Shape (magazine)Instance (computer science)Goodness of fitCore dumpLecture/ConferenceMeeting/Interview
02:30
INTEGRALMereologyBasis <Mathematik>CuboidSoftware developerSoftware frameworkGame controllerContinuous integrationRobotRevision control
03:24
Rule of inferenceOpen sourceSet (mathematics)User interfaceSoftware developerINTEGRALRight angleService (economics)Product (business)Element (mathematics)Greatest elementSoftwareBitCondition numberProcess (computing)Formal languageResultantPairwise comparisonRobotContinuous integrationJava appletGoodness of fitBuildingPoint cloudMeeting/Interview
05:00
Electronic mailing listSoftware testingEmailTraffic reportingBasis <Mathematik>Right angle
05:28
CodeRight angleLinear regressionProcess (computing)INTEGRALMereologyContinuous integrationDifferent (Kate Ryan album)MathematicsLecture/Conference
06:08
Open sourceSoftware testingSoftware developerSoftwareProcess (computing)Data managementMultiplication signCloningSound effectBasis <Mathematik>Right angleLine (geometry)
07:16
Information2 (number)INTEGRALArithmetic meanBitDefault (computer science)CloningEmailInstance (computer science)Control systemRule of inferenceRight angleMultiplication signSoftware repositoryContinuous integrationRevision controlSoftware testingHookingCommitment schemeRepository (publishing)Computer animation
09:02
Personal area networkLoginSoftware testingRobotArithmetic logic unitPauli exclusion principleMathematical analysisSoftware testingPhysical systemInstance (computer science)Interface (computing)CuboidFormal languageComputer fileSoftware frameworkInternetworkingVirtual machineStatisticsCodeRoboticsMedical imagingRight angleWrapper (data mining)Fluid staticsSubsetParameter (computer programming)Function (mathematics)Plug-in (computing)Closed setTraffic reportingWeb-DesignerLoginPlanningDifferent (Kate Ryan album)Chemical equationGeneric programmingService (economics)SoftwareWordOrder (biology)Projective planeRow (database)Mathematical analysisData managementScripting languageRevision controlState of matterSoftware developerProgrammer (hardware)InformationSelectivity (electronic)Presentation of a groupQuicksortStapeldateiNeuroinformatikWeb crawlerWindowComputer programmingMultiplicationMultiplication signWritingModal logicBinary decision diagramINTEGRALSoftware bugCloningJava appletResultantBuildingInstallation artVirtualizationProcess (computing)Computer animation
14:00
Maxima and minimaPlug-in (computing)Self-organizationLinear regressionControl flowEmailExtension (kinesiology)HookingElectronic mailing listInstance (computer science)Rule of inferenceAuthenticationClient (computing)BitDifferent (Kate Ryan album)Focus (optics)
15:26
Goodness of fitCASE <Informatik>Frame problemBuildingEmailContinuous integrationINTEGRALMultiplication signPoint (geometry)Software developerCloningFigurate numberData miningControl flowSoftwareRight angleSoftware testingMessage passingMeeting/Interview
17:34
Maxima and minimaCore dumpCodeLine (geometry)SoftwareSoftware developerLecture/Conference
18:07
Figurate numberSoftware testingContinuous integrationMultiplication signFrame problemSoftwareRule of inferenceState of matterReal numberINTEGRALLinear regressionRight angle
19:30
Green's functionServer (computing)Plug-in (computing)Multiplication signSet (mathematics)Game theoryAdditionSoftware testingSoftware developerState of matterPoint (geometry)Process (computing)PlanningServer (computing)Parallel portPhysical systemSoftwareScalabilityInstance (computer science)Projective planePlotterINTEGRALSinc functionOrder (biology)TouchscreenOntologyAnalytic continuationRight angleVirtual machineDirectory serviceContinuous integrationResource allocationPlug-in (computing)OvalPerfect groupSequenceRule of inferenceData managementSoftware bug
22:32
Plug-in (computing)Software developerSoftwareMultiplication signState of matterSoftware testingProof theoryJava appletPhysical systemINTEGRALComputer animation
23:12
Length of stayMetropolitan area networkPlane (geometry)Analytic continuationRule of inferenceContinuous integrationError messageBlogINTEGRALTwitterPlug-in (computing)XML
23:56
World Wide Web ConsortiumEmailSoftware developerRepository (publishing)Software testingMultiplication signBlock (periodic table)NumberRight angleVirtual machineMultiplicationSocial classBranch (computer science)Sound effectRepresentation (politics)Hash functionModule (mathematics)Set (mathematics)Asynchronous Transfer ModeTwitterSpectrum (functional analysis)TrailInformationSlide ruleTheory of everythingSoftwareMusical ensembleRevision controlFood energySelectivity (electronic)Presentation of a groupBlogWeb 2.0Function (mathematics)Computer animation
Transcript: English(auto-generated)
00:01
Thank you and hello everybody My talk is about as Matt said about continuous integration a few words about me I'm a Python web developer since more than 10 years and most of my professional and free time I spend on a project called clone. It's an open source enterprise content management system written in Python
00:22
We have around 340 core developers worldwide and clone powers websites like NASA CIA NSA oxfam Brazilian government and many more and since around four years, I'm the leader of the clone continuous integration and testing team
00:41
So we make sure that our continuous integration systems work and that our testing is in a good shape So what's continuous integration I guess everybody heard that term at some point and it's In contrast to what many people think that it's like a software that you can just install and then you do continuous integration
01:02
It's actually a software development practice like test driven development for instance So this software development practice Is about team members that integrate their work into a main branch of a version control system frequently and each of this integration or commits or push or pushes or whatever is verified by an automated build and test process and
01:24
This automated build and test process make sure that code violations test failures Or box are detected as early as possible and also reported as early as possible to your developers. We all know that This those statistics a lot box right that the later you detect box the more
01:43
Cost they cause right. So if we detect box early, they're easy to track down and easier to fix and One of the other advantages big advantages of a continuous integration system Is that if you run your build and your tests automatically on a continuous basis?
02:04
Then you know that your development and also your deployment environment is in a working state So as I said the there there are three important parts and continuous integration one is the first one is that you
02:20
integrate frequently into your Version control system that you have an automated build and test system and that you report so Keep those three items in in mind because I will come back to that our first approach in the Plone community To continuous integration was actually build bot who here knows what build bot is. Oh
02:42
Quite a few people so build bot is a continuous integration framework written in Python We we had it set up But it's it's quite complex And as I said, it's more framework than then an out-of-the-box solution So you can't just like install build bot and it will like do everything that you want
03:02
You have to really know what you're doing and it's yeah, it's hard to set up. So we barely used it I mean a few hardcore developers used it, but it wasn't really run on a continuous basis. It wasn't really integrated into our In our version control setup and nobody really as a like regular developer
03:21
You did not even notice or knew about it and around four years ago in 2011. We introduced Hudson What's now called Jenkins? into our Into our process and one of the developers who like start to play around with Jenkins wrote that it's like build but but with with a butler so in comparison to build bot
03:45
Jenkins is really out of out-of-the-box solution. You just install it and You configure you have to configure it a bit, but then it basically works So that that was really nice. Also Jenkins comes with a with a nice user interface So everybody can just go there and like check the status and stuff like that
04:04
Downside is it's written in Java and as a Python developer you always prefer of course to use a beautiful Python software, right? but it's It's it's Java is a decent language and it's it's a very good Software product in my opinion. It has a huge open source community
04:23
Around it with many when with many plugins. It's backed by a by a company who offers commercial services on top of that called cloud piece And we're really really happy with it. So during my talk I will Give you examples what we do with with Jenkins
04:41
But it's not very specific to Jenkins. So as I said continuous integration is a Software development practice. So it's about the practice and the rules that you have right? It's not about the software that you actually choose So when we moved from build bot to Hudson things looked a bit better
05:02
But we we use nightly builds, I guess a lot of people do that because your tests take quite a while And you don't want to run them on every commit for whatever reason and then you run them on nightly basis Right, if everybody sleeps then you can just run them for a couple of hours or whatever it takes and next morning
05:20
You will get a report to your mailing list saying those this is the list of commits and now the build is either broken or It's fine The problem with that is that you don't run your build for each integration If you if you recall the definition that I gave you up front about continuous integration The important part is that you run your build and test process for each commit because that's the only way
05:46
To figure out which commit or which code change actually costs regression right if you if you have 20 Commits from different people and next morning you will see hey the build is red Then somebody needs to clean that up and usually the person who cleans that up is not the person who causes the violation
06:05
So it's costly to do that right and nobody does that if you are in a company You can force somebody like a poor guy or girl to like fix the fix the stuff for other people But in open source communities is even harder because like there are 20 commits and and people say hey It wasn't me right my commit was really like clean and and perfect
06:23
So if you run them on a nightly basis, you build this broken 99% of the time that that's that's at least my my experience So our software like development and release process in the plum community was like this The build was like broken 99% of the time and then before release our release manager said hey guys
06:41
I want to make a release and then like two or three of the 340 developers the really a hardcore guys started to like fix Tests for everybody else. Sometimes we had like 400 or 500 test failures We have around 9,000 tests in clone So people really we set together like for a day or two and we really fixed like a couple of hundred bucks
07:04
Before we could even make a release and then we started to make our 300 releases and then our release manager could make the actual release right? So that's what it took when we had those nightly builds So, how could we solve that nightly build problem You can solve that by
07:22
Following the rule that you have one build and tests per commit. So how do you do that by default Jenkins? You thought use polling to poll the diversion control system Like you can set it to every 30 seconds or something and it pulls it and if a new commit is there It creates a build the problem with that is you you won't fetch all the you you will not get one build per
07:46
Commit because it could be that some like two people commit at the same time. Then you have like two commits and Believe me. Those two people will say it was the other one Always so you have one commit and you make sure that you have one build for that commit that with with today's
08:04
Version control system. That's really easy because github has post commit hooks If you if you host it on github or a bitbucket or on your own you have your own git Repository you can just create a git post commit hook that actually triggers your your dank instance Or you say I instance and then you can have one build
08:23
Per commit so you can trace the person or the commit that was responsible So it's really easy to figure out what what goes wrong In clone, it's a bit more complex than that because we have those 300 packages and one checkout does not mean we have like The exact same check out of all packages, but I will like come to that later
08:43
And then what's important is that you preserve this commit information through your continuous integration pipeline So you pass it through the builds and also so that you can at the end notify people right via email or anything else so we have those three steps commit build notify and
09:04
In order to be able to Automatically build and test your software you need an automated build So we have tools for that in the Python community right in clone. We use build out It's not widely used outside the soap community to permit folks use it But most people use pip or easy install which are also fine. You need you probably need to like wrap then wrap them in into
09:29
Into bash files or anything like that But you can automate your build right if you if you do that you can use talks for instance on the CI system to To configure what's run on the CI system and on on the Jenkins machine you can
09:46
For instance use tools like shining panda. That's a Jenkins plug-in that allows you to create virtual ends or build outs and install things by a pip automatically So it's just a convenience tool We are not using it in a plum community because a bash script is enough
10:03
But if you want to do stuff with Python and you want a nice wrapper then shining panda is the is the right Tool for the job. So if you build If you do your build automatically you all you all of course want to use you want to run your tests Right because you want to make sure that your software actually works
10:21
If you use pytest, you're lucky because you can just configure pytest to to output Files that Jenkins can read out of the box Jenkins is Java software. So it has of course an XML interface But with pytest is really easy. I'm not sure about other Python test frameworks we have collective XML test report, which is the plone
10:46
Rapper out about about the result test run. I won't bother you with that And then you can present those nice statistics about your failing or passing tests and The same is true for a test coverage so you can use
11:02
The coverage package and the Jenkins kobatura plug-in to actually show that to use us So you have a nice interface that you also can show to your project manager So he or she can can track your performance and see if the build is broken In order to make sure that your software is not only in a working state
11:21
But also does what is supposed to do you you usually need acceptance tests, right and I'm a web developer So what what you usually do is you write Selenium tests and where we use that in the poem community for a long time But around five years ago, we started with robot framework and that really gave us a boost When it comes to acceptance testing robot framework is a generic test framework
11:43
With multiple plugins and one of those plugins is jenkin is sorry is Selenium selenium 2 so you can write tests in this nice BDD syntax human readable not not only by programmers and
12:01
Robot framework and selenium will run those tests and you have all the integration necessary In Jenkins as well. So you have a robot framework plug-in in Jenkins or a Selenium 2 plug-in that That shows you all the nice outputs of robot framework or selenium The cool thing about robot framework is that it gives you a full trace back if you test fail you
12:22
It goes step by step through it and it it does an automatic Screenshot of the last where the test actually failed and you have all that in a nice output that you can access and and see What what failed right? And we are also using a sauce labs, which is a software service that you can use to actually run your
12:42
Robot framework or selenium test on different versions So they offer you all the all the versions that you could imagine because you don't want to set up your own Windows machine we tried it don't do it that those those services are cheap Sorry for the advertising but or use any other service but use the service don't do that yourself. We tried it
13:04
Then One thing that is that is especially important for Python because it's a dynamically typed language is static code analysis So you're able to to track possible bugs early I guess you're all familiar with with the with the tools peb8 pyflex pylons We created a wrapper in the clone community around those tools called clone recipe code analysis to to have our best practice testable
13:27
You can use that without clone but only within build out And you can present all those if you run those code analysis Scripts you can present them within the Jenkins violations plug-in and it gives you also nice statistics about all your violations
13:45
Not only for Python, but also a JS lind and all the modern stuff CSS lind It's all it's all pluggable into the violations plugins So you can really really easily present all the information that you have to to your developers or to your project managers or?
14:01
everybody involved Then one of the things that is really important is notifications because people need to be informed as quickly as possible about regressions And there are like many different ways you can do that in Jenkins the best way or the way that is most widely used is via email and there's an
14:21
extended email plug-in for Jenkins that allows you to define rules Which people you want to notify so you can say if the build breaks then I want to notify like this This mailing list and that on if the build is it's still failing then I want to do this and that
14:42
So it's so you can really like define all the all the rules that you want Usually if you have a larger organization you want to hook it, okay? Hook Jenkins up with with LDAP It also comes with a plug-in for for github for instance or bitbuckets So you can you use the authentication with that. That's that's really nice
15:00
That's the cool thing about Jenkins that it has such a huge community that you have plugins for everything And you want you also want to show the current status to users so You can use the Jenkins dashboard plug-in to have a nice dashboard or you can even build your custom front ends It's all there. You just have to choose
15:21
So in the plug community we set up everything that I just presented to you And we ended up with this still so why is that? I mean we put lots of effort with a lot of people into that and we build it all like by the book and The build was still broken. Why is why is that right? I mean there are two reasons actually one one of the reasons is that in for clone is hard to have this one commit
15:47
One build thingy because we have those 300 packages and if you do a check out Then it checks out those up to 300 packages and you can't be sure that this all happens In a time frame before somebody else comes along, right?
16:02
So that's pretty specific So I won't go into that detail But that's a problem as soon as you have like two people that could be responsible for something They will point to each other and say it was the other one, right? Always the case and then like the the continuous integration and testing team needs to clean up and figure out what went wrong And after that you can like point at those
16:21
Persons and then say hey, it was you but I had to clean up your stuff anyways But the second thing that is not specific to to clone is that people break the build and they just don't care I mean, it's not because they're evil Sometimes you just want to do like a quick fix or anything or you do a commit and you think that can't possibly break
16:41
Anything right. I just did that like two days ago and and looked like a good friend of mine then just it took him like two or three hours to To fix my stuff because it wasn't obvious because the commit really looks perfect And then he wrote in the get in a get up commit commit message that he wants to kill me It was like all my fault because I was tired and I just went to bed instead of like waiting for for Jenkins to
17:06
Pass so so it's not bad people but sometimes those things happens right you break the build Maybe you don't check your emails or anything our bill takes still around 40 minutes So people break the build So how how do you present prevent that as I said like a couple of times before continuous integration is a development
17:26
Practice. So what's what's maybe even more essential than the goods? Software that helps you with that is actually that you practice that if you have agreement on the team And I think we gained a lot of experience with that because we have like those 340 core developers
17:43
That's actually from our last year's conference in Brazil We have over half a million lines of code. We have over 300 core packages. So we have quite a complex software and Like a huge team of developers It's not like a company where you can tell somebody to do things, right?
18:02
So we need some agreement on the team how to like keep a green build fortunately Some smart people already thought about that And came up with a few continuous integration rules or best practices that allow you to keep a green build The most important one is do not check in on a broken build
18:20
It's the most important one is not do not break the build because that will not happen people will break the build and it's okay To break the build it's just important that you don't check in on a broken build because if the build is broken and somebody else comes in and Checks in then it things get complex you get more test failures And you can't figure out which commit was responsible
18:40
And then people will point at each other and say it was that guy in it wasn't me right and then Things will become complex. So what you should do if you break the build The team should stop the entire team should stop and start fixing the build because you have a real regression, right? Your software is not in a working state and nobody can commit if they take this first rule
19:03
Seriously, so the team should stop and work on that. Sometimes that's not happen if that's not working Then it's also fine to just revert your commit sometimes it's obvious what you can do to fix it and you can just fix it, but there should be a time frame within Where you should? Fix the back within that time frame right because otherwise you will block the build but if you do that if you
19:27
Stick with those rules You can actually get a green build most of the time like not 100% of the time because people will still break the build This is what CI is for right our tests take quite a long time to run if you run them
19:42
all not in parallel like we do on the CI system, but Sequentially then it takes more than one and a half hour to run our tests and you can't expect everybody to run all those tests right, so people should use the CI system to break it but Not for long So if we go with the continuous integration rules and have our like
20:04
Setup we have we have proved that our software is in a working state all the time That is pretty cool for our developers because if developers do a checkout they know that the software works right before that They checked it out wanted to fix something and they had like a broken build so they had to fix something else That's frustrating
20:20
We could make faster releases because our release manager did not have to ask The two or three hardcore developers to like fix all those bugs for a day He could he can't just make releases because our build a screen right so you can deploy at any any time Just a few remarks about Additional things that you could do scalability is important
20:40
You should definitely if you have a larger project consider using a server node setup for Jenkins Which Jenkins allows you to do? So otherwise you're if you if you have a lot of jobs running in your Jenkins machine Then your UI will freeze because the server is busy. So do that Do that on the nodes use provisioning. There's nothing worse than a CI system. That does not work reliably and
21:05
behaves differently on the nodes And you can use the Jenkins port allocator plugin to run things in parallel because this is what you want to do Then if you have your CI system in place the next step would be continuous delivery not Continuous integration with continuous integration you automate your testing
21:23
Process and your integration process with continuous deployment you automate your deployment The idea is that you for deployment You just have to push a button more or less and automatically you will like deploy right a lot of companies do that these days And Jenkins grew from a CI system to is actually a system that can do CD as well
21:43
And we also started to work on that. We're using Zestreleaser for instance to do Python egg releases It's an awesome package. If you do a release by hand stop and use Zestreleaser. It's perfect. It's really great Piece of software you can use that fee for instance to make egg releases or wheels releases
22:01
to test then actually your your deployment and On the Jenkins side or there's a new plugin since like half a year called Jenkins workflow plugin It's really a game changer in NCI in my opinion. It allows you to create really sophisticated Workflows within Jenkins to run certain steps in parallel or sequentially and notify people and it's incredibly flexible
22:24
I already played around with it, and we definitely plan to to move to it So if you start with Jenkins, I would like definitely check it out. It's really really awesome so to summarize If you have a CI system and you integrate frequently you have an automated build and test system for each integration
22:41
And you report as soon as possible you can get a green build like most of the time which gives you Proof that you have a software in working state that you can deploy at any time you can ship software Faster and better it's more fun for developers not frustrating for them because they they run into like failing tests And Jenkins in the last four years has been great. It's it's like you have plugins for everything
23:06
It's a great piece of software even though it's written in Java So yeah use it If you want to know more about continuous integration I highly recommend that book on the left side called continuous delivery from Jess humble and David Fairley
23:21
They came up with those continuous integration rules There's another book called continuous integration from the same publisher I would recommend to buy this book because it has everything and the continuous integration chapter and that book is really great I have to I bought two both of the books and Buy this one you don't need the other one
23:41
And on the right side, this is a blog post. There's also Below where the URL is below where I wrote a blog post about our CI setup with all the Plugins that we used and all the approaches so that's more in detail If you have any questions, feel free to to ask me on Twitter on IRC
24:02
on my blog the slides are there and Thank you Thanks Timo we have time for two questions if there's anyone who has a question for Timo put your hand up. Thanks. I
24:27
First I wanted to say that with nose test you can also output XML which can be interpreted by Jenkins and display it in the web UI and My question is what do you do with flaky tests? To with flaky tests to a test that sometimes fail. You can't prevent. Yeah
24:45
I mean that's that's that's hard to do what we usually try to do is to To make them work reliably and if they don't work reliably we remove them because personally I don't think it makes sense to have a test that fails randomly because that
25:00
doesn't give you any information if that if a test fails randomly It's to no use because if it fails it gives gives you no information if it passes it gives you no information. So We try really hard to make it work reliably that's especially important for selenium tests because the underlying technology is fragile But you can make it work reliably and Jenkins helps you a lot with that because if you run things in parallel
25:24
Then you will see all kinds of effects that you don't see on your local machine you have to make sure when you run selenium tests that you Did you yet as you make sure that everything is there because Test can run slow and fast and and it's not not easy to do but in my opinion
25:43
It's worth the effort to have like reliable tests My question was that could you quickly comment on on how often? Like developers step on each other's toes when you have so many Repositories and developers like does it happen often?
26:02
Do you regret having split them out instead of having them in one gate or do you use it sub modules? Could you please comment on these things? That's the the big question that we always ask does it make I mean we we we split our We had a big monolithic software block and we split it onto multiple packages multi
26:23
Multi-build repositories and it's really great if you can as a developer like pick things and improve Certain packages without having to download everything, right? So that's that's a great thing and we don't want to lose that on the other hand We see the amount of work that is necessary to release and keep track of all those multiple repositories and we haven't really solved that problem that you have one commit and one
26:47
One build we are closed, but we don't have it. So It's a trade-off in the end It's it's it's hard to say I don't think that we will go back to a one repository approach
27:00
But I can see the the advantages that you have Um, yeah, yeah, that's Yeah, that's possible. Yeah, that that's possible But then you still do a check out and then you can't be sure that's that's basically you say we're using actually mr developer Which is a tool that checks out all the packages for you and make sure that you have the right branches
27:24
It's pretty sophisticated pretty cool, but it's complex and we try to store known good sets of this so we have for all our 300 packages we stored the version numbers or the The the commit hashes and stuff like that and we try to make that reproducible, but it was just too complex
27:41
We failed at that that just did not work Great thank you very much Timo great presentation