Continuous Delivery - the missing parts
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 163 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/50191 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
NDC Oslo 201584 / 163
6
13
16
17
18
19
20
21
22
25
28
29
30
31
40
41
44
45
49
51
52
53
54
55
57
58
60
61
71
74
75
76
78
84
85
91
92
93
94
95
96
98
99
100
105
106
107
112
115
116
117
118
122
123
124
125
127
128
129
130
131
132
133
135
136
142
144
150
151
153
155
156
157
159
160
00:00
WeightApplication service providerSoftware developerContinuous functionIntegrated development environmentBinary fileMechanism designServer (computing)Data managementGoodness of fitPhysical systemDifferent (Kate Ryan album)SoftwareInternetworkingBinary codeMereologyIntegrated development environmentProcess (computing)CodeMoving averageAbsolute valueReading (process)Mechanism designComputer fileMassAreaSet (mathematics)Standard deviationSlide ruleMetropolitan area networkAxiom of choiceGroup actionProduct (business)DivisorSpacetimeSoftware testing1 (number)Software developerMultiplication signVisualization (computer graphics)TheoryOnline helpOperator (mathematics)Information technology consultingOrder (biology)Real numberBitSystem callVideo gameACIDNumberRevision controlPoint (geometry)BlogQuicksortAnalytic continuationApplication service providerStructural loadWindowKey (cryptography)Continuous integrationBuildingWeightComputer animation
07:02
SoftwareSoftware developerImplementationContinuous functionHookingCodeComputer networkHand fanSoftwareFeedbackAnalytic continuationSelf-organizationNumberCASE <Informatik>Vector potentialTwitterMereologyProduct (business)Multiplication signApplication service providerChecklistContinuous integrationBlack boxCartesian coordinate systemPhysical systemImplementationWritingSoftware developerIntegrated development environmentProcess (computing)WordSystem administratorSlide ruleInformation securityOrder (biology)DataflowVisualization (computer graphics)Information technology consultingAreaCodeGoodness of fitEntire functionQuicksortDatabaseVulnerability (computing)Design by contractRight angleType theoryIdentifiabilityWeightLattice (order)RankingUltraviolet photoelectron spectroscopyPoint (geometry)Arm2 (number)Film editing1 (number)Machine visionData structureSoftware industryExecution unitComputer animation
14:04
Software developerComputer networkInformationSelf-organizationCodeBasis <Mathematik>SoftwareStreaming mediaExecution unitData structureTerm (mathematics)Self-organizationTelecommunicationSoftware developerCycle (graph theory)Multiplication signCodeProcess (computing)SoftwareStreaming mediaMeasurementTask (computing)Order (biology)Operator (mathematics)Software bugConfiguration spaceGraph (mathematics)Information securityData managementLatent heatVelocityCASE <Informatik>Level (video gaming)DataflowPlotterCartesian coordinate systemWeb pageFlow separationMathematicsMainframe computerMetropolitan area networkProduct (business)Normal (geometry)Gastropod shellScripting languageError messageSet (mathematics)Line (geometry)Reading (process)Group actionBitNP-hardWindowBasis <Mathematik>Video gameDeterminismVotingDecision tree learningGrass (card game)WebsiteAreaForcing (mathematics)Single-precision floating-point formatIntegrated development environmentPhysical systemPoint (geometry)Electronic mailing listEvoluteTheoryUniform resource locatorRight angleCuboid19 (number)Computer animation
21:20
Storage area networkVelocitySoftware developerSystem administratorPhysical systemStability theoryOperations researchWebsiteProgrammschleifeFeedbackSystem programmingArchitectureCodeOrder (biology)Operator (mathematics)InformationLevel (video gaming)Physical system2 (number)Dataflow1 (number)Product (business)SoftwareType theorySelf-organizationProjective planeCartesian coordinate systemVideo gameMathematicsIntegrated development environmentFeedbackCircleTwitterSocial classSoftware developerGoodness of fitData conversionTerm (mathematics)MereologyTape driveMultiplication signDesign by contractAnalytic continuationDifferent (Kate Ryan album)WebsiteVirtual machineSystem administratorDivisorStability theoryOnline helpOffice suiteNetwork topologyWeb pageComputer architectureProcess (computing)Presentation of a groupRoutingBit ratePlanningSoftware bugWordVotingComputer clusterBitService (economics)Eccentricity (mathematics)Loop (music)Ultraviolet photoelectron spectroscopyBlock (periodic table)
28:15
FeedbackProgrammschleifeWeb browserSoftware developerMetric systemMeasurementWeb pageOrder (biology)Data conversionDataflowReading (process)Software testingWebsiteGraph coloringFeedbackMereologyTwitterSoftwareInformationProcess (computing)BuildingServer (computing)Utility softwarePhysical systemIntegrated development environmentOperator (mathematics)Self-organizationShared memorySoftware developerConfiguration space2 (number)RootMultiplication signCausalityMathematical analysisBitNatural numberAnalytic continuationMetric systemResponse time (technology)FrustrationFacebookGame controllerMultiplicationQuicksortGame theoryComa BerenicesError messagePlanningConditional-access moduleProjective planeConstructor (object-oriented programming)Web browserArchaeological field surveyXMLUML
35:11
Twin primePhysical systemQuicksortSoftware developerAnalytic continuationOrder (biology)Line (geometry)Self-organizationOperator (mathematics)Key (cryptography)Rule of inferenceMereologyPower (physics)Coma BerenicesSound effectXMLComputer animation
36:11
Configuration spaceData managementStack (abstract data type)Server (computing)Component-based software engineeringLoginFunctional programmingFlow separationLoginServer (computing)Physical systemLevel (video gaming)WebsiteMathematicsPressureWordWeb 2.0Programmer (hardware)Term (mathematics)Configuration managementCartesian coordinate systemKnowledge-based configurationOrder (biology)Point (geometry)PlotterCentralizer and normalizerProduct (business)Shared memoryThomas BayesConfiguration spaceService (economics)CodeWindowConnectivity (graph theory)Data structureSoftware bugQuicksortINTEGRALCollatz conjecture2 (number)Installation artVirtualizationComputer animation
40:00
Service (economics)Vertex (graph theory)Multiplication signGraph (mathematics)Physical systemElasticity (physics)LoginError messageCore dumpSoftware developerGraph (mathematics)AverageBefehlsprozessorProduct (business)Level (video gaming)MiniDiscSpacetimeFeedbackMetric systemDatabaseSlide ruleView (database)Goodness of fit2 (number)Operator (mathematics)Virtual machineConfiguration spaceSeries (mathematics)QuicksortResponse time (technology)Software repositoryUniform boundedness principleWritingProcess (computing)Latent heatService (economics)Dependent and independent variablesCodeMathematicsTime seriesNormal (geometry)CodeGrass (card game)Server (computing)Thermal fluctuationsComputer architectureClient (computing)Game theoryComputer animation
44:40
DreizehnGamma functionSoftware testingVirtual machineConfiguration spaceVulnerability (computing)Bit rateTask (computing)Physical systemProcess (computing)BitSource codeGroup action1 (number)Port scannerScheduling (computing)CuboidEmailBasis <Mathematik>Projective planeCodeNumberType theoryMeasurementTouchscreenRegular graphCellular automatonState of matterRight angleMultiplication signControl flowInsertion lossComputer animation
47:52
CodeDisk read-and-write headData centerOrder (biology)Right angleSoftware developerInstance (computer science)Operator (mathematics)Hash functionServer (computing)CASE <Informatik>Form (programming)Different (Kate Ryan album)CodecEntire functionTask (computing)Point (geometry)AreaCodeXML
49:35
Default (computer science)Local GroupConfiguration spaceState of matterRadiusCache (computing)Numerical digitParameter (computer programming)Motion blurGrass (card game)Communications protocolInformation securityGroup actionForceUniversal product codeGroup actionServer (computing)CuboidPhysical systemKeyboard shortcutRule of inferenceSoftwareInternetworkingMathematicsState of matterMultiplication signInformation securityWindowEntire functionCodeCartesian coordinate systemConnected spaceLocal ringVirtual machineGreatest elementBlogDeclarative programmingGame theoryWebsiteBlock (periodic table)Video gameRepository (publishing)BitRadical (chemistry)Form (programming)Stack (abstract data type)Set (mathematics)Demo (music)Windows RegistryResource allocationDigitizingPublic-key cryptographyPlanningAliasingComputer animation
53:34
Data recoveryDirect numerical simulationTime zoneOpen sourceStrategy gameRepository (publishing)1 (number)Virtual machineDialectData recoveryGraph (mathematics)Process (computing)Integrated development environmentPhysical systemInformationExistenceMultiplication signLink (knot theory)Goodness of fitBlogCapability Maturity ModelMetropolitan area networkVector potentialChaos (cosmogony)Analytic continuationEntire functionQueue (abstract data type)Computer animation
55:46
Capability Maturity ModelContinuous functionData modelMaxima and minimaMotion blurData recoveryAreaDifferent (Kate Ryan album)Bit rateMultiplication signExpert systemLevel (video gaming)Physical systemConfiguration spaceIntegrated development environmentMoment (mathematics)Data miningOrder (biology)Archaeological field surveyProduct (business)Combinational logicComputer programmingSurface of revolutionType theoryPiCodeMetric systemFitness functionTraffic reportingServer (computing)MathematicsFilm editingSampling (statistics)Process (computing)WebsiteAnalytic continuation1 (number)Self-organizationSystem administratorOnline helpCartesian coordinate systemTwitterMessage passingSoftware repositorySoftware testingShooting methodConfiguration managementComputer animation
Transcript: English(auto-generated)
00:01
Okay, let's get started. You'll have to excuse me, I'm a little bit cold-y. So I sound a little bit croaky. I think two days drinking has done that to me. But we'll see. Thank you all for coming to my talk. This talk is called continuous delivery, the missing parts. This is a talk focused around how, as a developer, everyone developers?
00:25
Anyone not developers? A couple, maybe one, maybe two. So this talk is about how to bring some operational thoughts and some testing thoughts and QA thoughts into your development pipeline and how it should potentially make you a better developer.
00:42
I have some contact details on the bottom left. If you disagree with anything or think it's a load of absolute rubbish, then please, please, please get in contact because I would be extremely interested to know why. Otherwise, if you think it's very interesting and want to talk further about it, also get in contact with me as well. My name's Paul Stack. I am an infrastructure engineer now for a very small, cool startup in London,
01:04
and we work on a visual search tool, which has been quite fun to work on. I'm a reformed ASP.NET and C-sharp developer. I mean that in the lightest, hardest possible way. I spent a huge amount of time working with ASP.NET and C-sharp, but I also had a lot of pain working with Windows
01:21
because in the operations space, Windows can be very interesting. Very interesting. I'm a DevOps extremist. I love it. I breathe this stuff. I teach it. I do workshops on it and try and help people understand what it's about and why it's good for your business. And lastly, I'm a conference junkie. I love going and meeting lots of different people at conferences.
01:42
It's so much fun. The background to this talk, I have heard an indescribable amount of people telling me or tweeting about it or reading about it on the Internet to say that they're doing continuous delivery.
02:01
I'm not doing continuous delivery. No, I'm not saying they're not doing continuous delivery. What I'm saying is that they're doing pieces of continuous delivery and they're probably not thinking about the whole picture. Continuous integration and continuous delivery has been a huge part of everything that I do at work for the past nine years. Even from my very first job,
02:23
my first role and my first job was to try and maintain an old system and it was an absolute piece of rubbish. It sat on a central server under the manager's desk and wasn't in source control. It had no builds. C sharp files were going onto the server and it was compiled on the fly.
02:40
So from then, it's played such a massive, massive part of everything I do. And I hope that as we go through this talk, it will introduce you to some of the areas that I never really thought about as continuous delivery and about how DevOps comes in and brings those together. So just a very simple recap.
03:02
Continuous delivery is a set of practices and principles aimed at building, testing, and releasing software faster and more frequently. This is not groundbreaking stuff. You've probably heard a hundred different people or read blog posts or read books on this and that's pretty much the defacto standard of what everyone says.
03:21
It gives us eight principles. The principles, I'm not going to rehash them, but more importantly, it gives us four practices. All these slides are all available at the end, I promise you. Now the practices are the most important part, in my opinion, of continuous delivery because you do simple things well. Firstly, you build your binaries once.
03:41
Please, please, please, please, please do not rebuild your binaries after you have moved from a QA environment and you want to move to production because factors may change your code. The code then that you may release if you rebuild your binary is potentially not the code that you've tested and verified.
04:02
If you take one thing away from this talk, please take that. I hope you take many more, to be totally honest. Two, use the same mechanism to deploy to every environment. There are so many deployment tools out there that you can use and that you can make better.
04:21
Octopus deploy is one of them in the .net space. You can use TeamCity, you can use Jenkins, you can use all of these things. The same mechanism to every environment. Three, smoke test your deployment. How do you know it works if you don't check it? And four, if anything fails, stop everything immediately and fix it.
04:43
It's really important. Has anyone read the book? Has anyone not read the book? Please buy a copy. Please, please, please. You'll see other book recommendations during the talk. There's only a couple of them. It's an excellent book and it will give you a real in-depth way
05:02
of reading all of these pieces. It is this seminal documentation on continuous delivery, software delivery. It was written by Jess Humble and Dave Farley and it was first published in 2010. There's a funny story actually behind it. It wasn't just Jess and Dave that actually wrote the book.
05:24
They're actually the ones that published it, but it started as a group of consultants who were friends, all born out of ThoughtWorks, actually discussing some of the disaster scenarios they've actually seen and had to go in and help companies with. It sort of was a real collection of war stories
05:43
and they wanted to sit down and think, how do we give people instructions in order to make it better? And that's where the book spawned from. I said it was published in July 2010, so are we saying that continuous delivery is only five years old?
06:01
No. The Agile Manifesto. History lesson time. In February 2001, 17, who are now very famous software developers, came together in Utah and the United States and they wanted to discuss software development
06:21
and the ways in which to make it better. Just to create some guidelines and some principles on how to make life suck less. And it included people like Martin Fowler, Kent Beck, Ron Jeffries, Uncle Bob Martin, people who we now look at as the founders of Agile, Modern Agile.
06:42
These are really influential people now. And in 2001, they actually said, principle number one, our highest priority is to satisfy the customer through early and continuous delivery of valuable software. So this is not a new thing. But there's a few very key points in there.
07:03
Firstly, early. You must get your software to your customers fast. You must be able to get feedback on your software. And the most important other word on that slide is valuable. Value is everything that we do.
07:22
If we are not creating value from the software we're making or the software our companies are doing, then how are we going to make a business? How is our business going to get better? There are some common misconceptions of continuous delivery. And these are not portrayed towards anyone.
07:41
These are just some funnies that I've seen over the past six or eight months. And I thought I would put some out there. There's only five that I thought are appropriate. They're ranked in no particular order. But as they go through, they get worse. Number one, continuous delivery is something that only startups can achieve.
08:06
Continuous delivery can be achieved by anyone. It is not a checklist exercise. You can't sit down as an organization and go, here are 12 points. When we satisfy them, we are continuous delivering. It's not how it works.
08:21
It's always about continual improvement and always making how you do things better and always delivering value. Number two, continuous delivery only works for Node.js or Ruby or Go developers. It's not true either.
08:43
Anyone can do it. My last company, when I originally started there, were a very legacy ASP.NET application. And we put a lot of very good practices in place. And we went from releasing software once every four weeks
09:00
to actually releasing on demand. Because we had put such a good pipeline in place, and it wasn't just me. It was the entire team work together to do that. So anyone can do it. I've heard of people even investigating continuous integration, continuous delivery for... There's all sorts of talks here. There's a talk next by Alex on continuous delivery for databases.
09:20
There was a talk this morning for continuous delivery in embedded systems. This is the de facto way that we're going to start delivering our systems. I apologize for this one in advance. We can hire a consultant to help us implement continuous delivery. This is nothing against consultancies.
09:41
Continuous delivery, as I said, cannot be a checkbox exercise. It requires understanding the system. It requires understanding the flow of how software is created in your company. And consultants can go and they can help you identify areas in which to improve your flow, and they can help you implement tool in order to do it,
10:01
but they will never always be able to go at the end of a contract and go, continuous delivery, check. It evolves over time. This one, I actually rephrased this one. Right click and deploy in Visual Studio is continuous delivery. I've seen it happen so many times, I really have.
10:21
I used to do that type of thing. It's not wrong. This is not a case of you're wrong if you're doing this. This is a case of you're not giving yourself the full potential pipeline in order to identify areas of weakness and how do you make it better. And number five, continuous delivery is as simple as hooking GitHub to our zero account.
10:44
You're not running tests, you're not smoking it, you're not automating your deploy to the same environment everywhere. This is not a case of if you're doing it, you're doing it wrong. I hate that. This is a case of look at potential other ways of doing it. The reason I put number five in
11:01
was because Dominic Byer sent me a tweet last month that said I just deployed from GitHub to Azure just to an iStack 72. So I thought that I would put it in there and say no, thank you very much. So these common misconceptions are everything there stems back to
11:24
delivering code fast. And delivering code fast, as I said, is part of it. A guy called Chris Reed, he used to be a ThoughtWorker. He now works for a company in Chicago. And in 2011 at a London continuous integration meetup
11:40
in London, he said that until your pretty code is in production making money or doing whatever it does you've just wasted your time. This quote has stuck with me for the last four years. I absolutely find it incredible because now I question everything that I'm doing and whether I'm delivering value for the company and also delivering value in a good time.
12:04
I used to be a huge fan of TDD and craftsmanship and now I focus less on writing very pretty code but writing functional, better code that actually works and getting it into production faster.
12:21
That's no disrespect to any craftsmanship community. That's not what I'm saying. I'm saying is that you can write code that's very nice and well-constructed and well-architected but it doesn't have to take weeks and weeks and weeks and weeks and weeks. You can deliver small amounts of code very fast in a very nice way.
12:41
Haven't I just contradicted myself? I don't believe I do because writing the code is just one part of it. Deploying the code is another part of it. But what happens when the code is in production? Is it a mythical black box that as developers we don't really think about?
13:00
Hands up who checks back on their code in production a week after it goes into production? A minority, right? Because that's not how we're used to working. That's not the environments that we work in. If we saw other people doing that
13:22
we would take notice of that and we would go and do it as well. So there is a traditional thought process and what the technical side of a company looks like. And you usually have departments like developers and QAs and sysadmins and network and helpdesk and infosec
13:41
and lots and lots and lots more. And each department is in its own little pod of desks that don't really talk to each other and there may as well be walls between them. I have worked in this environment. I used to be this environment. I used to be the they're the sysadmins, I don't need to talk to them.
14:00
Or QAs? Not too fussed about QAs. You end up with an organizational structure that looks like this. I actually pulled this from Google. That is a real organizational structure within our company. I'm not going to name the company. But has anyone ever heard the term Conway's law?
14:21
Yeah. Aren't we supposed to be one team? Yes, we are supposed to be one team. By creating these layers or silos or organizational disjoints we're effectively stopping ourselves from being as successful as we can
14:40
within our organization. So Mary Poppendake is one of my very favorite Agilistas to read. She's produced some fantastic stuff throughout the years and I had the pleasure of actually listening to a keynote from her in Budapest a couple of months ago.
15:01
And one of the famous quotes from Mary's book is actually how long would it take your organization to deploy a change that involved just a single line of code? And can you do it on a reliable and repeatable basis? Now I have asked this question a lot.
15:22
Usually one on one sessions or like small group sessions or workshops. And I've usually got well it depends. And they'll go if it's a normal change then we can possibly release the software some people say in minutes, other people say hours, other people say weeks
15:43
somebody told me because they actually have like a mainframe that it takes them like a year in order to make their code change that's okay. But one thing that I actually found nearly from every single session was you always had somebody in there who went but if it's a bug fix I can get that to production in minutes.
16:01
And the question I would always ask them back is how does a bug fix differ from your normal deployment pipeline or your normal development pipeline? And they were like oh but you know if it's a bug fix it's costing the company money. This is actually what people would respond. And you know
16:20
it was their thought process was if there's an error in production then I should fix that error right now. Great. Absolutely brilliant way to think. But they were thinking but the normal development I can just you know chill out I can do a couple of lines of code here and there I can have a read a Reddit you know do bits and pieces and then I'll push it up eventually.
16:42
So I challenge them and I always ask people to treat bug fixes the same as you would treat normal development. Fast delivery of valuable software. We call this cycle time. Cycle time is defined as the total time for beginning to end of your process.
17:02
And measuring cycle time is extremely useful. So when you measure cycle time you can do it in a value stream map and you can plot it and you can look like something like this. This is a company I've worked for. I'm not saying which company it is. But this is really useful
17:22
because when you plot your flow and your cycle time on a graph like this and you show it to a development manager or your CTO or whoever your superior is the first thing they'll say is why are we waiting so long between work centres? Why are we waiting
17:40
so long in order to get our software out? In this specific use case it is 19.5 days. That's one man month or one person month of work time wasted waiting. Now, if you eliminated that waste
18:01
you could deliver that software that specific software in just a few hours. And isn't that a better thing to do? When you go back to your companies on Monday sit down with a couple of your team plot your value stream. This is enlightening if you do this.
18:20
Honestly, this is amazing if you do it. And if you work on the operations side plot your value stream for the operations side because I can guarantee it will be drastically worse. Last company used to take us eight weeks to provision a VM. We had two Rumbooks for our main application
18:41
the first one was 56 pages that was just application configuration and then we had a 28 page security configuration document. Why they weren't in the same document I have no idea. It shows the separation of work. But through a lot of hard work
19:00
throughout a couple of years we were actually able to get that down to six minutes to provision a Windows box. If you just make things a tiny bit better every time. And how we actually sat down is we sat down with the operations people and we said how do you provision these boxes? And they said
19:22
we open up a command line window and we just run shell scripts. I was like okay so I sat down and watched them run a shell script and he did step 15, then step 24, then step 16 and I'm like whoa why are you doing that? He goes well if we don't do that then I know that step 18 or 19 will fail
19:41
because it requires a secure or a VB setting from step 26. I'm like so why don't we just update the document? He's like ah but if we do that you know then we'll have other issues. And this is what happens. We as people are rubbish at performing repetitive tasks.
20:02
We're really not very good at it. We're not designed for it. We make mistakes. A lot of mistakes. It's life. But it's not all doom and gloom. It's really not all doom and gloom. There are people out there who are incredible and have worked really hard in our industry in order to make our lives better.
20:22
The rise of DevOps. Who knows what DevOps is? Who thinks they know what DevOps is? Who has their own opinion on what DevOps is? Because that's what it is. Whatever you want it to be within your company. So where DevOps came from is in 2009
20:41
in the Velocity Conference in Santa Clara you had two guys who came and gave a talk called 10 Deploys Per Day Dev and Ops Cooperation a Flicker. It was John Allspaw who now works for Etsy. I think he's the Senior VP of Operations. I think this is his official title. And Paul Hammond who works at Slack HQ
21:00
the last time I checked. Which was like a couple of months ago. I don't know if he's there. Now, in 2009 I cannot think of delivering software 10 times a day. I know people who can't even think of delivering software 10 times a day now. Because that's not how in the environment that they've worked in
21:21
and that's not how they're set up. So this was a huge talk to give in 2009. They had the sophisticated systems in place in order to make that happen. But there was an even more important piece of information on this. You had two people who came together on a stage who were from the different sides
21:40
of the IT organization. You had a developer or somebody from development who was Paul Hammond and you had somebody from operations who was John Allspaw. So immediately in order to demonstrate how good they were at deploying software they removed the myth that they were two separate teams and that they were actually working together
22:01
even to give this presentation. And there was a huge amount of discussion on Twitter by a few very well-known people now. Paul Nasrad who works at Google Patrick Dubois who we would class as the founder of DevOps and lots and lots of other people. And they got together and they were like let's make a conference like this in Europe.
22:23
So in October 30th and 31st in 2009 in Ghent in Belgium they had developers and system administrators ops for two days. Can you see where this goes? And it was called DevOps days. Wow, it's groundbreaking.
22:41
And there was people who flew in from all over the world and the people came from as far as Australia to Ghent. Any Belgians here? I've never been to Ghent. Oh, I know you're a Belgian. I've never been to Ghent but I believe it's a nice place. And they have three very good things chips or frites chocolate and beer
23:01
is what I'm told. So you had people flying in from all over the world and enjoying the Belgian hospitality and more conversation continued on Twitter afterwards and they were hashtagging at DevOps days and then somebody was like why don't we just drop the days and the hashtag DevOps was born. That is actually the very short history of DevOps where the term came from.
23:22
And this actually helped us spawn a whole new environment and a whole new ecosystem of tools. Okay? These are a very small few and I'm sure some of them you recognize some of you probably won't because some are newer than others but you have Puppet, Chef, Sentu, Graphite, Jenkins, Logstash, Capistrano, Rum Deck,
23:41
Vagrant, Terraform, Grafana. So many. So many. But this discussion between people actually was a great way of sharing how people worked and how processes work within their organizations because it was about sharing and learning.
24:02
And of course when you have new tools people love shinies, right? Nobody wants to go and work in that horrible looking tool that's very difficult to maintain. They want to use the nice new Shiny application that makes their life a little bit easier. So there was a whole ecosystem of tooling spawned up. Thank you tool vendors. And of course you can buy many, many, many more today.
24:23
But what DevOps was about was to try and demystify this. You used to have a wall of confusion as I said between departments where developers want the change, operations want the stability. In some organizations operations people were actually paid for the uptime of their system.
24:42
They would get bonuses if their system was up 100% of the month or whatever. And of course when you have that they were just blocking up developers from releasing their systems in the production. Or you would have a lot of red tape in order to get your system in the production and you would have to look at
25:01
SLA time or support contracts. They would just make your life as developers very difficult to get systems in the production. I have been part of this. I have done this to developers. I don't get paid for uptime though. I get paged when it goes down. And if you're confused of the difference between developers
25:21
and operations then DevOps Borat is going to help you out. If you're praised for website success you're a dev and if you're blamed when the website goes down you're an ops. It's very true. And what generally happened was developers would create this nice little package of code and they would throw it across the wall and the operations guys would catch it and do something with it and that was the end of life
25:42
for that piece of code for the developer. It worked fine in the development environment. It worked in my machine. I don't know what's helped with your environment in production. I'm sure we've all done this because this is just
26:01
this is the human factor. We make mistakes. We're not going to get things right every single time. So, how does DevOps help this type of thing? DevOps is underpinned by three principles. They're called the three ways. So, the first one is
26:20
systems thinking. Then you have amplifying the feedback loops and then you have a culture of continual experimentation and learning. And there's lots and lots of documentation and there's actually books written about this which I'll show you in a second. But the first way is systems thinking. The flow of information
26:40
or the flow of software from the business to the customer. Developers would act on behalf of the business because they would write the code. Operations would act as the customer because they were the ones that were getting the code into production. They were actually facilitating the use of code by customers. Think of it as an idea
27:02
to a fully working system in production or a fully working piece of code or a fully working feature in production. It's one-way flow. And that led us to believe that there are four types of work. There's business projects. So like UI improvements
27:21
or search improvements. Internal projects. So architecture. Refactoring the microservices. Changes. Changes happen everywhere. So deployment, schema updates, et cetera, et cetera. And then the last one which is the worst one of the lot. Unplanned work. So when you have a bug,
27:41
when you're asked to investigate something, unplanned work will stop us from delivering our systems. Have any of you ever walked up to your colleague's desk and just started asking them a question while they're working? That's a distraction. That can potentially help them introduce a bug into software.
28:02
So then we have the second way. Amplify that feedback loop. Make that one-way flow of information. Circle back. Get the operations people to pass back very useful feedback, constructive feedback. It's very important back to the development team.
28:23
From your software, when we released that yesterday, we actually created a 5% performance degradation on the site. Can we investigate that the next time? That's useful, constructive feedback. You're giving them an idea that he's increased the server utilization. You're not turning around and saying to them,
28:40
Tom, that software you created yesterday was rubbish and it took down our website. That feedback is very important and it will help them form the next piece of work that they're going to do. It's about providing feedback and visibility. Not just for feedback's sake, but in order to achieve a higher goal of developing better systems.
29:03
Devs need feedback of issues and if there are problems, invite them to things like postmortems. If you run outage postmortems, I don't know what people here would call them, I know that when a website goes down, people get together to discuss root cause analysis or bits and pieces like that
29:21
and you would invite the developers who were involved to that. And Etsy, they're like the amazing company that everyone in the DevOps world want to be, actually run blameless postmortems. They have a process that they don't blame each other for when systems fail, which is incredible because it's human nature
29:40
that you are defensive. It's just the way we are. And the outcome of this is if we give this feedback, the outcome is more work will get done. So, Patrick Lightbody from the CEO, he was the CEO of BrowserMob, actually said, by this feedback, he said, giving developers, or we find that when we woke developers
30:00
up at 2 a.m., defects got fixed faster. It's true. Nobody wants to be woken up. The third way, creating a continual experimentation and learning, a culture of continual experimentation and learning. Has anyone ever done multi-variant testing on their website?
30:24
Multi-variant testing is that you give a small percentage of one way of the website looking or a feature working to a very small percentage of people and then you have a control of the rest of the users of your website get the normal way and feel and you would have to measure how that impacts
30:41
conversion or whatever the success metric for your website is. There are many companies that do this. Facebook do this. Facebook does this. Facebook actually roll out their features to very small percentages of the population first. I am told, I don't know how true this is because I can never prove it,
31:01
I'm told by someone at Facebook that no two people in a room will actually have exactly the same layout on a Facebook page. There will be a tiny, very small difference and they're able to measure how that works. Booking.com run thousands of multi-variant tests on their website. Etsy run thousands
31:20
of multi-variant tests on their website and you can, it can even be as simple as changing the color of the checkout button and measure and see what the impact of, or repositioning it on the page in order to measure the impact that that has on the conversion. When we achieve
31:41
these first two ways of the flow of work and the feedback coming back from operations in order to enrich that work, we have a very good setup that will allow us to be able to experiment more. There are two fantastic books on this.
32:00
The first is The Goal by Dr. Elliot Golrat and this was written in the 80s about the manufacturing plant, a manufacturing plant in the US and about understanding the flow of data within the manufacturing plant and understanding how work actually gets done and then that was actually recreated
32:20
in 2012 by Jean Cain, George Spafford and Kevin Bear as The Phoenix Project and it was actually written about an IT organization and it's a fantastic read. They both give a very similar story just two different environments. If you want to understand
32:41
more about flow and about how things happen in your system, pick one up and read it. And you can adopt DevOps through four things. It's called CAMS, culture, automation, measurement and sharing. This was an acronym coined by
33:03
John Willis and Damon Edwards who is botchagalup on Twitter and Damon Edwards on Twitter and they've been involved in a huge amount of information in the DevOps world and they really have given us four key things that we can learn from. Culture, it's people in process first.
33:21
If you don't have a culture then all automation attempts will be fruitless because you won't have the culture of people wanting to maintain them and make them better. Automation, once you understand the people that you work with and the culture that you're building then you can actually
33:40
start to stitch tools together to create the fabric of how everything happens in your pipeline. This is things like release management, provisioning, configuration management, monitoring, control, orchestration, etc. We'll have a look at some of these individual parts and these for me are the missing parts. These are the parts that developers we don't really
34:01
always think about. Measurement, if you can't measure it you can't improve. It's as simple as that. If you don't understand that your website has a three second response time then you don't see the frustration
34:20
and the need in order to make it better. If you can see that and that's put right in your face then you'll want instinctively you'll want to make that better. Sharing, feedback. Metrics are great but unless you can give constructive feedback and how to share
34:40
how things have gone wrong then you cannot create a culture of building better systems. We're all part of the same team. It's the second time I've said that. It's very true. Metrics and automation are key.
35:01
Very true. And when you get into this thought process and this culture and this way of sharing information your organization will actually everyone in your IT organization will be able to be involved in all sorts of pieces of the puzzle.
35:20
This for me is a great way of showing all the pieces required in order to deliver a system. It was a graphic presented at London CD continuous delivery meet up in October 2014 by John Turner of a company called pattypower.com. John basically was able to say
35:41
is that they're very effective at delivering systems because they don't have real defined lines between roles. He's not saying that QA only stick to the grey and that ops people only stick to the green and the red and developers only stick to the blue. He's just showing that there are different
36:00
pieces of the fabric that everyone must be aware of. And this then leads us into the ops side of continuous delivery. This is the real key. This is the missing pieces. Configuration management. Anyone use a CM tool? The ones here are Ansible, SaltStack, Chef,
36:20
Puppet, CFengine. These are a way of being able to define what level of configuration a system needs to be at. If you're on Windows you can say I need to have IIS installed. I need to have four web servers and a virtual
36:40
application running. I need to have a folder structure of C websites, whatever, whatever, whatever. And I would need to have these applications installed in order to make everything work. And you would take that and you would deploy that onto a server and your configuration management tool would bring that server to that correct level and it would maintain that. So if people
37:00
with prying hands want to RDP onto a web server and they want to change them or change IIS, the configuration management tool would run again and go that's not how we do it. We're managed through a tool and that tool will revert your changes. It will take it back to the level in which you've told it to be.
37:21
These do work on Windows. I used to think there was an excuse that I couldn't use these tools because I worked in Windows. And they will work on Windows because the
37:41
rest of its competitors work on Windows. If you're not interested in managing a server farm you can have what's called immutable infrastructure. Immutable infrastructure was a term coined by
38:00
Chad Fowler in 2013 and he basically wrote an article saying trash your servers and burn your code immutable infrastructure and disposable components. He effectively says long running servers are terrible. Long running servers mean that you cannot guarantee if you have two servers
38:20
in a web farm you cannot guarantee that both servers will be exactly configured the same unless you use either configuration management or you continually destroy your servers and rebuild them. Anyone got RDP access for production? Anyone ever already peed into production and
38:40
made a manual change? We do. Because when the pressure goes on and we have to fix a bug what do we do? We take the easiest way to fix it because we want to make that fix happen. Not a bad thing. But by doing that you have effectively introduced risk into your system.
39:03
Since then within the last couple of months people have said maybe immutable infrastructure is not the right word for it because it sort of integrates too much into the functional programming community and we don't want to annoy the functional programmers because they are doing some awesome things there. And what they have actually
39:20
suggested is that maybe we should call it disposable infrastructure. Maybe immutable isn't the word to suggest that it gets continually destroyed and recreated. So why don't we create infrastructure that we know we can dispose at any point and
39:40
recreate. Logging. Who logs? Who has a central logging system? Not trying to big up my other talk, but my other talk is about this. And it's basically about creating a central logging system and
40:01
the usefulness of a central logging system. I'm not going to hash this out too much but we used to use the Elastic Search Log Stash in Kibana Stack. It's called Elk. And we introduced another layer in there called Kafka. And effectively we created logs on everything. Request logs, error logs.
40:21
Every time a deployment was made we wanted to see it. Because we were able to create graphs that looked like that for our developers. Now this was for a specific team and that team ran 12 services and we knew exactly how many requests each of those services took and we know how many non-200 response
40:40
codes each of those services had. And we could tell performance changes. So you probably won't see it but on this graph here over in the front third there are tiny little black marks and those are actually deployment markers that every time a deployment was made we could tell exactly how many requests or average
41:00
response time or CPU time was after that deployment. We could see and we gave feedback we put it in front of everyone and it was open that they would be able to understand how their systems were acting in production. So not only were we collecting logs but we were given the feedback and visibility into the system.
41:20
It was great. Because we wanted to understand metrics around network traffic in and out of a server, we wanted to understand how the CPU was, levels of RAM fluctuation, disk space
41:40
and there's a tool called Grafana. You pick up data using graphite which is a time series database and Grafana is a UI that sits across the top of it that allows you to search based on data from it. And we were able to create other graphs. These are actually not the graphs. The graphs didn't come out very well in the slides and I no longer work at the
42:00
company so therefore I shouldn't show any graphs. And effectively it allowed people there and then to see not only we gave them the developers the view for their system but we gave the operations people visibility into their systems and
42:22
monitoring. Anyone use monitoring tools? Does anyone not know if they use a monitoring tool? Good. I'm presuming that means you all use them. Monitoring tools are great. Monitoring tools will tell you about a specific
42:40
piece of the system. So for example if IIS must be listening on port 443 you can deploy a check into a monitoring tool and it will tell you if that port is listening on port 443 because if not and you lose it then effectively you could lose traffic, you could be serving 404s, you could be serving 500, whatever.
43:01
Just serving errors to your client. And at my last company I used Nagios which is a great tool. It's a very old school tool. It's not the new and shiny so therefore some people actually talk badly of it. I loved it. Absolutely loved it. And we created checks for log stash
43:21
and for redis and for elastic search split brain and for checking the nodes in the elastic search cluster. There are all sorts of checks. There are more in my GitHub repo if you want to go and get any. But in my current company we use sensu. Sensu is a new and shiny one. I believe that's what I should be using. I'm just joking.
43:40
I have much more of a changeable architecture. It's an EC2 so nodes are continually spun up and destroyed and sensu is a pub sub for monitoring so it fits my need better. And we just write checks in yaml that look like that. So checking processes and checking CPU levels and checking RAM levels and
44:00
based on the tags that the machines that have those tags would then subscribe to the specific checks. If you do not know if you have a monitoring tool or not please go and sit with your ops people. Go and give them a hug. They will love it if you show some enthusiasm for the monitoring tool. In anything a great success story was
44:21
in the last company we actually put all the configuration in the GitHub and developers started sending pull requests themselves to add their own monitors. It was amazing. Orchestration. Orchestration is amazing. Let's see if I
44:40
can show you an example. Can you see my screen? I don't even
45:00
know what I have just done. Let's raise the size. We have a number of actions that we want to run on a regular basis. One of the big ones is I very much get into my security vulnerability scanning. I thought it
45:21
would be really good if we had a project called CV testing and that ran on a specified schedule. I just restored this box five minutes ago there for the nevers. It actually runs if I show you the activity. You can see that they run regularly and that we get a green status or a
45:41
red status. There is actually red in here somewhere. We are actually able to tell if some of our systems are prone to some of the more famous security vulnerabilities. When I fix the system it will run every night
46:00
between midnight and 1 a.m. and when I come in in the morning I will receive an email saying if there is any of our systems that are vulnerable to any of the SSL vulnerabilities. An orchestration tool is taking care of that for me. It is a job that I want to happen again and again and again and
46:30
again and It is a little bit of bash, doesn't really do anything groundbreaking. If you are interested and you want code that looks like this, I will send you, it is not closed source.
46:42
Have it, take it back, test the system. We use orchestration for another piece of our system as well. We actually use it for running our Ansible playbooks. For bringing our systems, when a new system comes online, that new system checks into our orchestration tool and says,
47:01
hey, I am a machine of this type. Configure me. Bring me to my desired state. And once it's configured and brought to its desired state, that's it. It's left. It's no more touched. We do not do anything else with it. And we can actually go and you can see that out of 408 executions, we've had a success rate and we can see average duration.
47:25
We're measuring everything. This tool is free. It's called RunDeck. It's amazing. Check it out. It runs on the JVM. If you're afraid of the JVM, then people will help you. We will help you through the JVM, I promise. But having that ability to orchestrate tasks
47:44
and just configure them once and let them run again and again and again is just brilliant. Use an orchestration tool. Okay. Go on. I often get asked the question of what's
48:03
the difference between using RunDeck and a tool like Jenkins or TeamCity. Very, very fair question. They actually complement each other but they're quite different. Jenkins and TeamCity are development tools and they're designed for automating builds and deploys. Okay. RunDeck is in essence an operations tool and it is designed for executing operational
48:25
style tasks. But obviously they can be mixed. That's not a problem. You can use any tool. The point of saying use an orchestration tool is to use one because it will take care of a lot of all this hassle for you. One of the big last areas I want to talk about
48:44
is data center as code. Okay. I have traditionally worked in companies that rack and stack servers. Okay. They'll buy VM heads, they'll rack and stack them and it will use a tool like VMware and a UI in order to spin it up and go into the
49:02
UI and right click and deploy instances. Very recently there is a new tool created by a company called HashiCorp. They're the guys who made Vagrant and Packer and all these other awesome tools and it's called Terraform. Okay. And Terraform allows us to effectively manage our entire data center. Data center being virtual because I now use
49:25
EC2 and I can actually create it and destroy it. So let's have a look. This is actually
49:42
my entire VPC. This is our entire application stack and I don't know how you make that any bigger. But effectively we have a lot of application server sets. So we have run deck and demo site and docker registry and our internal API and some other systems.
50:01
And then we have a VPC. We manage key pairs. We manage our NAT server which allows our internal servers outbound access to the Internet. We manage our subnets and security groups, et cetera, et cetera. Now all of this is actually managed in code and it's checked into GitHub. Therefore we can track any changes that were made that were
50:24
problematic. And if we have a look we can start to see if I make that any bigger
50:59
to be totally honest with you. But we declare resources, we declare inbound rules
51:05
and outbound rules. We declare tags that are in EC2 so we can easily identify things. We can declare security groups, insider blocks and everything. And at the end of it what I can do is I can go to my repository, my terminal or whatever system I use and I
51:26
have a digit terraform and I can say TF plan. I have a bash alias for terraform to be TF. And it will actually, if I have a network connection, it will look at the state of my local machine against what's currently in EC2 and it will tell me if there
51:40
are any changes to be made. It actually says at the bottom there are no changes, your infrastructure is up to date. But if somebody with some, not bad intentions, but if they want to make some changes to the infrastructure and haven't actually run it past anybody in the team and they changed it to port 5,000 and made a change to a box from a tag from NAT to be NDC Oslo security
52:10
group. When I go back here I can say show me the status of our infrastructure and it will go oh there are now some changes that need to
52:21
happen here. Okay the changes are as follows. The AWS security group dot NAT which is a box has four rules now. It has an extra rule of port 5,000 on ingress. It also has an extra tag, it's told me the name change in the tag.
52:41
So we're actually managing our entire infrastructure. If I really want to and I want to be really destructive I can go, I'm really hoping nobody's using this infrastructure right now that would actually be quite funny, 200. If I actually want to change an entire CIDR block and change the IP allocations which I know would break everything I can then go and I can actually check it
53:03
as long as I don't run the apply command. If anybody sees me run that just stop me. Okay because that would be really bad and it'll actually tell me here oh look your subnet has changed as well and it will force a new subnet which will destroy all my machines. So we're actually managing everything. I'm not going on the GitHub UI or on the AWS UI at all here.
53:23
Everything is managed via code because we're all engineers and we like code. These tools are cool. They will make your life a little bit easier. I don't guarantee it but and we can start to do other things. We can manage our DNS.
53:46
Managing DNS means no potential spelling mistakes because you get a check in advance and you can always send a pull request and get somebody else to check your work. You can graph your system. Okay now all of these things
54:00
coming together and automating all these processes means that we can actually have a disaster recovery strategy. Who knows their disaster recovery strategy? Who has a disaster recovery strategy? A couple, not many. Your company will have them you just won't be made aware of them. Netflix
54:24
are the most famous company in the world for disaster recovery strategy. Their simian army is incredible. They have got an open source repository called the simian army and it has tools in there that you can take and you can deploy those into your environment. They have three notable ones. You have chaos
54:43
monkey okay which will randomly destroy a machine and hopefully that machine will self heal and it will bring itself back up. Okay they have chaos gorilla which will, amazon is based on regions and inside a region you have availability zones and inside availability zones is where you would
55:00
have the machine. So chaos gorilla is the next step up and it will simulate an entire availability zone disappearing. Okay so it'll go EU West 1A no longer exists. Can your system cope? And then chaos Kong will come along and go EU West doesn't exist. Can your system fail over to another region? And they practice this regularly and by practicing this regularly it means
55:23
they're very good at what they do. They can deploy systems faster. They can develop better features. Your uptime as an end user is very very good. On info queue I encourage you go and look at this that the links on next
55:43
there is a very good blog post called the continuous delivery maturity model okay and it talks about all these different areas and how basically if you tend towards the right you're getting better. Okay I'm not going to rehash the article now but it's a very very very good article and it was by Andreas Ray
56:04
Tobias Palmberg and Patrick Bostrom and it's it basically the important thing it says it takes time and improvement to get towards the expert level. You can't just click your fingers. Okay and I believe from a
56:21
survey that was taken in 2014 by a combination of Revolution IT, Puppet Labs and Thoughtworks that the benefits of continuous delivery and DevOps mean that you are a higher performing IT team and are more agile. Apparently according to some metrics you can deliver code 30 times faster with 50% less failures. I love metrics like that. I don't know how I read the
56:45
report. It's actually a very in-depth report. Lead time for changes can be a lot faster, a huge amount faster. Apparently as somebody who is a company that has implemented continuous delivery in DevOps you can recover faster. We've already showed you the Netflix way of doing it.
57:06
They practice weekly. And then you can experiment more because if you're firefighting less and you're actually delivering value to your customers better then you can spend time experimenting. You have all of these
57:23
systems in place so you can start to go I think maybe as an idea we could try this on the website and you could multivariate testing for a very small portion of your traffic. Continuous delivery means better products for your customers. Internal customers and external customers. People internally in
57:48
your company use your systems. They are just as important. They may not be the ones that are paying for your systems but they are still your users. Does anybody have any questions? Shoot. The question is how does he make a
58:18
give him a hug. No I'm kidding. I'm joking. I'm joking. In actual seriousness if you
58:24
sit with the sysadmin and you actually show an interest in the types of work that they're doing and start to I'm not saying question or undermine what they're doing but start to suggest of could I help you with that? If you put that in a github repo or an SVN repo could I send a
58:42
pull request to that? And if they see that you genuinely care and want to make the place better and make the application better they will start to relinquish control. It happened to me. I was one of those annoying people who used to go and ask the sysadmins can you put that in github so that I
59:02
can actually start looking at the code? And Tom is an ex-colleague of mine and Tom will tell you and it becomes infectious because if somebody on another team is having a problem getting a feature out but they see that I can submit my own pull request in order to get my operational needs out faster then they'll start submitting pull requests as well. And it really
59:22
grows and grows and grows. Don't start too big. Don't go for I want RDP access to our production servers. Because they're immediately going to climb up and they're going to just want to be protective of their system or they'll feel that you're going to start poking holes in their system. So start
59:43
really small start with something that currently gives you pain today and pair program with them.
01:00:00
So, what is the best orchestration system or configuration management system for a predominantly Windows-style environment? In my honest opinion, it purely depends on the type of people you have working there. For example, if you have people that work in your company who love Ruby, then Chef will be very good for them, because Chef is written in Ruby.
01:00:23
If you have people who don't want to see the internals of the system but want a nice DSL, then maybe Puppet. If you want somebody who wants something that will run on Python because they love Python and they want to debug Python, then Ansible will work for them. It's honestly a matter of trying each tool and seeing how it fits within your company.
01:00:45
I cannot honestly say that one tool is better than the other, because next week there's probably going to be another tool, right? There probably will, and then that may be better. So, it's about trying them out, seeing how it works within your organization and the
01:01:00
people that you have. Success is for us, or my last company, we actually had DBA sending us pull requests in order to puppetize SQL Server. When I got that and saw that pull request, I actually gave a random person in a bar a hug, and that was quite embarrassing. Any other questions?
01:01:23
Awesome. I'm like one minute over, so thank you all for coming. Please tweet me or send me any messages if you want to see more or more code samples, or if you're interested in more of my opinions that I actually can't say on video, then let me know. Thanks, guys. Have a good one.