Keeping Secrets - A Practical Approach to Managing Credentials - TIB AV-Portal

Keeping Secrets - A Practical Approach to Managing Credentials

00:00

4

Antenesse, Chris

Formal Metadata

Title

Keeping Secrets - A Practical Approach to Managing Credentials

Title of Series

Number of Parts

45

Author

Antenesse, Chris

License

CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this

Identifiers

10.5446/34589 (DOI)

Publisher

Release Date

Language

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Tokens, passwords, certificates, API keys, and other secrets are vital to applications and infrastructure functioning properly. In the modern world of rapid, continuous delivery, we want to maintain agility and keep our secrets safe. While speed and safety feel mutually exclusive, modern tools with appropriate practices enable both at the same time. This talk will discuss patterns and show practical methods for keeping secrets safe from developer environments to production where tight access controls and continuous delivery are priorities.

ChefConf 201719 / 45

1

39:18

WinRM: Ride the Adventure!

2

42:26

The Hand-Waver's Guide To Contributing to Open Source

3

41:08

STIG Automation W/ Chef and Inspec

4

12:09

Amazon Web Services Keynote

5

15:02

Forrester Research

6

19:23

Python Applications with Habitat

7

31:27

Providing Monitoring Result Data to Chef

8

33:42

Practical Management of the new Shape of Applications With Chef Automate

9

39:29

10

33:28

Operationalizing Unknown Cloud Deployments (In a Repeatable Fashion)

11

39:32

My Journey Into Technology Through InSpec

12

11:24

Verisk Analytics Keynote

13

32:57

Mario Star Power Your Infrastructure: Getting the Most Out of Inspec

14

43:07

Managing Your Systems on Microsoft Azure with Chef

15

48:18

Managed Chef in the Cloud: Introducing AWS OpsWorks for Chef Automate

16

43:45

Kubernetes & Habitat

17

37:42

Kick Starting our DevOps Transition with Chef Compliance

18

29:24

19

38:23

Keeping Secrets - A Practical Approach to Managing Credentials

20

26:49

Microsoft Keynote

21

42:03

Incident Command at the Edge

22

37:00

Howdy, Chef Partner Cookbook!

23

42:27

How Habitat can Boost Your Chef Ecosystem

24

39:56

Habitat in Production

25

45:26

Getting Started with Habitat

26

30:39

From Solo to Happy: Migrating Chef Solo to Chef Server/Automate

27

41:15

Faking Coherence for Engineers

28

09:40

Verisk Analytics Keynote

29

41:20

Ephemeral Apps With Chef, Terraform, Nomad, and Habitat

30

35:59

Diversity is Not Just a Checklist

31

52:43

DevOps Transformation at Absa Bank: Technical Evolution; Cultural Revolution

32

39:58

Credit Union, AIX, and DevOps Oh My

33

45:24

Cooking Us Security for the Modern macOS Fleet

34

31:44

Chef Vault: A Deep Technical Dive

35

40:49

Chef Cookbook Testing Like a Pro

36

31:32

Chef CMO Ken Cheney interviews customers from Westpac & Rakuten

37

32:58

SAP NS2 Keynote

38

38:14

Beyond the Cookbook: Using Workflow to Bring Continuous Delivery to And Project

39

33:47

Chef CEO Keynote

40

37:39

An Approach to Air-Gapped Deployment

41

37:57

Adding Developers to the DevOps Process

42

1:06:50

Chef CTO Keynote

43

55:45

Chef CTO + Chef Automate demo Keynote

44

41:20

A Year with Chef and InSpec: A Retrospective with Optum

45

33:54

'Acing' Infrastructure Testing with Chef

Automatic playback

Speech

Text

Image

00:00

ZugriffskontrolleArchitectureAuto mechanicRow (database)Physical systemGUI widgetInformation securityComputer networkBackupOpen sourceSoftwareSeries (mathematics)Control flowLibrary (computing)Server (computing)Data storage deviceConfiguration spaceBitSoftware engineeringSoftware developerMultiplication signGastropod shellProcedural programmingRow (database)Strategy gamePhysical systemInformationPattern languageProjective planeGoodness of fitCore dumpIntegrated development environmentSystem administratorProof theorySelf-organizationPasswordFault-tolerant systemService (economics)ResultantPoint (geometry)Cartesian coordinate systemSoftware as a serviceInformation securityWhiteboardRange (statistics)Local ringRegular graphData managementCodeOperator (mathematics)Computing platformOnline helpElectronic program guideSource codeProduct (business)Set (mathematics)State of matterFigurate numberTouch typingTask (computing)SatelliteWordMathematicsDigital electronicsPlanningCanadian Mathematical SocietyBackupDecision theoryPower (physics)JSONXMLUMLComputer animation

07:08

BackupPoint (geometry)Physical systemVideo game consoleService (economics)CuboidSingle-precision floating-point formatMechanism designPattern languageData storage deviceOrder (biology)Inheritance (object-oriented programming)Electric generatorPerspective (visual)High availabilityPhase transitionFront and back endsGoodness of fit

08:19

ArchitectureInformation securityComputer networkServer (computing)Transport Layer SecurityTelecommunicationVertex (graph theory)Software developerGUI widgetoutputTexture mappingIntegrated development environmentSystem administratorAutomationConfiguration spaceClient (computing)PasswordVideo game consoleVariable (mathematics)GoogolProduct (business)File formatSoftware developerCASE <Informatik>Procedural programmingKey (cryptography)Server (computing)Network socketValidity (statistics)Integrated development environmentComputer fileData recoveryInclusion mapTime zoneGroup actionMultiplicationMaxima and minimaBinary codeControl flowData structureInternetworkingMechanism designInformation securityBuildingTelecommunicationSoftwareProcess (computing)Configuration spaceSelf-organizationMultiplication signBackupNumbering schemeData transmissionPasswordRight angleConsistencyPattern languageInformationPhysical systemRow (database)Type theoryDrill commandsWorkstation <Musikinstrument>CircleStudent's t-testStrategy gameSystem administratorSequenceFrequencyRegular graphLevel (video gaming)Computer architectureFirewall (computing)PlanningPoint (geometry)Line (geometry)Video game consoleIP addressReading (process)Differenz <Mathematik>

15:55

Data modelContinuous functionPhysical systemArchitectureStrategy gameIntegrated development environmentComputer-generated imageryBuildingLink (knot theory)Key (cryptography)Limit (category theory)Hash functionChi-squared distributionDialectProcess (computing)Drill commandsDifferent (Kate Ryan album)Branch (computer science)Procedural programmingBitSoftware developerSoftware bugValidity (statistics)Point (geometry)Electronic program guideElasticity (physics)InjektivitätGroup actionNumberCartesian coordinate systemDisk read-and-write headThermal conductivityHigh availabilityRun time (program lifecycle phase)MathematicsMereologySet (mathematics)DeterminantSource codeWordForm (programming)Data recoverySystem callDirect numerical simulationUsabilityIntegrated development environmentPhysical systemMedical imagingMultiplication signBuildingProduct (business)Software repositoryCodeControl flowComputer architectureRow (database)Task (computing)Phase transitionEndliche ModelltheorieSeries (mathematics)DialectMechanism designInternetworkingFunctional (mathematics)Service (economics)BackupSolid geometryBit rateServer (computing)Strategy gameVariable (mathematics)Time zoneReading (process)Fault-tolerant systemRoutingVideo game consoleComputer animation

23:31

Hash functionComputer-generated imageryMiniDiscRadon transformTask (computing)Key (cryptography)Run time (program lifecycle phase)Integrated development environmentVariable (mathematics)Semiconductor memoryPoint (geometry)MathematicsData storage deviceRepresentational state transferTask (computing)Bit ratePasswordElectronic mailing listConfiguration spaceComputer fileMedical imagingBuildingMultiplication signDeterminantMereologyPhysical systemLine (geometry)Key (cryptography)Branch (computer science)Java appletTemplate (C++)CASE <Informatik>Latent heatRow (database)MiniDiscSource codeGene clusterService (economics)Vector spaceSlide ruleParameter (computer programming)Independence (probability theory)Server (computing)Reading (process)Projective planeLogin10 (number)Extension (kinesiology)MultiplicationMessage passingAuthorizationState of matterSet (mathematics)Endliche ModelltheorieCartesian coordinate systemProcess (computing)RootLocal ring

31:08

Renewal theoryKey (cryptography)MereologySet (mathematics)Different (Kate Ryan album)Software testingBuildingCartesian coordinate systemProof theoryUniqueness quantificationRootAuthenticationGoodness of fitInformationBitServer (computing)Control flowVideo game consoleIntegrated development environmentPoint (geometry)Procedural programmingPlanningConfluence (abstract rewriting)CodeDefault (computer science)Projective planePasswordElectric generatorProcess (computing)Parameter (computer programming)Multiplication signInformation securitySoftwareFront and back endsRegular graphDynamical systemPerspective (visual)Physical systemVirtual machineVector spacePattern languageToken ringOrder (biology)MappingVoltmeterCodePropagatorMaxima and minimaPhysical lawProduct (business)Variable (mathematics)Local ringPhase transitionSelectivity (electronic)Computer fileCuboidRoutingInjektivitätExecution unitRing (mathematics)Right angleLecture/Conference

38:18

JSONXML

Transcript: English(auto-generated)

00:05

All right, cool. So the talk's called Keeping Secrets, a Practical Approach to Managing Credentials. My name's Chris Antonesi. As was said, I'm a software engineer at Socotra. I've been working in software since about 1999.

00:22

And in my early days, I was a Unix admin. And I worked on a wide range of systems, Linux to HPUX and all across the board. I had pretty much stuck with large organizations until about 2012, and that's when I went to my first startup and started using slightly more modern patterns for deployments.

00:43

And in 2014, I went to Socotra, and I was one of the first three engineers hired to build a modern SaaS platform to help modernize insurance operations. So starting at a true DevOps shop from day one was amazing.

01:01

There was no code the day that I started. I got to help inform all the decisions, stuff like how we were going to manage credentials and stuff like that, and what we were going to do for our deployment strategy. And I'm going to be showing off all the tech and methodologies that we use at Socotra for deployment and keeping secrets today.

01:20

Also, just plug, we're hiring. Check out socotra.com slash engineers for, sorry, careers for more information. So the problem. The bunny is the problem. This bunny wants to be fast, right? He's got that look in his eyes.

01:40

But he wants to be safe, so he stole the tortoise shell. And he feels like he can be both at the same time, but he has a really hard time doing it. And that's what we're faced with all the time when it comes to continuous delivery and managing secrets. We want the secrets to be safe. We want them to be per environment. We want the storage to be encrypted.

02:02

But we want to be able to deploy very quickly. And all through the late 90s and the 2000s, I was doing things like deploying software and then cluster SSHing to like 500 servers and setting a password for an environment.

02:22

Has anybody done this kind of thing? OK, at least one person. That's good, two people. And then things got slightly better. Around 2008, there were awesome tools like Chef introduced and encrypted data bags. And there was a little bit better way of handling it. But most recently, I've been very, very happy with Vault.

02:44

And that's what I'm going to show you today. I'm going to show you how the patterns that we use at Socotra for continuous deployment and injecting secrets per environment into the deployments, into the configuration.

03:00

So the purpose of the talk, it's to define workflows for managing secrets for organizations that value continuous delivery. I plan to show how to use HashiCorp Vault as a system of record for secrets. And I plan on defining workflows for how those secrets can get injected into the appropriate places so that your applications can consume them.

03:22

I'm going to show you methods and workflows from development all the way through to production. So when we talk about these things, I'll be talking about what we do in development all the way through to production. And spoiler alert, it's the same thing. We use all the same methodologies from local development all the way through to production.

03:42

And I'm going to show you how that works. So I'm going to spend about 10 minutes going over some important points about Vault. There's really a ton of information about how to set up Vault and how to configure it and all the best practices around that. So I'm not going to go into that. What I'm going to go into are some

04:00

of the really important points for treating it as a system of record. This is going to be the source of truth of your secrets in your organization. And that's really, really important. So there's a handful of things that you don't really get from the Getting Started guides that I plan to show you. And so I'm going to spend some time doing that. Then I'm going to spend a little bit of time just talking

04:22

about a continuous delivery methodology that we use. And then I'm going to spend the remainder of the time talking about how secrets are injected into the configuration all the way from development to production. So I developed these methodologies

04:41

and some of these libraries I'm going to show you at Socotra. And it was actually, when I was preparing for this talk, I was planning on showing you all the stuff that we were doing at Socotra. But I did something that I think is slightly more interesting. And I took an open source project that I've been working on with a friend called Bazel.

05:03

It's a CMS tool. And we employed all of those methodologies and libraries that were developed at Socotra to deploy this as well. So proof to myself and good open source stuff for the talk so that you can look at examples and GitHub repos

05:21

when we're done. So Apache Core Vault is a service that stores passwords in an encrypted data store and makes secrets available via request to the server. So the end result of introducing Vault in your infrastructure is that you have this system of record for your secrets.

05:41

I'm going to touch upon the operational details of using Vault for this purpose. And we're going to go from there. So Vault is a system of record for secrets. This is a really important piece of infrastructure for your organization.

06:02

I recommend going to vaultproject.io, checking it out and playing with things in a dev environment. But when you get ready to actually use this for something real, there's things that have to be done. Vault has to be reliable. It has to be highly available and fault tolerant. It has to have auditing capabilities. Those auditing capabilities might be for compliance

06:23

or they might be just to figure out what happened if something goes wrong. But you have to have that. You have to have reasonable network security. You have to have backups. You have to have a restore procedure. And not only do you have to have a restore procedure, but you have to have a restore procedure that's

06:41

practiced regularly. And you have to have a reasonable approach to access controls for the different roles in your org. And finally, you have to have defined and documented procedures for interacting with secrets, for reading secrets, writing secrets, and removing secrets. And I'm going to show you some methodologies for that.

07:02

And I'm also going to show you a library that's available to do that kind of stuff for building deployment purposes. So reliability. That doesn't look super reliable. But your Vault cluster will be. There must be no single point of failure. In order to eliminate a single point of failure

07:22

from a systems perspective, you have to choose a storage mechanism that makes sense. And Vault has the concept of storage backends, which again, there's extensive documentation about. There's four that support high availability. They're DynamoDB, etcd, zookeeper, and console. I'm going to recommend console for the following reasons.

07:41

First and foremost, it's automated provisioning using Chef. It just makes it super easy. And it makes it possible to provision this the same way that you provision everything else, provided that you're using Chef for that. It has mechanisms for service discovery and failover, and that's what console's good at.

08:02

So you get that right out of the box. Whereas if you do something with Dynamo or etcd or zookeeper, you're likely going to have to build that stuff yourself. And there's also a well-defined method for backup and recovery. This stuff is well-documented, and there's patterns for doing this kind of stuff. So here I show you an example architecture.

08:24

And that URL is ridiculous. If you just search for deploying console in AWS, you would find this, but also I can make this available as well. But here you see that you have multiple nodes spread out

08:40

across multiple availability zones, and you're using console to manage this cluster. And by following this guide, you will get this kind of architecture for your Vault cluster. So auditing. It's important to see what happened over time.

09:00

So you should enable auditing on every action. The way that you do this is with vault audit enable. And then you can choose the different kinds of logging capabilities that it has. You can log to a file, syslog destination, or socket. So whatever makes sense in your organization, that's what you should use there.

09:23

So network security. Server communication should all happen over TLS, as we want the transmission of this data to be encrypted between build nodes and developer workstations and the vault infrastructure. Network access should really only be granted to systems that need it.

09:41

And the ideal situation is that your build nodes and your workstations are on some kind of network where you can communicate with this thing over an internal IP. That's the best practice. That's the best case scenario. However, people use tools like Circle and Travis.

10:04

People are using certain technologies that don't lend themselves to being able to do that. So if you absolutely have to, you should really tighten up network access control via firewall security groups or whatever that mechanism is for your organization.

10:21

And so if it has to happen over the internet, at a minimum, you should do that. And you should make sure that everything's communicating over TLS. And again, the steps for doing this kind of configuration are well documented in the vault documentation. So you have to have a backup and restore procedure.

10:41

You have to have this road to recovery here. You need to back up regularly. And this is whatever makes sense for your organization. If you're a small startup where you have a handful of engineers and secrets are changing infrequently, then it could be totally fine to do a backup once a week. If you're in a larger organization or a midsize

11:01

organization and secrets are changing all the time, then you need to do backups more frequently. As a Coacher, we practice restoring everything once a week, which sounds crazy unless you have everything automated and then it's not that bad. But I highly, highly recommend that if you introduce

11:21

this kind of system, that you have not just a procedure for restoring your secrets, but that you also have the practice doing it, that you have an actual drill to go through this with some regularity. Now again, it might not make sense for your organization to do this once a week.

11:40

But you need to do it with some frequency to ensure that it works as intended in case there is a disaster. And most importantly, document a procedure to validate that the restore worked. This could mean dumping all of your secrets out of both places into some kind of common format and doing a diff. I mean, that would be the most simple way to do it.

12:01

But come up with a validation process to ensure that there's consistency. Because again, this is a system of record. It holds a certain kind of importance. Even if it's not compliance, it's your sanity on the line. So the next point is about access controls. Here we have an example of a very simple access control scheme.

12:20

We have engineer access control, a build access control, and an admin access control. And again, this is very, very simple. But you can see here that all the secrets for production are deployed to secret production. And all the secrets for development are secret development. Here we're allowing engineers access to read from development

12:42

but denying them access to production. For the build node, that's the middle one, we're allowing read for both production and development as we're going to need secrets for production and development to actually deploy both of those environments. And for admins, for the people who are actually going to be interacting with the secrets, we allow write to both secret paths.

13:03

This is using the generic secrets back end, again, which is very well documented with volume. So now that you have policies for each type of user that you have, you need to have a token per user. Because again, when it comes down to auditing and compliance and figuring out what happened,

13:21

you can't have shared accounts. It does not make sense to do that. So you will create a token for each engineer with vault token create policy, and then the role that you want to assign to that user. And then you would manage, and then that token would be provided to the user.

13:42

All right, so now that we have all this, we need to define procedures for interacting with secrets. You want to ensure that you have a well planned out plan for inserting secrets. And you want to map out which secrets will be available at which path. And you want to keep the documentation up to date as much as possible. So what we ended up doing, as a coach ran on Bazel as well,

14:03

was we documented out all the secrets that we had first. And we came up with a pathing structure that made sense. So again, using the generic secret back end, we have secret, environment, AWS, and then an access key. Or secret, environment, mlab, db, password.

14:24

So we mapped out all of our secrets. And then we mapped out all of our secrets and documented them before we started inserting any of them. And I'm going to show you the document in just a second

14:42

with how we actually document what the secret is. So the administrators will be entering secrets. And you want a way to automate the installation of Vault so you can control stuff like the binary and the server address. Again, we do this with Chef. And it works out marvelously. You should include enough information and documentation

15:01

for a new engineer administrator to understand exactly what their role is in interacting with secrets. Again, I can't emphasize this part enough. I think that you have to have a well planned out strategy for this before you just start dumping anything in there. Because this is a very, very important system

15:23

that every engineer and every build node is going to interact with. And you have to have a well planned out strategy for how these things are going to go. So I recommend doing this kind of documentation before you inject one secret. So then you want to document your secret path. And the path and the names may be intuitive.

15:42

It makes sense. What's MLAB DB password? But to actually say something about it so that people know what it is. They know that this is the per environment password for MLAB. They know that it'll be assigned as an environment variable at runtime. And they know that if they want to change the secret,

16:02

that they have to go to this place before they do the injection. So at this point, you have a successful Vault. You have a highly available, fault tolerant cluster using the high availability back end of console. You have the ability to audit so that you know what's going on.

16:22

You have a solid backup strategy. You have planned recovery drills to ensure that your backups are actually valid. You've created policies and processes for assigning users. And you've documented procedures for interacting with your secrets, including what secrets are at what paths.

16:40

So that's a little bit about Vault and kind of the setup there, the stuff that you won't really get from the guides. Now I'm going to go on to talk about a continuous delivery model. So I'm going to explain how Bazel is continuously deployed. I'll start with an explanation of the system architecture. And then I'll move on to the actions that engineers do to trigger these automated deployments.

17:07

So Bazel has a very simple architecture. It's deployed in AWS. It's using the ECS, Elastic Container Service.

17:20

The nodes themselves, the application nodes, are in an auto scaling group across two different availability zones. There's an ELB and a Route 53 record associated with every deployment. I'm going to refer to this group of resources as a Bazel deployment. What happens with the Bazel deployment

17:42

is that every time there's an action against the repo, it creates a new deployment. And it either is seen as a development or a production deployment. And I'm going to go through the mechanics about how we handle changing DNS records and dealing with that kind of stuff

18:02

after we create new deployments. So here's a development environment workflow. Bazel uses Gitflow for branching and merging. So there's two long-lived branches in Bazel, master and develop. Engineers create branches off of develop, and they create pull requests where the destination branch is

18:23

developed. So whenever a pull request is created, there's a temporary merge of the branch with develop, and the build pipeline is executed. So the build pipeline will detect that the build is a pull request, and it will create this set of resources. It'll create the ELB, the auto scaling group, the nodes

18:43

within the auto scaling group. And it will create a DNS record that points to uniqueid.basel.com. At that point, there's a validation process. It's a series of read requests against uniqueid.basel.com.tech.

19:03

And it'll go through and validate that the application is working as intended, and then it'll tear all those resources down. This is to ensure that every change that's made is deployable and works as intended.

19:23

So once we've merged to develop, so we have user-friendly forms branch, and it was validated in a pull request build where it created all these resources, validated, and then destroyed them, now we actually want to merge this guy to develop.

19:41

When we merge this to develop, the exact same process runs again. We run the exact same things that do the deployment. We create the resources in the exact same way that we did before, except this time, there's just one difference. Instead of tearing all these resources down, we change the CNAME of develop.basel.tech

20:03

to the new deployment ELB. And now we're able to access the most recent change on develop. And this could be a time where a handful of feature branches are coming together to create a specific release. And you can actually go to develop.basel.tech

20:21

to validate that the things that you expect to work, work as intended before actually pushing this guy to production. All right, so when all of the feature requests are in develop, when all the bugs are fixed,

20:41

when all the enhancements are there, when develop looks the way that you want production to look, you can then create, we create a pull request from develop to master. And when we create that pull request, the CI server does exactly what you would expect it to do. It does the exact same thing again. Except this time, it detects that the branch,

21:02

the destination branch is master, and it calls this a production deploy. And again, the one distinct step that's different from everything else is that the CNAME for basil.tech is modified at that point to be the name of the ELB for the new deployment.

21:23

And then we destroy all of the old resources that were comprising basil.tech before. All right, so now to the part about secrets. We have three different parts of our build orchestration. There's prepare, image, and deploy.

21:43

Running a Bazel build is the aggregate of many different commands, right? Like we run NPM to actually create the application. We are to build the application. We run Docker to package it all up. We run Docker to push it into ECR. We run Terraform to create the deployment resources

22:03

that I've been talking about. And of course, we have to run the vault command as well. So I've chosen Rake to codify the build and deployment tasks. So there's three different phases. There's prepare, image, and deploy. And I'm gonna talk about prepare first

22:20

and I'm gonna show a little code snippet here. So when we prepare for the build, we do a handful of things that are important. The first thing we do is we require a gem called sbuild. It's a public Ruby gem and sbuild has three functions. It has get secret.

22:41

It has system safe and system retry. And system safe and system retry are, this is all kind of what you would expect. System safe has some safeguards around the Ruby system call. And safe retry will retry commands that could have transient failures

23:02

due to internet dependencies. I'm sure we've all seen this kind of stuff happen. So sbuild is available in Ruby gems and the source code's available at my GitHub account. We have, so we do a couple things right off the bat. We make determinations about what this build actually is. Is it production?

23:21

Is it develop? What's the checksum? And then we do the part that this talk is all about. We make an environment determination, again, based on the branch. And then we get the secret based on the environment

23:41

for where we're deploying. And I showed you those paths to the secret before. So, oh, and one other thing that I forgot to mention is that all of this is overwritable by setting these environment variables. So if you have these guys set, when you're running your rate command, it'll get your secrets locally. It'll get the secrets from your environment variables.

24:02

So if you wanna change secrets to see what happens when you, or let's say you have your own AWS account and you wanna change your access key or something like that and not get it directly from Vault, you can just override it like that.

24:20

Yeah, so it'll get the secret for the appropriate environment and it'll store it in memory. And it'll make that secret available down the line. And I'm gonna show you that in just a minute. So the next part is about building the image. This is about how I'm building Docker image for this guy. And as you can see here,

24:41

I'm not using any secrets in the Docker image. I don't wanna store Docker images with secrets in them and put them somewhere that people can have them. I want these Docker images to be agnostic. I want them to be environment independent. So I want to create something that is environment independent

25:02

and that can take parameters, environmental parameters. I'm gonna show you how that works in the next slide. However, I do need to use these access keys to access ECR so you can see that I inject those access keys there before I run this AWS command.

25:23

And yeah, so secrets are available from that first part for the environment that you're interacting with. That kind of stuff's available to you down the line. And now comes the actual deployment part. So I'm using Terraform for deployment.

25:41

And you can see here that all of the secrets, all of the stuff that came from prepare are parameters in the deployment, in the execution of the deployment, the creation of these resources. So you can see that I have these,

26:00

that I'm passing in the access keys. I'm passing in some Google auth keys. I'm passing in the DB password. And I'm running Terraform with this stuff and they are getting templatized into the ECS task definition. And they will be available as environment variables

26:22

inside the running Docker container. So at this point you can do what you want with these secrets, right? Like they're environment variables in the Docker container. It's running. You can use, you can do something with Chef to create templatized config files.

26:42

What we did for Bazel was we made it so that Bazel would read from specified environment variables on execution time. So there was no reason to do, there was no reason to do anything else at that point. We inject these things as environment variables. Bazel starts up.

27:00

It has a list of things that it expects to be there and those things are there and those things are per environment because of the determinations that we made in the prepare phase and because the deployment itself is parameterized.

27:22

So this is the ECS task definition. This is what the AWS API expects. These things, again, get passed through the system

27:41

as variables in the rake file and then they get passed to the terraform command which populates these and writes them to the AWS API to actually have these things injected into the running Docker container. So when the cluster's up and running,

28:02

these environment variables, they'll contain all the passwords and Bazel will just read them and start up the way that it starts up. So I think that the thing that I wanted to pay the most attention to when doing this was thinking about

28:21

where these secrets would end up, right? Like, we've already said that we want vault as the system of record. This is the source of truth for your passwords. But it feels very important to know where these guys are gonna end up at the end of the day. And for me, I didn't want them written to any disk

28:41

in any kind of public service or something like that. So you have to consider your risk vectors when it comes to where these things end up. And the thing that made the most sense for us was to store these things in memory on the CI server

29:02

and to inject them into the actual running container at start time so that they were never written to disk in a place where someone could find them or that they weren't written to logs or something like that. So I think that when you're doing this stuff,

29:23

what actually happens to that secret when it's decrypted and where it ends up and where it's made available to the resources that are gonna read from it is a really important thing to consider. And for us, the thing that made the most sense both with this culture project and Bazel was to inject these things at runtime so that they would be available like that.

29:44

So yeah, this was born out of years of managing secrets in weird ways and key pass wallets and passing them around. And actually at Socotra, what we end up doing

30:00

is we have pretty extensive configuration, multi tens of lines of configuration files for a Java REST API service. What we end up doing is we pass all this stuff in as environment variables to the container and then we do a chef run. The chef run is actually the entry point for the container.

30:24

So it will take in all these environment variables and this recipe will apply all of these environment specific environment variables and create config files using templates and it works out pretty nicely.

30:41

So Bazel is a case where we do everything, we do everything with environment variables. Socotra does everything with environment variables too but it takes it one step further and actually creates config files using chef at execution time. So that concludes my talk.

31:00

I wanted to leave some time for some Q and A. I'm not sure if people have questions about how this stuff works but I'm more than happy to answer them. Any particular reason for not going from the recipes into like vault or console IO and grabbing the parameters that way instead of environment variables? Seems like the cleaner way of doing it or?

31:24

Well, the reason that I didn't do that is because if the servers are compromised I don't want those things to be, I don't want the application server to have access to vault. There's only two things in my, so I talked about risk vectors but I didn't mention this. There's only two things that have access to vault

31:41

in my infrastructure. It's engineer machines which we have tight control over and build nodes which we have tight control over too. And so when thinking about it from a network access and network security perspective I wanted to go that route with it. Having said that, I am tempted to do a proof of concept

32:03

the other way and try and create really tight access controls around the application nodes themselves and seeing how that would work and seeing what we can do with pen testing and if it actually makes a difference. So I have a question about renew the vault talking.

32:25

Yes. So I remember it's the talking can last maximum about 32 days? Yeah, yep. So what kind of way so you recommend so we can put the code in the cookbook

32:43

to renew the talking? Yeah, so we actually have a manual procedure for doing this. And I know that it feels a little bit heavy handed but there's certain things around secrets

33:01

that just can't be automated, right? Like for example, if we wanted to store all of these secrets in a file, that would be great. We could auto generate all this stuff but that's exactly what we're trying to avoid, right? We don't want to do that kind of stuff. So when it comes to, I'm kind of taking a little segue to get to my point. When it comes to doing certain things with vault

33:23

we just have manual procedures that we do with some regularity and they're part of our planning and they're part of the projection so that we can ensure that we're taking them into account when trying to accomplish our goals. But when it comes to the token renewal

33:40

and how to propagate that, right now that process for us is manual. Okay, thank you. What is that you're talking about? So there's a root token for vault. Well actually, there's a handful of different times where you need renewal.

34:01

For example, the root token is valid by default for a certain amount of time. There's a renewal process for that token that you have to do and there's places that token has to be. So when you do that renewal you have to ensure that you can, that you have the procedures in place

34:20

to make sure that it ends up where it needs to be after the renewal. Having said that though, there is a way to disable that if it, the auto renewal, if it's an issue in your environment or not just disable it, but there's also ways to change it. So again, when it comes to mapping out what you're gonna do with your secrets,

34:42

I think that it makes sense to account for this, this kind of thing in your plan. Have you considered using vault's ability to dynamically generate passwords per,

35:02

like for example here, the DB password is something that I would do something around like dynamic generation for each container uniquely and it's basically only for the lifetime of the container will that password be valid and then it just disappears and thus, if anybody got access, it's gone.

35:20

So we actually use the AWS backend for Socotra. So there's a bunch of different secret backends. I've been talking about the generic one where you store secrets at a path and you're able to retrieve them at that path, but there's another one called the AWS backend and that does exactly what you're talking about. You associate a policy with that

35:42

and it'll create keys dynamically every time and then you can take those keys and inject them and the keys will only live for the lifetime of the deployment or the container. That's exactly what we do at Socotra. We have a prepare phase, much like we're using for Bazel and it will generate those keys

36:01

and they'll propagate all the way down and when it comes to the destroy, when there's like a promotion of a deployment to production, it'll take those keys and it'll destroy them so they're no longer valid. Having said that, this does take some finesse getting it right, but it is a very, very nice pattern

36:24

that we followed and that I really like and again, speaking to compromise, if there is ever a compromise, you're good because you can tear down that, you create a new deployment, whole new set of keys, you tear down the other deployment, as part of that, it takes away those keys

36:42

and now you don't have to worry about it anymore. Yeah, good point, thanks for bringing it up. So it looks like you guys are using just tokens to authenticate your build system to, have you looked at like using the application role authentication where you have to have two unique bits of information

37:01

so that you can actually store one with the code and have one be actually on the build system? Yeah, so I've been talking about what we're doing for Bazel, but for Socotra, we do that, yeah. Yeah, thanks for bringing that up too. I'm behind the pillar, just so you know.

37:20

I just see the pillar. Yeah, so you had talked about things being well-documented and standard practices, where are those well-documented? Yeah, thanks for asking that. We're using, for the small project, we're using Google Docs, but for the bigger,

37:45

for Socotra, which is a big project with a lot of engineers, we're using Confluence. I think that it would actually be really interesting to do some kind of feature for Vault where you could auto-generate documentation. At Socotra, we auto-generate documentation for everything and it's just like part of the plan,

38:02

but because this doesn't exist in Vault right now, this is a fairly manual process. 3.40 on the dot. Thank you, Chris. Thanks. Thanks.