Python Applications with Habitat
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 45 | |
Author | ||
License | CC Attribution - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/34602 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
ChefConf 20176 / 45
1
4
5
9
12
16
18
20
21
22
24
25
27
28
35
37
39
42
00:00
BuildingDisintegrationDistribution (mathematics)DivisorData managementConfiguration spaceCodeProcess (computing)BlogService (economics)Parity (mathematics)Integrated development environmentKeyboard shortcutAuthenticationAuthorizationConcurrency (computer science)State of matterRepository (publishing)Cartesian coordinate systemSet (mathematics)Web 2.0Library (computing)Inclusion mapVirtualizationNatural numberVideo gameMobile appMultiplication signoutputMereologySheaf (mathematics)Reading (process)AuthenticationVideoconferencingFundamental theorem of algebraRevision controlSystem administratorSoftware developerProcess (computing)Physical systemDatabaseKeyboard shortcutBuildingDivisorCodeBinary codeMedical imagingCompilation albumConfiguration spaceINTEGRALDemo (music)Projective planeModule (mathematics)Distribution (mathematics)Single-precision floating-point formatInstance (computer science)AreaBitAdditionProper mapWebsiteService (economics)Chaos (cosmogony)Right angleIntegrated development environmentPlanningParity (mathematics)Duality (mathematics)Stability theoryOnline helpEvent horizonGraph (mathematics)Endliche ModelltheorieData managementSource codeKernel (computing)Concurrency (computer science)Link (knot theory)Goodness of fitSlide ruleMultiplicationCross-platformBijectionWeb applicationData loggerNumbering schemePoint (geometry)Interior (topology)Price indexAuthorizationProduct (business)Computer fileSequelFigurate numberQuicksortPattern languageLocal ringExpert systemExistenceNumberControl flowWeb serviceIntelligent NetworkMobile WebSingle sign-onCategory of beingType theorySemiconductor memoryObject (grammar)Staff (military)Focus (optics)Symbol tableDatabase normalizationJSONXMLUMLComputer animation
09:42
Installation artService (economics)Video projectorBookmark (World Wide Web)Graphical user interfaceConfiguration spaceSoftware frameworkJava appletScripting languageExpressionEvent horizonTraffic reportingPersonal digital assistantError messagePrice indexMathematicsMessage passingCollisionDifferent (Kate Ryan album)BuildingUniverse (mathematics)Computer configurationNamespaceGroup actionService (economics)MassProcess (computing)Integrated development environmentPiSoftware testingOrder (biology)Entire functionCartesian coordinate systemInformation overloadRevision controlSet (mathematics)Physical systemReplication (computing)NumberQuicksortPattern languageBitException handlingComputer fileFlow separationProjective planeArmReal numberPoint (geometry)Level (video gaming)Mathematical optimizationINTEGRALFlag1 (number)Virtual realityIP addressChainFormal languageGraph (mathematics)Series (mathematics)State of matterBit rateRhombusPlug-in (computing)Modal logicRun time (program lifecycle phase)Physical lawCASE <Informatik>Vector spacePlanningHoaxKeyboard shortcutData conversionBasis <Mathematik>Centralizer and normalizerMetadataAdditionFunctional (mathematics)Limit (category theory)SpacetimeImage resolutionWeb 2.0StapeldateiMereologyIterationLibrary (computing)Software repositoryReverse engineeringRootSingle-precision floating-point formatInstallation artTrailFunction (mathematics)Computer animation
19:19
JSONXML
Transcript: English(auto-generated)
00:05
So good afternoon everyone. Welcome to Deploying Python Applications with Habitat. So one thing I want to get out of the way, and that's the basic assumption that is that you've kind of done the intro to Habitat, kind of learned the basics and kind of understand that.
00:24
But because that's gonna be what groundwork you kind of need to have to kind of get that. If you haven't, go check out some of the videos that may already be up there explaining this and then come back. I'll be waiting here for you. But if you haven't done that, just kind of the too-long-did-it read of that is
00:40
Habitat allows you to build, package, deploy. And this includes application configuration and management, continuing integration and delivery, and it's platform agnostic. So one of the things I want to lay down here is kind of the typical Python project.
01:01
Because we're gonna go ahead and dissect this and so that way you can kind of understand and break it down and see all the things you've kind of either been doing wrong or what Google tells you you've been doing wrong for like all these years. So one of the places we always start with is the distribution version of Python. And it's always old and outdated
01:21
and so this is kind of a problem. Not to mention the system carries along its own set of libraries that kind of uninvited themselves into your project. So we use things like virtual length, create our environments, then we basically pip install our requirements, and then we run our app. And that's just kind of how life has been
01:41
for the longest time in the Python community. Now, under the hood, this gets a little bit more complicated. So obviously I mentioned we're restricted to distribution version of the Pythons kind of under this philosophy. We're isolating ourselves from the system site packages, which is why we're doing that. And then also, using many Linux binary wheels.
02:04
And if you're not familiar with what these are, these are binary compilations of Python packages. So if you're using something like Pillow or LXML, these are pre-built images built against an old version of Linux known as CentOS 5. So Habitat doesn't work well with older versions of libc
02:23
because we're trying to be on the edge. We're trying to be the latest stable of everything. So we have to kind of ignore those. And the other thing too is with Habitat, because of the way we do our packaging, you can actually skip virtual events. And so you can kind of help get yourself
02:41
out of this whole workflow. Because the way we're working on shipping Python is we're trying to strip away all site packages and you only get what you need in your application. There's no interference from outside sources on that. Now, there are some things you can do in existing environments to kind of work around them.
03:01
One of the ways is build Python from source or use a third party package. And okay, that's all right. And then also, same thing too, build your binary modules from source. Which if you've done this, it's painful. It's not fun. And being able to have these packaged up in a more meaningful way,
03:21
kind of similar to how the Minilux Onewheels work and making it easier would be like a big boon to Python development itself. So another fundamental to look at is kind of the whole fact of 12 factor apps. So if you've not read this book or seen it, I'd definitely go recommend reading it.
03:41
I got the reference link to this in the slides in the reference section. But this kind of highlights fundamentals you should be doing in your whole application development pipelines and just what you should be doing and how Habitat applies to them. So I'm gonna go ahead and break this down. We got the first five here. So the first one is one code base, one application.
04:03
What does that mean? Essentially, this means one Git repository for your application. And this could mean including multiple processes. Include a web service, a worker service, a cron service, and things of that nature. That's kind of like that. It does not mean multiple apps.
04:21
It's not like you should be packaging your iOS app or your web app in the same Git repository. That is kind of forbidden under this. Now, the other item is API first. This has to focus on documenting your APIs using tools like Swagger, Appiary, or any other kind of tools or Rammel to allow you to document your APIs first. And this is kind of a blueprint
04:41
to help you communicate your whole app with your teams and make sure you're doing that. And it's starred simply for the fact that this is an addition to the 12 factors. So, the next one would be dependency management. Now, dependency management is gonna be very heavily covered in this talk.
05:02
That is one of the big areas where Habitat excels. The other one is design, build, release, and run. And this is one of those other things that Habitat can help you with. Configuration credentials and code, same thing. Habitat can help you there. So, the other thing is logging, disposability,
05:22
backing services, environment parity, and administrative processes. These are all other things that Habitat can definitely help you do in this. Now, logging should always be going to standard out. Habitat supervisor does that, and it allows you to also go grab this log folder or this log file.
05:40
Habitat allows your apps to be disposable and immutable. And Habitat allows you to bind your backing services into your code. And backing services are basically a reference to MySQL, Redis, or anything of that nature, as well as maintaining environment parity. As you've already seen in some of the earlier talks, Habitat has a production and development
06:00
or an unstable environment, and it allows you to maintain that environment parity through your entire pipeline, and as well as creating an administrative process, and administrative processes are referring to those kind of one-off jobs. Those are things like where you go in and migrate the database to update your schema and things of that nature. The other thing is port binding.
06:21
Habitat allows you to explicitly declare your port bindings. So this is part of that, and so this is part of that process, too, when we export to Docker containers, that we get those into there, and now Docker knows what to expose. Stateless processes. Now, part of this is making sure you ship your application state out of your application
06:41
to a proper backing service. So as part of that, it's kind of giving yourself the ability to handle the chaos monkey and make sure you can recover from that, and so that kind of brings us to the next topic of concurrency. Concurrency is being able to run multiple
07:01
concurrent versions of your application, or not concurrent versions, but concurrent instances of your application simultaneously. Telemetry is another one. Asterisk as well, because it is a new thing to the 12 factors. Basically, telemetry is grab everything, and as I go on, I'll kind of demo
07:20
one of those telemetry tools you probably should be using your code base by the name of Sentry, and the other one, too, is authentication of authorization, and this is basically making sure you secure your applications from unauthorized users, either using things like OAuth2, single sign-ons, and such forth.
07:42
Now, the first thing I wanted to get into was dependency management, and we have to basically ask ourselves, how reproducible is our environment? How can we ensure our modules integrate with each other? And so, if we kind of take a look at what an existing Python pipeline looks like, we need to find a requirements.txt,
08:02
or anything of that nature. I shouldn't stand there, but. You basically get this whole kind of little dependency graph. You start out at Python down here, and then you kind of work your way up. So this is kind of a basic example of a Django dependency, like if you were just to run a basic Django application.
08:21
Now, one of the things you cannot do is control past Python, and ensure absolute reproducibility, and so when you're deploying to a existing Linux distro, like Ubuntu, or CentOS, or things like that, you're kind of at the mercy of what the OS provides, especially even if you've built your own Python, so things can potentially be inconsistent.
08:43
Now, when we apply more of the Habitat model to this, we get the whole graph on top. To show, here's our application up here again, sorry for it being so small, but now we get all of our C libraries all the way down to the Linux kernel right here, and this gives us a lot more reproducibility
09:02
in our application. So now we're controlling that entire stack, and so this kind of brings me to the next thing. What does it look like when you make a Habitat plan for Python packages? So I'm gonna go ahead and run over that.
09:20
So you have kind of a dual parody of ways to do it. You can do a Habitat plan to one Python library, or you can do one Habitat plan to multiple Python libraries, and essentially, there's advantages and disadvantages. So one of the big advantages of one-to-one packaging is you can leverage the Habitat builder system
09:42
to allow you to update a single Python package and update your entire chain of dependencies, kind of going back to that graph I was pointing out earlier. The other thing, too, is all these are name-spaced into their own Habitat package plans, and so you get that ability to make sure
10:00
your binary builds are also fully reproducible as well. And then if you want to just be kind of simpler, use the old traditional PIP install dash R requirements is you can use the one-to-many approach. And part of this is some of the work I've been doing
10:21
on Habitat to help facilitate making Python packaging easier. So one of the things to look at is the package-env-separator and add-path-env. And these are functionality that I've done to allow you to have decoupled Python packaging, and so you can add to these specific items to your plan.
10:41
And what happens when you install these packages together and run them with a supervisor, it now takes them all like Lego bricks and just plugs them all into each other, and now you kind of have an instant virtual environment without even having to compose and install it like that. And so let's go ahead and walk through the basic idea of a plan. Here's one for a Python library called IP address,
11:03
and just basic metadata here, nothing out of the ordinary. But in the next stage, when we go to define our dependencies, we go ahead and define our dependency on Python, and we're using setup tools in order to install this package. And because not every Python package is necessary for runtime, in the case of setup tools
11:23
for this package, you can actually put it in the buildups, so that way it gets removed from the environment after you've already packaged up your application. Same thing goes for pip. Why do you need to be shipping a tool to allow people to install libraries into an application once it's already packaged up? It's just unnecessary. And then we go and specify the Python-path-separator,
11:41
and this will allow us to take our Python path that we defined in this package and join it with the other packages in our whole Python ecosystem. So basically we do our whole python setup.py build. This is kind of a way to easily build the package into the folder without installing it, and that way we can kind of run test against it,
12:00
but that's not illustrated here. And then we have the do install, where we attach the Python path to the package and then we just install it into our package prefix. Now, a couple things to note here. The dash dash prefix is gonna tell Python where to go install it so it's relative to the whole habitat package namespace, and then the dash dash no compile
12:22
is also another important thing as well. Because we don't want compiled PYC files or the PYO files in there, because these can be mixed up or mashed around simply for the fact that if you run Python with a different level of optimization in that dash O flag, those files will get recompiled and thus you're kind of breaking the laws of immutability
12:42
that Habitat's trying to adhere to. And then kind of explore the whole package environment a little bit more. We can look at how it runs at runtime. So the only thing that gets pulled in at runtime is the package environment. And they all run in order of which your packages are defined in your package dependencies.
13:05
And the place we start out at is we start at your root package, so this would be your Python package. Then we go to your dependencies, and then we go to your transitive dependencies. This is gonna build up your entire Python environment stack. When you go to, and then when you're running,
13:21
if there is any collisions, it becomes a no op. And so we issue warnings. So this does change a little bit though in the building. So when you go to have build, we look at package environment and package build environment. But when it comes to the dependencies, we do it in reverse.
13:40
And this is so that we can call early to see, say you're gonna have collisions with these packages. And you have the option to optionally, or you have the option to force override those if you really want to. But we're gonna go and start out with transitive dependencies, go to dependencies, and then go to our root package. And then, like I said, overwrites will fail,
14:01
needs to be declared explicitly. The other one too that's not as obvious is the package environment separator. And this is key if you want your packages to play with each other. Like if you're joining paths, you need to be able to find this. And this allows you to do that.
14:22
So kind of like the main highlight of what I was trying to cover is Sentry. And Sentry is not a simple application. It's probably one of the best examples of a real world application you'll have in Python. So I chose it because I didn't want to do
14:43
any more hello world examples, because I'm sure everybody's sick of those at this point. But if you're unfamiliar with Sentry, it kind of looks like this. And it helps you track all exceptions in your application. And the way it does it, it has its own plugin called Raven. You install your application,
15:01
and lots of languages are supported underneath that. But if you look underneath the application, we have several services. So as I talked about earlier, we have our web, our worker, our cron, and an SMTP mailer. And part of this is each service should be getting its own habitat plan. And this has to do with how the hooks and bindings work.
15:23
In order to facilitate that, you have to depend on the Sentry package, and then create additional packages underneath that to add those additional hooks needed in order to support the individual processes of this application. So your plan would kind of look like this.
15:41
And basically all inherit from Sentry. Now the one thing is Sentry itself has what I like to call a bit of dependency overload. And this is kind of one of those things that I've encountered earlier on with versions of Habitat before they introduced the continuous build system.
16:00
And one of the things that was problematic was the whole version numbers and build numbers, and trying to bring in so many to packages at once. And now it's been solved with that, but it's been a bit of a painful path. This whole Sentry package itself
16:21
has 109 packages included in it. And there's 338 dependency nodes, each connecting to each other. Now if you go on GitHub, you can actually pull up the diagram and inspect it a little bit more, but it's just massive. The other thing I also wanted to talk about
16:41
is where kind of Habitat packaging is kind of going in the future. We have Python scaffolding, which we've been taking multiple stabs at. We haven't had a full candidate we were happy with, but we're gonna be working on that. And so that plan syntax I showed you will be changing a little bit to hopefully be automated by the scaffolding.
17:01
One of the other things too is the dependency integration testing. So with large complex graphs such as this, if you have a vulnerability, let's say with OpenSSL, you rebuild that. Anything that depends on it such as Python and any other subsequent Python packages will get rebuilt up the entire graph. So on top of that,
17:22
we can also do the test included with every single package. And this is kind of one of the big highlights with that one-to-one packaging. So you can actually test all the packages to make sure they're gonna coexist with each other before you even deploy it. Because we all know when you deploy and build a single package, it's like you have a limited amount of integration testing.
17:41
So it's nice to be able to get kind of like that full soak and go down the entire path. And then there's also gonna be more additions to the whole package environments. Now these are the references. So I'm just gonna go ahead and give you the thing. 12-factor-app.net is kind of like the first iteration.
18:03
Then there's 12-factor-book. Python Packaging Without Complication was a talk given at PyCon actually last week. So if you're unfamiliar with how to package your Python applications, I'd totally recommend go check that out. Obviously, habitat documentation. And then all the plans that I've been using
18:20
for building up Sentry is in my Python plans git repo. So you can go and follow that. Here's the URL so you can go ahead and do that. And I'm going to then demo in a minute. Cool, so here's Sentry running on top of Habitat. And to kind of give you an example of how this all works, let's go ahead and start an example project.
18:40
So we're gonna go ahead and start an example project. Let's go ahead and say Habitat. And let's go ahead and pick our language. But unfortunately, Rust is not here. So let's go ahead and just stick with Python for the sake of this conversation. And basically, we would be able to tie this
19:02
into our whole application. Thank you guys.