We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

cargo deny

00:00

Formal Metadata

Title
cargo deny
Subtitle
Fearlessly update your dependencies
Title of Series
Number of Parts
490
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
A talk about cargo-deny, why we created it, and how it helps us manage our dependencies in the long term. The slides are available on Github, it uses reveal-md to present the slides, or you can just browse the contents of all the slides in their text form in slides.md
Content (media)Game theorySoftware developerComputing platformWindows RegistrySpectrum (functional analysis)Time domainDifferent (Kate Ryan album)Interactive televisionRepository (publishing)Network topologyTerm (mathematics)CodeComputing platformTraffic reportingDivisorGame theoryMultiplication signCASE <Informatik>Patch (Unix)SoftwareLevel (video gaming)Commitment schemeData compressionSoftware maintenanceSoftware bugComputer configurationComputer fileNumberProjective planeMilitary baseGraph (mathematics)Open sourceBitContext awarenessMathematicsWhiteboardWordSoftware developerPlastikkartePhysical systemOpen setSource codeOperator (mathematics)Peer-to-peerLibrary (computing)Computer animation
Graph (mathematics)CodeConfiguration spaceSource codeTerm (mathematics)Identity managementSubsetFreewareoutputComputer fileRootDefault (computer science)Transport Layer SecurityPhysical systemCovering spaceException handlingExpressionBitComputer fileoutputCellular automatonConfiguration spaceCASE <Informatik>Source codeTerm (mathematics)Process (computing)WindowInformationMultiplication signMathematicsProgrammer (hardware)Field (computer science)Right angleRevision controlInformation securityGraph (mathematics)CodeVideo gameDefault (computer science)Set (mathematics)Physical systemOpen set1 (number)Cycle (graph theory)Expected valueGroup actionData conversionInverse elementBasis <Mathematik>Patch (Unix)Projective planeForcing (mathematics)Inclusion mapDifferent (Kate Ryan album)Link (knot theory)Transport Layer SecurityParsingError messageRootImplementationComputer animation
Image resolutionCASE <Informatik>NP-hardData managementClassical physicsImage resolutionCASE <Informatik>Revision controlLoginInheritance (object-oriented programming)Physical lawResultantRight angleProgram flowchart
Link (knot theory)Inclusion mapRevision controlGraph (mathematics)Configuration spaceFunction (mathematics)MultiplicationReal numberDifferent (Kate Ryan album)Resolvent formalism1 (number)Revision controlFunctional (mathematics)CASE <Informatik>Inclusion mapBinary codeGraph (mathematics)Stability theoryDrop (liquid)Function (mathematics)Line (geometry)NumberTime zoneDirectory serviceArchaeological field surveyType theoryComputer animation
MathematicsRevision controlDecision theorySimilarity (geometry)Core dumpFunction (mathematics)Rule of inferenceConfiguration spaceExecution unitControl flowSource codeLocal ringWindows RegistryInformation securityComputer fileModul <Datentyp>Library (computing)DatabaseLink (knot theory)Exterior algebraData conversionAdditionComputer configurationVulnerability (computing)CASE <Informatik>Knowledge baseSoftware maintenanceSource codeBoss CorporationCore dumpWindows RegistryPoint (geometry)Position operatorMathematicsOperator (mathematics)Revision controlLocal ringInformationGroup actionBlogDecision theoryLatent heatQuantum stateTrailCentralizer and normalizerDifferent (Kate Ryan album)Basis <Mathematik>Theory of relativityElectronic program guideDifferenz <Mathematik>Function (mathematics)Computer fileOnline helpAuthorizationComputer animationLecture/Conference
Source codeConfiguration spaceMaizeMacro (computer science)Computer filePersonal digital assistantSimilarity (geometry)Image resolutionGroup actionLink (knot theory)Bit rateConfiguration spaceWindows RegistryGraph (mathematics)Source codeMathematicsImage resolutionSingle-precision floating-point formatAttribute grammarPoint (geometry)View (database)Revision controlComputer filePort scannerComa BerenicesMultiplication signPhysical lawExpressionVulnerability (computing)MereologyLink (knot theory)Expert systemLibrary (computing)QuicksortCASE <Informatik>Group actionProcess (computing)BuildingProcedural programmingHookingResultantGoodness of fitPhysical systemMacro (computer science)MetadataRobotParsingPerformance appraisalComputer animation
Web pageSingle-precision floating-point formatAttribute grammarTemplate (C++)Source codeError messageEndliche ModelltheorieUniform resource locatorVideo gameLecture/Conference
Open sourcePoint cloud
Transcript: English(auto-generated)
Yeah, so this talk is about CargoDeny, which is a tool that we have made to help us manage our dependencies. So we're going to talk a little about the situation and kind of why we made this tool and what the idea of it is,
and then go through kind of what it currently offers and possible features for it. A little bit about me. I've been in game development for about 13 years. Been at smaller companies, larger companies, and now very small companies.
And right now I'm at NetBark. We're a game development studio, and we're running an engine and platform in Rust, as well as other stuff not in Rust. So yeah, so here's the situation about why we created the tool, and it is totally our fault.
So kind of give a context of where most of us are coming from. We've been in game dev for quite a while, and generally game development is in large, monorepository style code bases.
And there's basically very, very few external dependencies. And all the external dependencies that you do have are vendored typically into the code base where they rot. And basically, as a kind of consequence of vendoring, there's extremely little interaction with game development in general and open source software.
So even though some software is used, like a typical example would be Zlib or other compression libraries, they get pulled down, they get vendored, and basically that's it. There's no more interaction. There's no bug reports. There's no PRs. There's no kind of, hey, could you add this feature that's important to us?
Usually it's added internally, never shared with anyone else. And this is kind of like, this would be what I would call classical game dev. Not everyone operates this way, but this is kind of the modus operandi.
And so if you turn to the Rust ecosystem, the Rust ecosystem is basically complete and utter opposite of this, right? Pretty much everything is shared and public. And if it's not public at the beginning, it eventually becomes public.
Or at least people can talk openly about what they're working on, even if it's not actual public code. And obviously, everyone knows it's quite large. I think the screenshot was taken two days ago or something. And yeah, there's a huge, but being a huge package, well, not compared to NPM, but a large package repository with a lot of crates,
there's a huge number of axes that have different levels of quality and commitment from maintainers. And yeah, there's just a huge amount of options available across the ecosystem.
So going back to game development, when we started the company, obviously most game development is done in C++, so going to Rust was a big kind of thing that we wanted to do.
And obviously, one of the motivating factors for this was using this amazing crate ecosystem. So we use quite a lot of external dependencies on our main primary project, right now about 400 plus.
And we tend to keep all of them up to date. We kind of tend to live ahead. We update sometimes several times a day, but typically on a weekly cadence. And we actually have never vendored any dependencies.
We always fork, use a patch to the Git repository until our PR is merged, and then return back to the original project. And so we're kind of working quite differently from how we used to work. So yeah, so the problem that comes with this, the crate ecosystem is moving quite quickly.
We're updating quite frequently, and you can't just look at the cargo log file or cargo tree or license hound or all of the stuff all the time. It's just extremely tedious.
And while cargo gives some tools, it doesn't actually give you a complete picture of the crates, because it doesn't really have that capability because people have different use cases and different requirements. And it's kind of up to you. And basically, we have this cadence now of updating quite frequently.
And the Rust ecosystem is obviously moving quite fast and moving faster in terms of new crates, and some crates get updated multiple times a day. And we basically want to keep this cadence for now. Maybe we'll change it in the future, but for now, we want to be updating quickly and fastly,
getting new features, fixing bugs, all that kind of stuff. So that's where cargo deny comes in. So basically, our high concept of the tool is that it's a linter for your crate graph.
So the idea is to treat your crate graph as code and basically do what Clippy does, which is look for things that you've configured for and warn you or error, and basically make sure that as your crate graph changes, your expectations are met every time.
And so what we're checking for currently are licenses, bans, or basically saying I don't want particular crate or crates in my dependency graph, duplicate versions of crates, security advisories, and sources of your crates.
Licenses. So yeah, we just had a little bit of information about licenses. So again, going over this, crates usually specify their license terms in the cargo metadata,
and this is kind of a pretty typical one, MIT or Apache 2. And then they also have the ability to give a relative path to a license file, but you have to manually inspect that because cargo doesn't care about it.
And so basically, the question we want to ask is are all the crates that we're using using licenses that we find acceptable, and then making sure that that holds true over time. So if we add new crates, it has licenses that we find acceptable, and then it's also possible for crates to change licenses when they make a version change,
sometimes even in patch versions. And so basically, this is a short snippet of a configuration that you could have. So basically say you can configure what happens with unlicensed crates, crates that have copy left, and then a set of licenses that you explicitly allow,
as well as kind of exceptions for cases where you maybe don't want to blanket allow a particular license across all the crates possible, but on a particular crate. And then you can have other things like F-free or OSI-approved as well.
Basically what it does is evaluates the SPDX expression that's gathered from the crate. And so in this case, because we explicitly allow both MIT and Apache 2, it evaluates to true because both sides of the expression are true.
And this works for basically all SPDX expressions, including ones that actually aren't representable in cargo, because their parsing in cargo for the license field is actually not correct.
And so as we saw earlier, there are some caveats to this. Right now, there's only kind of two sources of input, which is the license field itself, as well as any license files that are in the crate root. And so we basically scan the license files to determine the license that is in the file.
And then we basically combine them all together with an and expression to be maximally covered. But in practice, that's not really accurate, because especially for C dependencies,
people tend to include C code into the Rust crate, link it in with everything else, and then kind of completely ignore that the C dependency has a completely different license than the rest of the crate. And yeah, so we have another tool called CargoAbout, which is about kind of doing a similar thing to the notice thing that we heard about earlier.
But the basic idea is that it does do full source code scanning of everything, and then finds all licenses and makes sure they comply with what the crate says, and then otherwise it'll force you to specify the licenses that you found
and what the expression is for the crate as a whole. But yeah, Cargo United's job is to do that very, very quickly, so this takes, you know, a millisecond or so. So yeah, there are crates that we don't want.
And this is totally fine. Not all crates match the requirements that you have for your project, and there's a lot of crates that have different philosophies about how they update, or what features they provide versus other crates that kind of operate in the same niche. And so sometimes we find them and we say,
yeah, we don't want this, and we want to keep it out for all time. So a particular example that was kind of the motivating reason for creating this tool in the first place was OpenSSL. We despise OpenSSL. And unfortunately, basically, any time you do TLS in the Rust ecosystem,
it's almost always the case that OpenSSL is the default implementation for that, even if they provide a feature to use, for example, RustSL or something. And kind of the reason that we find this annoying is it does have system dependencies, so if you have different systems, they have different versions of OpenSSL,
and then particularly for Windows, we have some Windows users who aren't necessarily programmers, and it's another dependency that they have to install and keep up to date, and it's just tedious. So we have, yeah, this is a very simple example. The reason it's doing the name is you can also specify particular versions that you deny,
instead of just the old versions of the crate. And we do a quick change here, so that's request, and we just turn off on default features by deleting the default features faults, and by default, request uses OpenSSL,
and we see that there's an error now. And then basically every time Cargut and I find anything wrong, and it puts a warning or error that pertains to a particular crate, it optionally will produce the inverse dependency graph,
basically how the crate gets pulled into your crate graph. So the next check would be duplicates, which is a kind of interesting case in Rust. So if you're not aware, dependency resolution is hard,
and it's actually an MP hard problem, and so some package managers will say, you can only have one version of a particular dependency in your project, and if you have conflicting versions, you have to figure out how to manually go down to one version.
Cargo, however, does not. It introduces a trade-off. So here's a really simple case. So we have a yours crate that depends on both theirs, some other crate, as well as log,
and then the theirs crate also depends on log, but fortunately they both resolve to the same version, and so everything's fine, right? You just have one version of log, and everything is great. The much more common case in the Rust ecosystem is that you depend on one version of log, you have another dependency that has a different version of log,
and so in this case, in the classical dependency resolution, this is unsatisfiable, and you have to choose one or the other, and you have to somehow get them to both work. But Cargo just says, why not both?
And this is great. By saying we can have multiple versions of the same dependency, you can automatically kind of resolve dependency quite easily and also fast. This is one of the great introductions to Rust.
When you're coming, especially from C++ or something, and you're adding dependencies and getting functionality, and everything just works and is kind of magical, and it also allows, most importantly for the ecosystem as a whole, to evolve at differing paces, right?
So one crate can decide, I want to use this bleeding edge version of some crate, and then the rest of the ecosystem is saying, okay, well, actually, that one's kind of risky. Maybe I'll wait until it's kind of stabilized or something. But they can both use that, and you can use all of the crates that use any of the versions,
and it's totally fine. The cons, of course, are not great. So if you have more versions of more crates, then you download more to compile, and then if you compile more, you link more, and if you link more, you have larger outputs, and that's both the actual final binaries that you ship
as well as your local target directories, which can get quite large, and then the fun, you expected type X and got type X. Thanks, cargo, or Rust. So the duplicate handling basically gives you
the way to see, look at your crate graph, give a concise kind of inclusion graph for that, and then allow you to kind of manage how you deal with duplicates. So basically say, okay, we're gonna deny multiple versions, and then we're gonna skip a few,
and then this is kind of what it looks like. So if we had two versions of base64, it'll basically give you the inclusion graph for both, all of the versions, you have more than two, the two is the typical case, and then kind of, yeah, highlight where they're coming in.
But also has a different graph output as well, optionally, and the idea is that the blue lines show the path to the lowest version, which is typically the one that you're going to be afraid of, and then the red path is the path
to the one with the fewest number of edges, which is typically going to be the one that's the easiest to remove. Obviously, this is a contrived case. This is kind of much more typically what the graph will look like. I think this was like X Wayland in Winit or something, but yeah, there's multiple duplicate versions there
that all go to this one, and yeah, it can be a mess. But the idea is that, yeah, it gives you the notice, like hey, something is here. And then once you find duplicates, then yeah, we have to decide what we want to do with them. Sometimes you don't care, and that's totally fine,
so then you just skip it, and then often you'll want to maybe open a PR to bump a version that needs bumping, or change your version to point to the same version that a dependency is using to just get rid of the duplicate. So it's basically whatever you want to do. There's a lot of options.
And to reiterate the point, we don't think duplicates are bad, because like I said, it does have a lot of positives. It lets you just do your work without getting in the way. But the duplicate detection is there so that you can notice it and actually make a decision about how you want to deal with it.
Next we have advisories. So the advisories are built on top of RASEC, so if anyone's used Cargo Audit, this is the same core crate that Cargo Audit uses to download, deserialize,
and inspect your crates with advisories. And this is cool, because it allows for a centralized knowledge base of advisories for all different kinds of things that people can contribute to and kind of help each other with.
And it's not just for vulnerabilities. There's also notices for unmaintained crates, so crates that the author is either completely unresponsive or has explicitly said, I'm not working on this anymore, or for crates that served a purpose, but for example maybe now be in the standard library itself
and have been implanted by a superior crate. It also can detect yanked versions. This isn't really a problem in practice that I've noticed, but it's there if you want it, as well as obviously the possibility of more advisories in the future.
So you can basically say how you want to deal with vulnerabilities, how you want to deal with unmaintained crates, how you want to deal with yanked crates, and then the ability to ignore specific advisories that maybe you don't care about. In this case, the spin is unmaintained, but LazyStatic uses it, and everything uses LazyStatic,
so it's kind of up to LazyStatic to remove it at some point. The output kind of gives you the information that's stored in the database for each advisory, and that will give you links to alternatives in the case of unmaintained crates
or what versions you should update to to get rid of the security vulnerability. And the last thing that we have is the source check. So Core Guide has multiple sources that it can use for crates, so there's the local source from FilePath,
the crates.io, the registries, but then most importantly, Git. So this is the blog post that kind of motivated the addition of this check, and it's basically talking about NPM lock files, but there is some relation to cargo lock files.
So we have a typical kind of PR from my boss, updated dependencies, and as is typical with GitHub, it'll just hide the diff. Looks fine. But if you look in the lock file, you would see that actually the version changed, but also the source changed, right? So instead of going to crates.io,
you'd go to bit.com, definitely not mining bitcoins. So basically the configuration for this is just whether you allow or deny unknown registries and unknown Git sources, and then basically if you deny
unknown Git sources or registries, you kind of opt in to them instead of the typical case of just opting to get anything from anywhere. And that's kind of the output of it, yep, just as a fullnet source that wasn't explicitly allowed and then points to the actual thing.
So future, we probably want to add more checks. So one example would maybe be unused dependencies. So there is udeps, but that does actually hook into Rust C itself. We think it could be faster
to just check for unused dependencies just by doing magnetic systems and so forth. We're also thinking about doing maybe proc macro buildrs kind of stuff as well because while proc macros and buildrs are great, they're also huge security holes and we kind of want to be at least aware
when we add a dependency on something that uses a procedural macro or buildrs file, for example, just so that we can kind of say, should we be adding this? And yeah, if anyone else has any ideas of other things that could be added to it, it would be cool. And then also there's this issue for cargo,
but the basic idea is that cargo could expose some way to hook into the dependency resolution of cargo itself. And this would be really cool because instead of right now a cargo denied just looks at the resolved dependencies after cargo is done with it
and if you add this, you could add dependency resolution time, stop something from getting into your graph in the first place. This would be particularly useful for things like secondary vulnerabilities like I don't want to add a dependency and then have my CI fail. I just want to not have it in there in the first place. And we do have a GitHub action that you can use and here's a few of the users.
Most of them are crates, but there's tonic, the tool itself, SPDX expression parser and evaluator creates this simple thing that basically takes cargo metadata and marries it to pet graph. There's config expert,
which evaluates configuration expressions and there's cargo audit and rust sec and then cargo bot are kind of license attribution tools as well. So yeah, that's it.
Time for about one question. Yeah, one question. No one, okay. Do we have any others? What does cargo about do? So cargo about uses the more, like it deep scans every single file in crate source to detect any licenses
and then basically make sure that they match the declared licenses and if they don't, basically gives an error and then you can configure to say like okay, actually there's an additional license in this location, but basically just takes them all together,
puts all the attributions for each crate together and then you can take that information and pass it to a handlebar template and then the handlebar template can be whatever you want. We use like a HTML one and the idea is create something like the Firefox attribution page so that you can list every single crate that's being used, every single license that's used
and then the license text