We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

The 7 key ingredients of a great SBOM

00:00

Formale Metadaten

Titel
The 7 key ingredients of a great SBOM
Untertitel
Ensuring your SBOM includes enough data to be actionable
Serientitel
Anzahl der Teile
542
Autor
Lizenz
CC-Namensnennung 2.0 Belgien:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
SBOMs vary wildly in the data they offer to consumers and to make the truly useful we need to consider seven important points in their contents. Let's immerse ourselves into real-world software bill of materials data to look for the required features all great SBOMs ought to have. As a record of components, SBOMs can vary wildly in how they describe software. Some SBOMs lean toward security and some toward licensing. Some do a good job in their own niche, while others do not even offer enough information to even understand what it is they are talking about. In this talk, we will try to visit the 7 key data points (syntactic correctness, dependencies, licensing, semantic structure, software identifiers, supplier data, and software integrity info) required to make sure your SBOM is useful to the widest possible audience. We will take an inner look into real-world SBOMs using the Kubernetes bom outliner. We will inspect how they are structured, and the data they offer looking for clues on how we could improve them with the goal of learning what a great Software Bill of Materials looks like.
MereologieEinsPunktCASE <Informatik>VersionsverwaltungUmsetzung <Informatik>Computeranimation
Kette <Mathematik>StabSoftwareProzess <Informatik>MereologieKette <Mathematik>Open SourceHilfesystemProjektive EbeneComputersicherheitComputeranimation
Kette <Mathematik>ProgrammbibliothekProjektive EbeneGrenzschichtablösungOpen SourceCASE <Informatik>SoftwarewartungDifferenteComputerspielFlächeninhaltInformationQuick-SortWeb-SeiteEndliche ModelltheorieComputeranimation
Quick-SortOrdnung <Mathematik>BlackboxInformationComputeranimation
BenutzerfreundlichkeitEntscheidungsmodellElektronischer DatenaustauschMehragentensystemMessage-PassingEinflussgrößeGammafunktionAtomarität <Informatik>MaßstabLokales MinimumKette <Mathematik>MarketinginformationssystemKreisbewegungRechenwerkStellenringWurm <Informatik>Latent-Class-AnalyseSichtenkonzeptGruppenkeimProjektive EbeneEntscheidungstheorieInformationVisualisierungMultiplikationsoperatorDatenstrukturComputeranimation
DatenfeldInformationOpen SourceVollständigkeitProjektive EbeneBetriebssystemIdentifizierbarkeitComputeranimation
Formale GrammatikFlächeninhaltInformationOpen SourceDatenstrukturElektronische PublikationProjektive EbeneMereologieWürfelQuellcodeMailing-ListeTabelleProgramm/QuellcodeComputeranimation
Prozess <Informatik>InformationComputeranimation
Syntaktische AnalyseStrategisches SpielTotal <Mathematik>Urbild <Mathematik>Kette <Mathematik>Anwendungsspezifischer ProzessorKomponente <Software>InformationSoftwareDatenstrukturDatenstrukturOpen SourceUnrundheitMailing-ListeEinsInformationProzess <Informatik>CASE <Informatik>Formale SemantikAnalogieschlussSoftwareBitSchlussregelFehlermeldungRechter WinkelGraphStrategisches SpielPunktKontextbezogenes SystemZeichenketteBildgebendes VerfahrenComputeranimation
Kette <Mathematik>GruppenkeimKorrelationMessage-PassingWechselsprungDDR-SDRAMEinflussgrößeKanal <Bildverarbeitung>E-LearningSpeicherabzugRelationale DatenbankManufacturing Execution SystemBildgebendes VerfahrenDatenstrukturEinfache GenauigkeitInformationCASE <Informatik>Elektronische PublikationDistributionenraum
Total <Mathematik>Anwendungsspezifischer ProzessorMulti-Tier-ArchitekturMereologieKette <Mathematik>SoftwareNegative ZahlAchtGebäude <Mathematik>AutorisierungSoftwareMailing-ListeNummernsystemIdentifizierbarkeitComputerspielE-MailComputersicherheitDreiecksfreier GraphKette <Mathematik>CASE <Informatik>DatenfeldAbgeschlossene MengeMultiplikationsoperatorZeichenketteDifferenteURLQuaderRechter WinkelLokales MinimumElement <Gruppentheorie>ProgrammbibliothekInformationDatenstrukturProjektive EbeneOpen SourceNatürliche SprachePhysikalisches SystemVollständiger VerbandOrdnung <Mathematik>Streaming <Kommunikationstechnik>Technische InformatikComputeranimation
Kette <Mathematik>SoftwareGebäude <Mathematik>Komponente <Software>Inhalt <Mathematik>Hash-AlgorithmusAdressraumKonsistenz <Informatik>IntegralRechenschieberSoftwareHash-AlgorithmusInhalt <Mathematik>Physikalisches SystemIdentifizierbarkeitDatenbankSoftwareschwachstelleVersionsverwaltungVerschlingungComputeranimation
Umsetzung <Informatik>Computeranimation
Twitter <Softwareplattform>InformationRechter WinkelHash-AlgorithmusIntegralMessage-PassingMathematikLuenberger-BeobachterMetadatenCASE <Informatik>SchnittmengeMereologieDistributionenraumSoftwareBitMaschinenschreibenZeichenketteWurm <Informatik>ValiditätBenchmarkKette <Mathematik>Inhalt <Mathematik>BinärcodeRepository <Informatik>Vorzeichen <Mathematik>Lebesgue-IntegralPunktPatch <Software>Twitter <Softwareplattform>Streaming <Kommunikationstechnik>ComputersicherheitMultiplikationsoperatorNumerische IntegrationATMPhysikalisches SystemGenerator <Informatik>Selbst organisierendes SystemOpen SourceNatürliche SpracheComputeranimation
Flussdiagramm
Transkript: Englisch(automatisch erzeugt)
Yeah, all right, so first of all, thank you for staying And We have been going through all of those cool use cases and conflicts like really complete
tools to generate headphones and I Was thinking that I
Wanted to do like my kind of people as we are And so as you have been hearing from folks
Right now Working on this ones are starting to get concerned about what's actually those documents and and I think When Thomas opened the bedroom today, the first thing he said was well Those dependencies that you're getting they may not be correct, right?
So I thought that it would be as we move to the latest part of the conference It would be cool if we could get a few talking points just to see the conversation that's about to happen so my name is Garcia and I am Well part of the CDX community I am a contributor to
SPDX and some of the tools I maintain a bunch of open-source tools that generate and consume S1 And that's help visualize them. I am also part of the Kubernetes project. I am part of Kubernetes take release and I Work there mostly on the supply chain security of the project and
Yeah, like riding my bike. I'm based in Mexico City Staff engineer with Chinger which is a company devoted to supply chain security and So as you heard from Probably every speaker today The goal of having a
Getting a document which you can actually use for something and there are there are many concerns about S1 flying around in the world today Because there are particular use cases and some people will argue that it's almost may not be necessarily
Incomplete the third of sort of them for a one page or the other and this is true, but instead of trying to Picture ourselves like generating an S1 from the position of like a large company or whatever I felt that it was more appropriate to discuss today that how I mean, I'm assuming a lot of people here are
Maintainers of open-source projects and sometimes very small projects like one maintainer small and I think it's important to start considering The that when those large companies are gonna use your project your library important that model
But you're right The S1 that you give them can really make a difference in several areas like First you can make the life easier Because you're having them more complete information, which they can act on and the other one is We as open the open-source community
Become better citizens of the supply chain like generating the information that pertains to us Is much more much more responsible thing to do so What happens when you open an S1? Well today you can get all sorts of surprises
Sometimes there's nothing in there you open the S1 and it's empty sometimes you don't have Absolutely any information that lets you determine what that this one is describing So it's simply just pointing to the same black box that you can look from the outside or the other is
What happens if? Are you sure that the S1 is really describing what you're expected to and you are not getting gone by someone well those that information needs to be in the S1 in order to ensure that Importance I Need to be in this one in order to ensure that it's actually describing that piece of salt work that you're distributing
So I'm gonna give you a few examples I'm not trying to name names, and this is I that's why I chose projects that I'm involved with Both good and bad, so
This is the first one. This is Our company has a Linux distribution, which is already shipped shipping with s bombs built in and we generate those as bombs at build time for all of the packages and You can see the structure here of one of the s bombs this is like a visualization of the s bomb using the Kubernetes bomb tool which lets you ingest as PDX documents and see how they're
structured inside and As you can see we try to in the Linux district add a lot of detail to the s bomb as much as we Can to just guide? Whoever is using those s bombs
To do smart decisions with the information they have in them So if you look at some of this is a fragment of the s bomb and I mean some information is there Some information is for example the the licenses The license look concluded fields they are marked as no assertion
But you can omit those for example if you want, but we have the license from the project from the actual Operating system package we have some identifiers things like that, so it's Pretty complete it's obviously not perfect, but we try and we try to add as much information as we can
but then Let me show you another s bomb from another popular open source project This is part of the Kubernetes s bomb so this is Part of the s bomb like the structure a little fragment of the structure of the s bomb
that we generate with when we put out a new Kubernetes release and This is describing for example the the the tables which we put out with every release The one of the tables of the cube API server the list of files so we also try to add information we put out two s bombs with Kubernetes one with the artifacts one with the source code which are linked one to each other and
so we also Think that's those are fairly complete s bombs, but now I Opened an s bomb in a popular open source project and try to Generate the structure like a this. I'm not I'm not gonna say which project this
it's just it's just one I'm involved with and should be we should be doing a better job there and There you can guess many reasons of why this is showing serial things, but we can go over this
So As you can see The you can really enrich an s bomb with a lot of information and some of it Can be more important than other things, but I've been thinking well. What's the most important details that you can add to the s bomb? So the first one is and by the way most of this
You already heard the truth of the day if you've been sitting in most of the conferences So we're gonna go one by one so the first one is Syntactic correctness you were you would expect that most tools generating SPDX or Cyclone DX s bombs do like the basic job of just making a compliant document well
Their reality is that they're not so you if you I picture the this guy from Apollo 13 that tries to Fit the square peg in the round hole or the other way around Because if you cannot ingest an s bomb so what's the point right and even if you have like
Try to somehow hacked at the document or ingested somehow The reality is that most tools that consume s bombs today. Do not have like a clear strategy of deprecating the documents So and then most importantly not clear and also not predictable so if you
If a tool tries to somehow ignore errors or whatever the the behavior may not be consistent so Ensure that any s bomb that you're producing or requesting at least complies with syntactic rules of the standard you're using The second one dependency data, and this is a little bit related to the first one
I've seen s bombs so since I work with a lot of open source tools and my job also has to do with s I've seen like a lot of tools producing s bombs and so for example one one variant of the bad s bomb is well that will just list like a double and
That's your s bomb Nothing else or the obvious case of this s bomb contains one thing an RPM No, no depends on dancing or nothing So we often use the analogy of The s one being the nutritional label of software, but without the dependency list well, it's really worthless
You can still use your s bomb as the old checks on the txt if you want it But it's not provide a lot more value than that Then the second one licensing information. We've heard a ton of
The talks today about licensing and why it may be important so the truth is is you are publishing software You are the most qualified person to do the assessment of what that's the license your software should be using and this applies both to the
dependencies that you're pulling in and if you are redistributing that information ensure that that the information about the licensing is Going down the string because the tools that we've been seeing today try to do a good job on Helping people understand their licensing situation and So I picture the checking the passport
It's an example of the license This next one and Semantic structure in the s one this one also came during the discussion today So there Folks that think that s bombs can be just the list of dependencies and it may be true
but then you start losing context on where those things fit like for example, if you have just a list of dependencies and Especially if they're not related to an artifact at the top of the s1 if you picture so the s1 can be like this beautiful graph of one know that spreads out to
lots of the pen relationships in notes and So sometimes you'll see as ones that only have the list of dependencies and they don't talk about Where those dependencies fit if they're describing a concerning image a binary nothing
So if you try to do something Like more sophisticated with a data you simply can't if you remember the s1 that we that I showed in the in the beginning That we build with the Linux distribution, this is how we structure The container images build from our Linux distribution
So you have the container the layers and the the packages like the OS packages and then all of the files In its proper place and this information is actually common coming from smaller s bombs that get compiled When we build a Linux distribution, so each of the AP case of the distro
Have their own s bomb describing that package and then when we build an image we take all of those s bombs And give you one single S bomb with all of that information composed where it's supposed to be and without structure you simply cannot do this And this is one image, but then if you go and make it more complex you can start thinking about
Multi-arch images right and those need to have this information for each of the images in the so the the relationships start to become more and more complex And The way I try to picture is This right so they give you like a box of Legos without any instructions or anything you can use your imagination
Probably you're gonna build something really beautiful most likely not Especially not the thing picture in the box, right? And so these are some of the reasons that I was thinking like if you have structure then
It's a guarantee at least that the tool at least is looking at how the thing is composed and where the information is flowing from and Lets you use more do more complex use cases for the documents Now the next one this also has come like two three two or three times today the software amplifiers and
S bombs need to be defining in Naming the piece of software as close as possible and Software and the identifiers are one of the schemes that you need
in in the document in order to ensure that the piece of software that the s1 is describing is clearly identified and All of them have their problems Especially CPE for example is like really complex to get it, right and But some the the the idea is there's gonna be a
Tool down the stream that it's gonna benefit from that information. So if you can add it You're you're making sure that the the s1 can work well with those tools And this is kind of the the idea of that. So how many packages in the world named love, right?
So, okay love but what's love and there are thousands in every language like operating system packages Libraries named love. So if you can have like a properly specified P URL CP both that clearly define the The piece of software that does that the s1 is talking about then can be better referenced and used by tools on the string
now the supplier data this is like a contentious one then the reason why I Added the supplier data is because and as software authors
Sometimes we don't think that it's like an important field We simply I mean in most large open source projects, we just list like copyright the project offers right like The editorial but the reality is that if you jump into any of the s1 meetings that go
On regularly the you you wanna hear the all of the compliance folks like I need a name to sue I don't know. It's a different mentality than ours, but people need it. And in fact, it's one of the requirements from NTIA as the minimum elements of s1 and
this is a weird field because if you deal in Kind of more into security of the documents that should be Generated during the supply chain and the software life cycle life cycle. This information is kind of I don't know
not really very useful because it can be forged and you cannot trust it and So just having a name and an email well like it serves compliance folks, but To us it's kind of Worthless really for security purposes, right, but then
You start thinking about well, what's a supplier? Is it the author the copyright holder? is it the tool that compiled the the thing the people who has who's distributing it and so well, at least ensure that you're providing some kind of information and the idea is Know who's selling you your things, right?
Buy candy from that guy probably not Well, yeah exactly come give him get him from us Supplier yeah, okay, I Messed up this this slide. So this one was supposed to be integrity data
Very the data to prevent this kind of thing. So when you as you heard today also, so it's one should be properly hashed like Hashing as much as you can inside of a document when possible when it makes sense and especially when it can be verified
So the idea is is this piece of software that I'm naming in the s1 the real deal Has it been corrupted or not, but more importantly having hashes lets you Deal the problem of the latest right? So sometimes you will not have a version
But you can still reference that software artifact inside of the the s1 and other documents like bags For example via the hashes so you you can think about the versioning system and the software identifiers as links to external
systems outside of the s1 like vulnerability databases like for example Package repositories, but internally everything should be Addressed via the hash if possible. So if I'm telling you this is the vulnerability
Document for a piece of software should match with the hashes somehow In the ADA as well you can once you start content content addressing the piece of software in the s1 cannot be you cannot go wrong and Well, that's basically what I have and
So I just wanted to let this open You know kick the conversations are about to happen about this kind of thing inside of the documents and If there are any questions or whatever happy to take them and if not You can reach me as pork almost in most systems and Twitter whatever so, thank you
Well, I would like to hear the opinions of the supplier that are for another yeah
So how so what's basically it's what's the role of the supplier data for what what's the use of that field? yeah, so should they feel like a Person or an entity or a tool. Yep, or
Not really No, no. Yeah, I was going to go to that question before but if anybody has
Insights about how supplier data is used in the organizations. That's the time to Discuss it. All right. Yeah. No, so I the way I've seen it required is No, no, this is
Exactly the first one so the first one is how is the supplier used and the way I've seen it is mostly from Procurement people like asking for that information and lawyers So that's that's the mode of the two that I've been as that I've seen asking for the information more. I'm
coming from the Security side of has been more so the compliance is not my strong side, but that's why I'm so just saying it And yeah Yeah, as one data point the way that we are using supplier data is actually recording who supplied
So not who wrote it not who created Right if we got it from An upstream distribution repo we put the
Again record, but we know that And to the other question is so those in the integrity points consider also signing of the s1 and Yes, but not in this case. So Integrity like signing signing of the s1 is mostly done outside of the s1. So
And that touches on Trusting the s1 which is a whole nother kind of worms But I mean it is but not not in the contents of the documents Is there something like benchmarks or I give it a score of 8.0 from 10, that's a good s1, that's a bad s1
Well, there are yeah, there are tools and yeah, yeah, I repeat the question So, how can I know sorry didn't get a lot of sleep
How can I make sure that that the s1 really complies to these things here? So there are a couple of tools that do validation of the s1 like Scoring try to the scoring so eBay Has a tool called
S1 scorecard then there's the NTIA compliance checker from SPDX I'm not sure and there's a there's I don't know are the end of folks here still
I Okay, so I seem to remember that they were handling some of that as well, but there are a couple of tools out there It's more like a remark, but I'm a bit surprised we didn't mention OpenChain that much OpenChain is, the goal is to build trust from the suppliers so you can trust the SBOMs from the supplier more than anything
So yeah, yeah so what Nico said is that OpenChain has touches on the idea of trusting the SBOM on the supplier and and Those those rights Observation this is having looked at Python and
The metadata that goes with Python packaging is really inconsistent So how do you spell Apache 2 how many different ways of putting the Apache 2 license? Is amazing exactly naturally between releases information chain disappears, yeah, so
This is really a message for the ones of in the ecosystems puts as much data in the ecosystem You can because it's going to support Yeah, exactly yeah the comment is Right exactly
Yeah, the comment is that in Python sometimes between releases information changes or disappears or whatever so This is actually One of the things that some of us would like to see happening like people working on packaging systems on
language ecosystems To start if not having SBOM generation straight away in their tooling at least expose the information so that we SBOM tool makers can go in and extract them from More trustable sources and
Okay Can you repeat it
In in in the case of the yeah, but just in the case of the distro or Well, yeah, the question is how do you deal with a patch of software right when you apply a patch so
But I mean you still have that hash right or where is the question about naming Yeah, so if you're describing a patch of Artifact I mean the hash simply hash the thing and you can use that down the stream The problem comes when you're trying to define well, I'm using curl, but I applied a few custom patches myself
How do you name that and that that becomes a like a more complex question? So internally as I was saying with integrity thing is you can still reference everything with the hashes, right? Like I'm talking about binary this hash all down the string
But when you want to express it externally Well, I guess that falls into the naming problem and you have to think about where that thing is going to be used so if that is going to be a package part of a distribution that you're Doing you may define your own set of package URLs, for example or if it's not going to be you can make up the license, but it falls more into the use case of
What you're trying to do with distributing that budget so far So that's it. All right. Thank you