The 7 key ingredients of a great SBOM
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Untertitel |
| |
Serientitel | ||
Anzahl der Teile | 542 | |
Autor | ||
Lizenz | CC-Namensnennung 2.0 Belgien: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/61944 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
00:00
MereologieEinsPunktCASE <Informatik>VersionsverwaltungUmsetzung <Informatik>Computeranimation
01:34
Kette <Mathematik>StabSoftwareProzess <Informatik>MereologieKette <Mathematik>Open SourceHilfesystemProjektive EbeneComputersicherheitComputeranimation
02:13
Kette <Mathematik>ProgrammbibliothekProjektive EbeneGrenzschichtablösungOpen SourceCASE <Informatik>SoftwarewartungDifferenteComputerspielFlächeninhaltInformationQuick-SortWeb-SeiteEndliche ModelltheorieComputeranimation
03:54
Quick-SortOrdnung <Mathematik>BlackboxInformationComputeranimation
04:47
BenutzerfreundlichkeitEntscheidungsmodellElektronischer DatenaustauschMehragentensystemMessage-PassingEinflussgrößeGammafunktionAtomarität <Informatik>MaßstabLokales MinimumKette <Mathematik>MarketinginformationssystemKreisbewegungRechenwerkStellenringWurm <Informatik>Latent-Class-AnalyseSichtenkonzeptGruppenkeimProjektive EbeneEntscheidungstheorieInformationVisualisierungMultiplikationsoperatorDatenstrukturComputeranimation
05:46
DatenfeldInformationOpen SourceVollständigkeitProjektive EbeneBetriebssystemIdentifizierbarkeitComputeranimation
06:29
Formale GrammatikFlächeninhaltInformationOpen SourceDatenstrukturElektronische PublikationProjektive EbeneMereologieWürfelQuellcodeMailing-ListeTabelleProgramm/QuellcodeComputeranimation
07:20
Prozess <Informatik>InformationComputeranimation
07:56
Syntaktische AnalyseStrategisches SpielTotal <Mathematik>Urbild <Mathematik>Kette <Mathematik>Anwendungsspezifischer ProzessorKomponente <Software>InformationSoftwareDatenstrukturDatenstrukturOpen SourceUnrundheitMailing-ListeEinsInformationProzess <Informatik>CASE <Informatik>Formale SemantikAnalogieschlussSoftwareBitSchlussregelFehlermeldungRechter WinkelGraphStrategisches SpielPunktKontextbezogenes SystemZeichenketteBildgebendes VerfahrenComputeranimation
12:13
Kette <Mathematik>GruppenkeimKorrelationMessage-PassingWechselsprungDDR-SDRAMEinflussgrößeKanal <Bildverarbeitung>E-LearningSpeicherabzugRelationale DatenbankManufacturing Execution SystemBildgebendes VerfahrenDatenstrukturEinfache GenauigkeitInformationCASE <Informatik>Elektronische PublikationDistributionenraum
13:12
Total <Mathematik>Anwendungsspezifischer ProzessorMulti-Tier-ArchitekturMereologieKette <Mathematik>SoftwareNegative ZahlAchtGebäude <Mathematik>AutorisierungSoftwareMailing-ListeNummernsystemIdentifizierbarkeitComputerspielE-MailComputersicherheitDreiecksfreier GraphKette <Mathematik>CASE <Informatik>DatenfeldAbgeschlossene MengeMultiplikationsoperatorZeichenketteDifferenteURLQuaderRechter WinkelLokales MinimumElement <Gruppentheorie>ProgrammbibliothekInformationDatenstrukturProjektive EbeneOpen SourceNatürliche SprachePhysikalisches SystemVollständiger VerbandOrdnung <Mathematik>Streaming <Kommunikationstechnik>Technische InformatikComputeranimation
17:36
Kette <Mathematik>SoftwareGebäude <Mathematik>Komponente <Software>Inhalt <Mathematik>Hash-AlgorithmusAdressraumKonsistenz <Informatik>IntegralRechenschieberSoftwareHash-AlgorithmusInhalt <Mathematik>Physikalisches SystemIdentifizierbarkeitDatenbankSoftwareschwachstelleVersionsverwaltungVerschlingungComputeranimation
19:18
Umsetzung <Informatik>Computeranimation
19:39
Twitter <Softwareplattform>InformationRechter WinkelHash-AlgorithmusIntegralMessage-PassingMathematikLuenberger-BeobachterMetadatenCASE <Informatik>SchnittmengeMereologieDistributionenraumSoftwareBitMaschinenschreibenZeichenketteWurm <Informatik>ValiditätBenchmarkKette <Mathematik>Inhalt <Mathematik>BinärcodeRepository <Informatik>Vorzeichen <Mathematik>Lebesgue-IntegralPunktPatch <Software>Twitter <Softwareplattform>Streaming <Kommunikationstechnik>ComputersicherheitMultiplikationsoperatorNumerische IntegrationATMPhysikalisches SystemGenerator <Informatik>Selbst organisierendes SystemOpen SourceNatürliche SpracheComputeranimation
27:58
Flussdiagramm
Transkript: Englisch(automatisch erzeugt)
00:05
Yeah, all right, so first of all, thank you for staying And We have been going through all of those cool use cases and conflicts like really complete
00:30
tools to generate headphones and I Was thinking that I
00:44
Wanted to do like my kind of people as we are And so as you have been hearing from folks
01:01
Right now Working on this ones are starting to get concerned about what's actually those documents and and I think When Thomas opened the bedroom today, the first thing he said was well Those dependencies that you're getting they may not be correct, right?
01:22
So I thought that it would be as we move to the latest part of the conference It would be cool if we could get a few talking points just to see the conversation that's about to happen so my name is Garcia and I am Well part of the CDX community I am a contributor to
01:41
SPDX and some of the tools I maintain a bunch of open-source tools that generate and consume S1 And that's help visualize them. I am also part of the Kubernetes project. I am part of Kubernetes take release and I Work there mostly on the supply chain security of the project and
02:01
Yeah, like riding my bike. I'm based in Mexico City Staff engineer with Chinger which is a company devoted to supply chain security and So as you heard from Probably every speaker today The goal of having a
02:21
Getting a document which you can actually use for something and there are there are many concerns about S1 flying around in the world today Because there are particular use cases and some people will argue that it's almost may not be necessarily
02:40
Incomplete the third of sort of them for a one page or the other and this is true, but instead of trying to Picture ourselves like generating an S1 from the position of like a large company or whatever I felt that it was more appropriate to discuss today that how I mean, I'm assuming a lot of people here are
03:03
Maintainers of open-source projects and sometimes very small projects like one maintainer small and I think it's important to start considering The that when those large companies are gonna use your project your library important that model
03:23
But you're right The S1 that you give them can really make a difference in several areas like First you can make the life easier Because you're having them more complete information, which they can act on and the other one is We as open the open-source community
03:42
Become better citizens of the supply chain like generating the information that pertains to us Is much more much more responsible thing to do so What happens when you open an S1? Well today you can get all sorts of surprises
04:01
Sometimes there's nothing in there you open the S1 and it's empty sometimes you don't have Absolutely any information that lets you determine what that this one is describing So it's simply just pointing to the same black box that you can look from the outside or the other is
04:21
What happens if? Are you sure that the S1 is really describing what you're expected to and you are not getting gone by someone well those that information needs to be in the S1 in order to ensure that Importance I Need to be in this one in order to ensure that it's actually describing that piece of salt work that you're distributing
04:47
So I'm gonna give you a few examples I'm not trying to name names, and this is I that's why I chose projects that I'm involved with Both good and bad, so
05:02
This is the first one. This is Our company has a Linux distribution, which is already shipped shipping with s bombs built in and we generate those as bombs at build time for all of the packages and You can see the structure here of one of the s bombs this is like a visualization of the s bomb using the Kubernetes bomb tool which lets you ingest as PDX documents and see how they're
05:28
structured inside and As you can see we try to in the Linux district add a lot of detail to the s bomb as much as we Can to just guide? Whoever is using those s bombs
05:41
To do smart decisions with the information they have in them So if you look at some of this is a fragment of the s bomb and I mean some information is there Some information is for example the the licenses The license look concluded fields they are marked as no assertion
06:03
But you can omit those for example if you want, but we have the license from the project from the actual Operating system package we have some identifiers things like that, so it's Pretty complete it's obviously not perfect, but we try and we try to add as much information as we can
06:22
but then Let me show you another s bomb from another popular open source project This is part of the Kubernetes s bomb so this is Part of the s bomb like the structure a little fragment of the structure of the s bomb
06:41
that we generate with when we put out a new Kubernetes release and This is describing for example the the the tables which we put out with every release The one of the tables of the cube API server the list of files so we also try to add information we put out two s bombs with Kubernetes one with the artifacts one with the source code which are linked one to each other and
07:08
so we also Think that's those are fairly complete s bombs, but now I Opened an s bomb in a popular open source project and try to Generate the structure like a this. I'm not I'm not gonna say which project this
07:25
it's just it's just one I'm involved with and should be we should be doing a better job there and There you can guess many reasons of why this is showing serial things, but we can go over this
07:40
So As you can see The you can really enrich an s bomb with a lot of information and some of it Can be more important than other things, but I've been thinking well. What's the most important details that you can add to the s bomb? So the first one is and by the way most of this
08:00
You already heard the truth of the day if you've been sitting in most of the conferences So we're gonna go one by one so the first one is Syntactic correctness you were you would expect that most tools generating SPDX or Cyclone DX s bombs do like the basic job of just making a compliant document well
08:20
Their reality is that they're not so you if you I picture the this guy from Apollo 13 that tries to Fit the square peg in the round hole or the other way around Because if you cannot ingest an s bomb so what's the point right and even if you have like
08:40
Try to somehow hacked at the document or ingested somehow The reality is that most tools that consume s bombs today. Do not have like a clear strategy of deprecating the documents So and then most importantly not clear and also not predictable so if you
09:00
If a tool tries to somehow ignore errors or whatever the the behavior may not be consistent so Ensure that any s bomb that you're producing or requesting at least complies with syntactic rules of the standard you're using The second one dependency data, and this is a little bit related to the first one
09:23
I've seen s bombs so since I work with a lot of open source tools and my job also has to do with s I've seen like a lot of tools producing s bombs and so for example one one variant of the bad s bomb is well that will just list like a double and
09:41
That's your s bomb Nothing else or the obvious case of this s bomb contains one thing an RPM No, no depends on dancing or nothing So we often use the analogy of The s one being the nutritional label of software, but without the dependency list well, it's really worthless
10:05
You can still use your s bomb as the old checks on the txt if you want it But it's not provide a lot more value than that Then the second one licensing information. We've heard a ton of
10:21
The talks today about licensing and why it may be important so the truth is is you are publishing software You are the most qualified person to do the assessment of what that's the license your software should be using and this applies both to the
10:40
dependencies that you're pulling in and if you are redistributing that information ensure that that the information about the licensing is Going down the string because the tools that we've been seeing today try to do a good job on Helping people understand their licensing situation and So I picture the checking the passport
11:03
It's an example of the license This next one and Semantic structure in the s one this one also came during the discussion today So there Folks that think that s bombs can be just the list of dependencies and it may be true
11:25
but then you start losing context on where those things fit like for example, if you have just a list of dependencies and Especially if they're not related to an artifact at the top of the s1 if you picture so the s1 can be like this beautiful graph of one know that spreads out to
11:46
lots of the pen relationships in notes and So sometimes you'll see as ones that only have the list of dependencies and they don't talk about Where those dependencies fit if they're describing a concerning image a binary nothing
12:01
So if you try to do something Like more sophisticated with a data you simply can't if you remember the s1 that we that I showed in the in the beginning That we build with the Linux distribution, this is how we structure The container images build from our Linux distribution
12:22
So you have the container the layers and the the packages like the OS packages and then all of the files In its proper place and this information is actually common coming from smaller s bombs that get compiled When we build a Linux distribution, so each of the AP case of the distro
12:43
Have their own s bomb describing that package and then when we build an image we take all of those s bombs And give you one single S bomb with all of that information composed where it's supposed to be and without structure you simply cannot do this And this is one image, but then if you go and make it more complex you can start thinking about
13:04
Multi-arch images right and those need to have this information for each of the images in the so the the relationships start to become more and more complex And The way I try to picture is This right so they give you like a box of Legos without any instructions or anything you can use your imagination
13:26
Probably you're gonna build something really beautiful most likely not Especially not the thing picture in the box, right? And so these are some of the reasons that I was thinking like if you have structure then
13:42
It's a guarantee at least that the tool at least is looking at how the thing is composed and where the information is flowing from and Lets you use more do more complex use cases for the documents Now the next one this also has come like two three two or three times today the software amplifiers and
14:06
S bombs need to be defining in Naming the piece of software as close as possible and Software and the identifiers are one of the schemes that you need
14:20
in in the document in order to ensure that the piece of software that the s1 is describing is clearly identified and All of them have their problems Especially CPE for example is like really complex to get it, right and But some the the the idea is there's gonna be a
14:42
Tool down the stream that it's gonna benefit from that information. So if you can add it You're you're making sure that the the s1 can work well with those tools And this is kind of the the idea of that. So how many packages in the world named love, right?
15:01
So, okay love but what's love and there are thousands in every language like operating system packages Libraries named love. So if you can have like a properly specified P URL CP both that clearly define the The piece of software that does that the s1 is talking about then can be better referenced and used by tools on the string
15:28
now the supplier data this is like a contentious one then the reason why I Added the supplier data is because and as software authors
15:41
Sometimes we don't think that it's like an important field We simply I mean in most large open source projects, we just list like copyright the project offers right like The editorial but the reality is that if you jump into any of the s1 meetings that go
16:00
On regularly the you you wanna hear the all of the compliance folks like I need a name to sue I don't know. It's a different mentality than ours, but people need it. And in fact, it's one of the requirements from NTIA as the minimum elements of s1 and
16:22
this is a weird field because if you deal in Kind of more into security of the documents that should be Generated during the supply chain and the software life cycle life cycle. This information is kind of I don't know
16:42
not really very useful because it can be forged and you cannot trust it and So just having a name and an email well like it serves compliance folks, but To us it's kind of Worthless really for security purposes, right, but then
17:02
You start thinking about well, what's a supplier? Is it the author the copyright holder? is it the tool that compiled the the thing the people who has who's distributing it and so well, at least ensure that you're providing some kind of information and the idea is Know who's selling you your things, right?
17:24
Buy candy from that guy probably not Well, yeah exactly come give him get him from us Supplier yeah, okay, I Messed up this this slide. So this one was supposed to be integrity data
17:44
Very the data to prevent this kind of thing. So when you as you heard today also, so it's one should be properly hashed like Hashing as much as you can inside of a document when possible when it makes sense and especially when it can be verified
18:02
So the idea is is this piece of software that I'm naming in the s1 the real deal Has it been corrupted or not, but more importantly having hashes lets you Deal the problem of the latest right? So sometimes you will not have a version
18:22
But you can still reference that software artifact inside of the the s1 and other documents like bags For example via the hashes so you you can think about the versioning system and the software identifiers as links to external
18:42
systems outside of the s1 like vulnerability databases like for example Package repositories, but internally everything should be Addressed via the hash if possible. So if I'm telling you this is the vulnerability
19:01
Document for a piece of software should match with the hashes somehow In the ADA as well you can once you start content content addressing the piece of software in the s1 cannot be you cannot go wrong and Well, that's basically what I have and
19:23
So I just wanted to let this open You know kick the conversations are about to happen about this kind of thing inside of the documents and If there are any questions or whatever happy to take them and if not You can reach me as pork almost in most systems and Twitter whatever so, thank you
20:18
Well, I would like to hear the opinions of the supplier that are for another yeah
20:24
So how so what's basically it's what's the role of the supplier data for what what's the use of that field? yeah, so should they feel like a Person or an entity or a tool. Yep, or
20:55
Not really No, no. Yeah, I was going to go to that question before but if anybody has
21:02
Insights about how supplier data is used in the organizations. That's the time to Discuss it. All right. Yeah. No, so I the way I've seen it required is No, no, this is
21:21
Exactly the first one so the first one is how is the supplier used and the way I've seen it is mostly from Procurement people like asking for that information and lawyers So that's that's the mode of the two that I've been as that I've seen asking for the information more. I'm
21:41
coming from the Security side of has been more so the compliance is not my strong side, but that's why I'm so just saying it And yeah Yeah, as one data point the way that we are using supplier data is actually recording who supplied
22:07
So not who wrote it not who created Right if we got it from An upstream distribution repo we put the
22:22
Again record, but we know that And to the other question is so those in the integrity points consider also signing of the s1 and Yes, but not in this case. So Integrity like signing signing of the s1 is mostly done outside of the s1. So
22:46
And that touches on Trusting the s1 which is a whole nother kind of worms But I mean it is but not not in the contents of the documents Is there something like benchmarks or I give it a score of 8.0 from 10, that's a good s1, that's a bad s1
23:14
Well, there are yeah, there are tools and yeah, yeah, I repeat the question So, how can I know sorry didn't get a lot of sleep
23:23
How can I make sure that that the s1 really complies to these things here? So there are a couple of tools that do validation of the s1 like Scoring try to the scoring so eBay Has a tool called
23:42
S1 scorecard then there's the NTIA compliance checker from SPDX I'm not sure and there's a there's I don't know are the end of folks here still
24:01
I Okay, so I seem to remember that they were handling some of that as well, but there are a couple of tools out there It's more like a remark, but I'm a bit surprised we didn't mention OpenChain that much OpenChain is, the goal is to build trust from the suppliers so you can trust the SBOMs from the supplier more than anything
24:24
So yeah, yeah so what Nico said is that OpenChain has touches on the idea of trusting the SBOM on the supplier and and Those those rights Observation this is having looked at Python and
24:42
The metadata that goes with Python packaging is really inconsistent So how do you spell Apache 2 how many different ways of putting the Apache 2 license? Is amazing exactly naturally between releases information chain disappears, yeah, so
25:01
This is really a message for the ones of in the ecosystems puts as much data in the ecosystem You can because it's going to support Yeah, exactly yeah the comment is Right exactly
25:27
Yeah, the comment is that in Python sometimes between releases information changes or disappears or whatever so This is actually One of the things that some of us would like to see happening like people working on packaging systems on
25:44
language ecosystems To start if not having SBOM generation straight away in their tooling at least expose the information so that we SBOM tool makers can go in and extract them from More trustable sources and
26:08
Okay Can you repeat it
26:23
In in in the case of the yeah, but just in the case of the distro or Well, yeah, the question is how do you deal with a patch of software right when you apply a patch so
26:41
But I mean you still have that hash right or where is the question about naming Yeah, so if you're describing a patch of Artifact I mean the hash simply hash the thing and you can use that down the stream The problem comes when you're trying to define well, I'm using curl, but I applied a few custom patches myself
27:06
How do you name that and that that becomes a like a more complex question? So internally as I was saying with integrity thing is you can still reference everything with the hashes, right? Like I'm talking about binary this hash all down the string
27:22
But when you want to express it externally Well, I guess that falls into the naming problem and you have to think about where that thing is going to be used so if that is going to be a package part of a distribution that you're Doing you may define your own set of package URLs, for example or if it's not going to be you can make up the license, but it falls more into the use case of
27:46
What you're trying to do with distributing that budget so far So that's it. All right. Thank you