Building Strong Foundations for a More Secure Future
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Untertitel |
| |
Serientitel | ||
Anzahl der Teile | 542 | |
Autor | ||
Lizenz | CC-Namensnennung 2.0 Belgien: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/61654 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
00:00
Gebäude <Mathematik>Digital Rights ManagementElement <Gruppentheorie>ComputersicherheitBildschirmmaskeSichtenkonzeptTermAggregatzustandCodeSchnittmengeSoftwareProjektive EbeneMultiplikationsoperatorOpen SourceComputeranimation
01:02
Kette <Mathematik>Open SourceSoftwareGruppenoperationCodeExplosion <Stochastik>Konsistenz <Informatik>DistributionenraumGebäude <Mathematik>SoftwareentwicklerRepository <Informatik>KontrollstrukturSystemplattformPhysikalisches SystemPhysikalisches SystemCodeZahlenbereichVersionsverwaltungStützpunkt <Mathematik>SoftwareQuellcodeRepository <Informatik>MereologieSoftwareentwicklerSystemplattformMittelwertBeobachtungsstudieOpen SourceTermResultanteEinfügungsdämpfungKette <Mathematik>Universal product codeDefaultProzess <Informatik>Ein-AusgabeSchreib-Lese-KopfEindringerkennungComputeranimation
03:40
Nichtlinearer OperatorSystemprogrammierungCodeSoftwareschwachstelleGruppenoperationUnendlichkeitLeistungsbewertungRekursive FunktionKonfigurationsraumComputersicherheitVersionsverwaltungRechnernetzInterpretiererInstallation <Informatik>SoftwareOpen SourceDatenmodellQuellcodeEreignishorizontWhiteboardExplosion <Stochastik>CybersexEndliche ModelltheorieSoftwareentwicklerCodierungstheorieFunktion <Mathematik>Kette <Mathematik>Patch <Software>HybridrechnerAdressraumSummengleichungOSS <Rechnernetz>SoftwaretestNP-hartes ProblemCodeStochastische AbhängigkeitVerzeichnisdienstVierFunktionalForcingEndliche ModelltheorieProgrammfehlerEchtzeitsystemGesetz <Physik>VerkehrsinformationMittelwertOrtsoperatorSchnittmengeProjektive EbeneBridge <Kommunikationstechnik>SoftwareentwicklerUbiquitous ComputingVersionsverwaltungUnternehmensarchitekturGüte der AnpassungProzess <Informatik>Physikalisches SystemMailing-ListeComputersicherheitSoftwareMinimalgradBeobachtungsstudieAutomatische HandlungsplanungSystemzusammenbruchExpertensystemCybersexGruppenoperationOffice-PaketMultiplikationsoperatorPhysikalischer EffektEinfügungsdämpfungWurzel <Mathematik>TeilbarkeitOpen SourceKette <Mathematik>Gebäude <Mathematik>Selbst organisierendes SystemWiderspruchsfreiheitMereologieSpeicherabzugSyntaktische AnalyseEin-AusgabeFlächeninhaltFreewareFlächentheorieCodierungSoftwareschwachstelleBitDigital Rights ManagementStandardabweichungLikelihood-FunktionDateiformatDämpfungComputeranimation
13:28
SoftwareentwicklerGruppenoperationSoftwareschwachstelleProzess <Informatik>PortscannerCodeEinflussgrößeMailing-ListeSoftwareentwicklerObjekt <Kategorie>SchätzfunktionProzess <Informatik>Projektive EbeneOpen SourceSoftwareAutomatische IndexierungStützpunkt <Mathematik>EinsProgrammfehlerBitMathematikComputersicherheitZahlenbereichEigentliche AbbildungOrdnung <Mathematik>Umsetzung <Informatik>Computeranimation
16:28
Physikalische TheorieMathematikComputersicherheitSoftwareTLSOSS <Rechnernetz>CybersexDefaultZuckerberg, MarkSoftwareentwicklerChiffrierungp-BlockWeb SiteKette <Mathematik>Open SourceProdukt <Mathematik>KanalkapazitätOffene MengeBrowserSoftwareentwicklerDomain-NameAllgemeine RelativitätstheorieMAPMultiplikationsoperatorBitBridge <Kommunikationstechnik>Digitales ZertifikatProjektive EbeneResultanteKette <Mathematik>SoftwareQuick-SortPunktSchnittmengeLikelihood-FunktionDifferenteMixed RealityBinärcodeGebäude <Mathematik>ComputersicherheitPhysikalisches SystemProzess <Informatik>BenutzerbeteiligungMathematikFokalpunktPhysikalische TheorieServerFolge <Mathematik>EinflussgrößeKomponente <Software>UmwandlungsenthalpieOpen SourceVersionsverwaltungChiffrierungTLSGamecontrollerTermTypentheorieMereologieProgrammierparadigmaSchlüsselverwaltungCybersexVorzeichen <Mathematik>Web SiteE-MailURLDefaultKontrollstrukturGleitendes MittelSoftwareschwachstelleDienst <Informatik>MinimalgradSpiegelung <Mathematik>Besprechung/InterviewComputeranimation
24:02
VerkehrsinformationSoftwareschwachstelleQuellcodeGruppenoperationComputersicherheitKollaboration <Informatik>GruppenoperationComputersicherheitKette <Mathematik>SoftwareProjektive EbenePhysikalisches SystemInhalt <Mathematik>RichtungKurvenanpassungSoftwareschwachstelleAutorisierungProgrammfehlerFlächeninhaltÄußere Algebra eines ModulsDistributionenraumProdukt <Mathematik>Exogene VariableSoftwareentwicklerUnternehmensarchitekturComputeranimation
26:21
Elektronischer ProgrammführerSoftwareOpen SourceSchätzungInformationAdditionLeistungsbewertungCodeVersionsverwaltungSoftwareentwicklerImplementierungSoftwaretestNegative ZahlGerichtete MengeToken-RingAuthentifikationMereologieFreewareFundamentalsatz der AlgebraDigitales ZertifikatProgrammverifikationSystemidentifikationOSS <Rechnernetz>ComputersicherheitVektorpotenzialStrom <Mathematik>MAPStatistikTLSSuite <Programmpaket>AnalysisHydrostatikProzess <Informatik>Web SiteSoftwareOpen SourceProjektive EbeneMereologieSoftwareentwicklerBitSoftwareschwachstelleBeobachtungsstudieBasis <Mathematik>PunktDatenfeldProzess <Informatik>Elektronischer ProgrammführerZahlenbereichCodeMathematikMAPDistributionenraumInformatikComputersicherheitInhalt <Mathematik>WellenpaketChecklisteMultiplikationsoperatorUmwandlungsenthalpieSchaltnetzSchnittmengeSoftwarewartungRepository <Informatik>Computeranimation
30:27
HeuristikOSS <Rechnernetz>StichprobeGruppenoperationSoftwaretestCodeComputersicherheitBinärdatenVerzweigendes ProgrammCodeVerzweigendes ProgrammDokumentenserverUnordnungComputersicherheitChecklisteQuellcodeDigital Rights ManagementGamecontrollerBinärcodeDifferenteSoftwarewartungMathematikBeweistheorieSpieltheorieInformationsspeicherungEinflussgrößePortscannerComputeranimation
32:06
Element <Gruppentheorie>DifferenteProjektive EbeneMultiplikationsoperatorComputersicherheitZahlenbereichKategorie <Mathematik>BeobachtungsstudieComputeranimation
32:46
ComputersicherheitSLAM-VerfahrenEinflussgrößeSoftwareOSS <Rechnernetz>Vorzeichen <Mathematik>Web logGenerator <Informatik>Token-RingIdentitätsverwaltungSchlüsselverwaltungMechanismus-Design-TheorieDigitales ZertifikatElektronische UnterschriftGruppoidDigital Rights ManagementOpen SourceSoftwareentwicklerSoftwareSoftwarewartungClientProjektive EbeneSchnittmengeMultiplikationsoperatorUnternehmensarchitekturEinfach zusammenhängender RaumWeb SiteKette <Mathematik>Public-Key-KryptosystemDatenbankVorzeichen <Mathematik>Hash-AlgorithmusSchlüsselverwaltungElektronische UnterschriftCodeMustererkennungAusnahmebehandlungÄußere Algebra eines ModulsComputeranimation
35:19
SoftwareschwachstelleElektronischer ProgrammführerSoftwareOpen SourceKette <Mathematik>MAPComputersicherheitSystemidentifikationMetrisches SystemOSS <Rechnernetz>EntscheidungstheorieSoftwareentwicklerUnordnungMaßstabJensen-MaßCMM <Software Engineering>KanalkapazitätPortscannerAbfrageSoftwareentwicklerJensen-MaßStützpunkt <Mathematik>CodeDefaultAuswahlaxiomPunktUnternehmensarchitekturProjektive EbeneSoftwareschwachstelleCachingKette <Mathematik>Direkte numerische SimulationInternetworkingOpen SourceElektronischer ProgrammführerSelbst organisierendes SystemSoftwarewartungComputersicherheitServerProgrammfehlerWurzel <Mathematik>MultiplikationsoperatorMAPMinimalgradSchnelltasteComputeranimation
38:48
OrtsoperatorSoftwaretestComputersicherheitTransportproblemSystemplattformOperations ResearchSinguläres IntegralProzess <Informatik>DokumentenserverKollaboration <Informatik>CodeSoftwareschwachstelleRestklasseElektronische PublikationVektorpotenzialOpen SourceSoftwareentwicklerStabFlächentheorieFlächeninhaltPatch <Software>MaßstabElektronische PublikationProjektive EbeneMultiplikationsoperatorOpen SourceZentrische StreckungBimodulSoftwareschwachstelleEin-AusgabeComputeranimation
39:53
Exogene VariableWeb-SeiteDivergente ReiheOpen SourceSoftwareComputersicherheitKette <Mathematik>AdressraumGebäude <Mathematik>Automatische HandlungsplanungPunktOpen SourceComputersicherheitProjektive EbeneSpeicherabzugSoftwareentwicklerMultiplikationsoperatorSoftwareschwachstelleSelbst organisierendes SystemComputeranimation
40:42
ComputersicherheitKette <Mathematik>Open SourceSoftwareAdressraumExpertensystemWhiteboardMobiles InternetSoftwareschwachstelleAbgeschlossene MengeComputersicherheitAutomatische HandlungsplanungExogene VariableSoftwareentwicklerStandardabweichungZahlenbereichOpen SourceComputeranimation
41:32
Gesetz <Mathematik>Open SourceEndliche ModelltheorieUmsetzung <Informatik>MereologieMultiplikationsoperatorCodeComputersicherheitSoftwareCASE <Informatik>StrebeComputeranimation
42:26
MultiplikationKette <Mathematik>ComputersicherheitKonsistenz <Informatik>SystemprogrammierungGebäude <Mathematik>DistributionenraumSoftwareOpen SourceEreignishorizontSoftwareentwicklerGarbentheorieImplementierungStreaming <Kommunikationstechnik>MereologieMinkowski-MetrikSoftwareentwicklerDefaultUmsetzung <Informatik>ÄhnlichkeitsgeometrieVollständiger VerbandSoftwareMultiplikationsoperatorProjektive EbeneMAPComputeranimation
43:10
Open SourceKonsistenz <Informatik>SoftwareAdressraumOSS <Rechnernetz>ComputersicherheitVerschlingungDefaultKette <Mathematik>EinflussgrößePlastikkarteNabel <Mathematik>Projektive EbeneBitKollaboration <Informatik>WhiteboardSoftwareentwicklerCodeTermKanalkapazitätDigitales ZertifikatOpen SourceChatten <Kommunikation>ErwartungswertZweiVerkehrsinformationChecklisteMereologieUnternehmensarchitekturUmwandlungsenthalpieProgrammfehlerIntegralRechter WinkelSoftwareEindringerkennungReelle ZahlLoginTropfenSelbst organisierendes SystemMultiplikationsoperatorTypentheorieVerschiebungsoperatorOffene MengeFokalpunktSchlussregelComputersicherheitGruppenoperationComputeranimationDiagramm
48:45
StandardabweichungCoxeter-GruppePunktUmsetzung <Informatik>MultiplikationsoperatorDreiecksfreier GraphRechter WinkelGruppenoperationEinflussgrößeComputersicherheitComputeranimation
49:50
Flussdiagramm
Transkript: Englisch(automatisch erzeugt)
00:05
Hi, I'm Brian Bellendorf, I'm the general manager for the Open Source Security Foundation, which is a project hosted at the Linux Foundation, but has its own rather large membership and set of activities and the like. And I thought I'd take the time to talk to you this morning about some of the things
00:22
that we learned coming out of the Log4Shell incident, and in general what we're doing at the OpenSSF to try to improve the state of security across all of open source software. And I do apologize for using the term software supply chain, I know folks are sometimes very sensitive to thinking of themselves as suppliers.
00:40
You're all developers, you're all building components, you're all handing things off to the next person, right? And I just want to be sensitive to that and recognize a lot of us pursue this not just to write code for our companies or for other people to use, but because we love it, because it's like a form of literature. And so I come to this very much from an expansive view of what software is.
01:03
But if you'll indulge me with the term software supply chain, lots has been made about the fact that today, 2023, open source software is incredibly pervasive, because the further upstream you go in most software supply chains, even if the end result is proprietary software, the further upstream, the much more likely it is that your dependencies,
01:24
your components, are open source code. Something like 78%, according to a study by Synopsys last year, 78% of code in a typical product code base. That could be a container image, that could be software in a phone, that could be or a car. 78% on average is preexisting open source code.
01:44
That last 22% is the part that the company put its name on and whatever. And 97% of code bases somewhere contain open source software. And 85% of code bases contain open source that is more than four years out of date.
02:00
The comparable for this and log4shell, by the way, were the number of companies who claimed we're not vulnerable to the log4j problem because we're still on version 1.x rather than 2.x. Don't worry, which had been out of support, out of any updates for five years. So this is kind of a disaster. But fixing this requires thinking about systematically what does the software supply chain look like.
02:23
And this is highly simplified, and in a way this is only what happens at one node of a chain, right? But within a given software lifecycle, you've got the developer writing code from their head or in partnership with Copilot now, I guess, into an IDE that then goes into
02:41
a build system and pulls in dependencies and then creates packages and pushes them out to a consumer of sorts, right, who could be another developer who then repeats that process and just uses that input as dependencies. And there's at least eight, and there's probably a lot more, but at least eight kind of major opportunities to take advantage of some default biases and assumptions and frankly
03:06
just things we forgot to close up in the course of this development process. Everything from bypassing code review to compromising the source control system to modifying code after it's come through source code and into build to compromising the build platform
03:20
to using a bad dependency to bypassing CI CD entirely, which we all know happens, to compromising the package repo to using a bad package as a consumer. So each of those has had examples of compromise in the last few years that has caused major breaches and data loss out there.
03:40
And of course we all, I don't know if any of you were on the front lines of fighting this fire over the winter holiday of 2021 into 2022, but it ruined a lot of people's holidays when log4shell hit. And by the way, I want to refer to the vulnerability and the breach and the remediation
04:01
as the log4shell problem, not the log4j problem, because the log4j developers don't deserve to have their brand of their project turned into exhibit A in what's broken about open source. They were actually, it's great software, they're all professional developers, let's give them some credit, there's a bunch of contributing factors we'll walk through, but it was really the log4shell breach. And what happened in the course of about six weeks is you went from a researcher for
04:25
Alibaba in China finding a vulnerability out of the ordinary course of due diligence work that he was doing, reporting it appropriately through the Apache Software Foundation processes, and that leading to the very first CVE, starting from November 24th all the way to
04:45
January 4th and January 10th, so about six weeks, where you have governments like the UK government warning people of this major systematic issue, right? And three more CVEs being discovered of various degrees of intensity, each of them leading
05:03
to a subsequent patch, to a subsequent remediation by exhausted IT teams, to, I mean if you talk to any of the log4j developers, I don't believe any of them are here, but they would talk about things like getting these demand letters from corporate legal departments, asking them to fax back a signed attestation that they had fixed the holes in log4j in
05:24
that company's use, when there had been no relationship, that company was a free writer on top of their code. So I'm not going to read through each of these steps, I apologize for this, but there was this incredibly compressed timeline where people were intensely stressed, where really the goodwill that we show as open source developers by putting our code out
05:42
there and the fair warning that we give people to use it at your own risk was substantially attacked, right? It was substantially, all these misconceptions that companies have about underlying open source code and the degree to which they take advantage of it kind of came to bear. And so it raised a bunch of questions amongst folks who perhaps hadn't thought about this
06:02
before. Is open source software's generally good reputation for security? Is it well-deserved? Does this demonstrate deep and pervasive issues, technical issues, across how we consume and develop open source code? Do these issues extend to the sustainability model for open source itself? Can we really depend upon so many quote volunteers, right?
06:22
I mean, think about it, we don't depend upon volunteers to build bridges and highways to maintain our electrical grid, right? How do we depend upon volunteers to maintain our critical infrastructure that everything runs on, right? And of course, it wasn't just like us as technologists asking these questions of each other, it was like compliance and risk officers.
06:43
It was the cybersecurity insurance industry. It was the European Union and the White House and UK's NCSC and other government agencies kind of all challenging us, do we know what we're doing? And one interesting report, and it is worth your time to read, it's about 49 pages, came out about six months after the fact.
07:03
What happened was the US government convened a group of experts from across industry and had them go and talk to the Log4j developers, talk to other open source experts, talk to lots and lots of open source foundations, and try to ask what went on, what contributed to this? And it was modeled after, you know, when a plane crashes and the government will convene
07:24
for many of them like a study group to answer, well, why did this plane crash other than, of course, sudden loss of altitude? What were the underlying root causes to this plane crash? And so this was modeled after the very same thing. And it's a great report to read, I think, because it also comes up with some recommendations
07:42
for how potentially to prevent the next one. And I'll walk through a couple of the conclusions. They said, you know, a focused review of the Log4j code could have identified the unintended functionality that led to the problem. Understand the bug, the original big bug was in a portion of code that had been contributed
08:00
to Log4j years and years earlier by a company that wanted to support LDAP lookups in real time during the logging process, which seems like an extraordinarily bad idea to me. But okay, I'm not in enterprise IT. So they'd added this functionality and then kind of left. They kind of didn't stick around to maintain it. They didn't really do a due diligence into the security of its own code.
08:24
And the other Log4j developers just kind of kept it around because they didn't get many bug reports on it, it wasn't really a problem. So it was kind of this forgotten portion of the code. So part of it was if they'd had security resources to look at every line of code that
08:41
they were shipping out, rather than just the core stuff, which they were pretty diligent about, they might have discovered this bug. It might also have been discovered if the developers themselves had developed, had adopted certain secure coding practices consistent with how certain other organizations kind of define how those processes should work.
09:01
If they'd had design reviews that focused on security and reducing kind of the surface area of the attack surface, sorry, for problems. If they'd used threat models to understand, hey, we really should try to make sure we're better protected against people using through the user-generated input, right? When you're a logging engine, you're dealing with a ton of user-generated input.
09:23
If you're parsing that for things, like format strings, which is what they were doing in this, that's potentially very dangerous, right? That's something we've known kind of since some of the earliest CVEs out there. And then finally, if they'd had proper security audits. And so in answering that and trying to generalize from it, and it's always dangerous to generalize
09:41
from a single example, but they found that the only way to reduce the likelihood of risk to the entire ecosystem caused by these kind of vulnerabilities and other widely used open source code was to ensure that as much code as possible is developed pursuant to standardized secure coding practices. Now that kind of sounds like, well, if only the pilots had had better training or if
10:02
only the planes had been maintained better, then we wouldn't have had these problems, right? It seems a little bit like hindsight is 20-20, right? But they did acknowledge that the volunteer-based model of open source would need many more resources than they have to be able to make this possible, on average.
10:21
I mean, you've all heard the aphorism, which is called Lenus's Law, but it was coined by Eric Raymond, Lenus has disavowed the law actually, that says, with enough eyeballs, all bugs are shallow, right? What was missing from that quote was eyeballs per line of code. And I would argue that even in some of the best open source projects,
10:41
we don't have enough eyeballs per line of code. And anything we do that divides that list, such as forks, for example, only takes us further away from having enough eyeballs to review code. There's a lot else that I can go into about it's not really just a supply chain security story, it was kind of a supply chain story because of just how pervasive
11:01
Log4j was, it was a bug that affected everything from the iTunes store, to people badging in with security kind of like badges, to all sorts of like embedded systems and the like, Log4j was kind of everywhere. And because it was everywhere, it was in a whole lot of places that people didn't have the tools to even go and discover.
11:21
Often it was compiled into JAR files, so you couldn't even just do a directory listing and grep through it to find Log4j.jar. You actually had to interrogate the development process. And without things like SBOMs, which appropriately the US government has focused on kind of saying this is an important thing to have, without a tool
11:41
like that a lot of enterprises were left scrambling to figure out if they were vulnerable, which versions they were running, and then as I mentioned, making ludicrous claims like we're not vulnerable because we're way on an old version of Log4j, or asking completely disinterested third parties like the Log4j developers themselves to attest that they're not vulnerable, right? It was kind of crazy.
12:03
Moving on, part of this as well is trying to understand the motivations of developers on an open source project, which again, they took a look at, but I think they could have pursued this even a little bit further. When you work on an open source project, your primary motivation, well, okay, first off, you probably all start as a user of the software.
12:22
You're probably your first step into an open source project was not to write it from scratch. Maybe it is, but in either case, you have some utilities, something you want to use it for. So your first interest is get it running, and get it running correctly. And so you're going to be fixing bugs. You're going to be adding features here and there, right? But adding things that help make the software more secure,
12:43
very rarely do they turn into immediate benefit for you. It's often a thing that's hard to convince your manager is worth doing, right, because it doesn't necessarily affect what the manager sees in terms of like the feature set and the code or hey, it's now fit for purpose or whatever. So there's a lot of sympathy that we can have for positions taken by like
13:02
folks like Ben who wrote the cyber resilience act, who was up here yesterday kind of defending what was written in the act to say, well, maybe there are other forces that need to come into play to help support those kinds of outcomes, right, to act as a forcing function for it if it wouldn't otherwise be there.
13:21
But it's really hard to measure the return on that benefit independently. And if you can't measure the ROI, it tends to get disincented. So as a way to illustrate this, particularly the log for J, I've had conversations with the Mira Montesary, who some of you might know, who's with OSTIF, who's here. We've kind of asked just, hey,
13:42
what would it have taken to do a proper third party code review for security of the log for J code base, right? Just as an independent thing, looking at the number of lines of code in there. And the estimate we came back was $50,000 to $100,000 depending on how deep you wanted to get. Let's say, and one of those would have found all four of those CVEs, possibly more, and with a little bit more money, generously,
14:04
let's say another $50,000 to $100,000, you could have funded the fixes for those bugs and coordinated a disclosure process such that everybody got, or a lot of people would get updated, and then you publish the release. And it wouldn't have been this mad scramble taking place between Christmas and New Year's for a lot of folks.
14:20
So $200,000, which is beyond what I think any of the log for J developers had in their back pocket. It's beyond what I think even the eight or ten of them together would have individually been able to put together or convinced their employers to put in as a chunk of cash. But it was far less than the negative impact that that breach had on society, right?
14:44
I mean, it's hard, no one's actually sat and tried to calculate how much. And when they've started, they've come back with billions of dollars in lost productivity, in breaches, and other things. So trying to play that back and do hindsight, 2020, that kind of thing. Could we have discovered this and fixed it?
15:03
Could we find the next one and spend $200,000 and fix that from being likely to happen? I don't think I could give you the one that that's likely to be. But what if I could give you a list of 200 projects, each of which probably had a greater than 1% chance. Based on their criticality, as you can measure from how often these
15:24
code bases are depended upon by other packages. There's lots of data sources for that, and we at the OpenSSF have developed something called the Criticality Index that'll find that. What if we could find this list of 200 projects based on criticality, based on how well they score by some objective measure of risk?
15:43
And I'll get into this in a little bit. Could I give you that list of 200 and could that likely, I mean more than 50% chance, prevent the next log for J? I would wager yes. And that $40 million, again, is more than any open source foundation has to be able to spend on this kind of work. Even all the foundations together collectively probably couldn't spend that.
16:02
And this is the kind of thing you probably have to do each year in order to have that kind of impact. But $40 million is, I don't mean to sound blase about it, but frankly, pocket change for a lot of governments, especially if we got governments to work together on this, or the insurance industry to work together, or
16:20
say many of the sectors who use this software without lifting a finger to contribute back, right? If we pooled these kinds of funds, we could have an impact like that. Some people just wanna watch the world burn. I had to throw in this, the obligatory slide, of course. But I wanna kind of push forward this theory of change then around, cuz it's not just about spending money on specific interventions like that.
16:43
I'll come back to that in a little bit, how we might rally those kinds of funds and focus on that kind of work. And what we're doing is the open SSF to have it. But I wanna also put forward, it's actually not just about a matter of spending money. It's not just a matter of a mandate from a government
17:02
to get open source software to be more secure, to get our processes and the supply chain to be more secure. There's a culture change that has to happen as well. People are often very resistant to change. When your CI system is running and you're able to put out a new release and turn the crank, and a few hours after issuing, initiating,
17:24
accepting a pull request, you've got a binary. You kinda don't wanna mess it up, you don't wanna change it. And especially the older we get, the more resistant we are to having to learn a new system or change, especially if there seems to be no benefit. But this is not unlike other times over the last 30 years that we've taken
17:41
an insecure paradigm and made it more secure. And my general theory is something called carrots, defaults, and sticks. And the best example I can come up with this is how we went from a completely unencrypted web, where browsers and servers talked clear text HTTP that could be sniffed by anybody between browser and server.
18:03
And got to the point where today, I mean, somebody might know the number, it was like 95% of web traffic is HTTPS. It's actually probably 99% now, based on what's happening recently with browsers. But it didn't start with the browser maker saying, right, on April 1st, we're gonna cut off or send these warning signs about unencrypted access
18:21
at the beginning of the TLS era. It started with incentives. It started with carrots. It started by having that little green key that would show up in the location bar on a browser. It might start by certain folks saying, well, this is something that should be used for banking websites, or for e-commerce websites, or hosted email sites, or that kind of thing.
18:42
And that got about 15% of the web traffic getting encrypted out there, but it started to flatline. And then a bunch of people got together and realized, it doesn't have to be this hard to get a TLS certificate and install it in the right place. We can automate this. We can automate demonstrating that you have control over this domain name.
19:04
And if you do, then to give you a short-lived TLS certificate that can automatically be installed in the right place in the web server. And that service was called Let's Encrypt. And it is now, I mean, for the last ten years, it's been at the point where you can automatically, Apache, when you install a web server and
19:23
tell it to install the TLS kind of version of that or a TLS profile, it'll automatically set up a fetch to let's encrypt for a domain name you give it. And it's automatable out of the box, right? And that is what got us from 15% of the web being encrypted to about 75%.
19:44
And at that point, about five, six years ago, is when the web browser maker said, right, it's time for us to bring up the tail. And that's where I talk about sticks. And to finally get the laggards, the legacy sites, the folks who probably don't care about it, who probably even haven't updated their web server in five years or ten or
20:03
whatever, to finally get off the duff and use Let's Encrypt or some other technique. And they did that by making it progressively harder and harder for you to access a non-encrypted website through Firefox, through Chrome, through MSIE, through other browsers. And they kind of talked amongst themselves how to do that.
20:22
They tried not to piss people off, but you kind of have to piss some people off to do that. And as long as you just kind of progressively roll through, you can kind of bring people along. And there's some who will just forever be pissed off. But that's like the tail end of an adoption curve, right, is this kind of concepts of sticks. We need to think about the same thing when it comes to things like S-bombs or
20:43
signing artifacts in the supply chain or software attestation levels, which I'll get into a bit. But when we think about how to get adoption of some of these security paradigms, it's gotta be through this three step kind of process. We can't just jump directly to sticks,
21:01
which is kind of what the European Union Cyber Resiliency Act attempts to do. And I will say, I think the CRA is a backlash to the Silicon Valley move fast and break things kind of paradigm. This concept that open source software is some sort of reflection of that or connected to that and that we're just as reckless. But none of you all are Mark Zuckerberg, thankfully.
21:20
None of you all, I think, take that degree of recklessness as a badge of honor. I think we're all just completely strapped for the amount of time that we'd really like to spend on making the software as secure as possible. And we need help, we need defaults. And we need, by the way, what's always worked in open source software as a duocracy, which is people showing up and
21:41
doing the work if that's what their primary interest is about, right? If somebody can sit on stage here and say, it's absolutely essential that this French nuclear power plant only run open source software that has been certified against a whole bunch of cyber security requirements. It's on them to do that work, not on the log for J developers or others.
22:01
So this is where OpenSSF comes in. We were started in 2020, kind of as a result of a small kind of gathering that had been hosted on the West Coast, people working on software projects that had to do with enhancing the software development processes in the open source community to be a bit more secure.
22:21
It was a mix of a bunch of different pieces of software, suggestions of protocols, building on some of the SBOM work that had been actually championed first by the licensing community, by the software licensing community. This is in particular a standard called SPDX for SBOMs. But they kind of realized that collectively what they were doing was
22:43
building tools that would help try to measure risk in open source. And what does that mean? It means measuring the likelihood that there will be a new undiscovered vulnerability in this component and the impact that that would have downstream, right? Measurement is essential. If we can't measure whether we're improving the overall risk in that chain
23:02
and the collective risk in our use of that software, we're not gonna know whether the interventions that we're trying are actually meaningful. So you've gotta measure it. You've gotta then think about this sequence of carrots and defaults and sticks to eventually get this stuff adopted, if it's any good, right? And then finally, as part of this culture change,
23:21
are there things that we should be learning as open source developers, things that we should be thinking about as a professional type of operation, like as a diligence of care, as something that as engineers. And that term used to have to go and take a certification exam to call yourself an engineer, right?
23:40
And that instilled a sense of professionalism in that industry that led to bridges that didn't fall down when you hired an engineer to design it. We need a little bit of the same professionalism in software development across the board, not just open source. And here are a set of resources that might help us, from a security point of view, be better developers. So collectively, we wanna put these pieces together.
24:02
And we've got all sorts of projects in working groups, projects organized by thematically related working groups, a working group on best practices and documenting those and advocating for those. A working group on identifying security threats, understanding, relatively speaking, what are the areas to really worry about
24:23
in the areas that might represent low threat. How do we think about supply chain integrity, like that chart I showed? How do you get those pieces, those opportunities for bugs to be inserted to just be locked down and hardened? How do we think about, the CVE system is not great, frankly.
24:43
It's nowhere near perfect. It's not great for trying to automate and understand, given this collection of software I use, where are the known vulnerabilities and how easy is it to remediate them? Are there known vulnerabilities that just don't matter because I'm not using them? And so there's an entire working group focused on vulnerability disclosures
25:01
and on the vulnerability system that has a bunch of new ideas for this, but also developed content to try to help developers simply be better at coordinating vulnerability disclosures and updates. We've got another working group focused on, once you've identified those critical projects, well, securing and identifying critical projects, and that's where we've defined the criticality score.
25:22
We've done some work with Harvard Business School to understand quantitatively how are things being used by enterprises and where might the next log for JAB be lurking, so to speak. And then one of the most important things we've got here is we've pulled together the architects and the people responsible for product
25:41
at many of the major security repositories, NPM, PyPy, Maven Central, because if we're going to get anything improved throughout the chain, you need to involve the last couple hops of each of the nodes in that chain, which are the distribution points, and there are things you can do there to encourage more secure alternatives and eventually have the stick to say, well, no, we're not going to accept,
26:04
you know, things like you might need to enforce two-factor auth for the more popular packages, right, which has been controversial to say the least, but is one of those things where it's like somewhere in that adoption curve we need to start nudging people into a more secure direction. But all of these pieces work together.
26:22
And if you are an open-source software maintainer, what I'm going to walk through now is a set of specific things coming out of the OpenSSF that I'd love you to adopt. I'm not going to be able to talk about all the features of each, just given time, but we've come up, the very best starting point you can start to consume,
26:40
the very first piece of thing that you can get from the OpenSSF are two concise guides. One that we've developed for evaluating open-source code when you're out there looking at packages and you're trying to figure out is this community likely to have processes and is there likely to be an undiscovered vulnerability lurking in this code that I'm about to use, right?
27:01
Is this a well-engineered, well-maintained active community that has adopted the right practices and the like, or is this a one-off that was developed by one person in a hurry, thrown up on a repo and not well-maintained? We've got some ad hoc cues that we can use that most of us have, but how many of you use GitHub stars as your basis
27:23
for deciding whether something is probably secure enough or not? I'm going to guess probably too many of you. So there's a bunch of subjective criteria that you can use. The flip side of that is if you are a maintainer and you are pushing code out, here are the signals you can send to your consumers of that software
27:44
that show that you're taking this stuff seriously, right? And it's a bunch of best practices, adopting multi-factor auth, taking the courses on secure software development, using a specific combination of tools in the CI pipeline, thinking about how do you get to the point of doing rapid updates
28:01
without throwing curveballs to your users because you change APIs all the time. That's the number one reason people don't update is that they assume, even in minor point releases, something is going to break because somebody took an API and marked it not only deprecated but removed it or changed a field that just sends things sideways or changed behavior in a way that they thought was compatible but was not.
28:23
How do you get to the point where you can have more rapid updates and make it easier for your end users to pick those up? Now, some of these ideas were elaborated upon in a course that we built within OpenSSF and have offered for free now through the Linux Foundation's training department called Secure Software Development Fundamentals.
28:42
We're translating, this has been translated to Japanese, it's been translated to a bunch of other languages, Chinese, Arabic, and Hebrew. And this is 14 to 18 hours worth of content that primarily talks about anti-patterns, right? What does it mean to not trust user-contributed input, right? What are some of the other common gotchas
29:01
that have led to security vulnerabilities and breaches? That, as most software developers are self-taught, some people take courses, but even most university, you know, undergraduate level courses on computer science don't really teach about vulnerabilities and about common mistakes as well as they could. So this is something we think anybody who is writing code
29:21
for a living or even for a hobby and you're giving somebody else to run, you should probably take this course. And the flip side of this is you might want to look and see whether the developers who are working on a thing you really care about have taken this course. You can get a badge that certifies you've taken the course, you've answered a basic quiz.
29:40
It's not onerous, and it's free, but we hope it's something that helps substantiate that somebody is a bit more knowledgeable about this than they otherwise would be. Another part of this is something called the best practices badge. This is a checklist that's fairly extensive of the things that open source projects can do to show that they take steps. They have a security team.
30:02
They distribute things over HTTPS. Some of these things that seem pretty basic and each individual one is no guarantee of its security hole-free code, but collectively can represent that this is a project that takes security more seriously. And studies have shown that the projects
30:20
that have better scores tend to have fewer CVEs over the subsequent months and years. Now, there's an automated tool for those projects because the best practices badge is something that requires the maintainers to fill out a questionnaire, a checklist. There's some automation to that, but it's really just used to check the answers that the maintainers give.
30:41
There's a different approach, which is much more of a scanning kind of approach that called the OpenSSF security scorecards that automatically goes and scans repositories. It's gone and done a first wave scan of a million different repos, and you can trigger it to do an updated scan if you've made changes to your repo, but it takes dozens of different heuristics,
31:03
things like, do you have binary artifacts you've checked into your GitHub repo? That's probably not a great thing. Storing binaries in repos, I don't know how many of you might disagree, but for reproducibility, maybe, but use your package manager to get your binary packages.
31:21
Checking it into source code control opens the door to things that are not scrutinizable inside your source code system. Branch protection, CI tests, do you have the best practices badge? Have you done code reviews before code is merged, or does everybody just have commit proofs, right? Do you have contributors from more than one organization?
31:42
So some of this adopts the chaos metrics, which looks at community health, but some of this as well are things like, do your automated tests, do they call fuzzing routines, fuzzing libraries? Now, you game a lot of these tests. Again, none of this is proof that your code is diffract-free, but collectively, what this can do,
32:03
along with these other kind of measures of risk, is develop a, essentially develop a credit score for a project. And the scores in Scorecard actually do correlate to lower CVEs. There was a study done by Sonotype who looked at the projects that had been scored
32:21
and discovered that after receiving a score, there's a couple different categories within the security scorecards, so they really were interested on which of those correlate most strongly to lower number of CVEs, so that was an interesting outcome, and this is going to be used to refine the security scorecards continuously over time
32:41
to have them reflect kind of the changing landscape of some of the better run projects, but that's a big deal, and it's something that some projects have picked up as a leaderboard tool. The Cloud Native Compute Foundation ran a kind of a competition recently called the CLO Monitor, kind of on the sidelines for a month
33:01
of their main KubeCon event, where they got the maintainers of the different projects to commit to have a floor on the score for, I think it was six or seven out of 10, for all of their projects, right? And have kind of a competition between them and with rewards for the maintainers who got their projects highest on the scorecard,
33:23
so really cool to see. So, all of that was about measurement. This now, the next set of things are about tools that help actually harden the software supply chain, so to speak. And one that you've heard, no doubt, talked about before, so I won't dwell on it too much, is something called Sigstore.
33:40
And Sigstore is a software signing service, it's a set of software that are clients into that service, it's a protocol, because it's a certain way of signing it that's an alternative to GPG signing of code, and it's a recognition that we haven't really signed artifacts to the supply chain pervasively, except for the very end.
34:00
You know, when you do an apt-get install, it checks the GPG signatures on each package, and okay, that is helpful even above and beyond the fact that you're sending stuff over TLS, right? But for the rest of upstream, so often people are just pulling off of NPM, pulling off of package hosting, other packages just stored on bare websites,
34:21
where even validating the hash of that and fetching it over HTTPS doesn't prove the connection between the developers behind that code and the binary you have. There have been examples of people registering NPM packages that are named the same thing as a GitHub repo, kind of a typo squatting kind of attack, as a way to try to cause you to inadvertently
34:41
pick up the wrong piece of code. Obviously, even sites like GitHub can be hacked, could be compromised, and you don't want your enterprise to be compromised if GitHub's compromised, frankly. So this is a tool to try to prevent that kind of thing, and it logs all of this to essentially
35:00
a distributed ledger, a public database using short-lived keys, so you don't have to worry like PGP requires you to have private keys that you battle and keep private for a long time. Just like Let's Encrypt, this is based on short-lived keys and an easy way to re-provision those keys. I won't go into much more depth on that.
35:21
There's a few other things I'll point you to, something called SALSA, which is for supply chain level attestations, basically a way to distinguish between those things in your chain that are built to a higher degree of rigor than other things that are not. We've got another, something back in the best practices working group, which is a guide to coordinated vulnerability disclosure.
35:42
So the next time there's a team, say they're not associated with Apache, say they're not associated with a major foundation who discovered that they've got a pretty nasty bug and their code is used pretty widely, well who do they turn to to understand how to manage a coordinated disclosure process,
36:01
how to not get in trouble for keeping something a little bit quiet while they've come up with the right fix, and then how do you evaluate who potentially to notify ahead of time so that you're upgrading enough of the internet before something becomes more widely known. And by the way, even if that sounds controversial, there's some people who say, including the CRA, like if you know about a bug,
36:20
you should tell everybody immediately. Well, does anyone remember the hack that almost brought down the internet? The DNS cache poisoning bug in Bind that if that had become widely known before the root name servers had been updated, would really, I mean would have set us back so far, we would have had to revert to Etsy host files to be able to connect on the internet again
36:41
and get started. So the need for coordinated vulnerability might be somewhat controversial, but has become much more widely accepted today than ever before. And one thing we're going to do in the OpenSSF is pull this all together into kind of a single dashboard to understand that risk, understand how open source projects
37:02
compare apples to apples, and really as a tool to help the maintainers in those projects get better at what they do. But also, frankly, if we can help enterprises understand where the risk lies in their use of code, if they can start making more choices based more on the security of that code than necessarily what is the most features
37:21
and the most users, then we can start to, I think, bend industry in a better direction, somewhere between the carrots and the defaults kind of step of getting folks to adopt stuff. I do also want to throw out one of the biggest efforts that we have under the OpenSSF is this thing called the Alpha Omega Project, which is independently funded
37:43
and staffed, and it's got two pieces to it. The alpha side is going and helping the largest open source foundations out there with the most critical needs basically develop better security practices, develop security teams, go and do some proactive third-party audits, but develop this muscle,
38:01
this capability that hopefully persists. Even, you know, we'll go and we'll fund some of these projects for a couple years, and our hope is that at some point the stakeholders in that community take on that funding themselves, right? The companies who are depending upon that see the value of this kind of proactive investment and continue it forward so we can move on to the next set of projects. The Omega side of that
38:20
is trying to set up scanning infrastructure for the most important 10,000 open source code bases to look for new kinds of vulnerabilities, to ask, you know, in theory, this Gindi LDAP bug in Log4j that led to this thing, is it novel? And if it's novel, can we systematically scan for other projects that might be vulnerable to the same thing? And in some cases,
38:40
could we even submit proactive pull requests to go and close those? And an example of another organization that has done this recently, I don't know if folks saw this announcement by Trellix last week, where they went and discovered the 60,000, 61,000 open source projects that used the Python tar file module
39:01
in an insecure way. This is actually an old CVE from 2007 that the CPython devs have refused to fix because of a claim that the only way to fix it would be to break POSIX, so we can have that debate some other time. They went and found 61,000 projects that have a vulnerability because they used it unsafely. They didn't sanitize inputs to it.
39:21
They went and proactively issued 61,000 pull requests on those projects to fix this code. Doing this at scale is tremendously hard, but they did it. So far, after a month and a half of having those pull requests up, do you want to guess how many projects have actually accepted that pull request,
39:40
literally pushed the button to make their project more secure? 1,000. And so we still have like a culture problem here. We still have an incentives problem, even when you've given them this gift, you know, it might not happen. So we're kind of really towards the end of time. I just want to say, you know, we've recognized that,
40:01
as the point I made earlier, you've got to show up. If you're an organization that cares about increasing the security of code, you've got to be prepared to invest time and ultimately money to make that happen. You cannot demand that open source developers simply be better. You've got to go and help them do that and spend that in. And so one of the things we've done
40:21
at the OpenSSF is over the last year, we developed a kind of an overarching plan to go and address 10 systematic weaknesses in open source and put together essentially business plans for each of them that would call for some funding to pay for kind of like the core of a project on the presumption that we could leverage volunteers around the periphery to try and have some of this impact.
40:42
And I won't go into detail of what that is too much. We call it the security mobilization plan. It includes things like 40 million dollars to go and close security holes and some other things. It includes setting up an emergency response team for development teams that are kind of understaffed who find a vulnerability
41:01
to going and doing new scanning to driving adoption of SIGSTOR and other standards. So it's pretty comprehensive and it might seem like 150 million dollars which is the number we came up with after saying what could we do that would be lean, that would be kind of low-hanging fruit but actually have a big impact. And we came up with this two-year number of 150.
41:20
It might sound like a lot of money. It's certainly more than the Linux Foundation has. It's more than any of the other major open source foundations or even frankly Google and Microsoft have to spend on this arguably. But there's a larger number out there that I want to focus on which is 700 million dollars which is the fine that the US Federal Trade Commission levied on Equifax for the 2007 data breach
41:42
caused in part by their use of unpatched open source software Apache Struts. So to the industry, making the case that we collectively could pool our funds to go do this should be easy, right? And we've been out there trying to do it and have these conversations doing it at a time when the economic headwinds have not been in our favor.
42:01
But still the kinds of conversations we're having are very positive and things like seeing the Sovereign Tech Fund in Germany pop up and be able to fund not just some improvements in open source code but security enhancements and the like has been really positive to see. And it should be a model for other countries to go and do this. But frankly insurance companies
42:20
as well as banks as well as all these other industries that have benefited from open source software and haven't put things really in. And again we're kind of out of time so I won't go too much into depth. I do want to emphasize tooling around the SBOM space is an important part of this as well. And being able to paint this overarching picture about how SBOMs, signed artifacts,
42:42
software level attestations and all these other things could have a positive impact. But we've got to show up and not just tell projects to adopt this but weave it into the development tools as defaults so we can bring industry along. We've launched this with U.S. government back in May. We had a similar meeting in Japan in July.
43:01
We've had conversations in Singapore and with people in other countries and hopefully we'll see something here in Europe along the same lines. But let me just wrap up by saying there are attacks on the integrity of open source software that are increasingly disruptive and require us to work together
43:21
to do things a little bit differently than we have before. It's not that we were reckless or careless or didn't care about security. Most of us at least there's some open source projects that definitely have been problems. But there's simply more and more that people who care about this and are starting to use open source software in French nuclear power plants can do to help this space be better.
43:44
There's some specific steps as I've talked about and that's kind of why we're here at the OpenSSF and we're here to help but we also need your help as developers. It'd be very easy to see what we do as being kind of the enterprises trying to make open source boring and make it all about checklists
44:00
and I'll concede that, right? But we're also here to try to say if we're going to shift the landscape how do we work with the open source community to do that and because we are part of the open source community how we collectively take some action to make it less likely that our next winter holidays will be ruined. And with that, thank you.
44:28
I think I left about 22 seconds for questions.
44:45
So we've got about five minutes for questions. So let me start by reading one question online. There was a question whether OpenSSF allows for anonymous reporting which could be without disclosing the reporter because that would be useful for companies
45:01
fearing backlash or high expectations regarding what they have to do. So why bother reporting if it's just going to get into trouble? Why not just keep it not release anything? So making it possible to report anonymously would possibly improve that. So the question is about anonymous reporting of bugs
45:20
and would that be helpful? Well, I will submit open source software has benefited tremendously by the fact that you aren't badging in to GitHub with your national ID, right? You are able to be pseudonymous and you can use Satoshi Nakamoto as a famous example of that, whatever, right? But most of us or many of us have probably created login IDs
45:40
that have nothing to do with our real name, right? And open source software is one of the last remaining places where you can actually productively collaborate with people who aren't fully identifying who they are, right? You're basing it on their quality of their contribution and that's a really essential thing to try to preserve. And whether it's in reporting bugs or even collaborating on code
46:00
we should fight to preserve the right to be pseudonymous or even anonymous if you want to call it that in the development of code and in the reporting of bugs and the fixing of bugs. So I'm committed to trying to make sure we don't get to know your developer kinds of rules like some countries have started to call for. Thanks for a really interesting talk.
46:21
I have a question about the economics of your proposed solution. It's not that you can pay developers out of the blue to do these tasks. You have to pull them away from other work so you have to pay them actually more to let all the other work rest and focus on security.
46:40
So this requires a major shift. Have you factored this into your proposal? No, I haven't thought about having to pay not just for the work that we're doing but paying for work that wouldn't be done because we're paying people to do the work. I think most of the time developers aren't able to work on open source code because they have to work on proprietary code to pay the bills. And so I think there's a lot of capacity out there
47:02
for us if we do have funds to be able to pay for the kind of work that needs to be done. I also don't think we're talking about taking away from other software development work that's about adding features or fixing bugs. This is about bringing new types of organizations like the software auditor community to look at, to find new bugs.
47:22
So it's not a big deal. And frankly, 150 million dollars even if that were all spent on software developers would be a drop in the bucket compared to the total amount that is spent on developer salaries out there. So that doesn't worry me. But it's a good question. Thank you for that. Do you have more questions from the public?
47:46
There was one question in the chat about how is the OpenSSF collaborating with OWASP? Is that a collaboration there or not? So there's a question in chat on how is the OpenSSF collaborating with OWASP?
48:02
So Andrew Vanderstock, who's the executive director of OWASP, is on our board. There's a lot that OWASP does in terms of education and certification and community building. Absolutely is essential. And so we look for ways to work together and we'd like to avoid overlap with what they do. But yeah, that's about it.
48:21
We're very complementary to their efforts and I think they think the same. Okay, anybody else?
48:43
Hello, we are in Europe here. You have spoken about different conferences all over the world in Singapore. But do you have something in Europe? We've had lots of conversations in Europe. There are many OSPOs starting in Europe who are interested in this.
49:01
And I do think OSPOs are an interesting lever point in being able to get standards adopted around security, being able to present measures of risk, to be thinking about all these kinds of things. So have had interesting conversations that way. Have not had the kind of full-throated engagement around this that we've seen with the United States
49:20
and with some other countries. So I would like to see more. But frankly, even those other countries haven't yet put money into this. We're kind of waiting for certain political cycles to make their way through. But I know we've inspired some action. Again, I do want to cite the Sovereign Tech Fund out of Germany as not specifically a security fund, but the right kind of thing for these countries to be doing.
49:41
So anyways, thank you for the question. Anything else? That's exactly how much time we've had. So thank you again, Brian. Thanks all.