We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Handling Security Flaws in an Open Source Project

00:00

Formal Metadata

Title
Handling Security Flaws in an Open Source Project
Title of Series
Number of Parts
561
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
10
58
80
111
137
Thumbnail
15:21
159
Thumbnail
18:51
168
Thumbnail
26:18
213
221
Thumbnail
15:22
234
Thumbnail
49:51
248
Thumbnail
23:06
256
268
283
Thumbnail
28:38
313
Thumbnail
1:00:10
318
Thumbnail
21:35
343
345
Thumbnail
36:13
353
Thumbnail
18:44
369
370
373
Thumbnail
44:37
396
Thumbnail
28:21
413
Thumbnail
16:24
439
455
Thumbnail
25:10
529
Thumbnail
15:36
535
Thumbnail
28:04
552
Open sourceInformation securityOpen setGoogolProject ZeroComputer fileProcess (computing)EmailPatch (Unix)Vulnerability (computing)GleichverteilungVideo trackingConsistencyAliasingSelf-organizationDistribution (mathematics)Error messageCodeMathematical analysisFluid staticsSuite (music)Software testingInformation securityOpen setProcess (computing)Coefficient of determinationSoftware bugEmailQueue (abstract data type)Open sourcePatch (Unix)Office suiteProjective planeCASE <Informatik>Covering spaceConsistencyEvent horizonCodeComputer programmingBit rateCodeSoftware testingLevel (video gaming)Fluid staticsMultiplication signGoogolKernel (computing)FreewareProduct (business)WindowAliasingVulnerability (computing)Self-organizationData managementNumberSoftwareRevision controlDistribution (mathematics)Traffic reportingQuicksortDependent and independent variablesMessage passingAssembly languageElectronic mailing listNeuroinformatikService (economics)Virtual machineDemonMathematical analysisPresentation of a groupObservational studyLatent heatSoftware suiteStreaming mediaDifferent (Kate Ryan album)PhysicalismOvalWeb 2.0Computer animation
Vulnerability (computing)Broadcast programmingEmailInformation securityFrame problemTerm (mathematics)PredictabilityGame theoryCoordinate systemLimit (category theory)CodeRemote procedure callCommunications protocolProduct (business)Operations researchEvent horizonComplex (psychology)Data storage deviceEntire functionLeakInformation securityCodePasswordRevision controlGame controllerVirtual machineEntire functionNumberDatabaseLine (geometry)CASE <Informatik>Observational studyPoint (geometry)EmailProjective planeSoftware bugSoftwareTraffic reportingCommunications protocolMereologyScripting languageGroup actionPointer (computer programming)Crash (computing)MathematicsProcess (computing)Scheduling (computing)Metropolitan area network1 (number)Domain nameMultiplication signOpen sourceConnected spaceSingle-precision floating-point formatOnline helpVulnerability (computing)Game theoryPredictabilityTelecommunicationYouTubeEuler anglesMultiple RegressionPattern languageRemote procedure callFormal languageCode refactoringDenial-of-service attackDesign by contractTheoryScaling (geometry)WebsiteStress (mechanics)Right angleLevel (video gaming)Ring (mathematics)Speech synthesisWindowRootDemonExploit (computer security)Goodness of fitAssociative propertyFrame problemProof theoryDirection (geometry)Procedural programmingComputer animation
CodeInformation securityData managementService (economics)outputError messageSign (mathematics)Module (mathematics)System programmingRevision controlFormal languageLogicProject ZeroGoogolProcess (computing)Server (computing)Generic programmingSpacetimeGoodness of fitSoftware bugModule (mathematics)Structural loadConnected spaceMiniDiscRouter (computing)InterprozesskommunikationPatch (Unix)Projective planeLogicInformation securityDisk read-and-write headSoftwareSound effectMechanism designSoftware testingData storage devicePoint (geometry)Error messageEmailSystem callShared memoryCASE <Informatik>NumberCodeConfluence (abstract rewriting)Directory serviceCondition numberRevision controlConnectivity (graph theory)Traffic reportingCommunications protocolDatabaseFlow separationLibrary (computing)AreaMathematicsObservational studyLevel (video gaming)Data managementWindowTwitterTouch typingExploit (computer security)Service (economics)Buffer solutionClient (computing)RootElectronic mailing listHidden Markov modelProcess (computing)Mixed realityRight angleBuffer overflowMultiplication signView (database)MereologyComputer animation
Process (computing)Modul <Datentyp>SoftwareModule (mathematics)GoogolProject ZeroExtension (kinesiology)Revision controlDecision theoryInformation securityPressureCodeStrategy gameComplex (psychology)Software developerSanitary sewerImplementationCommunications protocolTask (computing)Formal languageError messageLogicStrategy gameIntegerSimulationLink (knot theory)CodeWritingSoftware bugBuffer solutionInformation securitySweep line algorithmSoftware developerMultiplication signPoint cloudVolume (thermodynamics)ImplementationAcoustic shadowCommunications protocolTracing (software)Datei-ServerSoftwareBit rateFreewareCombinational logicLogicMassServer (computing)Video gameProcess (computing)Product (business)Open sourceSpacetimeMereologyError messageDirectory serviceConnectivity (graph theory)Traffic reportingLevel (video gaming)Module (mathematics)Web 2.0Generic programmingExtension (kinesiology)Formal languagePressureDecision theoryPlanningRevision controlFigurate numberWindowReal numberProzesssimulationSanitary sewerBuffer overflowSoftware testingService (economics)NP-hardMobile appProjective planeShared memoryComputer animation
Open sourceCodeFormal languageInformation securityLogicError messageOpen sourceInformation securityProjective planeSoftware bugPoint (geometry)CodeBitMessage passingData structureGoodness of fitAddress spaceEmailElectronic mailing listCore dumpFilter <Stochastik>Exploit (computer security)Remote procedure callLatent heatCommunications protocolVolume (thermodynamics)Acoustic shadowSpeech synthesisView (database)Dependent and independent variablesPatch (Unix)Level (video gaming)Directory serviceAliasingGeneric programmingUltraviolet photoelectron spectroscopyNumberImplementationRight angleTurbo-CodeInternetworking1 (number)Process (computing)Domain nameEvent horizonRepository (publishing)Computer animation
CodeFormal languageInformation securityLogicError messageGodNumberDisk read-and-write headCASE <Informatik>Position operatorOpen sourceSoftware bugCodeEuler anglesInformation securityProduct (business)Electronic mailing listRevision controlLevel (video gaming)Expert system1 (number)Multiplication signMathematicsEntire functionProjective planeKernel (computing)Goodness of fitPresentation of a groupLeakCodeAreaService (economics)YouTubeWeb 2.0Mathematical analysisDivisorSoftwareCore dumpLine (geometry)Instance (computer science)Reading (process)RootDependent and independent variablesComputer animation
CodeFormal languageInformation securityLogicError messageOpen sourceGoogolPoint cloudComputer animation
Transcript: English(auto-generated)
So as you may have noticed, this is not the talk that was originally built in this time slot. So in the best tradition, this is not the talk you're looking for.
You can go about your business. Move along. So unfortunately, the previous speaker wasn't able to be here on time. And so I was on the booth outside 15 minutes ago when someone in an orange jacket frantically ran up to me and said, we need a talk. Do you have a talk?
And I was like, what? I got something from tomorrow. I don't want to give it twice. So here's a talk that I gave in the best blast from the past. Here's a talk that I gave from the Linux Foundation open source leadership event in Napa of last year.
So first of all, I should probably introduce myself. My name is Jeremy Allison, and I work for Google's open source programs office. One thing I really need to be very, very clear about for this talk, just in case there's anyone from the press here who's going to give me any trouble, this is not a Google-badged talk.
And you'll know why later on when you see it. No Google lawyers have reviewed this text. Google lawyers would run screaming from the room. There are some Google lawyers here at FOSDEM. I'm hoping none of them are in the room. Google lawyers would run screaming from the room if they knew I was giving the talk. So please do not ascribe anything in here other than
the deranged lunatic ravings of a madman on stage. This has nothing to do with Google. So Google pay me to work on free software full time, which is very nice of them. I'm very happy about that. And my main project that I helped create back in 1992 is
Samba, which is the open source free software, Windows interoperability software suite. And we've been going for a very long time. We actually predate the Linux kernel, I think, or around the same time. And we've had some interesting security issues, some of which
I will talk about in this presentation. So in the best tradition of tell them what you're going to say, tell them the thing that you said, and then tell
them what you told them, here's what the presentation is about. It's actually three case studies of three specific flaws in Samba, how we manage them. They're all kind of interesting because they're all actually different kinds of problems.
Security is a very vast and complicated subject. But these are three rather interesting case studies of things that went wrong, and I'll go into why they went wrong, how we dealt with them, what you do about it, and hopefully how you can avoid such things happening in
your projects, and how you can do better than we did, which to be honest, wouldn't be that hard, really. Oh, one more thing. If anyone would like to ask any questions within the talk, please feel free, put your hands up. Somebody can run to you with a microphone. And I'm really happy to take questions in the middle of the talk.
I don't want to be standing up here on stage droning on for 40 minutes. So feel free to stick your hands up. I'll repeat the question for the live stream, et cetera, et cetera. So how was security handled back in the good old days?
So this was great. I actually remember this happening. This was wonderful. We are proceeding the Samba mailing list, and somebody said, I can exploit Samba. Reported the bug to Tridge, Andrew Tridgell, who is my partner in crime living in Australia. And because the mailing list processing for Samba was
running on his personal machine, he just stopped the mailer demon when he got the message. No one else was working on it. Fixed it. It took him about 30 minutes to find and fix the floor, and made damn sure that the very next email that came out on the list contained the patch. So that's the kind of service we used to provide back
in the good old days. The first thing that anyone knew about the bug was when the patch had appeared, and we were very smug about how well we handled security and how safe we were, how little we knew. And then he restarted the mail list processing. Things are a little more complicated these days, as you
can imagine. Excuse me, with free software used in rather a few more places. So how should you handle these kind of flaws? So the first thing you really have to put in place is process, process, process, process.
You have to have a process. It doesn't have to be a good process, but it has to be something that everybody knows that you can follow. If you don't have this, you essentially have no security management. Now, you can start by just having an alias on your
project. We have security at samba.org. Everything that comes into that will end up in my inbox and the inboxes of various other people on the project. And we guarantee a response within 24 hours. We're a distributed team around the world. We can normally get back to someone within 24 hours. We can't necessarily fix it, but we can acknowledge the
problem and say, yes, we're going to look at it. You really, really do need to have that, because otherwise people are sending messages into the void. I'm reminded of a story, I'm not sure how true it was, that the original security at Microsoft Alias went to the people who handled physical security on the Microsoft campus. So they were getting all these bug reports about
critical flaws in Windows, and they were like, well, it's not my door, and deleting the messages. So this is, you have to have coverage. It's hard if you're volunteers, like the samba community is, but you have to have at least any semblance of professional response.
You must have this. The other thing, how many people here have heard of CVE numbers? OK, most people. So CVE stands for Common Vulnerability and Exposure. And they're basically numbers assigned by an organization. I think it's MITRE in the US, or CERT, Security
Emergency Response Team, Computing Emergency Response Team, and essentially that allows people who are using your software to track the bug throughout its stages, which versions it's fixed in, et cetera. So the thing to do, what we normally do in samba is we ask our friends in the Linux distributions who ship our
software, we say, hey, can you guys assign us a CERT CVE? We'll get a number. And then everything to do with that bug has that CVE number attached to it. The patches have the number attached to it. The release notes have the CVE number attached to it.
The bug report has the CVE attached to it. So it's a way of searching and tracking for anything related to that specific bug. As I say there, it doesn't have to be a perfect process. No process is. But it has to be consistent. People have to know what to expect. You can't sort of respond drastically and massively to one bug and then leave another one in the queue for three
months and not do anything about it. You must at least have consistency in the way you respond to these things. So the old myth, well, the code's open. People are going to find the security clause. They're just going to look at the code. And magically, all the bugs will be solved, right? Well, no, not anymore.
Maybe that was true once. Now people want money for doing this stuff, strangely enough. So here's where your automated tools come in. How many people work on open source projects that run under Valgrind and make sure they're Valgrind clean? OK. So that's not so many.
This is a very, very important tool to use. Fuzzers, static analysis. We're very lucky in that Coverity, which is a large American company that basically sell a web service, they decided to use Samba as one of their test cases for how they could advertise their product in open source code. So we came to kind of an uneasy relationship where they
would find security bugs. And I would give them a quote saying how wonderful their product was, which kind of worked for us. But you won't always be able to get that. But if you can, run these. There are some nice open source static analyzers that are coming out now. But try and get as much code coverage as you can. And treat the reports seriously.
And hopefully, code reviews from people who have security experience, they will start to catch the worst mistakes that you might make. And trust me, if you don't have any security experience, once you start a shipping code in a Linux distro, you'll get some.
Because people will find bugs in your code, especially if, like Samba, if you're ending up running as a root daemon. So how many people here have GPG email set up? That's amazing. I know I'm speaking at FOSDIM when I have more than two people put their hands up, and now I'm one of them.
Yeah, so this is great. Most people don't know how to use GPG. Most people can't use GPG. But at least it's the baseline in the security world. So whenever we get a GPG encrypted vulnerability report coming in, we know it's serious at that point. Oh, OK, they could figure out how to use GPG. They probably know what they're doing.
So at that point, you can actually unpack it, take a look at it. Keep transparency as much as possible with the bug reporter, even if they appear crazy. There's a security person after all, so many of them are crazy.
So don't try and hide anything. Don't say, well, that's not really a problem because if it really is a problem, then say so. Admit your vulnerabilities. Be honest with them. If it really isn't a problem and you're speaking with a crazy person, then say so. Say, hey, this is not a problem. Prove to me it's a real vulnerability.
Send me something that could actually exploit this. There are a bunch of, I don't want to use bad language in the conference, but a bunch of people who I would describe using, let's say, some strong epithets who sell security vulnerabilities for a living, and
those people are lower than low. There are people who, as part of whose contracts, they will actually sell a vulnerability, and as part of the contract that they used to sell it, they will say, and you must not inform the open source project whose code it is. There's nothing you can do about those people. They are the same people who will break into your car and
steal your radio. They're just acting on a more sophisticated scale, so you have to ignore those people. So your internal and external time frames can differ. Obviously, internally, we try and be much more aggressive about fixing serious security vulnerabilities.
Externally, we try and promise less, because no security bugs are as simple as a one-line change. Actually, that's not true. In fact, one of the case studies I'm going to tell you about is a one-line change, but it took months to get the code out there. So try and stick to a schedule, and the main thing
that you try and provide to your users of your code and distributors of your code is reliability predictability. So you want to get the reputation of those guys take security seriously, or women, or whoever's in your group, they are reliable. You can depend on them to get the fix done, and they are
predictable. If they say they're going to do a fix in two months, you can guarantee that there will be fixable in two months, so people can stick to schedules. And if you can do that, then obviously having bugs in your code is terrible, but you can usually survive. Your reputation will survive if you can do those things.
If you get somebody saying, I have a vulnerability in your code, ha ha, I think it's somewhere around here, don't let them play games with you. Insist on reproducibility. If they have an exploit, get the exploit. You don't have to publish it.
You can stick it in the bug report. You can set the exploit hidden, but you really need to know and understand exactly what it is. It's very tempting, especially in the beginning, to race for the easy fix. Say, oh, I can fix that with a quick refactoring of this one-line change here. What? We're done. Yay.
Let's ship it. We used to have that attitude a long time ago. These days, what we do is, yes, we'll still do the, I'll refactor here, one-line fix. We're done. Yaha. Then we go looking in all the rest of the code for all the places where we use the same pattern, where we might have the same issues. And we try and do a full regression analysis.
Git makes this easy, obviously, because you've got Git blame, of how the code went in, where it went wrong, why it went wrong, and maybe how you can, in the future, avoid making the same mistakes. Look for it everywhere. The other thing that we used to do that we don't do
anymore is only fix security bugs in a security release. If you're putting out a version of your code, and you say this is xxx, and x.y.z, security release, don't fix any other bugs. It doesn't matter how tempting it is. It doesn't matter if it's, oh, but we're having a crash here.
We need to fix that as well. Don't do it. Only fix, even if you have an unusable release, just have the unusable release with the security fix on top. Let's keep it unusable. You only fix security bugs in a security release. That way, people who are backporting, who are looking for bugs, who are associating CVE numbers with particular
changes, they have a much more manageable and isolated job. And that's usually for the Linux vendors who have to support versions, maybe versions of your code that are very, very old that you don't even support anymore. We have this with Samba.
We have the resources on the team to support the latest release, and two previous ones, one previous one, and then one further back in that security. We don't have the resources to go back all the way to the beginning of time, even though there are people out there running very old versions of Samba. What we'll do is we'll try and give them best effort help, but we can't promise anything.
We can't promise that we're going to fix every single thing. And if partners are willing to do that, Red Hat are really good here, Suzy are really good here, you can accept their help and produce. You can be the central coordinated hub for all of the issues.
So case study number one, which is hideous security disaster. Does anyone remember a bug called badlock that was horribly named? Yeah, probably not. Good. It was my fault. So badlock happened while I was up in Redmond working with Microsoft.
Things have changed greatly. So we were working together. Somebody was running a fuzzer on the Samba code. They crashed in a remote procedure called DCRPC code. And so I sat next to a Samba team engineer. I looked at the bug. It was an obvious null pointer in direction. I went, oh, denial of service.
Just check not null, fix it, commit. Got my peer review. Oh, OK, bug fix. We're done. That night at 3 AM, I woke up in the hotel room thinking, wait a minute. That pointer can never be null. That's just impossible. How did that get there? So I went into work the next day and kicked over a rock
and just kept on digging. So eventually, I was calling a guy in Germany discussing things on the phone. A day later, he emailed me back and he said, oh, it's worse than that. And he produced a Python script that was a thing of
beauty which we had still never published. What it allowed you to do was to pretend you were a domain controller, man in the middle any connection to the domain controller, create a dummy machine account, suck down the entire password database, and
because he's German and very polite and efficient, delete the missing user and then continue. It did that in about two minutes. You would never even know that code had been on your network and it had sucked down your entire password database. It was actually used later on to, I believe it was an Italian security company that used the entire email
database that was compromised and the company was basically annihilated. That was essentially bad luck in action. Now, this was just after Heartbleed. So I made the massively stupid mistake of chatting with marketing people of the engine I was working with
saying, wow, all the cool bugs have names these days. And the people who were listening to me didn't speak English that well and didn't understand the bug. Very, very few people understood the bug. And so we ended up with a website and a logo and you name it.
My theory, of course, is that this didn't gain traction because we didn't have the right theme song. All good bugs should have their own theme song with YouTube video associated. Anyway, that was a disaster. And it was worse because it was a protocol level vulnerability. I don't know if people understand what that means.
Essentially what that means is it's baked into the protocol itself. It's non-fixable inside the protocol. You have to do a new revision of the protocol. Remember, this is the protocol that is used for all Windows to Windows RPC communication. So we had to start telling people about it and I had to
tell everybody. I had to tell NetApp, Microsoft, obviously, were there, they knew, Apple, EMC, Isilon, every storage vendor who did any SMB, which is everybody. And so knowledge of this bug started to leak the fact that there was a bug.
We started getting some very strange activity on our bugzilla of people trying to break in and find out the details on this bug. And at one point I started getting really weird phone calls of people asking me to discuss this bug. And so I got so paranoid that eventually I just stopped
answering the phone unless I recognized the number that was coming in as someone I knew in the storage industry. And so I had to pull in a load of personal contacts to actually get people to realize and know they were going to have to do a revision of their storage software for this. It took about seven months to get this thing fixed and
released and coordinated across the entire industry. And 90 days, and I know certain vendors who shall remain nameless are big on, oh, we told you about it. You have 90 days to fix it or you're all dead. That would have killed us. I mean, that would have just wiped everybody out.
You would have had password databases stolen left, right, and center. It would have been a disaster. So the press just completely failed miserably. They didn't understand it. A lot of security people didn't understand it.
They didn't see why it was a big issue. The worst case scenario, it was a thankless fix, misunderstood by users. Why do I need this? And anyone who didn't actually work with the protocol didn't really understand why it was a problem and why it
should get fixed. So it was hard for them to get management support. And like I said, that's why I ended up using a lot of personal contacts. And it was a disaster because of the catchy logo and name, which was completely inappropriate. It was nothing to do with locking. It was bad, but it wasn't to do with locking. It was lots of erroneous and completely incorrect press
reports were created about at the time. If you actually want to look, there was a decent coverage in Ars Technica. That was the only, I think, publication that did a fairly decent write-up of the whole thing. OK. So case study number two, that was an interesting one.
Protocol level bug, not really a bug in our code. So number two was named SambaCry. And the best comment on this was from a wonderful tweet. It was after Microsoft had the eternal blue exploit from the NSA released.
That was known as the worst disaster week for Microsoft. SMB was horribly insecure. And then somebody discovered SambaCry. And this guy said, Microsoft SMB, wow, what a week. And the Samba team, ha, hold my beer. We had something that was probably worse.
So this was a beautiful bug. I was so paranoid about this bug that I actually went back and found out who put the code in. It's funny. Interestingly enough, I was chatting to a Microsoft engineer about eternal blue, the SMB bug that caused them all that trouble. And he confessed to me that he had privately tracked down
the committer because he was so suspect that the bug was so beautiful and easy to exploit and so devastating, he was wondering whether it had been put in there deliberately. But actually, after tracking it down and looking at the code, he decided that it was actually just a confluence
of horrible mistakes. And SambaCry was the same. It was such an easy bug to exploit. And it was so beautiful that I actually was paranoid that I went back and checked how it had happened just in case someone had done it to us deliberately. And then it turned out that I had reviewed the code change.
So what had happened? What was it? Samba exports, like Windows does, what's called a named pipe subsystem. This is actually an inter-process communication mechanism. You can say, hey, I'm going to open a name called slash slash pipe slash pipe name.
What that will do is, at least on Windows, is running services can register that pipe name when you connect over SMB to that pipe name. It essentially puts you in touch with the running service. So somebody came up with a great idea. We had a restricted list of names that we would actually support in Samba.
Somebody came up with a great idea of, well, we don't know all the services that people might want to run. Why don't we make this dynamic? So if we have a name coming in that we don't recognize, why don't we look for a shared library on disk that has that name? If it has it, load it in, run it, and connect them. Sounds great, right? And I'm like, yeah, that's a great idea.
And then we also had, hey, we have a secure module loading subsystem that was written for loading modules into Samba. Why don't we reuse that code? Hey, software reuse is good, right? You don't want to rewrite the entire thing. We have a nice secure mechanism for loading modules.
Why don't we connect the two together? So that was what the patch did. It was great, except for the fact that the patch that put the two together forgot to check for slashes for directory separators in the path name.
If you think about what that means, what it means is if you come out with a name slash pipe slash and then an arbitrary path name to anywhere on disk slash user slash share slash Samba slash data slash whatever, it'll treat that as a pipe name and load any shared library
that's found at that path name. Now if you choose your path name right, you can point it at a writable area of the disk that the client can access. So what that means is you can write a Linux shared library, you can use SMB copy to put it in the share, and then you can connect via the pipe mechanism and have
Samba load and run it as root. Hey, that's convenient. It was sent in there for seven years before somebody found it and somebody externally reported it. We don't know if it had been exploited. So what was the flaw here?
The only thing that would have caught this was a security review and paranoia about loading module code, which we are now much, much more paranoid about. It was a logic error. This was not a buffer overrun. This was not an off by one error.
This was not any coding mistake. This was blowing your head off as by design, right? This was carefully taking aim at your foot. Tests, the tests that people might have written around this, they wouldn't have helped. You would have tested, hey, can I load a module?
Yes, you can. Can I connect to this pipe? Yes, you can. Because the concept that caused the problem, which was not checking for an arbitrary path name, wasn't being checked and hadn't been considered. So all it would have shown was that the name pipe code was working as designed. The problem was the design was flawed.
And the worst effect on it was basically routers and other non-upgradable devices that are essentially fixed and have old, unfixable versions. And I say, as an industry, you must get a ban. What I really mean is, as Samba, we suck, and we need to get better at it. So the third one was, has anyone heard of Meltdown
Inspector? Well, before Meltdown Inspector, Jan Horn, the guy who found the Meltdown Inspector attacks, he was a new Google employee. And he thought, oh, I don't know. I've heard of this Samba thing. Let me have a go at that. So it's really interesting. We got the email coming in through Google.
He didn't even know I was a Google employee. So it's not like Google goes kindly on its own projects or projects it contributes to. Oh no, we got the same treatment as everybody else.
So it was what I consider a borderline exploit, in that this one was really hard, because what it was was a race condition in path name processing. And he could cause Samba to process a path name.
And while it was processing the path name, he could change one of the components of it to a symlink that would point outside the share. And therefore, Samba would escape outside the share. Now, he could never reproduce this without running the server under S trace to slow it down. It's the only way he could reproduce it.
So it wasn't practical in real life exploit. You'd have to load the server down so heavily that you could never guarantee that you could reproduce this. But it only has to happen once. So I couldn't really argue that it wasn't a bug. I mean, it genuinely was.
And he exposed essentially a generic design flaw in user space server code. If you're processing path names, doing them is one chunk. I don't know whether Ganesha has this problem. I don't know whether Seth or any of the other user space servers has this problem. I'm frightened to look, to be honest. I know we had the problem.
So what it required was redesign of all the path name processing. Now, the natural way to do this is to take it one component at a time and check that it's not a SIM link. Turns out, that's patented. Thank you, software patents.
So we decided on a different way to fix it. It actually came out from some code that Tridge had written many, many years ago, which is essentially to change directory right into the last component of the path name, to pin the directory, and then only operate on the last component.
And that's the way we work. So rather than having to walk the path name component at a time, you change directory into the very last component, you make sure you're under the share, and because you've pinned the directory in place, a SIM link race, you can't be attacked by a SIM link race. And that immediate fix took me about a week. And then, while I was doing the testing, I realized that I
snapshot volume shadow copy services, if you know the Microsoft term, a snapshot processing was completely broken by this fix. No snapshots worked anymore. And then I realized that the module that was broken was written for and donated by the lovely bunch of people who had filed a software patent that caused us not to
do it that way. So I was tempted to leave it broken, but couldn't. So ultimately, we had to ask for a 14-day extension. And it took the full 90 days plus 14-day extension to get the thing fixed, back ported.
I think Red Hat put a sterling amount of work in here to actually get this fixed and pushed back to the versions that they supported. It was miserable and hard. And essentially, they did the work for everybody, which I really appreciate. And so security work under time pressure is when you screw things up.
And so that's why I'm ambivalent on deadlines. For some vendors, you have to have deadlines. Otherwise, they will sweep it under the rug, and they will keep sweeping until the rug is a large hump in the middle of the room. But for others, if they say they need time, then they genuinely might need time. So I do have sympathy with people saying, look, 90 days is
all very well. But for some bugs, that's just not possible. And basically, the lessons from this were, essentially, you have to design better. You have to think about any combination of design
decisions for how robust this is going to be. Now, essentially, this is impossible. I know this. The design we used in Sambo had been happily used by most file servers back from the 1980s via original NFS. Nobody ever thought about it. It really wasn't. Interestingly enough, this is one of the reasons that
Windows on the wire uses handles for everything now that actually get around this problem. And so you just have to code failsafe as failsafe as you possibly can. So these kind of design flaws are the hardest to fix, really. You have to go back and revisit all your assumptions.
And I originally wanted to push back from me and say, hey, come on, how serious is this? You've got to make it S trace or whatever. And I realized I was just weaseling and backpedaling, and I wasn't going to get away with it. So even if you think you're right and the security researcher is crazy, if they go public, it's still going to make you look like an idiot who does sloppy
work. So work with the people reporting your vulnerabilities and get them to understand what your mitigation strategy is going to be. And I got him essentially to sign off on how we were fixing it before we fixed it and then helped us test it. And don't be embarrassed to beg and grovel.
If it gets you more time, hey, it's only your self-respect. If you're working in free software, you almost certainly already lost that. So what have you got to lose? So doing security work is like clearing toilets.
Nobody notices until it's broken and it's overflowing onto the toilet floor. Nobody rates security. Nobody buys stuff. Oh, I'm buying this product rather than that one, because it's more secure. Nobody cares. They only rate security when they don't have it, when
there's a massive breach or things have gone wrong. And even then, they don't care that much. The other thing is the press. The press does not, I mean, the press doesn't really understand technology. And they don't understand security part of technology at all. At all. I mean, some of these flaws are complex to explain, even
for people who are writing the code. So to try and expect a reasonable coverage in the press is just, ah, good luck. So my favorite report started out, a flaw in Microsoft's implementation of the Samba protocol. That was just so delightful.
And that shows you the level of expertise that you are dealing with when you're talking to press. So anyway. Yes, if you're working on open source software, you already know this, volunteer developers will get blamed called malevolent idiots and fools. Yes, but that's nothing new there.
Personal contacts. If you know the people you're working with, you can get more leeway. And it's all about what PTP tried to do, the web of trust. You can actually get a web of trust that you can work with, you can do a better job that way. And nobody, as I said, it's like fixing the sewers.
Nobody notices until you fail. So if you are writing open source software, essentially prepare for failure. You will have big security bugs. It will be your fault. And there will have been things that you could have done about it but didn't and you
missed. That's just going to happen. This is a very hard topic. Until AI in the cloud is writing all our software for us and we're sipping back on the beach, this is going to continue to be a miserable problem. But if you prepare and have a process in place, at least
when it happens, you have a plan. So the other thing is accept all reports and respond to everything, even if you think the people reporting are completely insane. Keep them talking to you. Try and figure it out. Figure out what it is that they're upset about that's
not working for them. Now eventually, you may decide that they really are insane and you don't need to deal with it. But there's usually something there, even if it's covered up by some strange things. Untested code is broken code. That's a mantra that we started to live by very much inside Samba.
And the interesting thing about this, if you take away one thing, is that there is no magic bullet. So all of the flaws that I described in this project, how many people think that rewriting in Go or Rust would have saved you? Zero. None.
Rust or Go could have and would have implemented every single one of these bugs. None of these were C buffer overruns. None of these were C integer wrap problems. We have those too. Don't get me wrong. It doesn't make it easier because you're writing in C. But these were logic flaws.
There were logic flaws in design. There were logic flaws in protocol design. There were logic flaws in software design. There were logic flaws in OS, path name processing, design. Posix is actually very bad at handling path names because of symlinks. If you come to my talk tomorrow on SMB3 Unix
extensions, I'm going to explain why I am going to excise the ghost of symlinks from our problems. But that's a talk for tomorrow. So yeah, logic errors can happen. There is no magic language. Or you could have written it in Python. You would have still been bitten by exactly the same errors.
Nothing would have saved you here. Yeah, that was the Open Source Leadership Summit. So I'm going to, can I go back? Yes, there we go. All right, we'll finish on that one. So I'm using my, I'm JR at samba.org. So with that, I'm kind of done. Does anyone have any comments or questions? Nobody said anything during the talk?
Oh, great, yeah. Does it work? Well, all the way at the beginning, you had an email address at security at samba.org or whatever. But Samba is a large project. So multiple people received the mail.
Now, you just assume that everyone in the Samba projects as well. Well, there's always a chance that someone is a black hat. And he will immediately write an exploit, publish it, and you have six billion people in the world. So let me rephrase your question.
The question is, how can you trust the people on your security alias? There may be black hats. If they're black hats, why are they on your security alias? You, the security, remember, this is at samba.org. This is not a generic mailing list.
This is an alias that goes to a specific, trusted list of people. So you don't put anyone on security.samba.org. In fact, the only people on security.samba.org are people who already have direct commit access to the project. There are no outsiders on that mailing list. This is the core of your core team.
These are the people who you know you can call at three a.m. in the morning and say, we have a security problem, you have to do something about it, who you know are gonna get out of bed and start looking at code. You don't put anyone else on that list. That has to be your trusted response core, right? So if you misunderstood me about the mailing list aspect,
that's not really a mailing list. It is an alias of about five or six people. You have to, essentially, yes, you have to know these people, you have to trust these people. These are the people that you have to trust to write the security code. So you have to have some level of trust
because they're writing the code that fixes the bugs anyway. So yes, it is a problem. Most people don't have that many contributors that they don't trust them. So anyway, yes, other question, yeah. Thank you, Jeremy. I was wondering if you have any guidance for us around the commit message that you would use
to fix these security bugs. I know some of the projects like to obfuscate their message a little bit just to not reveal the bug. How do you feel about those practices? So the question is, do you hide the fact that you're fixing a security bug in a commit message?
So for the actual commit that fixes the security bug, the one that says this is it, this is the fix, we hide nothing. We say this is a security bug fix, here's the CVE number, here's the bug ID. At that point, once that goes public into the external repositories, then it's public.
That bug at least is opened, at least to the Samba vendors who are shipping it, maybe not to the public, but we hide nothing because black hats aren't stupid. That's one of the reasons they're black hats, right? They're smart enough to make a living at this stuff. So you're not going to be able to hide that stuff.
You really aren't. If you think you're cleverer than they are, you're fooling yourselves. The people who are creating the exploits are the people you wish you could get to work on your product, because they're really smart usually. That's how they found the bugs. Now, having said that, going back to the
YanHorn SimLink race bug, the fixes for the snapshot volume shadow copy ones, there were a series of about 25 patches. I essentially had to rewrite the path name processing there. I trickled those in with completely innocuous messages
as a cleanup fix a month before the security fix went in, because none of those fixes fixed the security bug. They were just prerequisite cleanups to the code that meant that the code would keep working once the security bug fix went in.
So that is as much obfuscation as I think is reasonable to do. So if you have a bunch of prerequisite fixes that you need for a security fix to work, but they're not related and they're not revealing anything, just put them in as ordinary fixes. Hey, I'm refactoring this thing, I'm making it work, making it more robust path name.
You know, all of the messages that went in there basically said, fixing the snapshot code to be robust against changing directories. Now, why we weren't changing directories, so why would you need to do that? Theoretically you could have put two and two together at that point. But it was a good cleanup, the code was nice,
it was a genuinely good, it was a cleanup that would have stood on its own without the security bug fix. So, yes, got another question. Hello? Oh, sorry. You were there first, sorry. Question for the general handling of project structure. You recommend making a mail address security at project.
How do you filter out spam? So, the question is, how do you filter out spam from your security at, well, you use good spam filters and hopefully, you know, the people on that list
are gonna have to get used to getting crap, sorry. It's the price you pay for actually being able to have decent security. Yeah, there's, I mean, that's like, how do you fix spam on the internet? Good luck with that. Any more? Yes, over here.
Oh, sorry, yeah. So, as someone who has a handful of CVEs to your name and so on, how do you recommend organizing as an outsider who just stumbles on a bug and suddenly you have a remote problem with many vendors and need to coordinate them?
Sorry, can you? As someone from the outside who's reporting a problem that is cross-domain, as in cross-vendor, cross-implementation, usually specification bugs or things like that, how should we organize it because? So, the question is, how do you, if you find a protocol-level bug,
how would you coordinate that? So, I mean, I'm speaking from the project point of view. So, hopefully, you would be reporting this to the open source project that is implementing that protocol or that thing that you're reporting the bug on.
Now, at that point, you've essentially handed over the ownership of that. I mean, you can stay involved, but hopefully if that project has the contacts in the industry, they will then start, if they're handling it responsibly, they will have to start coordinating with the proprietary vendors. You may have no relationship whatsoever.
I mean, you have a relationship with the open source project because that's where you reported the bug to, but you might not necessarily have relationships with proprietary vendors, but the open source project probably will if they're of any size. So, does that answer your question? Okay, maybe follow up offline.
Yeah, there was, ah, yeah. Thank you very much for the presentation. I want to, could you give a recommendation? I want to read more about the subjects, books, webcast, YouTube channels, and so on and so on. Sorry, you would like recommendations for? Yeah, books, do you have, for example, any Bible of these subjects?
That's a really good question, thank you. The question is, why do I learn more about this stuff? There are some really good books on secure coding. They're mostly to do with flaws that you get in C and C++. One of the things I try and do
is I try and stay up to date with, whenever you get a disaster level bug like Heartbleed or the Bash one or whatever, I always read the postmortems. Postmortems are where I learn most of the, a postmortem is an analysis of what went wrong and how it happened. So, reading about the Heartbleed problem
and how they mess things up there, the Apple one. So, I try and learn from other people's mistakes. So, there are many people saying, I'm a security expert. Well, other than Bruce Schneer, of course, who can factor RSA numbers in his head.
You should obviously read Schneer on security. He's the god on this. But other than him and a few people like him, there's lots of security experts. There's too many opinions around that. So, what I try and do is I try and learn from specific bugs. So, disaster level bugs that are reported industry-wide,
you can normally eventually drill down to the actual code changes that went in. Take a look at those code changes, look at how they fixed them, how they mitigated them. This is one of the problems with proprietary software. So, when Microsoft fixes something or Google fixes something or whatever, you can't see the changes they made.
Whereas, something like when we screw up or when the Linux kernel screws up, you can go in and you can look at what they did to fix Meltman and Spectre and why. That's how you learn, is by looking at other people's code, looking at the bugs that they introduced and how they mitigated them. So, yes, another question?
I was wondering, if you have downstream projects, like someone has forked your code and you don't really know these people, how would you handle that case? Would you give them a head start or would you just dump the fix? So, I'm sorry, you've got downstream who are using your code and you have no relationship with them? Like if someone forks Samba
to make Samba something else? So, we've been lucky in that we've had no successful forks with Samba. The main project is kept together. So, if you, hopefully you have at least some of the community members in common. If you really don't and they've just taken your code
and walked away and you have no relationship with them, then there's not a lot you can do. They're on their own, really. But hopefully you have some relationship with them. The big problem would be is if you didn't trust them to manage the bug responsibly. So, what we have is we have two levels of disclosure.
So, we have the internal core security, Samba team disclosure where everybody knows everything, everyone knows what they're working on. And then we have an external layer called vendor sec, which is basically a bunch of vendors who we know are shipping Samba, who we mostly kind of trust, not completely.
We will give them more, a few weeks heads up, we'll say, hey, we've got a security fix coming down the line. We will open up the bug to them before we open it up to the general public. We'll let them take a look at the bug fixes, let them test them out in their products. And then, yeah, so that's kind of how we handle it.
There's actually some very interesting discussions we've had internally on, we've actually had national security agencies wanting to join that vendor sec list. And we've had some people push back, say, no, I don't trust those people, I don't want them on the list. Personally, I'm actually happy to get the attention
from those guys because they are full of really good people. Yes, there are essentially the black hat, versions of those national security agencies, but they're usually not the ones who are asking to be on the list. The ones who are asking to be on the list are normally the defensive side of government. And so I'm very, it's hard because some people trust some governments
more than others, it's about the web of trust, really. It's about who do you trust and how much do you trust them. And only you can really decide that for your project. So. You mentioned the time aspect a lot of times in your presentation, but I don't really see how this is a problem.
If a bug is hard to fix, most likely it has been there for a long, long amount of time. So, well, if it's been there for years, who cares that it's there for a couple of months? So the comment is, ah, if it's been there for years, who cares? Well, if it's a critical exploit that nobody noticed before,
like Heart Blade, for instance, then people really, really do care and they really should care. And if it's gonna take a year to fix it and someone says, you have 90 days and you're screwed, we're gonna release to the world, then essentially you've kind of damaged the entire ecosystem. Don't do that. Having said that, this is the attitude, this is the official position of my employee in some cases.
So. Again, in some ways 90 days is too long, in other ways 90 days is way too short. You have to pick a number. Somebody picked 90 days. But time really, really is critical. If you have a bad bug,
eventually when people start working on it, knowledge of it somehow kind of organically leaks. There's something going on in this area. And if it's really critical and you are a root level service, bad things can happen. And you really don't want those bad things to happen to your customers. So time is much more important than you think.
Any more questions or? I think also the time is more or less over, I don't know. Oh, okay. So thank you very much for the spontaneous talk, for jumping in. I wish we could do applause. Hey.