How To Get MUMPS: Thirty Years Later - or hacking the government with FOIAd code
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 85 | |
Autor | ||
Mitwirkende | ||
Lizenz | CC-Namensnennung 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/62265 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
DEF CON 3031 / 85
24
28
29
47
51
53
59
60
62
70
72
75
80
84
85
00:00
BitOffice-PaketComputeranimation
00:44
ComputersicherheitInverser LimesSummierbarkeitGEDCOMKonvexe HülleSoftwarewartungInformationServerBitSpeicher <Informatik>BildschirmmaskeTexteditorRechenschieberSoftwareFächer <Mathematik>Computeranimation
02:32
p-BlockTurm <Mathematik>Formale SpracheBitQuick-SortProjektive EbeneRandomisierungComputeranimation
03:08
Formale SpracheSystemprogrammierungRuhmasseGruppenoperationMultitaskingComputeranimation
03:46
W3C-StandardVorzeichen <Mathematik>Cloud ComputingGammafunktionComputerDatensatzZweiFormale SpracheZahlenbereichsinc-FunktionLokales MinimumEinflussgrößeProgrammierungSuite <Programmpaket>Bayes-NetzProgrammierumgebungSoftwareGruppenoperationReelle ZahlQuick-SortProzess <Informatik>BaumechanikSystemprogrammComputeranimation
06:09
VerschlingungSoftwareSuite <Programmpaket>SpeicherabzugSystemprogrammierungLesen <Datenverarbeitung>EreignishorizontUnternehmensarchitekturBiegungVirtuelle GemeinschaftWeb SiteInverser LimesE-MailSpieltheorieZeiger <Informatik>ImplementierungFormale SprachePay-TVTermStreaming <Kommunikationstechnik>NormalvektorMaskierung <Informatik>CodeSteuerwerkDistributionenraumDifferenteQuick-SortFormale SpracheGeradeQuellcodeCodeVariablePhysikalischer EffektMinkowski-MetrikProgrammierstilKoroutineVersionsverwaltungDifferenzkernServerNeuroinformatikMultiplikationsoperatorMathematikNichtlinearer OperatorOrdnung <Mathematik>HalbleiterspeicherPunktUnternehmensarchitekturPay-TVObjekt <Kategorie>SoftwareStrategisches SpielSoftwareentwicklerSpeicher <Informatik>CASE <Informatik>ProgrammschleifeInstallation <Informatik>SystemprogrammierungSpeicherabzugPerfekte GruppeLastRechter WinkelAbschattungFitnessfunktionWort <Informatik>Arithmetisches MittelComputeranimation
14:40
HardwareSystemprogrammierungImplementierungKonvexe HülleTaupunktLogarithmusROM <Informatik>Socket-SchnittstelleRPCMechanismus-Design-TheorieFuzzy-LogikQuellcodeEin-AusgabeTest-First-AnsatzMinimumNetzbetriebssystemVersionsverwaltungSocketClientStandardabweichungSoftwareCodeGeradeStellenringHalbleiterspeicherServerNormalvektorImplementierungZeichenketteCachingAdressraumSkriptspracheMAPAuthentifikationBildschirmfensterQuick-SortVirtuelle MaschineDifferenteData MiningMultiplikationsoperatorSoftwaretestPerfekte GruppeProgrammbibliothekMathematikMereologieBitGenerator <Informatik>QuadratzahlTypentheorieSystemaufrufKontrollstrukturInstantiierungOffene MengeAbgeschlossene MengePunktComputeranimation
22:29
Fuzzy-LogikZeiger <Informatik>ProgrammbibliothekMenütechnikQuellcodeMulti-Tier-ArchitekturRuhmasseCodeFermatsche VermutungSchreiben <Datenverarbeitung>Funktion <Mathematik>ZeichenketteFuzzy-LogikProgrammfehlerElektronische PublikationAggregatzustandFlächentheorieObjekt <Kategorie>Schreiben <Datenverarbeitung>SpeicherverwaltungSystemaufrufZeiger <Informatik>Rechter WinkelFlächeninhaltNormalvektorFunktionalOrdnung <Mathematik>GeradeGefrierenUmwandlungsenthalpieExploitReelle ZahlSystemzusammenbruchImplementierungHalbleiterspeicherPhysikalischer EffektPunktEin-AusgabeSpeicheradresseZeichenketteTabellePufferüberlaufFreewareQuellcodeCodeRippen <Informatik>CASE <Informatik>DatenausgabegerätDatenstrukturMinimumZeitrichtungHochdruckDifferenteQuick-SortWinkelStrömungsrichtungHyperbelverfahrenComputeranimationVorlesung/Konferenz
27:36
Zeiger <Informatik>BimodulWarteschlangeQuellcodeÜbersetzer <Informatik>AdressraumPufferspeicherW3C-StandardDemo <Programm>Sampler <Musikinstrument>Baum <Mathematik>Konvexe HülleRuhmasseGEDCOMBewegungsunschärfeGammafunktionROM <Informatik>BetriebsmittelverwaltungSpeicherverwaltungHash-AlgorithmusRechenwerkURLBitZweiProgrammfehlerSystemaufrufSpeicheradresseVerallgemeinertes lineares ModellKoroutineSystemzusammenbruchEin-AusgabeSpeicherverwaltungElement <Gruppentheorie>Prozess <Informatik>CASE <Informatik>DifferenteMereologieHalbleiterspeicherAdressraumSymboltabelleMakrobefehlCodeMultiplikationsoperatorZeichenketteFunktion <Mathematik>TypentheorieSystemprogrammierungMechanismus-Design-TheorieProgrammierungZeiger <Informatik>KontrollstrukturFigurierte ZahlKartesische KoordinatenBEEPPhysikalischer EffektE-MailFlächeninhaltPunktMinkowski-MetrikProgramm/QuellcodeComputeranimation
32:40
ChiffrierungSenderQuellcodeRechenschieberInverser LimesImplementierungProgrammfehlerEin-AusgabeTopologieInternetworkingVersionsverwaltungE-MailMechanismus-Design-TheorieExogene VariableFront-End <Software>InformationSystemprogrammierungSystemaufrufMathematikClientDistributionenraumChiffrierungAutorisierungZahlenbereichRichtungReelle Zahl
36:15
ROM <Informatik>Arithmetisches MittelProdukt <Mathematik>HackerVersionsverwaltungQuellcodeABEL <Programmiersprache>Dynamic Host Configuration ProtocolRechenschieberHackerProgrammierungProgrammfehlerProjektive EbeneKonstanteNeuroinformatikFormale SprachePunktSpeicherabzugProdukt <Mathematik>E-MailReelle ZahlMultiplikationsoperatorHalbleiterspeicherSystemaufrufVirtuelles LANZahlenbereichDatensatzVersionsverwaltungVirtuelle MaschineKartesische KoordinatenPatch <Software>AusnahmebehandlungComputeranimation
40:40
RechenschieberCodeRelativitätstheorieMAPWort <Informatik>MultiplikationsoperatorElektronische PublikationTwitter <Softwareplattform>BinärcodeProjektive EbeneFormale SpracheReelle ZahlRechter WinkelProgramm/QuellcodeComputeranimation
Transkript: Englisch(automatisch erzeugt)
00:00
So this is Zachary Minneker. He is going to talk about how to get mumps 30 years later with some more stuff added on that you can read. Yeah. All right, good to go. Whoo! All right.
00:21
Let me just set my timer here real quick and then we will get going. Okay. What's up? I think I already do. I recognize you from the office. Yeah, all right, yeah, yeah. All right, cool. Yeah, thanks for coming everybody. This is how to get mumps 30 years later.
00:40
My name's Zach Minneker. Yeah, so to just start, I'm gonna just give you a little bit of an agenda of what we're gonna talk about here. First I'm just gonna talk about like the history of the thing we're talking about here. I'm gonna talk about like how to break mumps, how to break an EMR that's called Vista. And then I'm gonna talk about like the future
01:01
and the past of these things. Before I start, I just wanna say, just get some definitions going. First, I'm gonna say EMR throughout, by which I mean electronic medical records, which is just like software that is used to maintain patient records, stuff like that. I'm also gonna say VA throughout, by which I mean the Department of Veterans Affairs,
01:21
which handles concerns for veterans in the United States, chiefly for what we care about, their medical care after they're out of the military. And then I'm also gonna talk about FOIA, which is the Freedom of Information Act, which basically you can submit a form to the government that says, I want information about this specific thing.
01:42
And as long as it's not, as long as they don't have a reason to say no, they give it to you, basically. Yeah, cool. So who am I? Like I said, my name's Zach Minneker. I work for a company called Security Innovation. I don't work for the government. I, you know, I break stuff for a living. I also used to work in healthcare, kinda, sorta,
02:02
and now I just like work on breaking healthcare stuff for fun. And I care a lot about like the history of a lot of the software that we use, you know, like I'm a real big fan of the PDP-11, for example. Yeah, also, I'm speaking on behalf of myself and not of my employer. And also, I made these slides
02:20
in an ANSI text editor called Mobius. And they should be available on the media server under the name, like the title of the talk, .info. They're not up there right now, but I will get those up there. Yeah. So anyway, so what's this talk about? In healthcare, there's a language called Mumps that effectively undergirds a ton of modern,
02:44
like, healthcare infrastructure, right? You've probably seen this XKCD. You've got on the left there these towers that say all of digital infrastructure. And then there's a little block that says, you know, random project maintained by somebody in Nebraska since 2003.
03:01
This talk is sort of about hitting that block with a hammer. It's basically what we're trying to do here. So, like I said, we're gonna talk a little bit about like the history of the things involved here. So, where the hell did Mumps come from? In 1966, there was a group of engineers that were working for a Dr. Octo Barnett
03:22
at Mass General Hospital in Boston, including Neil Pappalardo. They started working on this new language that they called the Massachusetts General Hospital Utility Multiprogramming System, which, of course, shortens to Mumps. There is some movement now to call it the M language.
03:41
I'm gonna refer to it at Mumps throughout because I think it's a more fun name. So, Mumps was specifically imagined as a language for healthcare environments. It was originally written, allegedly, for PDP-9s, and then Digital Equipment Corporation grabbed it and they turned it into like a standalone OS
04:02
for PDP-11s that they called Mumps-11. Some records say that, like, originally it was deployed on PDP-7s. But yeah, there's also, it has a lot of influence from a language called BBNN Telecom, which I say mostly just to say that, like,
04:21
this is a pre-UNIX language. This is a pre-C language. This is, like, this is before the standard concept of how programming works as you and I know it, right? And so originally it wasn't, like, a language. It was an environment. It was everything you needed to make software
04:41
for healthcare, like, uses. So, as just an example of, like, one of the little oddities here, there wasn't a UNIX epoch, right? There was no, you know, number of minutes since, or number of seconds since 1970, right? So instead, they have this variable that's called ORALOG that has two numbers separated by a comma,
05:01
and the first number is the number of days since January 1st, 1841. And the whole reason for that is because their assumption was that the oldest living veteran that they would have to give healthcare to fought in the Civil War, and that was their, like, measure that they went for.
05:21
So, yeah, also the maximum data can support is, like, December 31st, 9999. So, like, Y2K compliant, which, you know, good job. Really thinking ahead there. And, yeah, so originally, like, Mumps wasn't super well standardized. It was this idea that they were playing around with. It showed up in PDPs. It showed up all over the place. But eventually, the VA hired two engineers and said,
05:42
hey, there's this new thing, like, let's look into doing something with this. And they eventually, they started working on this suite of utilities over at the VA that eventually sort of coalesced into this single total EMR that they called VISTA, capital V, capital A, real cute name.
06:02
And, like, the people who were working on this was this group that was called the Hardhats. And just as an example of, like, the people we're talking about, they're still around and this is their website as of 2021. Absolutely incredible. Like, these folks are really going for it.
06:23
So, like, VISTA, this EMR that they were working on, VISTA, it sort of grew throughout the years. It got bigger and bigger and bigger. And it became, like, really well loved and well respected. On the one hand, this is, like, you have people that have a need for software and they're making it themselves, right?
06:41
This is, like, for doctors by doctors effectively, right? On the other hand, it's effectively, like, shadow IT as a development strategy. So there's a lot of, you know, there was a lot of working that needed to be done to get stuff to, you know, fit together, basically. Months the language was and is extremely fast.
07:06
It's no SQL partially because it beat SQL to market. Also partially because, like, it fits every definition of no SQL. And it's, like, it's perfect for, like, any time you've got data that needs a lot of rights. Banks, sciences, hospitals, right?
07:23
In, like, nowadays, like, VISTA is still, like, widely deployed at basically every VA hospital. Doctors really love it. Like, people who have interacted with it love it. Hospitals outside of the VA system, like, use VISTA for certain things. There's inside of the VA, there's this effort to, like, modernize their EMR,
07:44
which means that they're, like, trying to move, they're trying to get rid of VISTA, but it's still deployed all over the place. And yeah, like, Mumps is, like, still widely used, even outside of healthcare. There's some of the biggest EMRs in the world use it.
08:00
Core banking systems use it. The European Space Agency has deployed it, like, within the decade, I believe, or within the last decade, which, like, it's still, people are still finding uses for it. If you wanna join me on this adventure, you can install Mumps by running in Ubuntu or Debian, sudo apt install-y fisgtm,
08:23
which will install fis's gt.m. You need that hyphen y, because if you even think about installing Mumps, you must. So, like, you have to have apt install it for you. And VISTA, you can actually just get. At some point, it got FOIA'd, is my understanding.
08:42
This appears to have started somewhere in, like, September of 2004, but, like, it just, somebody at the VA just uploads it to an FTP server every month, and, like, whatever the most modern version of VISTA is, it's just every year, or every month, you can just grab a new version.
09:01
So we're gonna talk about, like, a couple of different, you know, we're gonna talk about, like, Mumps a little bit. I'm not gonna talk a whole lot about how VISTA works, but I just kinda wanted to ask, like, has anyone in here, like, written any Mumps? I know there's at least one person. Okay, so we got two people, three people. All right, that's, yeah, oh, oh, wait, okay.
09:20
So we got, like, you know, less than 10. Makes sense, all right, yeah. Look at, Mumps is a cool language. I'm gonna demonstrate to you that it's a cool language. First off, in Mumps, three plus six times two is 18. So if we think about, like, our order of operations here, six times two is 12, plus three is 15, right? So that doesn't make a whole lot of sense.
09:41
The reason why this is happening is because all math is strictly evaluated from left to right. Yeah, it gets weirder from here. We're gonna keep going. In general, like, I find it to be a pretty readable language but it's from a time where, like, size for computers was at, like, a really high premium, right?
10:02
So a lot of code isn't commented. It just isn't, because they didn't wanna have to store it. There's some, on some implementations, there's a performance cost to actually having comments in the code. And then also, a lot of the keywords in the language can be shortened down to single characters, which, you know, gets kinda wild. So, like, here's just an example of some mumps.
10:24
I don't believe this runs, but it, like, you know, it looks okay. And if you notice, I just wanna point out at the bottom, the third to last line there, you have period, space, the word else, and then space, space, the word do. And then there's a semicolon for the comment.
10:41
Those two spaces after the else are important because, like, white space is significant in the language. It's, you know, you can do some stuff with it, but it's significant. But like I said, you can shorten a lot of keywords down to single characters, so we can go from that to that, right?
11:00
Extremely, like, honestly, still kinda readable. As long as, like, everything stays just sorta short, per line, like, everything's cool, right? But we can go smaller, right? There is no reason to stop at only, like, if you think about how code is written, usually, it flows from top to bottom, right?
11:22
This is an invention that does not need to exist. What if, instead of writing your code vertically, you wrote it horizontally, right? So, that same code can be turned into that, right? This is basically code golf. Like, this is enterprise code golf.
11:40
And in fact, if you go on, like, some code golf forums, like, people are using this language to do code golf. Like, it's just, yeah, it's a great language for it. And I'm not, like, I am not cherry picking here. Like, this is actual source from Vista. And, like, this is how readable it is, right? If you look at that first line, you have, like, N, space, and then a couple variable names.
12:01
And then the line below it, you have set D sub equals zero. That's just, like, setting a variable to a certain value. Space, four, space, space, set D sub equal to, and then, like, you know, a bunch of, like, there is a, like, that is how this code gets written, is, like, entire for loops on one line, you know?
12:21
Here's, like, another example. In, for some reason in the Vista source code, there doesn't seem to be, like, a strong coding style that was enforced. There's no linters for this language, right? So, like, in this case, you have if written literally as I F, you know, like, they're literally using if. But then, like, new and set are just single characters,
12:41
right? Yeah, it's, like, this language gets rough to look at. But, like, on modern implementations, now that we've talked about, like, writing the code, right? We need to, like, run the code. Generally, MUMPS is described as both, like, a, both an interpreted and also a compiled language.
13:00
So, on the one hand, you can, you know, write your code and then tell MUMPS, like, hey, run this code. And what it does is compile it, store that as a shared object, or at least GT.M and YottaDB do, I should be clear. It compiles it, stores that as a shared object, and then it loads that shared object into its memory space
13:21
and then jumps into that code that you just wrote, right? And so what that means is, like, you can deploy Vista code as just the source code, which is kind of small, and then compile on site, which is a pretty, you know, useful feature to have. And yeah, it's just, it's, like, this is how it works in the modern era.
13:42
So, like, like I said earlier, like, just to be real clear about what we're talking about, that's the language MUMPS, right? Vista is written entirely in MUMPS. The way that I got it for this research is just by downloading it from that FTP server I mentioned. And a lot of this is based on a certain flavor
14:01
of the FOIA version of MUMPS. There's some modifications that get done to Vista, sort of after the fact, that get, you know, packaged into different distributions for different uses and stuff. The, if you are using a version of Vista that, like, is deployed using GT.M, for example,
14:21
usually you're storing your routines in a folder that's traditionally just called r/. So if you follow me on this and you, like, you know, use Vista that is deployed with GT.M using, like, Docker or something, look for that routines folder, because it's going to have all of your source code in it. All right, so, that's all of our history.
14:40
So then I show up, right? So how did I actually get involved in this? So at Security Innovation, the company I work at, we have, like, research time where we, you know, look at interesting stuff, learn how to break new things, whatever. And I have been using mine to kind of, like, systematically go through a bunch of different, you know, healthcare protocols, look at different EMRs, kind of, you know, do whatever. And, like, I had heard about Vista
15:01
maybe, like, five or six years ago, and I didn't realize, like, I didn't realize it was mumps. I didn't realize how foundational mumps was to all of, like, a lot of healthcare stuff. Talk more about, like, where it's being used later. And, like, places I had worked at in hospitals had always used Java-based EMRs.
15:20
So, like, I just never, you know, never got a whole lot of exposure to mumps. And on top of that, I desperately want to be cool. And, like, I think hacking weird code is cool, and I think I've demonstrated that Vista is weird, and, like, mumps is weird. So, like, you know, thus we can play. So, yeah.
15:41
So, let me just talk about, like, what a deployment of Vista sort of looks like. It's basically this. So, you have some hardware, some, like, x86, probably, machine that you're running it on, that you have an operating system that's running on, right? On top of that, you're running some sort of mumps implementation. For my use, this is either GT.M or YottaDB.
16:01
There's also a Windows implementation that's called Cache, that's pretty common. And then on top of that is Vista. So, in Vista, when you, like, you know, make a new string or whatever, you're interacting, you're asking the mumps implementation to give you, like, memory to use as a string, and it goes to the operating system to get that memory. Yeah, yeah, yeah, yeah, yeah.
16:21
So, the way that in a actual hospital Vista gets used is that you have clients that talk to it using this RPC method that's called XWB. A really common client is CPRS, which I'll talk about in a sec here. But yeah, that's our general map of what this thing looks like, right?
16:41
So, I go out and say, like, I wanna attack this as an attacker, or I wanna attack this as a client. I wanna just be able to show up at, like, a VA hospital, plug into a wall, and go, right? So, like, I wanna start with their client and then, you know, start exploring what I can do here, right? So, I go and grab the most common Vista client,
17:03
which is called CPRS. CPRS is really widely available. I think it's up on GitHub now. It's written in Delphi, so it's more readable than mumps. And so, yeah, so I install CPRS. I run a version of Vista that's deployed without TLS, which isn't hard. That'll come back later.
17:20
And then I start capturing packets, right? And so, I get a lot of RPC traffic that looks like this. And, like, you know, ignoring, like, the normal, like, you know, TCP stuff at the bottom, we've got all this, like, ASCII at the bottom that I just don't really understand. If you look on, like, the third line,
17:41
you can kinda make out that there's 127.0.0.1 that's on there. That makes sense. I'm running the server on localhost, but, like, I have no idea what's going on. And I can't really turn to the source code because at this point, this is, you know, like, a year into this research, and I don't know mumps. So, like, I'm gonna do this in the dumbest way possible
18:00
and just start looking for keywords in the source code. And so, when I'm, like, dragging through the source at some point, I start finding this code. And if you look here, you'll notice there's a line that says type equals XR equals, and then in quotes, square bracket, XWB, close quotes, right?
18:21
That's our code. That is, like, consuming this RPC traffic. So, we've got, like, we've got a way in, right? So, like I said, don't wanna learn mumps. And so, I turned to old reliable here, which is Bufuzz. So, Bufuzz, if you haven't used it, is a Python library for basically making, like,
18:41
network fuzzers where you don't really have to, you can just say, like, here's what the network traffic looks like, go fuzz this thing, here's, like, the address. But to do that, I need to capture a lot of traffic and then turn that traffic into this Bufuzz script. And if you use Vista for, like, with CPRS for, like, you know, I don't know, 15 minutes or so,
19:00
like, you'll generate hundreds of RPC calls. So, I start writing these, like, notating these by hand into a Bufuzz script, and I get through about 20 before I'm like, this is dumb. And I write a script that'll just create the Bufuzz script for me, and then I end up with an 18,000 line Bufuzz script that gets me nothing.
19:21
I ran that for a couple of months, absolutely nothing. So, I switched tack. I think to myself, like, I don't necessarily, like, the network is slow, like, let's see if we can cut the network out. So, I learn enough months to write a harness that will take input from standard in instead of from that, like, instead of from a socket, right?
19:42
And it will still hit that RPC code. So, after doing that, I can now use AFL++ in dumb mode, not have to worry about, like, instrumentation, just kind of, like, feed input into this thing and see if it dies, right? And that also doesn't seem to be working. So, I think to myself, like, what if I just, like,
20:03
instrument it, and then I can see if code is actually getting hit, right? And so, like, we need to talk a little bit about, like, instrumenting some mumps implementations here. So, there's two mumps implementations that I kind of care about for this research. The first one is GT.M, the other one is YottaDB.
20:21
Both of them are open source, YottaDB is based on GT.M because of some, like, historical reasons. The Vista deployment that I was working on was based on GT.M, so I have, like, already have, like, a stood-up GT.M instance. YottaDB is very easy to get going if you want to get it going. And both YottaDB and GT.M are written by, like,
20:40
sea wizards who are, like, way cooler than me. And, like, they do everything they possibly can to, like, make sea even faster. Big parts of it are written in assembly, which is of GT.M and YottaDB, which is fascinating. But all I have to do is make, like, three changes to the code to get, like, AFL to work,
21:01
which is good for me. Yeah, and so since I'm down here anyway and, like, instrumenting this underlying mumps implementation I figured, like, I might as well just fuzz, like, the mumps implementation anyway. You know, like, fuzz how it handles source code input. And so to do that, YottaDB has all of these
21:23
test-driven development, like, source code examples that are all, like, they all explore weird states, right? And so, like, that's a really perfect corpus for this. And in general, like, that's my advice. If you're fuzzing something that you don't understand but they have code tests, just steal their code tests.
21:40
You know, just steal their test inputs and just use those. Like, it works a lot. And then, yeah, at this point I've written enough mumps that I can finally, like, read mumps. So now I can actually go through and, like, you know, read the source code and, you know, make some sense of it. So I start looking through the authentication, I start looking at the input handling, I start looking at how it interacts
22:01
with the underlying system, mostly just looking for, like, quick wins and stuff. So we've got, like, three pathways of attack here. First, we're fuzzing the Vista RPC mechanisms using, like, a mumps harness and AFL++. Second, I'm just fuzzing how YottaDB and GT.M handle source code input using, like, YottaDB's tests.
22:23
And then third, I'm just looking through the code by hand, looking for anything weird that I can see, right? So what'd we find? So first off, the RPC fuzzing got us just nothing. There's a really boring technical reason for this that I'm not gonna get into, but yeah, just absolutely nothing.
22:40
Fuzzing YottaDB and GT.M got us 30 CVEs. All of those are memory corruption bugs. It's everything from, like, buffer overflows to use-after-freeze to null pointer dereferences to everything you can possibly imagine. And I wanna be really clear about, like, what the attack surface for those looks like. I'm talking about modifying source code
23:01
that gets fed to the interpreter, right? So you have to be in a really specific spot to exploit these. I don't think it's, I think it's easier than you would expect to get there, but yeah. So cool, these CVEs are CVE-2021-44481-244510. And, like, these bugs are weird.
23:22
Like I said, like, this was written by, like, C. Greybeards using every possible trick you can imagine. And so there's all of these weird states that, like, ended up getting explored doing all of this. So let's, like, take one of those bugs and, like, talk about it, right? So we're gonna look at 44486. So what I'm gonna do is I'm gonna show you the input,
23:42
I'm gonna show you the crash. We're gonna talk about, like, why this crash is happening. Then I'm gonna show you the crash again from, like, a different angle and show you, like, what actually is causing the memory corruption here. One sec. All right, so here we go.
24:02
So, first I'm just gonna open that input and just kind of show this to you. This is the input that is gonna cause the crash. This is just, like, a non-minified input that the fuzzer found. If you look at this line here, you can see this write command, which is actually what causes the crash to happen.
24:24
So if I bail out of VIM real quick, I'm just gonna run YottaDB in GDB. It is configured to just read that input and, like, try to create a source code or a shared object from that. And we get this segfault, right? If we take a look at the state of the registers,
24:43
what we will see here is that RIP is at this, the instruction pointer is at, like, 555C6950. So if we look at, like, the instructions around that location, there's just sort of a bunch of garbage there. There's that instruction at the bottom that, like, GDB can't really make sense of.
25:02
And I'm gonna talk about this later, but that's somewhere in the heap. Just trust me that that's in the heap. So if we look at, like, the line of source code that caused this crash, it's inside of op underscore write, and there's this call that uses iocur device dot out and then arrow dispatch pointer.
25:21
If I print that, you can see that there's, like, some memory addresses in here that don't make a lot of sense to me, but there's also that write function pointer is at 555C6950, right? So what's actually happening here, right? There's a specific order of strings being created
25:42
and, like, attempts to compile the code that is corrupting some data structure in memory that contains a function pointer. So then later in the source code file that's being parsed, there's this call to write where the function pointer gets corrupted and we just jump out into the middle of absolutely nowhere. So, like, in this case,
26:01
we're jumping to somewhere in the middle of the heap, but, like, that's just purely chance in this case. So the thing that's actually being corrupted is this iocur device dot out. Iocur device dot out is, like, the current input output device. It handles, like, taking input from the user and also, like, printing and emitting source code and stuff like that.
26:21
It has a dispatch table that's called DSP underscore PTR and that dispatch table is just a bunch of function pointers that point to different functions that you can, like, rewrite on the fly if you need to change what the mumps implementation is doing. And then we are trying to perform the write function
26:42
that's in that dispatch pointer using some input, right? So once the corruption happens, we end up with this, where that iocur device dot out just gets corrupted. So it's completely kind of destroyed. That dispatch pointer, excuse me, points to just somewhere randomly,
27:01
which means that that write function call is completely random. You know, it's just some other, you know, it's just some area of memory, basically. So, but, like, why, like, why does that happen, right? And, like, what actually is this corruption look like? Basically, what we end up having is, like,
27:21
these two objects in memory that are at the same, like, memory locations. We're overlapping two chunks, basically. I'll explain this more later. But, like, in other contexts, like, if you're just doing, like, normal heap exploitation things, you can kind of get into a similar state using, like, a use-after-free or a double-free.
27:40
And, like, let me demonstrate that for you. Like, I'm gonna look at that crash again, but we're gonna take a slightly different look and look at the way specifically that malloc is being called here. So, let me restart the program, and then we're gonna run it again with that same input, just, and see what happens, right?
28:01
And if we take a look here at, this is, we are inside of op write, and now we're breaking at this, this incr link function call. Before a call to malloc, right, dispatch pointer looks fine. Like, this is, the symbols are being,
28:21
like, this is correct, right? And if we look at, like, some strings around that area, where that dispatch pointer is, or where curdevice.out is, there's nothing really reasonable. After a call to malloc, there's this macro that gets called that uses the output that it gets from malloc, and if we check IO curdevice after that,
28:42
now all of a sudden there's a string written there, right? So, we're overwriting some data that's in that curdevice, or IO curdevice, right? And the dispatch pointer now is just completely clobbered, like it is just nonsense, and if you look at the rest of this, like, all of these have been just completely destroyed.
29:01
So, let me rerun that again, and this time we're gonna step into that call to malloc to, like, figure out, like, what the malloc is actually doing, right? So, here's our completely normal call to malloc. We step in, and we are not in malloc. This is GTM malloc. They wrote their own malloc, and replaced the system malloc with it.
29:22
So, if I break at another macro later inside of this, like, custom malloc, there's this call to, like, get queued element that gets some piece of memory that starts around, like, E200, ignore, like, you know, 55555, somewhere on the heap, E200, right?
29:41
And if I look at where curdevice.out is, it is at E210. So, there's 16 bytes between those two, right? So, before that crashing call, before that call to, yeah, before, like, the crash that happens in op write, IO curdevice is well-formed,
30:00
and it's at a memory address that ends with E210, right? There's this call to malloc that goes to GTM malloc instead of glibc malloc, and, like, eventually returns this memory address that ends in E200, right? The devs have made their own memory allocator inside of the heap that, like, manages,
30:22
there is the heap memory allocator, and then there is their memory allocator managing the same locations in memory, right? There's at least two memory allocators in use on this application, which is just super wild. And by, like, a little bit of some magic, you can get that second memory allocator
30:41
to return overlapping chunks, basically. So, just to do this a little bit visually, on the far left here, we have the way that, like, you know, process memory is laid out. You've got, you know, the text at low addresses, you've got the memory, you've got the heap, and then there's, like, the heap, right? The heap is made up of chunks, like, memory that is either allocated,
31:02
in which case it's labeled chunk, or it's freed, like, and the memory allocator can, you know, do whatever it wants to it, right? If we take a look at one of those chunks, we've just got some memory that we can use for whatever. During initialization, GT.M and YottaDB allocate a chunk, a really big chunk,
31:21
and then they just say, like, this is the memory that we are going to use for any mumps program that's written, right? So then, when the, when IO-Cure device runs, it, like, or when GTM runs, it initializes this IO-Cure device somewhere in that same memory space, and then that GTM malloc returns
31:43
a similar looking memory space, and they overwrite that IO-Cure device. So you have, in one part of the code, like, the code thinking that we're looking at the input and output device, and in a different part of the code, they think it's a routine header, right? Which lets us get that, like,
32:01
overlapping chunks type confusion thing. So, basically, we have these two, like, mechanisms that are managing memory, malloc and GTM malloc, and then we get, like, a type confusion bug in the way that GTM malloc, specifically, is handling that memory. So, like, this is a heap bug inside of a memory manager that is managing memory
32:23
managed by a different memory manager. So, like, yeah, the address there is not completely random, but it's not really in our control, and I just, I really wanted to talk about this bug in particular, because it's just so fucking weird. It's so weird. So, yeah, so yeah, that's what we got
32:41
looking at, like, the, you know, looking at the MUMPS implementations themselves, right? So what about that source review, where we were just looking at Vista, right? This next slide I have to read really carefully. The source code review just looked at, like, just was looking for quick wins, and only looked at, like, the auth mechanisms, input handling, like, how it interacts
33:00
with the underlying system. So that RPC mechanism I was talking about that the clients use is gated by a encryption mechanism that uses roll-your-own encryption from the 90s, right? So if you deploy Vista without TLS, creds are poorly encrypted and transmitted in a way
33:21
that attackers can trivially decrypt them or simply replay the packets. There also appears to be hard-coded creds in the source, but because of some, like, particulars, I'm not super sure that they can be used. And I would absolutely love to explain to you how this works, but we had some problems disclosing this.
33:44
So let me show you my disclosure timeline real fast. So on January 3rd, we sent an email to the VA following their disclosure policy. We received an automated email that said, somebody will email you back. Nobody emailed us back. So then, like, on the 10th, I sent them another email
34:00
that said, hey, there's some problems. I really wanna talk to you about these. They sent another automated email, no follow-up. We never got another email from the VA after this. And then on the 10th, I sent another email and got nothing. Right, so, like, I assumed that they, you know, either something changed on their backend or they, like, just blocked my email address, so cool.
34:22
So then I reached out to somebody I know works at the DHS. They did not respond. I then reached out to CISA directly. They also did not respond. And then I called CISA on the phone. This is a thing that you can do. Their phone number is on the internet. You can find them and call them. And somehow, I think because of a phone tree thing,
34:42
that call just got disconnected, like, before I could ever speak to somebody and, like, explain what the hell was going on. So then I called CISA again, and I was told that any information that I give them is just not gonna be provided to the VA. Like, they're not gonna, like, give it to the people to try to fix the bug, right? They said they were gonna give it to their threat hunting teams.
35:01
So then I reached out to CMU CERT, and I received an email that was like, hey, give me more details. And then they didn't respond to responses to that email. This says they never responded. Within the last week, they have started responding. But I don't believe they, yeah,
35:21
I'm not super sure what's going on there. I still don't think the VA has been told about this problem. And I wanna be clear, like, this is an EMR that is deployed in, like, it is at VA hospitals right now. It is also at civilian hospitals in the United States. So yeah, also, we, like, disclose a lot of bugs
35:41
to a bunch of mumps distributions. For YottaDB, we sent them an email that was like, hey, like, we found some bugs. They sent us an email back that was like, cool, do you mind teaching us how to do this? And I said, hell yeah. And then we explained to them, like, hey, here's how you fuzz, here's, like, the changes you need to make, like, you know. And then they started fuzzing, and they've got, they've found tons and tons and tons of bugs.
36:02
And then by, like, you know, February, we just goes to them in November. By February, new version was out that had all the fixes in it. GT.M, we sent them an email in December. By March, they had, like, fixed all of the bugs. So yeah, that's the, that's our disclosure there. Thank you.
36:22
So what does this all mean, right? When I first, like, started looking at this research, like, I've done a lot of fuzzing projects and never found anything. You know, I always kind of figured that, like, oh, you know, all the fun memory corruption bugs are dead, right? Nope. There's still big stuff out there that has real, real kind of obvious memory corruption bugs.
36:43
Mumps isn't really gonna go anywhere. I think at this point, we're sorta, we're stuck with it. When Mumps first got, like, off the ground and people started using it for stuff, it was faster than everything. It was cutting edge, it was innovative, it was everything you want. And, like, a lot of companies jumped on this bandwagon
37:01
and are still there. Core banking uses Mumps. You know, a bunch of healthcare stuff uses Mumps outside of just Vista. Like I said, the ESA uses Mumps. Based on some numbers that I've seen, more than 50% of healthcare records in the United States pass through, like, some application that's written in this language at some point. And yeah, there's still more weird machines
37:22
out there to break. You know, there's still, like, more stuff to find, right? So what should you do? Like, if you're working on Vista or a Vista derived product or something like that, make sure you're deploying TLS everywhere. Deploying TLS is not, like, difficult
37:42
on a lot of Mumps implementations, but it's not trivial. And, like, just make sure, even internally, do not trust that you're just behind, you're inside of, like, your VLANs and everything's fine. Like, make sure you've got TLS everywhere. If you deploy Mumps or a Mumps-based product,
38:01
you need to update. If you're using gt.m from apt, like I said to do earlier, you're four versions behind, and you're two versions behind the patch that has all of the fixes to the bugs that were disclosed. So, you know, update, basically. Probably build from source. And if you're a hacker looking for research, like, I can't think of, like, look at healthcare stuff.
38:21
Like, there's people are, healthcare stuff is still not getting the eyes on it that it needs. Like, look at healthcare stuff. Also, if you work at the VA, send me an email. Like, I don't, yeah, this shouldn't be this hard, you know? Yeah. I just wanna, like, I've just trashed this language
38:42
and this product for, like, you know, at this point, like, 40 minutes. And I just wanna say that, like, everybody who worked on this is a hero to me. Like, I am not kidding about this. Like, looking at some of this code, you see these names that are, like, these people that are wizards, right? Mumps was this incredible idea that, like, was just,
39:03
like, hey, we have computers now. Like, what can we do for healthcare stuff? And they made this incredible thing that is still in use that you can still play around with and still learn. It's not exactly an esoteric language, but if you're looking for a new Esolang, look at mumps. It's neat.
39:22
Vista, as, like, the EMR is super well respected. It's really flexible. There's a story about, like, the VA has just been in constant scandal forever and ever. And there's a story about how the, like, during some bad times, they basically, like,
39:42
there was this congressional testimony that was like, oh, you know, everything over there is broken, except for that EMR. That EMR is the best EMR I have ever seen. And a bunch of doctors have said this about mumps. And I, like, just as, like, a fun little side thing, it was, so Vista got named Vista in, like, 1994,
40:02
which is, like, almost 30 years ago now. And the reason why they called it Vista is because previously they were calling it DHCP, which then, you know, led to be a problem. I think it stands for Distributed Healthcare Program or something like that. And they renamed it to Vista, capital V, capital A.
40:20
And I'm sure at the time they were like, this is the greatest name. No one will ever scoop us on this name. Why the hell would you call something Vista? You know, and then, you know, a decade later, here comes Bill Gates's microsoft.com. So yeah, that's slide 63 of 64. Here's slide 63 and a half.
40:42
This is my greet slide. Thank you for having me. Like, I, this really means a lot to me, this community and, like, everybody who has stood on this stage and all of the research that everyone who has ever gone to Def Con has ever done has been a great, like, inspiration to me. Yeah, thanks to everybody at SI.
41:00
Thanks to, you know, just everybody who helped out on this. Yeah. Thank you so much for your time. That's kind of, that's everything I got. You can yell at me at Twitter. That's it, yeah. You're my hero. Thank you, you're my hero.
41:23
All right. Oh yeah, also my Twitter account is the word binaries backwards. Which, yeah, it's just, doesn't look like it, doesn't look like it, but it is. Cool. Yeah. Are we, do we have time for questions? Yeah. Yeah, cool.
41:41
Yeah. Yeah, yeah, cool, yeah. Yeah, any questions? Anybody have any questions? All right. I have, I have absolutely killed the crowd. Hell yeah. Yeah, like I said, go look at mumps. It's a fun language, it's real weird. There is, there's a lot of multi-billion dollar companies
42:03
that deploy stuff that's written in mumps. So like, there's some fun stuff out there. What's up? There is not, the question is, is there a Wireshark decode for the RPC? There is not. There is a long dead, I think there is a git commit,
42:20
I can't remember what project it's on, for a JS file that I think is called RPC snoop, which you're not, that's gonna be difficult to find, but like look for RPC snoop in relation to Vista and hopefully you'll find it. It's a no JS file that was floating around. Yeah. Cool.
42:43
Any other questions? All right. Thank you very much. Thank you.