Hidden Attack Surface of OEM IoT devices
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Untertitel |
| |
Serientitel | ||
Anzahl der Teile | 85 | |
Autor | ||
Mitwirkende | ||
Lizenz | CC-Namensnennung 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/62236 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
DEF CON 304 / 85
24
28
29
47
51
53
59
60
62
70
72
75
80
84
85
00:00
FontAnalysiseCosSoftwareschwachstelleSoftwaretestReverse EngineeringRouterComputersicherheitMereologieRouterProjektive Ebenet-TestGrundraumMultiplikationsoperatorHardwareSoftwareschwachstelleReverse EngineeringLokales NetzGewicht <Ausgleichsrechnung>InformatikInjektivitätNetzbetriebssystemKategorie <Mathematik>Kernel <Informatik>eCosMultiplikationAnalysisBitStrategisches SpielPatch <Software>PhasenumwandlungBootenHook <Programmierung>Persönliche IdentifikationsnummerProgrammierungZweiSoftwareFirmwareATMWhiteboardProzess <Informatik>AdressraumUnternehmensarchitekturWeb SiteMessage-PassingLeistung <Physik>FlächeninhaltEndliche ModelltheorieZweiunddreißig BitCoprozessorOrdnungsreduktionExploitDifferenteFlächentheorieRechter WinkelSystem-on-ChipFramework <Informatik>StandardabweichungOffene MengeInternet der DingeComputeranimation
05:13
Funktion <Mathematik>BootenMarketinginformationssystemPersönliche IdentifikationsnummerFunktion <Mathematik>AdressraumKernel <Informatik>FunktionalanalysisCodeReverse EngineeringProzess <Informatik>Programm/QuellcodeJSONComputeranimation
05:45
FaserbündelSpezialrechnerKernel <Informatik>ServerCodeImplementierungeCosOpen SourceProzess <Informatik>EchtzeitsystemSpeicherverwaltungThreadROM <Informatik>Minkowski-MetrikAusnahmebehandlungSystemzusammenbruchFunktion <Mathematik>QuellcodeMereologieCompilerLeckNabel <Mathematik>TelnetRouterDefaultMessage-PassingMechanismus-Design-TheorieInhalt <Mathematik>ZeichenketteMensch-Maschine-SchnittstelleWhiteboardSystem-on-ChipSpeicherabzugAdressraumProzess <Informatik>ThreadHalbleiterspeicherEchtzeitsystemServerSoftwareMereologieFunktionalanalysisNabel <Mathematik>Bildgebendes VerfahrenKonfigurationsraumeCosCodeAusnahmebehandlungMultiplikationCompilerBinärcodeFirmwareSchnittmengeZeichenketteFaktor <Algebra>SystemzusammenbruchMensch-Maschine-SchnittstelleAdressraumQuantenzustandKontrollstrukturDebuggingProgrammierungDifferenteKernel <Informatik>Minkowski-MetrikCoprozessorOrdnung <Mathematik>BimodulHardwareReverse Engineeringsinc-FunktionMAPCharakteristisches PolynomWeg <Topologie>MultiplikationsoperatorElektronische PublikationInhalt <Mathematik>System-on-ChipPersönliche IdentifikationsnummerNP-hartes ProblemBenutzerbeteiligungQuellcodeBitGruppenoperationOffene MengeEntscheidungstheorieDienst <Informatik>SoftwareentwicklerPrimitive <Informatik>Elektronische UnterschriftErzeugendeNichtlinearer OperatorPunktMessage-PassingQuaderGlobale OptimierungZusammenhängender GraphLokales MinimumMatchingWhiteboardSpeicherabzugRouterGewicht <Ausgleichsrechnung>SymboltabelleKeller <Informatik>Computeranimation
13:29
SystemzusammenbruchSpeicherabzugAdressraumSoftwareschwachstelleSkriptspracheFunktion <Mathematik>SystemaufrufParametersystemSISPRechnernetzMetrisches SystemHypermediaTypentheorieAnwendungsschichtInternettelefonieMessage-PassingAlgebraisches ModellInhalt <Mathematik>DickeE-MailDatentypPufferspeicherKeller <Informatik>UDP <Protokoll>RouterZufallszahlenWeitverkehrsnetzBildschirmfensterWurm <Informatik>DiagrammFlächentheorieKonfigurationsraumTelnetMensch-Maschine-SchnittstelleZeiger <Informatik>Strategisches SpielHintertür <Informatik>PasswortNabel <Mathematik>BinärdatenDeskriptive StatistikMultiplikationsoperatorMinkowski-MetrikMetrisches SystemSISPFunktionalanalysisZeichenketteTypentheorieHypermediaProtokoll <Datenverarbeitungssystem>Message-PassingRouterCodeSkriptspracheMechanismus-Design-TheorieParametersystemPhysikalischer EffektEin-AusgabeGeradeKontextbezogenes SystemQuick-SortMailing-ListeFramework <Informatik>PhasenumwandlungDeterministischer ProzessThreadExploitOrdnung <Mathematik>FlächentheorieHintertür <Informatik>Elektronische PublikationPasswortGamecontrollerPhysikalisches SystemPufferüberlaufBinärcodeGraphiktablettFirewallNabel <Mathematik>Dienst <Informatik>HalbleiterspeicherInformationDateiformatKonfigurationsraumDatenfeldAlgebraisches ModellNotepad-ComputerWurm <Informatik>MereologieAdressraumVerkehrsinformationTelnetSystemaufrufKartesische KoordinatenLokales NetzNetzadresseInstant MessagingKeller <Informatik>GatewayPortscannerGruppenoperationEinsEinfach zusammenhängender RaumSoftwareFirmwareProgrammierungInternettelefonieMetropolitan area networkLesen <Datenverarbeitung>BenutzeroberflächeE-MailSoftwareschwachstelleQuellcodeGesetz <Physik>Puffer <Netzplantechnik>Computeranimation
21:14
SoftwareschwachstelleDialektPortscannerROM <Informatik>CodeSpezialrechnerFunktion <Mathematik>AdressraumMaschinenspracheCodeExogene VariableBenutzerfreundlichkeitPunktHalbleiterspeicherZeiger <Informatik>FirmwareRohdatenComputeranimation
21:59
SoftwareschwachstelleImplementierungMultiplikationPortscannereCosThreadDatenverwaltungeCosFunktionalanalysisService providerRohdatenSocket-SchnittstelleEinfach zusammenhängender RaumPortscannerThreadDemoszene <Programmierung>DatenverwaltungComputeranimation
22:34
SoftwareschwachstelleGebäude <Mathematik>HydrostatikBinärdatenBinder <Informatik>SkriptspracheCodeBinärcodeTelnetRouterKette <Mathematik>ROM <Informatik>MultiplikationsoperatorWurm <Informatik>Demo <Programm>Kontextbezogenes SystemProgrammbibliothekRouterBinder <Informatik>Ordnung <Mathematik>BinärcodeTelnetAdressraumBildgebendes VerfahrenSkriptspracheCompilerQuellcodeComputeranimation
23:23
Drahtloses lokales NetzTouchscreenPasswortSichtenkonzeptHypermediaDemo <Programm>InternetworkingReelle ZahlBandmatrixStatistikWidgetInformationPhysikalisches SystemDatentypGasströmungRouterKontrollstrukturWeitverkehrsnetzZahlenbereichParametersystemLokales NetzLoginSimplexFaktor <Algebra>Klon <Mathematik>TelnetDatenverwaltungKanal <Bildverarbeitung>ServerDynamic Host Configuration ProtocolLemma <Logik>ExploitThreadTrigonometrieBenutzerfreundlichkeitBootenProgrammschemaDefaultTelnetSystemverwaltungRouterNetzadresseBenutzeroberflächeSISPMessage-PassingBeweistheoriePortscannerNabel <Mathematik>XMLComputeranimation
25:27
CodeBinärcodeSpezialrechnereCosÄhnlichkeitsgeometrieSoftwareschwachstelleComputersicherheitAnalysisFunktion <Mathematik>Elektronische UnterschriftProzessautomationMusterspracheInverser LimesSISPZeichenketteMinkowski-MetrikAdressraumKernel <Informatik>Konvexe HülleBootenEingebettetes SystemSystemaufrufAbstrakter SyntaxbaumATMWeb-SeiteSpieltheorieCodeZeichenketteMinkowski-MetrikFunktionalanalysisAdressraumSoftwareschwachstelleFirmwareMusterspracheAnalysisSystemaufrufPuffer <Netzplantechnik>ParametersystemRouterKernel <Informatik>Keller <Informatik>TelnetDifferenteMatchingMessage-PassingOrdnung <Mathematik>MAPDatenkompressionHalbleiterspeicherBootenSechseckFunktion <Mathematik>TermMultifunktionMultiplikationsoperatorZwischenspracheTeilmengeHardwareMereologieRichtungDatenstrukturMensch-Maschine-SchnittstelleEin-AusgabeSISPCoxeter-GruppeElektronische UnterschrifteCosBildgebendes VerfahrenSystem-on-ChipFramework <Informatik>HochdruckSkriptspracheComputeranimation
32:15
AnalysisProzessautomationCodeMusterspracheSkriptspracheKernel <Informatik>AdressraumKette <Mathematik>Flash-SpeicherSpeicherabzugDatenverwaltungSpezialrechnerKonfigurationsraumTelnetWeitverkehrsnetzRouterMensch-Maschine-SchnittstelleFlächentheorieEndliche ModelltheorieeCosPufferspeicherInternetworkingEinfach zusammenhängender RaumComputersicherheitReverse EngineeringSoftwareschwachstelleUnternehmensarchitekturSoftwareschwachstelleRouterCodeMensch-Maschine-SchnittstelleSoftwareRPCElektronische UnterschriftPunktFunktionalanalysisEndliche ModelltheoriePatch <Software>Keller <Informatik>FlächentheoriePuffer <Netzplantechnik>SichtenkonzeptÄhnlichkeitsgeometrieBinärbildPufferüberlaufNormalvektorRechenschieberCMM <Software Engineering>Lateinisches QuadratHilfesystemSkriptspracheGebundener ZustandCASE <Informatik>FirmwareAnalysisZahlenbereichKonfigurationsraumSoftware Development KitBitKette <Mathematik>SystemaufrufOpen SourceFehlermeldungMAPBinärcodeLastRepository <Informatik>MultiplikationsoperatorVersionsverwaltungQuellcodeProgrammfehlerComputersicherheitInternet der DingeBenutzeroberflächeDifferenteReelle ZahlBenutzerbeteiligungOffene MengeKernel <Informatik>AdressraumComputeranimation
39:03
Elektronischer ProgrammführerMarketinginformationssystemKette <Mathematik>Web logeCosOffene MengeArchitektur <Informatik>Computeranimation
Transkript: Englisch(automatisch erzeugt)
00:00
So our next talk is called Exploring the Hidden Attack Services of OEM IoT Devices, Pawning Thousands of Routers with a Vulnerability in Realtek SDK for eCOS OS. We all know about Realtek, so. But give a big hand to these guys, this is their first time at DEF CON. Give them a big cheer.
00:20
And hopefully you learned something and probably patch your router at home. All right, bye. Well, welcome. Do you hear me? Yeah. Welcome to Exploring the Hidden Attack Surface of OEM IoT Devices. Today we'll be sharing with you vulnerability we found in Realtek's SDK for eCOS OS.
00:43
With the vulnerability, we managed to found multiple router models from many different vendors. So we'll be starting with how we pick our initial target for this research. And then we'll move on to the initial reconnaissance phase. I will talk a little bit about eCOS OS, which is the operating system that these devices run.
01:01
After that, we'll talk about how we analyze the firmware and how we found this vulnerability in question. Then we'll discuss exploitation and post-exploitation strategies on these kind of devices. Then we'll talk about automating firmware analysis to detect the presence of the vulnerability in other router models. And finally, we'll be closing with some takeaways.
01:23
But first, let us introduce ourselves. My name is Octavio Cienatiempo and I'm a security researcher at Faraday. And here with me is Octavio Alland, who was also a security researcher at Faraday at the time of this project, and now he's a research intern at the Max Bank Institute in Germany. And also Emilio Cotto and Javier Aguinaga
01:41
are part of this team and contributed to the research that we'll be sharing with you today. Emilio couldn't come, but Javier is over there. Well, Octavio and I were the main researchers on this project and we are computer science students at the University of Buenos Aires in Argentina. Anyone from Argentina in the audience?
02:04
So, I am also a biologist, but that's a long story, maybe for another time. And we are CTF players with our team from Net Injection. Our team will understand the PAM in the name. And we mainly focus on reverse engineering and PAM categories. And the most important thing is that
02:21
when we started tackling this project, we had no prior hardware hacking experience. So, our motivation to choose an IoT device was a reputation for being insecure. And we thought it would be a great opportunity to put to test our skills in reverse engineering and hopefully, if we got lucky, our exploitation skills too.
02:40
So, how did we pick our initial target? Well, for us, a router was an obvious choice because if you manage to power a router, you get access to a local network. And in this area of working from home, this may also be an opportunity to pivot into an enterprise network. And we decided to choose a popular target to maximize the impact of our findings.
03:00
And also, a relatively cheap one, because we thought that for a vendor that is designing a cheap router, maybe security is not a priority. And keeping these things in mind, we looked for the top selling router in a local e-commerce site. And we settled for this one. It's called the Next Nebula 300 Pass. The brand is Next and Nebula 300 Pass is the model.
03:22
And it's a pretty standard 300 megabits Wi-Fi router that's based on a real tech stock, the RTL8196E, which has a 32-bit MIPS processor that can handle those 16-bit instructions to reduce program size. And it's configured in big end mode.
03:42
And as you can see here, at the time of making this slide, this router had almost 40K sales in Argentina only, in this marketplace. Well, here it says that it's the second top selling router, but actually the first one was a repeater, so this one is the top selling.
04:01
And it's even recommended by the e-commerce itself, and because it has pretty good reviews from the customers. But well, the typical customer doesn't have the tools or the skills to reverse engineer the firmware of the router and find the vulnerability. So what do they know about security, right? So we bought this router and downloaded the firmware
04:21
from the vendor's website. And the first thing we noticed was that it had a bootloader and a compressed kernel image. We ran that through Binwalk, and we managed to decompress this kernel image, but we couldn't guess the loading address to start the reverse engineering process.
04:42
So we decided to crack open the device to hook to the URT interface, but there weren't pins on the board or places other than them. But luckily this sock from Realtek, the RTL8196E, has URT capabilities and has some pins assigned to URT.
05:01
So since we were working from home and on a budget, we designed this little contraption with a cork, because Argentina, and if you don't know, wine is very good in Argentina, and two thin wires, and this device hovered just over the sock, barely touching the pins. And with that, we managed to get our first URT output
05:22
from this device, and luckily for us, it had a lot of addresses printed on the screen, and among them, there was this start address, which is the address of the first function that gets called within the kernel code. And with that, we guessed the loading address,
05:42
and we could start our reverse engineering process. So the first thing we noticed was that this giant binary, which is the kernel image, is composed of software from many different origins. It has a real-time operating system called ECOS, a libc implementation, a web server called Go Ahead,
06:02
and a lot of custom code, mainly written by the vendor, but it also has code from other sources. So let's talk a little bit about ECOS. This is an open-source, real-time operating system. It's POSIX-compatible, and it's designed to be lightweight and customizable.
06:20
The idea is that the developer can choose which modules and packages of the kernel to include during the build process, and to bundle that up with our code to achieve a tailored solution that can run on an embedded device that has limited hardware specs. So another key characteristic to achieve this lightweightness
06:46
is that ECOS has only a single process, but to be able to achieve concurrency, this process can spawn multiple threads. And these threads can access the whole memory space. There is no build trial memory, there are no privileges,
07:01
and every time a thread crashes, an exception handler gets called. And for this device that we were looking at, this exception handler just reboots the device. So once we knew the approximate composition of this image, we started reverse engineering the custom functionalities of this router.
07:22
So since many parts of the software stack were open-source, that was good news. We had, well, the operating system, the libc, everything was open-source. We wanted to build those components ourselves with debug symbols enabled, and then apply those generate function signatures
07:41
and apply them to the binary we already had so that the reversing process will be easier. Unfortunately, the vendor did not provide a release for this device, so we couldn't download a zip package and run make on it and have it work. So we looked for the compiler being used within the firmware image. There was a string indicating the version, of course,
08:02
and we tried to use that to build the firmware with debug symbols, but we couldn't, without the exact build configuration, such as the compiler options, optimizations enabled, and so on, we couldn't generate matching function signatures. In this slide, there's a footnote
08:20
with how that approach will look like, had it been possible. So we had to do without function signatures. However, as I just said, we had the code for the operating system, the web server, which was go-ahead. The libc was usually libc, all of this is open source, and there were many parts which were not actually open source,
08:41
but the code was leaked online, so that helped a little bit with the reversing process. So we basically went about the reversing process as usual, basically, but reading the source code as reference. And now that we knew we had the source code and we had the code on the device,
09:02
and we noticed that there were a few functionalities in the device which were not available in the upstream code. One of these functionalities was a shell that was exposed through your Intel net. Since this device runs ECOS, it's not a Linux shell, it doesn't have many things that one will expect from a shell, but it basically allows us to inspect
09:22
the configuration of the device, inspect the threads, the networking options, and so on. And it also provided us with a great starting point for the reversing process, because we could just look up within the image strings relating to those commands and work our understanding of the device from there.
09:42
One of those commands, or rather a group of these commands, were particularly interesting because they allowed us to read and write memory. And this sounds pretty basic, but this was a very low-level primitive, like it was a command on which you could just plug an address and it will try to read from that address
10:02
or write to it without any checks. So we could use that to modify the code that was running on the device, or we could make it crash if we tried to access and invalidate the address, and this will be very useful throughout the talk. So when we moved on to trying to inspect
10:22
the threads running on the device, we also noticed that this is kind of a design decision by ECOS. Basically, every functionality resides on its own thread. As Octavio said before, we only have one process, so we cannot have multiple processes or services. Everything has its own thread.
10:41
And this is really interesting, because as you can see, for instance, the DHCP server is a thread just as privileged as the network support thread, which implements all the network stack. So basically, everything lives together in the same space, and there are no privileges whatsoever. In order to communicate among themselves,
11:02
these threads can basically exchange messages among themselves, the messages being just C strings, and there are a few API calls that any thread can make using an ID and the message that they want to exchange. And on the slide, you can see an example.
11:20
For instance, when the reset button is pressed for long enough, a message gets sent to a thread which restores the factory settings. And one last thing that we tried to do during the initial reversing stage was to debug the firmware, because that would have been really useful when trying to build up knowledge
11:40
about how this thing worked. But, luckily, there were no JTAG interface on the board. But when we looked at the SOX documentation, we noticed that there were a few pins that had provided JTAG functionality, but they were used for GPIO on this specific device.
12:00
They had the two functions, and when we tried to switch them over to JTAG mode, the device crashed, so we had to do without JTAG, which was somewhat hard. Now, this is what happens when the device crashes. We got, well, a dump indicating which thread caused the crash, the type,
12:21
like the reason for the crash or the exception, and a dump of all the states, the contents of all the registers, a stack trace, and also, we didn't include it here, but there's also the contents of the top of the stack. So, even though we cannot debug the firmware properly using JTAG and attach to it via GDB or anything,
12:43
if you think about it, getting such a dump is kind of the functionality that one would expect from a debugger when the execution hits a breakpoint. When you use a debugger and you hit a breakpoint, you usually can expect the state of the program and the processor, and well, a real debugger
13:02
will also allow you to modify those values and receive execution that was not possible in this case, but it was good enough, and more importantly, it was the only thing we had. So, in order to set these breakpoints of sorts, what we did was we overwrote the desired address
13:22
where we wanted the execution to stop with an invalid address, an invalid instruction, and when the execution hits that address, the thing will crash, we will get the dump, and then after a reboot, we revert it back to a clean firmware so we can use that as a sort of rudimentary debugging mechanism.
13:44
Well, with that out of the way, we were able to build an initial understanding and moving on to trying to find the vulnerability. So, during this reversing effort, we identified a lot of libc functions. Remember that we had to do this manually,
14:01
and as you know, as you might know, many of these functions are dangerous or potentially dangerous, so we decided to write a script to search for calls to string copy, man copy, and such functions with the destination argument located on the stack and the source argument that was not hard-coded, and sifting through that list of results,
14:21
we found this piece of code that is very interesting because it uses a string chart to search for a space two times in an input line, and then, as you can see, it uses string copy to copy from there onwards to the stack without checking its size, so this is a classic stack buffer overflow, as you might see on CTF, but before we can understand
14:43
what this function does in the context of the router, we first must talk about VoiceOver IP and SIP and SDP protocols. So, every time a VoiceOver IP call is made, first session must be established using the SIP protocol, and alongside this, the Session Description Protocol,
15:01
or SDP, is used to negotiate network metrics and media types that will be used when the actual call takes place over another protocol, such as RTP, and both SIP and SDP protocols are application layer and are text-based, so here you can see an example SIP message, and it has two parts,
15:20
a SIP header that resembles HTTP, and it can have SDP data alongside, and the important thing for us is that it has IP addresses and ports, even though this works on layer seven, and the IP addresses and ports on the SIP header will be used, for example, in this case, by the colleague to respond to this message,
15:41
and the ones on the SDP data will be used to establish a session, a media session, and in this case, it is an audio session, such as it is described in the field that starts with the M equals audio, and the IP in the C field will be used to make that connection. So what happens when a device like this,
16:01
it's in a local network behind the router that does network address translation? Well, these IP addresses and ports in the SIP message will be local ones, and as this message traverses through the router, the router has to change them to the external one IP address of the router and an external port to ensure that the colleague can respond.
16:21
When this fails, the call might not ring, or one of the ends might not have audio in this case, for example. So here you can see the same message before and after this functionality rewrites it, and this functionality is called SIP-ALG, or Application Layer Gateway. So now we can go back to the vulnerable code
16:42
and understand it better. The code starts reading lines from the SDP part of this message, and it will use scanF to try to match the media description field. And from there, it will try to extract the port in an attempt to rebuild this media field
17:01
and replace this with an external port. It will search for the two spaces, and then it will copy the rest of the information that includes the protocol and the format to the stack. So this function that is part of the SIP-ALG feature of the router and writes SDP data in SIP messages
17:21
has stack buffer overflow. And the router should crash if we send a message that has, for example, a lot of phase after the media port in this media description field. And since the functionality has to rewrite both incoming and outgoing packets, we might crash the router with an incoming packet too. So we sent a UDP packet crafted like this
17:44
with a lot of phase as the report to a random port on the router and using these, the router's one IP address, the external IP. And when we looked at the URT interface, we saw that the router had crashed with a lot of phase on the stack and with control over the program counter.
18:00
So this means that no open ports are required to trigger this vulnerability and that it can be triggered from one. And more importantly, this is a hidden attack surface because there's nowhere, no place on the documentation of this router that mentioned that it has the SIP-ALG functionality. And it can be disabled via the router's web interface.
18:21
We found that it can only be disabled via the common line that is available through Telnet and UART, but there's no way to persist such a configuration and every time the router resets, it will become vulnerable again. And also, port scanning wouldn't have revealed the presence of this feature. So once we knew that the router had this hidden attack surface
18:41
and that it was triggerable for one, we decided to try to exploit it. Okay, so the upside of trying to write an exploit for an ECOS device was that, at least on this particular device, there was no ASLR nor any kind of prevention from executing a writeable memory or the other way around.
19:03
And well, that implied that all the addresses were deterministic. For instance, we knew where our shell code will land, like everything we sent on the packet will arrive at the specific address. So we could just go with the usual approach that will be familiar to a lot of people
19:21
of just writing shell code on the stack and then using the overflow to overwrite the return address to make it point to our shell code. The two caveats are that the shell code cannot contain null bytes because it will be copied over using a string copy, and that in this architecture, we have two separate caches, one for data and one for instructions. So we cannot write self-modifying code,
19:41
which was our first approach to try to avoid using null bytes so we can't do that because it leads to cache coherency issues. So what we do is we send an otherwise completely normal packet, only that after the audio port, while we include some padding or shell code, and as you expect, the address
20:02
of where the shell code will land. So when we send this payload, our shell code executes. Within the shell code, we enable telnet and send a message to the firewall service in order to turn it off, and then we continue execution normally. And it's very important that we do this, continuing execution, because if we fail
20:22
to receive execution after the exploit is done and we crash a thread, not only will the thread crash, but the whole device will go down and the exploit will not work. And after that, we connect to telnet using a backdoor password, a law that is not strictly necessary because at this stage, we have full control of the device and we could set the password if there was no backdoor.
20:45
So that was it for the exploitation. At this point, we have a shell, which isn't strictly necessary. We could do everything with shell code, but it's easier this way. And we cannot use a second stage binary like wgetbinary and run it because there is no file system.
21:03
This is not a Linux system. So this time, we'll resort it back to the memory modification command that we talked about earlier. So if we look at how commands are handled in ECOS, we notice that there is a global array which has one entry for each possible command.
21:22
Each entry consists of a pointer to the command name and to the function responsible for handling invocations to that command. So what we do is we look for an unused memory region and we inject a custom code in there. And then we modify the global array with the commands
21:40
to make one of the handlers point to our code. Again, there is one more caveat here. The code we are injecting here, it's not a binary. It's just a raw machine code. So it has to be self-contained or otherwise only depend on functions available within the firmware provided, of course, that we know the addresses.
22:00
So within this code, we have access to basically everything that's available on the device. What we used for our second stage for the POC was the ECOS API which includes threat management functions and the libc. And using this, we implemented a multi-threaded TCP connect port scanner. The same port scanner would have been better,
22:23
but ECOS didn't provide support for raw TCP sockets for doing that. So we had to do a TCP connect scanner and we used the multi-threading that ECOS provided to reduce scan times. All of this needs to be built statically in a self-contained binary.
22:40
We used a custom linker script in order to be able to specify the loading address so that all the jams will make sense in the context of the router. Using a compiler which is compatible with the one used to build the image in the original device, and we kind of fake the library calls with the addresses that we already know. We can upload this using telnet
23:01
with the command for writing memory, and from there we can just execute it. All of this is open source and will be uploaded to a repository shortly. And one thing we didn't go into but is interesting is gaining persistence, and there's a footnote which you can check to see about that. So I think it's time for the demo of the full payload.
23:27
So as you can see, we start by entering to the admin panel of the router. Here is the one IP address. And if we go to the administration part, we can check that telnet is not enabled by default.
23:45
However, we can try to use telnet, but obviously it will fail. And we can also check if the telnet port is opened.
24:03
But it's not. So now we run our exploit that will begin by building the second stage, and then we'll send the zip message to enable telnet. Now the telnet port is opened on the device, and we can use the shell to upload the second stage.
24:32
So this is rewriting the command handler, and when it finishes, we'll have a new command on the router. And as Octavio said before, this command is a port scanner,
24:47
so we can use it to scan the router itself. That now has telnet enabled, also alongside this interface, this web interface.
25:01
And we can choose another device on the network and scan it with nmap as a ground truth. So this device has some open ports, and we can replicate this scan using the router this time.
25:21
And it works. So once we managed to pound this router model, we decided to try to pound other models, but using the same vulnerability. The first thing that caught our attention was that among these commands that were available in the command line that was available
25:42
through telnet and UART, there were a subset of them that were called Tenda's commands and as you recall, this router maker is NeXT, and Tenda is another manufacturer. And we also have a new Tenda command, which is a new Tenda command, and Tenda is another manufacturer.
26:00
So this was interesting, and we decided to search for the hardware specs of NeXT and Tenda devices, and we found that many of them are based on socks from Realtek from the same family, the RTL819X. Here you can see on the left, the device that we were doing our research on, and on the right, the Tenda AC5
26:22
also has a sock from this family. So we downloaded the firmware images, and we found that they run ECOS too. And we managed to found another vendor that uses its socks and run ECOS on their devices, and when we looked at the user interface
26:40
to configure the router for the browser, we found that they were very similar and only differing on the branding. Moreover, many of these devices are very similar physically, even on their packaging, so all of these suggest that these are OEM manufactured devices, maybe manufactured by one or two companies.
27:02
So all these routers are built alike, and we wanted to know if there could be a pound alike, so we manually searched for the presence of the vulnerability in these frameworks, and it was so many of them. But before we moved on to pulling all the routers, we decided to disclose this vulnerability. And we reflected on the fact that this vulnerability
27:21
was shared by many different vendors, but this feature, the CPLG functionality, is kind of low-level, it's part of the network stack. So we thought that it was unlikely to have been written by one of the vendors, and we decided to contact Realtek directly. And they quickly confirmed that the vulnerability was part of their SDK for ECOS-based routers,
27:42
access points, and repeaters, and this meant that all vendors that use this SDK and run ECOS on their devices might have this vulnerability if they don't review the code that Realtek provides. So this motivated us to automate firmware analysis to try to detect more vulnerable devices.
28:04
So if we take a look at the vulnerable snippet again, we can see that it has a pretty recognizable structure. There are basically two calls to string char looking for spaces in a given input, and there are string copy, which copies everything after the second space, to the buffer on the stack. And we thought it may be possible
28:22
to create a signature for this. So if we think in terms of the pattern that we want to detect, we basically want to detect calls to string copy, but again, given a raw firmware image, we don't know which function is a string copy, so we just want to detect calls to any function,
28:41
which takes two arguments, the first one being a buffer on the stack. And we can check whether a call has the first argument using the stack buffer, using the intermediate representation API. And from there, we can check that the second argument comes from a call to a different function, again with two arguments,
29:01
the second one being constants. We repeat this last step again, and we check that the first argument to that previous call also comes to a call to the same function, which we hope will be string char. And lastly, we check that these constant values equals a hex 20, which is ASCII for space.
29:22
And if we find such a code pattern, we basically assume that F corresponds to string char, G corresponds to string copy, and that the firmware is indeed vulnerable. So we end up trying to detect this pattern using a Ghidorah script, and basically scanning the whole code for this code pattern will be very time consuming
29:44
on top of the Ghidorah analysis that needs to run first. So in order to narrow down the search space, we only look for this pattern within all the functions that reference C-related strings, such as M equals audio, or C invite, or any of those.
30:01
But there's a big problem that needs to be sorted out first, and that's that in order to be able to get the string references right, we need to be able to calculate the loading address for the kernel. In our case, when we manually reversed this device, we got the loading address from the UART output,
30:21
but if we want to do this statically, we cannot, I mean, if we want to automate this, we must do it statically. We cannot go out and buy any device that we want to scan. So if we look at the UART output once again, we can see that at some stage in the boot process, the kernel needs to be decompressed, and someone is responsible for both decompressing the kernel
30:42
and deciding where the kernel will be loaded in memory. So we reverse engineered the boot loader, and we found this piece of code. Again, the names were added by ourselves. And we can see that there's a function which we have called decompress kernel, which takes the kernel loading address,
31:01
and it gets called right in between the calls to printf, which prints the debug messages we were seeing earlier. So once again, if we try to detect this code pattern, we can make use of the fact that there are several calls to printf, and that we know the offsets between these strings
31:22
that are being referenced. So we want to detect a code pattern that looks like this, several calls to the same function using those strings as arguments, and in between the first two calls, a call to a different function, which takes at least one argument.
31:42
But because we don't know the loading address for the boot loader either, we cannot get the string references right. So we need to rely on the offsets that I just said we already know. So we want to detect this code pattern, and we must make sure that the difference in offset between the calls to the function f
32:02
match the difference in offsets between the strings that it should be printing. And if we find a matching piece of code, then we assume that the first argument to the second call is the kernel loading address. And in order to do this, we used Capstone, which basically works on disassembled instructions.
32:20
It's much lower level than Ghidra's IR API, but it was good enough, because the analysis that we were conducting was rather primitive. And by the way, there's also an alternative approach for figuring out the kernel loading address aesthetically, which is in the footnote, but we tried that and it didn't work in our device. So we automated all of this,
32:42
and then coded a higher level script, which invoked the Capstone script first to detect the loading address, load the binary into Ghidra, and then run the second script, which detects the vulnerable function call. And all of that is open source, and again, will be available in the repo in a short time after this talk.
33:03
So we run this scan against the models that we, through basically Googling for devices using this chip or devices using this OS. We identified four vendors. We ran the script against all models from these four vendors. We identified 13 vulnerable models,
33:23
which had, at the time of making this project, we noticed had amounted to over 100,000 sales in Latin America alone, in one e-commerce alone. Not only that, but they were actively being sold because a few months in, 30K more devices were sold.
33:42
But then, the guys at Faraday, with the help of Daniel Delfino and Federca, who's here in the audience, basically figured out a way to detect more potentially vulnerable devices, provided that these devices expose the HTTP interface through one. So this gave us 63,000 more devices to look at,
34:06
or more devices in the world to look at, not different models. But again, these are only the devices that are exporting the, or yeah, exporting the HTTP interface through one, which is much lower bound. We started digging up through those models individually,
34:23
and again, we noticed that there were many brands, many devices from those brands, and they all looked alike, they all used the same chip and everything. More so, they physically resembled one another. And after running the script on more of these devices,
34:41
we managed to identify 31 models from 19 vendors, including, well, Xenda, of course, D-Link, ZXL, and well, a few more. Well, in case you have a device that looks like that, or if the web interface looks like that, you can download the firmware from the vendor's website,
35:00
or hit that endpoint, which these devices provide, to dump the firmware, and run it through our tool, and please let us know if you find more vulnerable devices. So with that being said, we can move on to the takeaways of this talk. So as a recap, we started researching on a router
35:22
that was top selling in Argentina, and we found the vulnerability in an undocumented functionality. This vulnerability can allow an attacker to achieve remote code execution without user intervention in this router, and through the one interface, and it can be disabled via the router's web interface, it can only be disabled via the command line,
35:42
which is kind of difficult for a normal user, and even in that case, this configuration does not persist, and when the router resets, it becomes vulnerable again. So why does this matter? Well, because it was a hidden attack surface, there was no place in the documentation that mentioned this feature, and the fact that it ended up being on real text SDK
36:03
meant that it affected very small from many different vendors, and it also shows light on the fact that vendors don't do source code preview, because the majority of these devices that use this SDK ended up being vulnerable. So you might be wondering, well, you found this stack buffer overflow on a cheap router,
36:21
but expensive routers should be more hardened, right? Well, at least for these vendors, and especially for Tenda, which is the one that has the highest number of devices affected, the expensive router models might offer the users more functionality, such as configuring your router with your phone, using the cloud, and things like that,
36:40
but they are also based on this SDK and have the vulnerability, and you might be also wondering, well, enterprise routers should be more hardened, right, because these are all home-grade routers, and for that, we refer you to the latest Flashback Talk team where they found a vulnerability in the VPN functionality of a Cisco router and its stack buffer overflow, pretty similar to the one we found,
37:02
and they discuss other similar vulnerabilities, but in enterprise routers. So although the security of internet-connected devices has improved recently, buffer overflows can still be found on 2022. And you might be wondering, well, why hasn't this been reported yet, despite being a classic stack buffer overflow?
37:21
Well, these are our thoughts. From a manufacturer's point of view, they don't have a security mindset. In fact, when we reported this vulnerability to Realtek, they thought that the only thing an attacker could achieve by exploiting it was to reset the router, and it will be very hard for an attacker to achieve code execution using this vulnerability.
37:41
From a vendor's point of view, it is clear that they don't review the source code provided by Realtek. From a researcher's point of view, we think that the fact that this binary image is a giant blob composed of software from many different origins and in which applying function signatures is difficult might be a little bit daunting.
38:00
And from the user's point of view, well, they don't even know that their routers have this feature. So after we reported this vulnerability, it was assigned CVE-2022-27255, and Realtek patched this vulnerability on March 25th. But to the best of our knowledge and up to this date, no vendor has released patch versions of their frameworks.
38:22
And even after that, users will still have to update their devices to fix this issue. So we think this vulnerability will be around for some time. So to conclude, IoT devices can have vulnerabilities and undocumented functionalities, and this makes it harder to audit them. And code introduced on the supply chain might never get reviewed by the vendors.
38:42
And when these devices are OEM manufacturers, well, they end up sharing code, and this means that they also share vulnerabilities. And from an attacker's point of view, this is a perfect scenario because they can find high-impact bugs with little prior knowledge and with little investment. So here are some references,
39:00
if you want to dig a little bit deeper on the topics we covered on the talk. And, well, thank you very much. And if there anyone has a question. Thank you.