Practical Cache Attacks from the Network and Bad Cat Puns
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 254 | |
Autor | ||
Mitwirkende | ||
Lizenz | CC-Namensnennung 4.0 International: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/53142 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
| |
Schlagwörter |
00:00
CachingDatennetzComputerStatistische HypotheseBrowserBefehlsprozessorAppletHardwareServerIntelReverse EngineeringDemo <Programm>AnalogieschlussSpezialrechnerAusweiskarteGeradePrimidealMultiplikationsoperatorTypentheorieWeb SiteBenutzerfreundlichkeitTwitter <Softwareplattform>Mixed RealityInformatikMAPMusterspracheBitCADEntropie <Informationstheorie>FontEin-AusgabeComputerAggregatzustandDifferenteSoftwareAnalogieschlussRechter WinkelComputerarchitekturSkriptspracheProzess <Informatik>RPCVideokonferenzGamecontrollerCodePunktwolkeQuaderCASE <Informatik>DatennetzÄußere Algebra eines ModulsZahlenbereichPen <Datentechnik>AutorisierungUnrundheitRuhmasseStatistikComputersicherheitRechnernetzLastPrimidealArithmetisches MittelGeradeInformationForcingKartesische KoordinatenLeckAnalysisHierarchische StrukturProgrammierungMalwareServerVirtuelle MaschineAusweiskarteWort <Informatik>ClientHardwareBrowserComputerunterstützte ÜbersetzungStellenringCachingMinimalgradBefehlsprozessorInteraktives FernsehenResultanteMulti-Tier-ArchitekturSchnittmengeStatistische HypotheseRauschenDemo <Programm>GruppenoperationComputeranimationVorlesung/Konferenz
10:09
CachingAdvanced Encryption StandardSchlüsselverwaltungKryptologieWeb SiteAusweiskarteSpeicherverwaltungBefehlsprozessorMAPGraphikprozessorInformationsspeicherungProzessfähigkeit <Qualitätsmanagement>HauptspeicherGamecontrollerBootenProgrammierungKryptologiePhysikalisches SystemInformationCachingBefehlsprozessorRechnernetzLeckInterrupt <Informatik>MAPAusweiskartePunktspektrumDatenflussInhalt <Mathematik>PCI-ExpressDifferenteKartesische KoordinatenWeb SiteStichprobenumfangDiagrammComputeranimation
11:46
Gerichtete MengeIntelTreiber <Programm>AusweiskarteCachingSocketEinfache GenauigkeitProxy ServerKernel <Informatik>BefehlsprozessorInterrupt <Informatik>ServerVirtuelle RealitätNichtlinearer OperatorMultiplikationsoperatorRPCPhysikalisches SystemBeobachtungsstudieTreiber <Programm>Proxy ServerBefehlsprozessorAusweiskarteServerRichtungParametersystemRechnernetzRechter WinkelIntelKartesische KoordinatenMinkowski-MetrikBefehl <Informatik>Cloud ComputingSpeicherkarteRechenzentrumDatentransferZentrische StreckungDatennetzCachingKeller <Informatik>TaskMagnetkarteKernel <Informatik>Inhalt <Mathematik>VollständigkeitDifferenteMinimumVorlesung/Konferenz
14:24
HochleistungsrechnenInformationsspeicherungDatennetzSchreiben <Datenverarbeitung>Socket-SchnittstelleCachingAusweiskarteReverse EngineeringBefehlsprozessorMAPProzessfähigkeit <Qualitätsmanagement>GraphikprozessorSpeicherabzugServerRechnernetzMAPBitCachingSeitenkanalattackeRechenzentrumCloud ComputingFront-End <Software>AusweiskarteRechter WinkelNichtlinearer OperatorMultiplikationsoperatorNormalvektorZweiInverser LimesBetriebsmittelverwaltungCASE <Informatik>p-BlockGebäude <Mathematik>Schreiben <Datenverarbeitung>Virtuelle MaschineClientServerProzessfähigkeit <Qualitätsmanagement>FlächentheorieSoundverarbeitungMultifunktionDistributionenraumCodeMereologiePrimitive <Informatik>Lesen <Datenverarbeitung>ModallogikBefehlsprozessorKartesische KoordinatenKommunikationsprotokollElektronische PublikationPunktwolkeURLWeb SiteExpertensystemDifferenteSupercomputerMultiplikationOffice-PaketOrakel <Informatik>HardwareIntegralEinfacher Ring
19:37
GeradeCachingEinfacher RingNichtlinearer OperatorCachingRechnernetzBefehlsprozessorMultiplikationsoperatorGeradeOrtsoperatorSichtenkonzeptPuffer <Netzplantechnik>MusterspracheEinfacher RingInformationInhalt <Mathematik>Rechter WinkelKreisflächeURLComputerspielZeitzoneReelle ZahlDiagramm
21:20
BitMultiplikationsoperatorMusterspracheEinfacher RingServerCachingTypentheoriePuffer <Netzplantechnik>URLInhalt <Mathematik>
22:19
Wort <Informatik>Einfacher RingProfil <Aerodynamik>BildschirmfensterOrtsoperatorFunktion <Mathematik>MAPMultiplikationsoperatorInformationWort <Informatik>PunktRechnernetzClientMusterspracheGeradeRauschenDifferentePuffer <Netzplantechnik>Einfacher RingCaching
24:11
Wort <Informatik>FreewareMittelwertEindeutigkeitLeistungsbewertungTotal <Mathematik>Virtuelle MaschineMusterspracheSchnelltasteMAPWort <Informatik>AlgorithmusTypentheorieAlgorithmische LerntheorieEinflussgrößeTermStreaming <Kommunikationstechnik>Euklidischer RaumPunktMinkowski-MetrikMultiplikationsoperatorCachingFunktion <Mathematik>RohdatenVorlesung/Konferenz
26:26
Demo <Programm>Rechter WinkelWeb SiteTypentheorieEndliche ModelltheorieBitComputerspielVirtuelle MaschineMultiplikationsoperatorTeilbarkeitDemo <Programm>VideokonferenzWort <Informatik>Vorlesung/Konferenz
27:33
E-MailWeb-SeiteBenutzerprofilEinfacher RingComputersicherheitEinfach zusammenhängender RaumGarbentheorieProxy ServerInternetworkingGibbs-VerteilungAdressraumOrdnungsreduktionKernel <Informatik>Physikalisches SystemLoginCachingEinflussgrößeKraftDatennetzPrognoseverfahrenWort <Informatik>ServerVirtuelle MaschineMinimumURLRechnernetzOrdnung <Mathematik>BitPuffer <Netzplantechnik>InformationEinfacher RingClientBefehlsprozessorVerknüpfungsgliedComputeranimation
29:08
DatennetzIntelDatenmissbrauchGerichtete MengeSoftwareschwachstelleFlächentheorieKomponente <Software>Peripheres GerätInformationsspeicherungGraphikprozessorCachingDatenmissbrauchProfil <Aerodynamik>RichtungTUNIS <Programm>ComputersicherheitSeitenkanalattackeClientMAPComputerarchitekturProxy ServerRechnernetzProgrammfehlerArithmetisches MittelBitKartesische KoordinatenMomentenproblemParametersystemWeb SiteIntelVideokonferenzFlächentheorieQuaderZusammenhängender GraphTelekommunikationVorlesung/Konferenz
31:47
Wort <Informatik>ClientMultiplikationsoperatorVirtuelle MaschineSchnelltasteDifferenteMessage-PassingKategorizitätLASER <Mikrocomputer>MustererkennungPasswortTypentheorieReelle ZahlOrdnung <Mathematik>BeobachtungsstudieAbstandAlgorithmusMusterspracheArithmetisches MittelVorzeichen <Mathematik>MAPInternetworkingInformationAggregatzustandRechnernetzNeuronales NetzInteraktives FernsehenStapeldateiRandomisierungNormalvektorSchlüsselverwaltungDynamisches SystemAutomatische DifferentiationKategorie <Mathematik>Euklidischer RaumVorlesung/Konferenz
36:48
MultiplikationsoperatorDatenfeldRechnernetzMereologieNormalvektorBetrag <Mathematik>ZweiUnrundheitStapeldateiMetropolitan area networkNetzwerktopologieReelle ZahlBitDifferenteVersionsverwaltungHeuristikVirtuelle MaschineAggregatzustandSchätzfunktionKernel <Informatik>FokalpunktProxy ServerPuffer <Netzplantechnik>Einfacher RingEinfach zusammenhängender RaumRechenzentrumEinsRechter WinkelSkriptspracheQuaderFormation <Mathematik>UnendlichkeitBinärdatenComputerunterstützte ÜbersetzungCASE <Informatik>SpieltheorieMAPSchnittmengeAutorisierungWeg <Topologie>AppletGarbentheorieBeobachtungsstudieModallogikTypentheorieSimulationMusterspracheVorlesung/Konferenz
41:49
Computeranimation
Transkript: Englisch(automatisch erzeugt)
00:22
cache attacks from the network, and the speaker, Michael Kurth, is the person who discovered the attack, and it's the first attack of its type. So he's the first author of the paper, and this talk is going to be amazing.
00:44
We've also been promised a lot of bad cat puns, so I'm going to hold you to that. A round of applause for Michael Kurth. Hey, everyone, and thank you so much for making it to my talk tonight.
01:04
My name is Michael, and I want to share with you the research that I was able to conduct at the amazing WUSEC group during my master's thesis. Briefly to myself, so I pursued my master's degree in computer science at ETH Zurich and could do my master's thesis in Amsterdam.
01:22
Nowadays, I work as a security analyst at Infogard. So what you see here are the people that actually made this research possible. These are my supervisors and research colleagues which supported me all the way along and put so much time and effort in the research.
01:43
So these are the two rock stars behind this research. But let's start with cache attacks. Cache attacks were previously known to be local code execution attacks. For example, in the cloud setting here on the left-hand side, we have two VMs that
02:02
basically share the hardware, so they time-sharing the CPU and the cache, and therefore, an attacker that controls VM 2 can actually attack VM 1 via cache attack. Similarly, JavaScript. So a malicious JavaScript gets served to your browser which then executes it, and because
02:25
you shared a resource on your computer, it can also attack other processes. Well, this JavaScript thing gives you the feeling of remoteness, right? But still it requires this JavaScript to be executed on your machine to be actually
02:42
effective. So we wanted to really push this further and have a true network cache attack. So we have this basic setting where a client does SSH to a server, and we have a third machine that is controlled by the attacker.
03:00
And as I will show you today, we can break the confidentiality of this SSH session from the third machine without any malicious software running either on the client or the server. Furthermore, the CPU on the server is not even involved in any of these cache attacks.
03:22
So it's just there and not even noticing that we actually leak secrets. So let's look a bit more closely. So we have this nice cat doing an SSH session to the server, and every time the cat presses a key, one packet gets sent to the server.
03:44
So this is always true for interactive SSH sessions, because, as it's said in the name, it gives you this feeling of interactiveness. When we look a bit more under the hood what's happening on the server, we see that these
04:01
packages are actually activating the last level cache. More to that also later into the talk. Now, the attacker, in the same time, launches a remote cache attack on the last level cache by just sending network packets, and, by this, we can actually leak arrival times
04:20
of individual SSH packets. Now, you might ask yourself, well, how would arrival times of SSH packing packets break the confidentiality of my SSH session? Well, humans have distinct typing patterns.
04:40
And here we see an example of a user typing the word because. And you see that typing E right after B is faster than, for example, typing C after E. And this can be generalised, and we can use this to launch a statistical analysis. So here on the orange dots, if we are able to reconstruct these arrival times correctly,
05:06
so what correctly means is we can reconstruct the exact times of when the user was typing, we can then launch this statistical analysis on the inter-arrival timings.
05:21
And therefore, we can leak what you were typing in your private SSH session. Sounds very scary and futuristic, but I will demystify this during my talk. So, all right, there is something I want to bring up right here at the beginning. As per tradition and the ease of writing, you give a name to your paper.
05:45
And if you're following InfoSec Twitter closely, you probably already know what I'm talking about because, in our case, we named our paper Netcat. Well, of course, it was a pun. In our case, Netcat stands for network cache attack, and, as it is with humour,
06:06
it can backfire sometime, and, in our case, it backfired massively. And, with that, we caused like a small Twitter drama this September.
06:21
One of the most liked tweets about this research was the one from Jake. And, yes, these talks are great because you can put a face to such tweets, and, yes, I'm this idiot. So let's fix this. Interlock knowledge does with a bounty and also a CV number,
06:44
so, from nowadays, we can just refer it with the CV number, or if that is inconvenient to you, during that Twitter drama, somebody sent us like a nice alternative name and also including a logo, which actually I quite like.
07:02
It's called Neocat. Anyway, lessons learned on that whole naming thing, and so let's move on. Let's get back to the actual interesting bits and pieces of our research. So a quick outline. I'm firstly going to talk about the background, so general cache attacks,
07:24
then DDO and RDMA, which are the key technologies that we were abusing for our remote cache attack, then about the attack itself, how we reverse-engineered DDIO, the end-to-end attack, and, of course, a small demo.
07:42
So cache attacks are all about observing a micro-architectural state which should be hidden from software. And we do this by leveraging shared resources to leak information. An analogy here is safe-bracking with the stethoscope, where the shared resource is actually air that just transmits
08:03
the sound noises from the lock on different inputs that you're doing. And actually works quite similarly in computers, but here it's just the cache. So caches solve the problem that latency of loads from memory
08:24
are really bad, right? Which make up roughly a quarter of all instructions. And with caches, we can reuse specific data and also use spatial locality in programs. Modern CPUs have usually this three-layer cache hierarchy,
08:41
L1, which is split between data and instruction cache, L2, and then L3, which is shared amongst the cores. If data that you access is already in the cache, that results in a cache hit. And if it has to be fetched from main memory, that's considered a cache miss.
09:02
So how do we actually now know if a cache hits or misses? Because we cannot actually read data directly from the caches. We can do this, for example, with prime and probe. It's a well-known technique that we are actually also used in the network setting. So I want to quickly go through what's actually happened.
09:22
So the first step of prime and probe is that the packer brings the cache to a known state, basically priming the cache. So it fills it with its own data, and then the attacker waits until the victim accesses it.
09:41
The last step is then probing, which is basically doing priming again, but this time just timing the access times. So fast access cache hits are meaning that the cache was not touched in between. And cache misses results in that we know now
10:02
that the victim actually accessed one of the cache lines in the time between prime and probe. So what can we do with these cache hits and misses now? Well, we can analyze them. And these timing information tell us a lot about the behavior of programs and users.
10:22
And based on cache hits and misses alone, we can, or researchers were able to leak crypto keys, guest visited websites, or leak memory content. That's with spectrum meltdown. So let's see how we can actually launch
10:41
such an attack over the network. So one of the key technologies is DDIO. But first I want to talk to DMA, because it's like the predecessor to it. So DMA is basically a technology that allows your PCIe device, for example, the network card, to interact directly on itself with main memory
11:03
without the CPU interrupt. So for example, if a packet is received, the PCIe device can just put it in main memory. And then when the program or the application wants to work on that data, then it can fetch from main memory.
11:21
Now with DDIO, this is a bit different. With DDIO, the PCIe device can directly put data into the last level cache. And that's great, because now the application, when working on the data, just doesn't have to go through the costly main memory walk,
11:40
and can just directly work on the data flow or fetch it from the last level cache. So DDIO stands for Data Direct IO Technology. And it's enabled on all Intel server-grade processors since 2012. It's enabled by default and transparent to drivers and operating systems.
12:03
So I guess most people didn't even notice that something changed under the hood. And it changed something quite drastically. But why is DDIO actually needed? Well, it's for performance reasons. So here we have a nice study from Intel,
12:20
which shows on the bottom different times of NICs. So we have a setting with two NICs, four NICs, six, and eight NICs. And you have the throughput for it. And as you can see with the dark blue, that without DDIO, it's basically stopped scaling after having four NICs.
12:41
With the dark blue and with the light blue, you then see that it still scales up when you add more network cards to it. So DDIO is specifically built to scale network applications. The other technology that we were abusing is RDMA. So it stands for Remote Direct Memory Access.
13:03
And it basically offloads transport layer task to silicon. It's basically a kernel bypass. And it's also no CPU involvement. So application can access remote memory without consuming any CPU time on the remote server.
13:23
So I brought here a little illustration to showcase the RDMA. So on the left, we have the initiator. And on the right, we have the target server. A memory region gets allocated on startup of the server. And from now on, applications can perform data transfer
13:41
without the involvement of the network software stack. So you omit the TCP IP stack completely. With one-sided RDMA operations, you even allow the initiator to read and write to arbitrary offsets within that allocated space on the target.
14:01
I quote here a statement of the market leader of one of these high-performance NICs. Moreover, the caches of the remote CPU will not be filled with the accessed memory content. Well, that's not true anymore with DDIO. And that's exactly what we attacked on.
14:25
So you might ask yourself, where is this RDMA you used? And I can tell you that RDMA is one of these technologies that you don't hear often but are actually extensively used in the back ends of the big data centers and cloud infrastructures.
14:41
So you can get your own RDMA-enabled infrastructures from public clouds like Azure, Oracle Cloud, Huawei, or Alibaba. Also, file protocols like SMB and NFS can support RDMA. And other applications are high-performance computing,
15:01
big data, machine learning, data centers, clouds, and so on. But let's get a bit into detail about the research and how we abused the two technologies. So we know now that we have a shared resource exposed to the network via DDIO.
15:21
And RDMA gives us the necessary read and write primitives to launch such a cache attack over the network. But first, we needed to clarify some things. So of course, we did many experiments and extensively tested the DDIO part to understand the inner workings.
15:41
But here, I brought with me two major questions which we had to answer. So first of all is, of course, can we distinguish a cache hit or miss over the network? We still have network latency and packet queuing and so on. So would it be possible to actually get the timing right?
16:04
Which is an absolutely must for launching a side channel. Well, the second question is then, can we actually access the full last level cache? This would correspond more to the attack surface that we actually have for attack. So the first question we can answer
16:21
with this very simple experiment. So we have on the left a very small code snippet. We have a timed RDMA read to a certain offset. Then we write to that offset and we read again from the offset. So what you can see is that when doing this
16:42
like 50,000 times over multi-different offsets, you can clearly distinguish the two distributions. So the blue one corresponds to data that was fetched from main memory and the orange one to the data that was fetched from the last level cache over the network.
17:00
You can also see the effects of the network. For example, you can see the long tails which correspond to some packages that were slowed down in the network or were queued. So on the side note here for all the side channel experts, we really need that right because actually with DDIO reads
17:23
do not allocate anything in the last level cache. So basically this is the building block to launch a prime and probe attack over the network. However, we still need to have a target what we can actually profile. So let's see what kind of an attack surface we have.
17:44
Which brings us to the question, can we access the full last level cache? And unfortunately, this is not the case. So DDIO has this allocation limitation of two ways. Here in the example out of 20 ways, so roughly 10%.
18:01
It's not a dedicated way, so still the CPU uses this. But we would only have like access to 10% of the cache activity of the CPU in the last level bit. So that was not so well working for a first attack.
18:21
But the good news is that other PCI devices, let's say a second network card, will also use the same two cache ways. And with that, we have a 100% visibility of what other PCI devices are doing in the cache.
18:43
So let's look at the end-to-end attack. So as I told you before, we have this basic setup of a client and a server. And we have the machine that is controlled by us, the attackers. So the client just sends this package
19:02
over a normal ethernet NIC. And there is a second NIC attached to the server, which allows the attacker to launch RDMA operations. So we also know now that all the packets or all the keystrokes that the user is typing are sent in individual packets,
19:22
and which are activated in the last level cache through DDIO. So, but how can we actually now get these arrival times of packets, because that's what we're interested in? So now we have to look a bit more closely to how such arrival of network packages actually work.
19:44
So the IP stack has a ring buffer, which is basically there to have an asynchronous operation between the hardware, so the NIC, and the CPU. So if a packet arrives, it will allocate this
20:01
in the first ring buffer position. On the right-hand side, you see the view of the attacker, which can just profile the cache activity. And we see that the cache line at position one lights up, so we see an activity there. Could also be on cache line two, that's, we don't know on which cache line
20:23
this will actually pop up, but what is important is what happens with the second packet. Because the second packet will also light up a cache line, but this time different, and it's actually the next cache line as from the previous package. And if we do this for three and four packets,
20:43
we can see that we suddenly have this nice staircase pattern. So now we have a predictable pattern that we can exploit to get information when packets were received. And this is just because the ring buffer
21:01
is allocated in a way that it doesn't evict itself. It doesn't evict if packet two arrives, it doesn't evict the cache content of the packet one, which is great for us as an attacker because we can profile it well. Well, let's look at the real-life example.
21:22
So this is the cache activity when the server receives constant pings. You can see this nice staircase pattern, and you can also see that the ring buffer reuses locations as it is a circular buffer. Here it's important to know that the ring buffer
21:40
doesn't hold the data content, just the descriptor to the data. So this is reused. Unfortunately, when the user types over SSH, the pattern is not as nice as this one here because then we would already have a dumb deal and just could work on this.
22:01
Because when a user types, you will have more delays between packages, generally also you don't know when the user is typing, so you have to profile all the time to get the timings right. Therefore, we needed to build a bit more of a sophisticated pipeline.
22:21
So it basically is a two-stage pipeline which consists of an online tracker that is just looking at a bunch of cache lines that he's observing all the time. And when he sees that certain cache lines were activated, it moves that Windows forwards the next position
22:43
that he believes an activation will happen. The reason why is that we have a speed advantage, so we need to profile much faster than the network packets of the SSH session are arriving. And what you can see here on the left-hand side is a visual output of what the online tracker does.
23:02
So it just profiles this window, which you can see in red, and if you look very closely, you can see also more lit up in the middle, which corresponds to arrive network packets. You can also see that there is plenty of noise involved,
23:22
so therefore we are not able just to directly get the packet arrival times from it. That's why we need a second stage, the offline extractor. And the offline extractor is in charge of computing the most like-list occurrence of client SSH network packet.
23:43
It uses the information from the online tracker and the predictable pattern of the ring buffer to do so. And then it outputs the inter-packet arrival times for different words, as shown here on the right. Great, so now we are again at the point
24:01
where we have just packet arrival times, but no words, which we need for breaking the confidentiality of your private SSH session. So as I told you before, users or generally humans have distinctive typing patterns.
24:21
And with that, we were able to launch a statistical attack. More closely, we just do like a machine learning of mapping between user typing behavior and actual words. So that in the end, we can output the two words that you were typing in your SSH session.
24:44
So we used 20 subjects that were typing free and transcribed text, which resulted in a total of 4,500 unique words. And each represented as a point in a multidimensional space.
25:00
And we used really simple machine learning techniques like the k-nearest neighbor algorithm, which is basically categorizing the measurements in terms of Euclidean space to other words. The reason why we just used like a very basic machine learning algorithm is that we just wanted to prove that the signal that we were extracting
25:21
from the remote cache is actually strong enough to launch such an attack. So we didn't want to improve in general like this kind of mapping between users and their typing behavior. So let's look how this worked out. So firstly, on the left hand side,
25:40
you see we used our classifier on raw keyboard data. So means that we just used the signal that was emitted during the typing. So when they were typing on their local keyboard, which gives us perfect and precise data timing. And we can see that this is already quite challenging to mount.
26:01
So we have an accuracy of roughly 35%, but looking at the top 10 accuracy, which is basically the attacker can guess 10 words. And if the correct word was amongst this 10 words, then that's considered to be accurate. And with the top 10 guesses,
26:20
we have an accuracy on 58%. That's just on the raw keyboard data. And then we use the same data and also the same classifier on the remote signal. And of course, this is less precise because we have noise factors and we could even add or miss out on keystrokes.
26:43
And the accuracy is roughly 11% less and the top 10 accuracy is roughly 60%. So as we use the very basic machine learning algorithm, many subjects and a relatively large word corpus,
27:03
we believe that we can showcase that the signal is strong enough to launch such attacks. So of course, now we want to see this whole thing working, right? As I'm a bit nervous here on stage, I'm not gonna do a live demo
27:20
because it would involve me doing some typing, which probably would confuse myself and of course also the machine learning model. Therefore, I brought a video with me. So here on the right-hand side, you see the victim. So it will shortly begin with doing an SSH session.
27:42
And then on the left-hand side, you see the attacker. So mainly on the bottom, you see this online tracker and on top, you see the extractor and hopefully the predicted words. So now the victim starts this SSH session to the server called father
28:00
and the attacker, which is on the machine sum, launches now this attack. So you saw we profiled the ring buffer location and now the victim starts to type. And as this pipeline takes a bit to process these words and to predict the right thing,
28:21
you will shortly see slowly the words popping up in the correct, hopefully the correct order. And as you can see, we can correctly guess
28:42
the right words over the network by just sending network package to the same server and with that, getting out the crucial information of when such SSH packets were arrived.
29:06
So now you might ask yourself, how do you mitigate against these things? Well, luckily it's just server-grade processors, so no clients and so on. But then from our viewpoint, the only true mitigation at the moment
29:20
is to either disable DDIO or don't use RDMA. Both comes quite with the performance impact. So DDIO, you will talk roughly about 10 to 18% less performance depending of course on your application. And if you decide just to don't use RDMA,
29:40
you probably rewrite your whole application. So Intel on their publication on disclosure day sounded a bit different therefore, but read it for yourself. I mean, the meaning untrusted network can,
30:00
I guess be quite debatable and yeah, but it's what it is. So I'm very proud that we got accepted at security and privacy 2020. Also Intel acknowledged our findings, public disclosure was in September,
30:21
and we also got the bug bounty payment. Increased peripheral performance has forced Intel to place the last level cache on the fast IO path in its processors. And by this, it exposed even more
30:40
shared micro architectural components, which we know by now have a direct security impact. Our research is the first DDIO side channel vulnerability, but we still believe that we just scratched the surface with it. Remember, there's more PCIe devices attached to them.
31:02
So there could be storage devices. So you could profile cache activity of storage devices and so on. There's even such things as GPU direct, which gives you access to the GPUs cache,
31:20
but that's a whole other story. So yeah, I think there is much more to discover on that side and stay tuned. With that, all is left to say is a massive thank you to you and of course to all the volunteers here at the conference. Thank you.
31:47
Thank you, Michael. We have time for questions. So you can line up behind the microphones and I can see someone at microphone seven. So thank you for your talk.
32:01
I had a question about when I'm working on a remote machine using SSH, I'm usually not typing nice words like you've shown. Usually it's weird bash things like dollar signs and dashes and I don't know. Have you looked into that as well? Well, I think, so I mean, of course, what we wanted to showcase is that we can leak passwords.
32:22
If you would do pseudo or whatsoever. The thing with passwords is that it's kind of its own dynamic. So you type key passwords differently than you type normal keywords. And then it gets a bit difficult because when you want to do a large study
32:41
of how users would type passwords, you either ask them for their real password, which is not so ethical anymore, or you train them different passwords. And that's also difficult because they might adapt different style of how they type these passwords than if it were the real password.
33:02
And of course, same would go for command line in general and we just didn't have like the word corpus for it to launch such an attack. Thank you. Microphone one. Hi, thanks for a talk. I would like to ask the original SSH timing paper attacks.
33:25
It's like 2001 or something like that. Yeah, exactly, exactly. And do you have some idea why there are no circumstances on the side of SSH clients to add some bidding or some random delays or something like that? Do you have some idea why there's nothing happening there?
33:43
Is it some technical reason or what's the deal? Well, so we also were afraid that between 2001 and nowadays that they added some kind of a delay or batching or whatsoever. I'm not sure if it's just a trade-off between the interactiveness of your SSH session
34:03
or if there is like a true reason behind it. But what I do know is that it's oftentimes quite difficult to add like these artificial packets in between because if it's like not random at all, you could even filter out like additional packets that just gets inserted by the SSH.
34:23
But other than that, I'm not familiar with anything why they didn't adapt or why this wasn't on their radar. Thank you. Microphone four. How much do you rely on the skill of the typer? So I think of a user that has to search each letter
34:44
on the keyboard or someone that is distracted while typing, so not having a real pattern behind the typing. Are we actually absolutely relying that the pattern is reducible?
35:01
As I said, we're just using this very simple machine learning algorithm that just looks at the Euclidean distance of previous words that you were typing and the new word or the new arrival times that we were observing. And so if that is completely different, then the accuracy would drop.
35:21
Thank you. Microphone eight. As a follow-up to what was said before, wouldn't this make it a targeted attack since you would need to train the machine learning algorithm exactly for the person that you want to extract the data from? So yeah, so our goal of the research
35:41
was not like to do next level, let's say, machine learning type of recognition on your typing behavior. So we actually used the information on which user was typing to profile that correctly. But still, I think you could maybe generalize.
36:03
So there is other research showing that you can categorize users in different type of typers. And if I remember correctly, they came up that you can categorize each person into like seven different typing, let's say, categories.
36:20
And I also know that some kind of online trackers are using your typing behavior to re-identify you. So just to serve you personalized ads and so on. But still, we didn't want to go into that depth of improving the state of this whole thing.
36:44
Thank you. And we'll take a question from the internet next. Did you ever try this with a high latency network like the internet? So of course we rely on, let's say, a constant latency because otherwise it would basically
37:01
screw up our timing attack. And so as we are talking with RDMA, which is usually in data centers, we also tested it in data center kind of topologies. It would make it, I guess, quite hard, which means that you would have to do a lot of repetition,
37:20
which is actually bad because you cannot tell the users that please retype what you just did because I have to profile it again, right? So yeah, the answer is no. Thank you. Mic one, please. If the victim pastes something into the SSH session,
37:41
would you be able to carry out the attack successfully? No, this is, so if you paste stuff, this is just sent out as a batch when you enter. Okay, thanks. Thank you. The angels tell me there is a person behind mic six whom I am completely unable to see
38:01
because of all the lights. So as far as I understood, the attacker can only see that some package arrived on their NIC. So if there is a second SSH session running simultaneously and the machine under attack, would it already interfere with this attack? Yeah, absolutely. So even distinguishing SSH packets
38:23
from normal network packages is challenging. So we use kind of a heuristic here because the thing with SSH is that it always sends two packets right after. So not only one, just two. But I omitted this part because of simplicity of this talk.
38:43
But we also rely on these kind of heuristics to even filter out SSH packets. And if you would have a second SSH session, I can imagine that this would completely, so we cannot distinguish which SSH session did what. Thank you.
39:01
Mic seven again. You always said there was, you were using two connectors or like, how's it called, NICs? Yes, exactly. Is it has to be two different ones or it can be the same, how does it work? So in our setting, we used one NIC
39:21
that has the capability of doing RDMA. So in our case, this was Fabric, so InfiniBand. And the other was just like a normal ethernet connection. But could it be the same? It could be both of InfiniBand, for example? Yes, I mean, I don't, I mean, the thing with InfiniBand,
39:41
it doesn't use the ring buffer. So we would have to come up with a different kind of tracking ability to get this and which could even get a bit more complicated because it does this kernel bypass. But if there is a predictable pattern, we could all potentially also do this. Okay, thank you.
40:02
Thank you. Mic one. Yeah, hello again. I would like to ask, I know it was not a main focus of your study, but do you have some estimation how practical this can be, this timing attack? Like if you do like real world simulation, not the like prepared one,
40:20
how big problem it can really be? What do you think? Like what's the state of the art in this field or how you feel the risk? You're just referring to the typing attack, right? The timing attacks, SSH timing attack. Not necessarily the cache version. So the original research that was conducted is out there since 2001. And since then, many researchers show
40:43
that it's possible to launch such typing attacks over different scenarios. For example, JavaScript is another one. And it's always a bit difficult to judge because most of the researcher are using different data sets. So it's different to compare.
41:00
But I think in general, I mean, we have used like quite a large work corpus and it still worked. Not super precisely, but it still worked. So yeah, I do believe it's possible, but to even make it like to a real world attack where an attacker wants to have high accuracy, he probably would need a lot of data
41:22
and even like more sophisticated techniques, which they are. So there are a couple other of machine learning techniques that you could use, which have their four pros and cons. Thank you. Ladies and gentlemen, the name, the man who named an attack, Netcat,
41:41
Michael Kurth. Give him a round of applause, please. Thank you so much.