Contributor analysis
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Teil | 8 | |
Anzahl der Teile | 20 | |
Autor | ||
Lizenz | CC-Namensnennung 4.0 International: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/32383 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache | ||
Produktionsort | Brüssel |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
3
4
7
8
17
18
00:00
ComputeranimationJSON
00:21
ComputersicherheitNebenbedingungTechnische InformatikVorlesung/Konferenz
00:42
MultiplikationsoperatorForcingBitEinfache GenauigkeitEinsSchaltnetzFunktion <Mathematik>MereologieEin-AusgabeAdditionParametersystemZahlenbereichDemo <Programm>ImplementierungDisjunktion <Logik>InformationKryptologieOrdnung <Mathematik>AlgorithmusPropagatorDigitaltechnikSubstitutionMapping <Computergraphik>Güte der AnpassungSchlüsselverwaltungSoftwaretestInternetworkingMailing-ListeDigitalisierungOrtsoperatorSoftware EngineeringStandardabweichungLeistung <Physik>QuaderRestklasseChiffrierungNichtlinearer OperatorAnalysisRechter WinkelDifferenteCASE <Informatik>EnthalpieVerschiebungsoperatorKreisbewegungKonstanteSchreiben <Datenverarbeitung>SoftwareentwicklerWissensbasisInelastischer StoßSoundverarbeitungZellularer AutomatWeb SiteStichprobenumfangQuick-SortGruppenoperationKoeffizientDatenstrukturSchwebungSurjektivitätEndliche ModelltheorieDigitale PhotographiePhysikalisches SystemData MiningMAPVersionsverwaltungSprachsyntheseAggregatzustandCodierungGammafunktionPoisson-KlammerGesetz <Physik>InformationsspeicherungTotal <Mathematik>t-TestCompilerAnalogieschlussAntwortfunktionTeilbarkeitSichtenkonzeptComputeranimation
08:43
ImplementierungMaterialisation <Physik>ProgrammbibliothekRechenwerkPlastikkarteSoftwareentwicklerComputeranimation
09:18
Mailing-ListeEin-AusgabeKlasse <Mathematik>DechiffrierungAggregatzustandQuick-SortBitSchwebungPolstelleChiffrierungComputersicherheitDisjunktion <Logik>ResultanteCodierungJSON
11:15
ResultanteCASE <Informatik>ZahlenbereichDisjunktion <Logik>BitUnrundheitChiffrierungEin-AusgabeAdditionFormation <Mathematik>SchwebungProzess <Informatik>DechiffrierungÄhnlichkeitsgeometrieProgramm/Quellcode
13:04
Demo <Programm>Programm/Quellcode
13:24
Ordnung <Mathematik>BitEinsZahlenbereichMereologieDreiAggregatzustandProgramm/Quellcode
13:51
Familie <Mathematik>Produkt <Mathematik>AdditionProgramm/QuellcodeComputeranimation
14:17
VersionsverwaltungImplementierungSchnittmengeCASE <Informatik>FunktionalanalysisZahlenbereichAggregatzustandLoopQuick-SortZustandsgrößeAuswahlverfahrenDisjunktion <Logik>Mixed RealityProgramm/Quellcode
15:04
ImplementierungQuick-SortIdeal <Mathematik>Kategorie <Mathematik>MAPWort <Informatik>ZahlenbereichBitSubstitutionOptimierungsproblemAggregatzustandCliquenweiteCASE <Informatik>DechiffrierungMultiplikationsoperatorMixed RealityLastUnrundheitDisjunktion <Logik>Mailing-ListeResultanteBootenOrdnung <Mathematik>Algorithmus
16:54
Leistung <Physik>Twitter <Softwareplattform>Gewicht <Ausgleichsrechnung>InformationBitSchwebungEinfache GenauigkeitAggregatzustandMixed RealityDechiffrierungSchnittmengeCASE <Informatik>Formale BegriffsanalyseEin-AusgabeDefaultPaarvergleichFunktionalanalysisResultanteFunktion <Mathematik>SkriptspracheCoxeter-GruppeProgrammfehlerGüte der AnpassungChiffrierungAnalysisProgramm/QuellcodeComputeranimation
19:44
Demo <Programm>Coxeter-GruppeDigitaltechnikAlgebraisches ModellGrundraumNeuroinformatikGruppenoperationFormale GrammatikSchnittmengePaarvergleichEnergiedichteSelbstrepräsentationDatenflussAnalysisKryptoanalyseSchwebungElektronische PublikationWasserdampftafelEinsZahlenbereichDechiffrierungStichprobenumfangEin-AusgabeBitFunktion <Mathematik>Güte der AnpassungChiffrierungUnrundheitMultiplikationsoperatorNegative ZahlOrdnung <Mathematik>SoftwaretestSchlüsselverwaltungComputeranimation
23:20
Ordnung <Mathematik>BitSoftwaretestDisjunktion <Logik>Algebraisches ModellChiffrierungAnalysisEinsWort <Informatik>Negative ZahlAutomatische HandlungsplanungCASE <Informatik>Inhalt <Mathematik>SchwebungQuick-SortVorlesung/Konferenz
24:58
CodierungMultiplikationsoperatorComputeranimation
Transkript: Englisch(automatisch erzeugt)
00:10
Brussels 2017.
00:22
OK, so my name is Francisco Olazizquierdo Riera. It's a really boring name, at least if you live in Spain. So please call me Klondike. Well, I have been working with security for 11 years by now. I started with 17.
00:41
I'm a computer engineer and blah, blah, blah. That's like things that you put on the wall and you forget about them. I'm a software engineer and a developer. And I really enjoy cryptography. And well, now I'm working as a pen tester. So well, enough about me. Now we are going to introduce you to a crypt analysis. I mean, not the internet.
01:03
So let's explain what's confusion and what's diffusion. The both concepts were introduced by Cloud Sanon in 1945. And you might not realize it, but they are basically the base below almost all cryptographic algorithms that are used nowadays.
01:22
For example, eyes is designed to have both good confusion and good diffusion, and so was this. Confusion is basically the ability that a cipher has to hide out the relationship between plaintext and cleartext based on the key that you are providing. And diffusion on the other side, which is what we are going to focus on,
01:40
is that each single input bit that you provide to the cipher should affect each single output bit of the ciphers. Because otherwise, you can try to focus on two speeds that are affected by less inputs. And then you can try to break two speeds and use the information you gain from two speeds in order to attack the other ones.
02:01
Any questions until now? No? Well, even if you had, I could not be able to see it because there is really, really strong light coming my way. Okay, so what is the idea? We want to reproduce the cryptographic algorithm for example, our implementation of eyes, or salsa20, whatever.
02:21
But instead of running all the operations, we are going to try to see how the operations try to mix together the different inputs that we are providing. So therefore, we are focusing on diffusion. So how can we map the operations in order to make that to happen? Well, first, we have the concept
02:40
of a black substitution box, or black s-box. It's an s-box, or substitution box, on which we don't know which is the internal mapping. This is probably something that sounds familiar to people that has been doing different circuit analysis. So basically, we have like n bits as input, and n bits as output. And we will assume that all bits of the input
03:02
propagate to all bits of the outputs. So for every single output bit, for every single output bit, we will make the union of the list of every single input bit. We also have the bitwise node, which is, well, totally trivial.
03:20
I mean, we are just flipping every single bit. There is no cross-dependency. So the output list of a node is the same as the input list. And our XOR, we only will interact between pairs of bits. So we just make the union of both contributor lists. Sifts are a lot more complicated
03:41
if you don't have a constant on the sift parameter. So what we will do is just spread the dependencies of the first input to all the outputs, and that will do it, because we know that the first input will be in one of the possible positions spreading the bits. For additions, we can spread from LSBs
04:03
to MSBs. I'm not sure if you are familiar with what we call the schoolbook addition. It's the typical algorithm that you use at the schools when you have to add really large numbers. So you start adding the first digits that are on the right, and then you might have a carry that goes to the second group of digits,
04:21
then you add them. So if you understand this algorithm, you can understand why we first will take the union of lists of the first two digits, and then we propagate that as carry to the second group of digits, then we propagate that contribution list to the third, and so on. Substructions are pretty much the same, because when we are speaking of integers,
04:42
we will do a complement of two, so it's a node of the operand, and we also add one at the end, so that's an addition, and then we do another addition. So we will just do the same as with addition, and propagate from right to left. Multiplications, still the same. We propagate from right to left,
05:02
as we used to do on the school. Modulos are a lot harder to map, so for now we have to use a black S-Box. So our divisions, fortunately you will not see many modulos and many divisions, and most of the modulos you will see are modulos by powers of two, which you can convert instead to a byte-wise ANT.
05:26
And we have also a white S-Box, which is the standard substitution box you can find, for example, on nice. And in this case, what we can try to do is try to find out if there is some bits that don't contribute to a specific output bit.
05:41
This is usually the case on S-Boxes like the ones that are used for Tiger, because Tiger S-Boxes have 256 bits of input, but they have six, sorry, they have eight bits of input, and they have 16 bits of output. So let's see if we can optimize everything.
06:02
I mean, this so far is a bit confusing, but now it's hopefully going to be easier. So if we do a bit-wise ANT by a constant, it's really simple. We use clear all the input lists where the bit is zero, because we know that this bit is going to be zero always.
06:24
With a bit-wise OR is still the same. We will clear the bits where the bits are one. Bit-wise XOR, we use copy the lists. We ignore the constant. Bit-wise shifts and rotates. Basically, we just need to shift or rotate the inputs,
06:42
which makes things a lot easier compared to our other approach. Arithmetic write shifts, we just have to move to the right and then copy the most significant bits, contribute or list to every single position that we have moved. Shifts and rotates of a constant are a lot harder,
07:02
so for now, we will use the union of all the inputs for the second parameter. Multiplications, well, they can be replaced by shifts and additions. And as you can probably figure out, there is still a shitload of work left to do. I mean, this I started developing on April,
07:23
and well, I also have work full-time, so I don't have that much time for this stuff. So let's go now for the cool part. I mean, if we want to attack the cipher, once we know that there is at least one output bit that has less contributors than input bits exist on the cipher,
07:42
we will just try to brute force the contributors of that single bit and try to filter out which of the combinations of contributors are actually the right ones. From a cryptographic approach, this means that you have broken the cipher because you can figure out which are the correct inputs
08:02
using less than two to the number of input bits. In reality, of course, if we have a bit that has 127 contributors of 128, we still cannot solve the problem, but hey, it's broken. But in other cases, when the cipher is awfully broken,
08:21
you will find that maybe one bit has one or two bits that are contributing to its output, and those are the ones that are easy to attack. So well, now is when we go for a simple demo. Yeah, now is when things start going wrong. So let's go for our eight-bit XOR.
08:41
So hopefully this will work, as I expect. Let's move this one here. And let's make this awfully large. So I developed a small Python library that implements most of these algorithms,
09:01
which you can find along with the other dark materials. So first, we need to import the implementation of the algorithms, which I'm going to do the old way. Let's see, copy.
09:22
And paste. So for example, here you can see that the XOR is basically the contribution of the concatenation of the list of every single bit, and rotate is the decision, and well, it's not something especially complicated.
09:42
So now we have, for example, the first cipher that we are going to implement, which is a simple XOR cipher, which you guys are probably familiar with, I hope. So as you can see, the only thing that we are doing is, first we generate the list of inputs,
10:03
and here you can see that we use the XOR of both inputs, and then we return the result of the XOR. So if we go and generate the state, and then we print the code for that state, and you can see how basically we can find
10:22
that bit zero has two contributors, which are inputs 00 and 10, bit one has two contributors that are 01 and 11, bit two has 02 and 12, bit three has 03 and 13. So by now, I mean, even if you don't know cryptography,
10:42
and you think that using an XOR cipher is secure, you should be able to figure out that it's not, actually the security that that thing is providing you, it's only two bits for every single output bit. So let's go for something slightly more advanced, so you can understand that this is an algorithm.
11:06
And yeah, I forgot to define the cipher. So this is the XOR cipher, which you guys are probably also familiar with. It's basically adding a constant to,
11:24
and now we use the XOR cipher. So copy and paste.
11:44
Okay, so as you can see here, we can see that the less significant bits, which in our case are bit number seven, only has one contributor. Well, contributions from bit seven of input zero and one. Bit number six actually has contributors
12:01
from bits six and seven from input zero and one, and it keeps increasing as we go to the most significant bit. So in this case, if we try to approach the attack of the cipher, we could first try to break out bit number seven, and then with the results we get from bit number seven that are considered valid inputs, we'll go to bit number six,
12:22
and then with the results from that to bit number five, and so on. Or well, since it's a simple addition cipher, you just try to find out which is the constant by having to plain text. But so far I'm trying to make it simple, so you can understand what is going on.
12:43
So now we are going to do for addition rotate XOR cipher with a single round. So you can see that that actually can be used with some slightly more real-world ciphers.
13:02
Okay, so as you can see now it's a lot more complicated to follow this, and when I was doing the demo with smaller screen, it was a lot easier. But, oh, fuck, okay.
13:24
So as you can see, basically we have a lot of contributors by the late members of the cipher, but if we go to state number one, we can see that the number of contributors gets reduced significantly. I mean, we have some parts where we have nine, seven, five, even three contributors,
13:41
which are on bit number three. So to start the bits on which we will start focusing the TAC in order to continue going for the more complicated ones. And now we are going to go for, well, the real hard stuff. I mean, I have shown you how if you have one rabbit
14:01
and you have another rabbit, then you have two rabbits. That's why one plus one is two. That's simple addition. So now I'm going to try to show you the other one. I think you should be able to do it by now. I mean, you have learned everything you need. So in this case, I have an implementation
14:21
of the first version of Petya cipher, which is the one that motivated me to start doing hard disk research. So as you can see, it's not very complicated. It's pretty much like Salsa. Basically, we have a loop on which
14:42
we will change the state number four, which will be the Petya mix function of states four, zero, 12, using constant number seven. And as you can see, it's basically an XOR of a cast to 16 bits, rotate of 32, cast to 32 at 16. I mean, this is pretty much hard to follow,
15:02
but if you use this implementation replacing XOR 16, cast 16, etcetera by the proper implementation of these primitives, you will get exactly the same results as if you ran Petya, which I hope you never have to.
15:21
So for those of you that don't know, Petya is a ransomware that runs from the boot sector. And well, it's very interesting, I guess, at least for me. So well, as you can see here, we are not showing the list of contributors
15:41
because in some cases it's awfully huge. I mean, we have like 228, 235. So it would be really hard to understand, but here we can see that, for example, for state number 14, bit number 15, we only have one contributor, and here we only have two, three, five, six, and even seven.
16:03
That is actually why the original attack that you set the evolutionary algorithms in order to try to find the optimal solution did work because some of the bits were so little influenced by the other bits that they were pretty much plain text.
16:22
So you could use the XOR of the plain text that you already known with the result of the cipher, which was basically the key without being mixed with anything. So hopefully this can show you that well, this kind of stuff can be used for something that is real world. I mean, I do have two more examples,
16:42
which I think we still have some time to analyze, which is SASA-2, which is basically two rounds of SASA. So let's load SASA-2. And let's run SASA-2.
17:06
And yeah, well, please ignore the bug. That's what happens when you do presentations late at night. Anyways, you can see that not all the input bits
17:21
are contributing to all of the output bits. I mean, some of them get 255 contributors, which is less than the full algorithm, while some of them get 228, which is still less than it should be. But instead, if we pick SASA-20, we will see that this actually does the mixing
17:40
the way it should. And all of the bits have 512 bits of input. That's supposing I have not messed up the algorithm.
18:06
So no, there is something that is going awfully wrong here, and I'm not sure what it is. Ah, yeah, I forgot to define the mixing function.
18:22
Okay, so I'm sorry, this is the problem of doing a live demo, as you probably know, so okay. So now we have the mixing function. So SASA-2 should also work the way it should now.
18:41
Yeah, so as you can see, most of the inputs only get 480 bits of contribution, which is less than the 512 bits that the state contains. And in some cases, we have almost 300 bits of contribution. I mean, of course, we cannot attack the cipher with this information, but we already know that it has flaws, and that if we use
19:02
monadvance script analysis techniques, we might be able to break it. In the case of SASA-20, we will instead find that all the input bits are contributing to all the output bits, and therefore, it cannot be broken easily. Yeah, as you can see, it takes a little bit more, but now you can see that all the input bits
19:23
are contributing to all the output bits, because all the results that we are getting are 512. And because of that, it means that every single bit of output gets at least a tiny contribution from every single bit of input, and therefore, that means that it has a good diffusion,
19:43
and it's hard to attack, at least using a diffusion-based approach. So, well, that's basically the hard demos. So now we are going to use, too, a small comparison of both approaches. We have on one side the algebraic approach. So, well, we will try to break the cipher for all keys
20:04
by finding out the simplification of the algebraic representation using adhesions, rotates, exos, and we might map them to real logic. Well, a question I'm receiving, it's quite hard. Please raise your hands if you really enjoy it,
20:22
the algebra courses on your high school or on your university. Yeah, I think that, no, nobody raised their hands, so you can understand why nobody does this kind of algebraic approaches, good. And, well, simplification, at least, if you want to do it with a computer, tends to be painful. It's usually an NP problem.
20:42
So instead, with contribution analysis, we have something that is simpler to reason with. I mean, I think that everybody here can understand the concept of, okay, we have some inputs, we have some outputs, and we are going to see how these inputs affect the outputs. It's also successful if we have a really broken cipher,
21:05
and, well, we don't care about how the contributions are made, we don't care if it's an AND, if it's an XOR, we just care about this input bit is affecting in some way this output bit. And because of that, it's really fast to run, because we only need to run the cipher,
21:20
and we will only need to do M operations, where M is the number of bits that the input of the cipher has. So it has some problems, of course. It's awfully less precise than algebra, and it's also less precise than rotational cryptanalysis. If I had run, for example, SARSA2 with a rotational cryptanalysis approach,
21:42
I could have found much bigger flaws than I have found with this approach. And it will only find blatantly broken ciphers. I mean, not expected to find the next flaw on ERC4, because it will not. It also has a lot more false negatives. That means that it will find a lot of ciphers
22:02
that you will think are secure when they are not. So this should be used mostly for pen testers, by pen testers and people that is doing reverse engineering of malware, in order to try to understand which are the flaws of the ciphers they are analyzing. But if you work as a cryptanalyst at a really big government company,
22:20
you cannot say, yeah, my cipher is totally secure against contributor analysis, because it most likely will not be secure against something else. So, well, that makes the presentation. There is a lot of people to thank here. My mom and dad, which are not watching this. The people that is making recon possible
22:41
that you have over there, they really deserve a round of applause. Please, come on. And, well, there has been a lot of people that has supported me whilst I was doing this, and while my company actually allows me to take leave of absence, I don't get paid for doing research, but I can do research because I don't have to work
23:00
whilst I'm doing research. But the most important, thanks to you for your attention. So, I think we have time for like, for like one or two questions, I guess. If anybody wants to ask something, because I cannot see if you are raising your hands or anything.
23:21
Let's see. Oh, now I can see. Cool. Have you tried searching for, for example, you have two contributors, bit one and bit two, but if you XOR them, then these are out of the contributor list.
23:45
Sometimes the XOR of two bits doesn't affect if you have two ones or two zeros, it's the same. Yeah, you are talking about when bits
24:00
what's it called, negate each other. For example, if I do an XOR by a specific bit, and then XOR it again, true? This cannot be detected by contributor analysis. You will need to use an algebraic approach if you want to detect that, because it's really hard to follow up how the different bits are contributing to the algebraic. So, what this will do is that,
24:23
in this case, you will find bits that you think that contribute to the cipher when they don't. So, that's why I said that you should use this in order to find flaws and cipher, but not in order to test that ciphers are secure. Not sure if I am explaining myself.
24:40
Okay, any other questions? I want to believe that this is because I'm really good at explaining, otherwise here you have a duck that will do the explanation a lot better than I do. Not because you didn't understand a single word that I said. Anyways, yeah, as I said, I hope you are found by now. That's your code if you want to find
25:01
all the talk material and that URL. I saw somebody that was taking pictures, so please, now is the right time to take the picture, you know? Okay, so thanks a lot for your time, and well, I hope you enjoy the rest of the conference.