The Remote Metamorphic Engine
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 93 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/36213 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Information securityBinary fileFluxReverse engineeringMachine learningRight angleDegree (graph theory)ResultantInformation securityBit rateSequenceNichtlineares GleichungssystemMachine codeArtificial neural networkRemote procedure callCountingRoundness (object)ExpressionReverse engineeringFluxMachine learningBinary codePoint (geometry)Presentation of a groupComputer fontWordIntegrated development environmentView (database)Dependent and independent variablesMoment (mathematics)Function (mathematics)Vulnerability (computing)outputQuicksortVirtual machineSpeech synthesisLecture/Conference
05:05
RandomizationInformation securityUniform resource locatorPattern languageNumberTheory of relativityBuffer overflowRandomizationRandom number generationArithmetic meanInfinityDataflowInformation securityDivision (mathematics)ExpressionAddress spaceSymmetric-key algorithmStack (abstract data type)MathematicsMathematical analysisFunctional (mathematics)Nichtlineares GleichungssystemDegree (graph theory)Program flowchart
07:37
Regulärer Ausdruck <Textverarbeitung>Information securityDimensional analysisMultiplication signReverse engineeringInfinityPerspective (visual)Binary codeCASE <Informatik>Electronic signatureMachine codePhysical systemPattern languageMalwareComputer virusProgram flowchart
09:19
PredictionMachine codeQuicksortSoftwareInformation securityExploit (computer security)Machine codeDynamical systemFluid staticsSoftwareRandomizationReplication (computing)Endliche ModelltheorieQuicksortVulnerability (computing)Reverse engineeringExploit (computer security)Computer animation
10:10
Machine codeTime evolutionFunction (mathematics)Reverse engineeringControl flow2 (number)Machine codeDynamical systemEndliche ModelltheorieFluxLevel (video gaming)Binary codeBit rateOrder (biology)Table (information)Remote procedure call
11:16
FluxPunched cardCommunications protocolComputer architectureAreaReverse engineeringMachine codeTime zoneRemote procedure callDivisorData storage deviceDependent and independent variablesPanel painting
11:59
AreaMixed realityOrder (biology)MalwareData storage deviceMachine codeIntegrated development environmentReverse engineering
12:40
TelecommunicationCommunications protocolDependent and independent variablesMachine codeServer (computing)Machine codeIntegrated development environmentVirtual machineGame controllerAreaCommunications protocolClient (computing)INTEGRALDependent and independent variablesMalwarePresentation of a groupCASE <Informatik>Reverse engineeringKernel (computing)
14:20
Machine codeAreaDependent and independent variablesDecision theoryReverse engineeringMachine codeMereologyDataflowElectronic mailing listPhysical systemIntegrated development environmentOrder (biology)TelecommunicationBefehlsprozessorINTEGRALPerspective (visual)SynchronizationMalwareCommunications protocolFunctional (mathematics)Semiconductor memoryComputer hardwareVirtual machineOperating systemProcess (computing)Sampling (statistics)Resolvent formalismSolid geometryIntrusion detection systemDiagramProgram flowchart
16:56
Dependent and independent variablesFunction (mathematics)Machine codeOrder (biology)Dependent and independent variablesMalwareMultiplication signDisk read-and-write headSemantics (computer science)Functional (mathematics)Data structureSampling (statistics)
18:09
Machine codeDependent and independent variablesBinary fileEncryptionAreaDependent and independent variablesDifferent (Kate Ryan album)Multiplication signEncryptionOrder (biology)HookingComputer animation
18:48
EncryptionFunctional (mathematics)CoroutineSet (mathematics)Remote procedure callDependent and independent variablesKey (cryptography)Dynamical systemSingle-precision floating-point formatReverse engineering
19:48
Sampling (statistics)Dependent and independent variablesDifferenz <Mathematik>Reverse engineeringRight angleAdditionBitRotationSet (mathematics)Exclusive or
20:24
Dependent and independent variablesExecution unitSet (mathematics)Reverse engineeringMachine codeFunctional (mathematics)Order (biology)Disk read-and-write headSource code
21:10
Function (mathematics)Machine learningBlock (periodic table)Bloch waveElectronic signatureReverse engineeringFunctional (mathematics)Set (mathematics)Machine codeEncryptionAntivirus softwareSemiconductor memoryPolymorphism (materials science)Sampling (statistics)Program flowchart
22:20
Electronic signatureMultiplication signElectronic signatureMalwareMachine codeReverse engineeringTerm (mathematics)Sampling (statistics)Data structureMathematicsFlux
23:11
Virtual machineTime evolutionMachine codeMachine codeReverse engineeringAreaFluxINTEGRALOrder (biology)
23:59
Data structureTime evolutionMathematicsFunction (mathematics)Hill differential equationReverse engineeringMachine codeData structureBlock (periodic table)Order (biology)Different (Kate Ryan album)Functional (mathematics)Term (mathematics)Multiplication signEmulator
26:35
Sampling (statistics)Block (periodic table)Source code
27:11
Dependent and independent variablesData structureMorphingMachine codeFunction (mathematics)PermutationBlock (periodic table)Machine codeSequenceProgram slicingMultiplication signSynchronizationFunctional (mathematics)Context awarenessPermutationVulnerability (computing)Real numberDifferent (Kate Ryan album)Polymorphism (materials science)Reverse engineeringBefehlsprozessorTerm (mathematics)Electronic mailing listRadical (chemistry)Data structureData miningMalwareElectronic signatureDependent and independent variables
30:24
Time evolutionMachine codeMorphingTouchscreenDifferent (Kate Ryan album)Sampling (statistics)Machine codeProcess (computing)DebuggerFunctional (mathematics)Insertion lossThermal expansionMultiplication signUsabilityOperator (mathematics)BitSemiconductor memoryData structureSet (mathematics)
31:16
Insertion lossTime evolutionMachine codeLevel (video gaming)Line (geometry)Zeno of EleaMorphingChord (peer-to-peer)Set (mathematics)Different (Kate Ryan album)Level (video gaming)Machine code2 (number)Data structureReverse engineeringHookingPosition operatorBinary codeMereology
32:07
Line (geometry)Machine codeTime evolutionLevel (video gaming)Zeno of EleaMathematical singularityInclusion mapMorphingIndependent set (graph theory)Term (mathematics)Machine codeHydraulic jumpSequencePosition operatorBlock (periodic table)
32:45
Time evolutionMachine codeLevel (video gaming)MorphingElectronic data interchangePosition operatorMachine codeBlock (periodic table)Uniform resource locatorData structureSemiconductor memoryHydraulic jumpMereologyFunctional (mathematics)EncryptionMultiplication signFrame problemKey (cryptography)Context awarenessReverse engineeringCoroutineOnline help
35:34
Data structureLevel (video gaming)Block (periodic table)SummierbarkeitMachine codeRule of inferenceElectronic data interchangeMalwareData structureMachine codeSampling (statistics)Block (periodic table)Source codeComputer animation
36:14
Block (periodic table)Fisher's exact testMalwareEwe languageElectronic data interchangeMachine codeDependent and independent variablesVariable (mathematics)Embedded systemSemiconductor memoryMachine codeMathematicsSource codeComputer animation
36:56
Block (periodic table)Embedded systemSet (mathematics)MalwareAsynchronous Transfer ModeElectronic data interchangeEwe languageVariable (mathematics)Machine codeMachine codePoint (geometry)Dependent and independent variablesRemote procedure callFunctional (mathematics)Virtual machineAreaDifferent (Kate Ryan album)Multiplication signClient (computing)Response time (technology)Sampling (statistics)Data structureLocal ringOrder (biology)Source codeComputer animation
39:23
Artificial neural networkDependent and independent variablesMachine codeResultantNatural numberKey (cryptography)EncryptionCommunications protocolDependent and independent variablesMultiplication signFunctional (mathematics)Different (Kate Ryan album)Data structureOrder (biology)Physical system
42:10
MereologyMalwareDivisorAverageIntegrated development environmentMathematical analysisReverse engineeringDifferent (Kate Ryan album)Endliche ModelltheorieRun time (program lifecycle phase)Operating systemElectronic signatureBefehlsprozessorThermal fluctuationsCASE <Informatik>Machine codeINTEGRALFrame problemConnected spaceMultiplication signDependent and independent variablesFunction (mathematics)Functional (mathematics)outputPairwise comparisonFluid staticsSampling (statistics)Semiconductor memorySoftwareProcess (computing)Computer fileReflektor <Informatik>Physical systemLecture/Conference
47:49
Address spaceOrder (biology)Multiplication signEmailMereologyComputer animation
Transcript: English(auto-generated)
00:00
We have a, uh, super interesting talk for you up next by, um, Omro Abdel-Gawad. Uh, he's a security researcher with immune, er, immuni, sorry, uh, my words, speech is getting a little slurred today, I'm not even drunk. So, uh, without further ado, uh, I'm going to let him take off his talk. Let's get a big round of applause for him. So when I
00:32
started working on the remote metamorphic engine research, the last thing I was thinking about back then was metamorphism. Alright, is it ok now? So, the last thing I was
00:55
thinking about when I started working on the remote metamorphic engine research was metamorphism. I wasn't thinking about metamorphic at all. I was mainly trying to
01:06
create unbreakable code, a piece of code that cannot be reverse engineered, cannot be tampered, cannot be analyzed. So, anyway, I'm going to, I'm going to talk about reverse engineering. So, anyone who has limited experience with reverse engineering will know
01:21
that it's actually not possible to create unbreakable code. It's possible to resist reverse engineering, but it's really impossible to create a piece of code that would have high resistance to the degree that it cannot be analyzed. So, I researched this, I researched the subject, I read about a lot of papers, I learned about a lot of
01:41
obfuscation techniques, amazing obfuscation techniques, and I applied a lot of techniques, but unfortunately I failed. Sequence of failure attempts kept going up until the moment that I decided to simplify the problem and treat the problem as a security
02:01
problem. Well, at first, that actually turned to complicate the problem rather than simplifying it, because unfortunately we don't know what security is. We know what security is not, and we learn about weaknesses based on knowledge of vulnerabilities, and we learn about strengths and security based on weaknesses and defined
02:23
weaknesses, but we really don't know what security really is. If you asked me 15 years ago when I started my career, what is security, you would have lost your whole day receiving my response to your question. Today, if you ask me the same question, you would really make me thinking. Something amazing about security that the more we learn
02:45
about it, the less likely we are capable to define it. But there's no wonder, as at a certain point in my research, I came into very satisfying view for security, that security actually meant to be undefined, that security is all about undefined
03:03
expressions, undefined expressions that aim to take probabilities out of the equation that you're trying to secure. So the remote metamorphic engine research of resisting reverse engineering started by defining security as an undefined expression, taking the
03:23
lessons learned from there and apply it to the binary protection problem, and that resulted in flux binary mutation, and only then with flux binary mutation I started to have satisfying results for resisting reverse engineering. But resisting reverse
03:40
engineering or resisting a reverse engineer turned to be not enough. Once you have high rates of satisfying results of resisting reverse engineering, you will then realize that the problem is much bigger than that. You need to also count automated tools, you need to count AI tools or machine learning, and then the problem is not
04:01
only about the code, the problem is also about data, how you're gonna secure the data, input, output, even the data while it's being processed inside the code. So that resulted in the end by adding techniques to secure the data and the code that resulted in sort of artificial immunity. So that's the outline of the presentation
04:22
today, and the remote metamorphic engine is actually the name that I've given to the approach, which is a new approach for resisting reverse engineering, but it's, it's more like a new approach rather than an actual engine, and I decided to name the, the
04:41
approach like that because it's not possible without metamorphism or mutation, and it's not possible without having the engine or the morphic engine isolated remotely or away from the reverse engineering environment. And I'm applying all these techniques using very simple, very simple techniques that I'm gonna go through it all with you today,
05:04
within the coming few minutes. So I define security as an undefined expression following very simple analysis flow. If we don't know what security is but we know about a lot of successful security solutions, so if we analyze these successful security solutions enough, we
05:25
can then find patterns that keep repeating itself everywhere and then we can, by defining these patterns, probably that will, will help us to better understand security and will help us to better approach security. So that's exactly what I've done. If you ever do
05:45
that, you will find something pretty interesting, that randomization and isolation are two major patterns that we rely on in everything. To the degree that if you take these two patterns out of the equation or out of any security solution, security is not gonna be possible. Randomization like, let me give you an example like stack buffer
06:04
overflow protection where we insert random number and then we check on the random number when the function returns, address space layout randomization to disable jumping into hard coded, uh, locations, encryption, uh, asymmetric, uh, asymmetric and symmetric
06:22
encryption, almost everywhere you will find random numbers, without random numbers we cannot apply a lot of security solutions. And on the other hand, a much stronger security pattern is isolation and if you just treat these patterns based on their meaning, randomization and isolation, you wouldn't really gain much from their meaning. But analyzing
06:46
them in abstract mathematics, you can then be able to isolate the patterns away from their meaning and then you would be able to extend their strength and to find their ultimate strength and then you would be able to move freely with these patterns to apply to other
07:02
problems. So I've done just that and then I, I arrived into defining randomization in its ultimate security pattern as division by infinity in an inverse relation to probabilities where like you, increase the random number as much as possible to reduce probabilities as much as possible to zero. And on the other hand, I
07:27
defined isolation as a division by zero which is an undefined mathematical expression and by then you can actually take probabilities totally out of the equation. So only then, when I return back to solve the binary protection problem, only then I started to see
07:44
another dimension to the problem. From that perspective, all the researches that I read about and learned about and all the failure attempt that I went through, they were all going toward infinity in an attempt to increase the time and effort needed to disable
08:02
reverse engineering or to resist reverse engineering. So how about going toward zero? In this case, instead of increasing the time and effort needed for reverse engineering, we will just reduce the time as much as possible to zero. So allow just few
08:22
milliseconds for the code to be executed. And by that, there will be no way for the code to be reverse engineered. Imagine that you are just generating a code that will be valid only for six milliseconds, there will be no way that it will be reverse engineered unless it will be saved and then analyzed and then a reverse engineer will
08:42
return back to try to attack the system based on knowledge of previous execution but then, if you do that, it means that the code is expired, the code is not gonna be used anymore so you need to generate new code. And hence we need to have metamorphic engine. But not a metamorphic engine from the perspective of viruses or malware where they use
09:03
metamorphism just to change the way the code looks like or to change the pattern or to change the signature of the code. We need to have actually more like mutation engines more than just, um, mo- uh, morphic engines. So I went by and I defined the unbreakable
09:23
code as an unpredictable code but a code that cannot be determined and cannot be and and to keep on changing, uh, and it cannot be expected before it gets executed. And while trying to apply the randomization and isolation techniques that we talked about, I
09:45
found that the major weakness is actually the static code dynamic data which is the model that we use to everywhere that this is the way we learn how to program, this is the way how we learn how to develop our software. The code is static and the data is dynamic
10:02
and that that enables all sorts of reverse engineering and also enables all sorts of replicable software exploits. So I tried to change that model into the dynamic code, dynamic data that the code will keep on changing and the code will keep on evolving while it's being executed and that the code will not remain the same, will not remain
10:23
static. So if you look at the code now and try to analyze it, it should look totally different than the way it looked just an, uh, a minute ago or a few seconds ago. So locating the code, analyzing the code, and then breaking the code. In order to have very
10:41
high rates of RE resistance, we need to resist all these stages. We need to make the code unlocatable, so we're gonna make that by storing the code remotely and, and, and perform remote execution. And we're, we're gonna disable analyzing the code by using flux binary mutation and by setting the lifetime of the code into few milliseconds. And then
11:06
we'll make the code unbreakable by allowing the code to know more about itself that it would detect any tampering attempt if any happened while it's being executed. So the remote metamorphic engine architecture looks, um, as you see here, will divide the
11:27
engine into two separate areas. Trusted area and untrusted area. The trusted zone here is the area where the reverse engineer has no access to. And the untrusted zone is where the reverse engineer would have access to. Um, we'll have mutation engine stored in the
11:46
trusted zone that will keep generating code and keep mutating the code and then push the code to be executed in the, uh, in the untrusted zone by using challenge response metamorphic protocol. So why remote? Why we have to store the engine, the
12:03
metamorphic engine, remotely away from the reverse engineering environment? Well, if you keep the engine next to the, in the reverse engineering environment, a reverse engineer will just simply go and reverse the engine itself and will break the engine. So in order to secure the engine itself, the engine has to be stored in a secure area and
12:24
an area where the reverse engineer doesn't have any access, any access to. But you don't really have to do that. It all comes to what you're trying to defend against. If you are trying to make the code keep morphing itself and keep changing because you're trying to defend against intrusion or malware or, um, external intrusion, so you can,
12:44
you can then have the, the engine in the same trusted area. Um, but today I'm mainly focused on, or the presentation is mainly focused on resisting reverse engineering rather than using metamorphism to secure, uh, the, uh, trusted environment. So, uh,
13:06
the remote metamorphic engine is made based on challenge response protocol, challenge response communication protocol that is made of morphed machine code rather than data. And the protocol is pretty simple, is that the trusted, the trusted area will push 4 bytes of
13:25
code size and then will push the morphed code and then the untrusted area will just receive, uh, will receive the, the code and execute the code and then respond back, respond to the, uh, to the engine. And trusted area and untrusted area here can be client
13:43
and server, it can be kernel and user mood, it can be, uh, guest and host machine, it can be even like an oil reader in an oil field and then you need to check its integrity to make sure that it's secure, it's not tampered, um, and on the other hand we
14:01
also should count the offensive approach which is, in, in this case, a malware can use the trusted area as a command and control server and the untrusted area as the infected machine where it's untrusted because reverse engineers will have access to. So by
14:22
doing that we would actually split the execution flow into two different areas. An area that the reverse engineer has access to and another area where the reverse engineer doesn't have access to and that on its own will, will create a lot of challenges to any reverse engineer because the reverse engineer will never see the whole picture, will see
14:40
just, you know, parts of the code being executed and it wouldn't even determine what decision is gonna be made on the responses or return values. So here is a list of, uh, samples of the challenges that can be pushed into that protocol which is mainly, here mainly focused on challenges that will check on the integrity of the environment, uh, such
15:04
as in memory code integrity check, um, execution environment integrity check, uh, detecting hooks or trying to determine if the execution environment is real or it's emulated or it's instrumented, uh, clock synchronization, uh, uh, clock
15:22
synchronized challenges is actually empty challenges, it has no functionality at all where you can just create a challenge that has to be executed to be solved and then you would push it to the untrusted area to make sure that it's not analyzed, um, and then detect virtual machines or detect, um, or collect hardware, um, IDs to check on the integrity of the
15:44
hardware. So once you start to work on these challenges you will find that not all the challenges will have the same strength, some challenges will be so easy to be broken and some challenges will be much harder to be broken from a reverse engineering perspective.
16:03
The challenges, the more you're gonna rely on the CPU and the execution and the process itself that you will, you will create challenges that will only execute inside the process, this will be the, uh, most, the most solid challenges but if you're gonna create challenges that will communicate with the operating system or resolve or communicate
16:23
with APIs in the system, these are weak challenges that can be fooled easily. So that makes the approach if it's gonna be used by any malware will be, will be weak enough to be analyzed. The only trick will be just to, to know that a malware is analyzing the
16:42
execution environment while you're analyzing the malware and to make sure that you wouldn't slow down the malware while it's being executed so it will reveal all its functionality and you can go on and, um, and reveal its functionality. So in order to
17:00
ensure, um, in order, in order to ensure secure the challenges we will use, um, uh, morphing techniques where we will have the function that we need to execute and then we have to mutate the function in a manner that the challenge cannot be solved unless the code is executed. And the way we will do that, we will have to rely on other morphic
17:24
techniques, not just like the, uh, uh, malware morphic techniques. We need to use mutation techniques that will change the functionality of the code, not just changing the code structure or not just changing the code, uh, semantics. So here's how we
17:41
can, uh, here's, here's how I'm, here's how I'm creating these challenges in the morphic techniques on the function, changing the function structure totally and then add a head and a tail to the function. And the head is just unused instructions and the tail
18:01
is where we will, will perform response mutation. So every time the function is gonna get executed, it will return different response. Here is a sample of the code being executed. Challenges are being generated and being sent to be executed in a manner that is not executed in the untrusted area. And as you can see, the encrypted response, every
18:24
time the challenge is being executed, it will return back different return value and then the morphic engine, morphing engine will, will receive the response and then decrypt it back to its original value. And the main reason why you need to do that is to make sure that no one can fool the responses or can hook into the response and
18:44
then just, uh, send fake responses. So, in order to, to, to, to, to, to, to, to perform mutation for every single, uh, challenge, I'm mainly using reversible
19:02
instructions and the reversible instruction, what they do, what they do is that, uh, when the function returns, before the function returns, a set of instructions will take the return value and then mutate the return value. So the remote morphic engine will, would generate mutation key and then use that key to create, uh, dynamic encryption
19:25
routine and then insert the encryption routine in the end of the function to encrypt, to encrypt the response and then once the response is returned to the remote morphic engine, the, it will use the same mutation key to use, to generate a
19:42
decryptor that will decrypt the response and return it back to its original value. So here, as you can see, these are samples of the mutation, um, that we will use to mutate the response and this mutation is actually, as you can see here, we're using a set of
20:04
reversible instructions like addition, subtraction, XOR, um, bit rotation to the left or to the right and then once the response returns back, you will apply the opposite, um, instruction to, to, to, um, to reverse the, the response back to its original value.
20:27
So once you start to use these instructions, and as you can see here, if you use these instructions in detail, any AI or reverse engineer trying to analyze the code will be able to detect that this set of instructions is actually in the tail of the function. So in
20:46
order to disable that, we have to use the same exact instructions in the body of the function, we'll use that in the morphing, um, in the morphing techniques that we're gonna do to the function and also we'll use the same instructions in the head where we
21:00
will insert use this instructions. So this, this, this how like anyone analyzing the function will not be able to determine the beginning of, or the end or the body of the function. So here, after performing the morphing techniques that I'm gonna show you now, what you need to do is to disable any reverse engineer or AI trying to, uh,
21:22
automated tools trying to analyze the code, you need to disable them from determining the beginning or the end or the middle of the function by mixing the same instruction sets everywhere in the function. So, the mutation techniques that we will need
21:42
to use to resist reverse engineering, it's totally different than the mutation techniques or the morphic techniques that malware would use. Malware, they use morphing or polymorphic techniques to evade antiviruses, they use polymorphic techniques simply by encrypting the code and then when the code executes it will fold memory
22:03
and then decrypt and then will be in its original form and on the other hand they use metamorphic techniques to make the code operate in the same exact way but would look totally different and use totally different instruction sets so no encryption is used in, in metamorphism. Here are some samples of what, uh, here are some techniques that are
22:24
used by, by, by malware, uh, metamorphic techniques which is like all aiming to change the structure of the code or evading signatures but what we're trying to do here is not really to evade signature, we need actually to resist reverse engineering. If you make
22:41
some few changes to the code to make it look different, still a reverse engineer can easily determine what's going on. But more importantly, because we need to evade artificial intelligence or any automated tools, we, we, we need to use morphic, morphing techniques that will make it expensive for automated tools in terms of time rather than just
23:08
changing the signature. So the flux mutation goals that we're gonna have is to extend the trust so the notion is that if you have few milliseconds of trusted execution we
23:24
need to be able to extend that trust from few milliseconds to cover the whole untrusted area by checking the, on the, by checking on the integrity of the execution area. Uh, ensure trusted execution, we need to make sure that the challenges that we're creating will not be
23:41
solved unless the code got, get executed. And while doing that we also need to disable the code being emulated or instrumented. And we need to also detect reverse engineering or, and evade reverse engineering while the code is being executed. So in order
24:03
to make it expensive for any automated tool or reverse engineer trying to reverse engineer the code, we need to have, we need to use morphing, morphing techniques that will, that will require time for any automated tool to analyze the code. So I'm using here mainly structure obfuscation. So every time the code is being morphed, the structure of
24:26
the code will be totally different. And I'm actually changing the structure of the code, not by making the structure look different, but actually by making the structure look the same. So all the functions, while is, while they are being morphed, though they are
24:43
different functions, at the end they will look all the same. And that's how you can, that's how you can make, make it harder for any automated tools to determine the difference between the functions, so we'll make it harder for any automated tool to attack the code while it's being executed. So these are actually basic blocks, so we, we take a
25:02
function, simple function, and then we would morph it into thousands of basic blocks that they all look the same. And these basic blocks will be totally disconnected, there's no edges connecting these basic blocks with one another. And then we will, we'll
25:21
have to use self-modifying techniques, where every single basic block, when the basic block is executed, it will modify itself and then connect to the next, next block only after it gets executed. So here as you can see, reverse engineer will just receive the code as
25:41
totally disconnected basic blocks, and then only when the, when the, when these blocks start to get executed, they will start to connect to one another. And that's how we can make it more expensive for any reverse engineer or automated tool to analyze the code. In order for any automated tool to analyze that code, it has to emulate the execution of
26:04
these basic blocks or analyze it so it will be expensive at the end of the day in terms of time. And then at the end you wanna make sure that, that in order for the challenges to be solved, it will only be solved if the code get execute, if the code get
26:20
executed only natively or even on an emulator but not, um, on an, um, instrumentation tool or any tool that tries to understand, try to understand the code or break the code while it's being executed. So here, sample of these basic blocks. Every basic block here is
26:46
actually just one instruction. And the morphic techniques that I'm using is actually taking every single instruction, as you can see up here, this is the original instruction. And then taking the instruction and then transform every single instruction into a basic
27:04
block. And that basic block will encrypt the instruction and then the only way that the real instruction will appear is by executing the basic block. So the morphic techniques
27:25
that we can use to resist reverse engineering in the context of clock synchronization that you're allowing the code only to execute for a few milliseconds is totally different than the morphing techniques that is used by malware to change the structure of the code
27:42
or to evade signatures. So here is the list of the techniques that I found to be very helpful. We need to use metamorphic techniques plus polymorphic techniques. You cannot really rely on metamorphism only, you need to use polymorphic techniques. And why you need to use polymorphic techniques? Because you have to do self-modifying code,
28:02
you have to generate self-modifying code to make it more expensive in terms of time for any automated tool to reverse engineer the code. And you need to make code structure obfuscation so any AI wouldn't determine which function is which because not all the
28:22
challenges will have the same strength, some challenges will be weak and some challenges will be strong so you need to disable any reverse engineer to determine which function is which and the way we'll do that is to make all the function look the same and all the function will have the same structure as I showed you. And then challenge
28:40
response mutation which we talked about in the beginning is that we need to, you're generating code that will expire in a few milliseconds so you need to actually make the code function differently because if it wouldn't function different, differently it can easily be faked. So every time we execute the function it should function different way and return different return value so it will uh transform into a real challenge. And slices
29:10
permutation is where if you have 100 functions if you send these functions to be executed in the same sequence that will be weakness as well, you have to morph the
29:20
sequence of the functions so every time you send functions to to be uh executed you will have to rearrange the sequence of these functions while being executed. And code size magnification is also very important here because if you're expiring the code in few milliseconds and you're trying to determine if the code is is is being executed
29:44
natively or if the code is being analyzed if you're sending just few instructions it will be so hard for you to determine the difference. So you have to magnify the code, you have to magnify it enough that will enable you to determine or you will have uh larger
30:02
difference between if the code is being executed natively or the code is being emulated and by saying emulated here I don't really mean like emulated CPU but rather instrumentation that the code is being instrumented while it's being executed and then a reverse engineer or an AI would patch or like would uh temper the code while it's being
30:23
executed. So here's a sample, this is just a very simple function that you can define in the remote metamorphic engine. This function is just checking if the debugger is connected to the process and I've chosen this function because it's just very short and
30:41
small and enough to fit into the screen. So we'll start morphic techniques by inserting useless instructions, unused instructions and then after inserting the unused instructions and randomizing uh every time you would insert different set of
31:03
instructions this is how you can change the a little bit the structure of the code on the first stage. And then here we're also can use expansion so you can replace one instruction that does memory operation you can change it with a different set of instructions that perform the same exact operation but in a different way and using
31:22
different instructions. So the first morphing stage you will reach that code and then in the second morphing stage because we need to make it harder for any reverse engineer to analyze the code we need actually to change the structure of the code and make it
31:50
much harder for any automated tool to hook into any part of the code so every single instruction should look totally different and should and should be moved into totally
32:02
different position inside the function or inside the binary code. So what I'm doing here is actually I'm taking that code and then inserting a label into every single instruction and then inserting the jump after every single instruction and by that I'm
32:22
totally free to move any instruction anywhere because the sequence of execution will always be the same. So this code here that you see is exactly the same as that one. By inserting labels and inserting jumps after every instruction to the next one you are free to move any instruction anywhere or you are free to move any of these basic blocks
32:44
anywhere. These two are exactly the same. Just doing tran tran tran position here. So after doing that and changing the structure of the code and doing
33:01
transposition and push it moving every single instruction in a different uh location we will take every single basic block and then use polymorphic techniques to transform this basic block into self-modifying code. So that basic block we're gonna take it and
33:22
transform it into that morph basic block which is a self-modifying basic block. As you can see this is a randomly generated mutation key that I'm using to create randomly encryption and decryption routines. And here as you can see this is a randomly generated
33:43
self-modifying code so every single basic block which is actually every single instruction will be transformed into a self-modifying basic block on its own. And here is a hel helper function that just helps to allocate the location of the code in
34:00
memory. And as you can see this part actually will transform into self-modifying and it will modify itself while it's being executed to um um to unfold into the original instruction or the original basic block. And also the jump that you see here, the jump
34:26
that is after the instruction is also gonna be morphed so any AI or reverse engineer trying to understand what which instruction will be next he wouldn't be able to know or any automated tool trying to analyze the code they wouldn't be able to know which next
34:41
instruction which instruction will be next until the code is executed and the code self-modify and decrypt itself and then will jump to the next basic block and then the next basic block will self-modify and decrypt itself in memory and then will reveal the next instruction and so on. And we need to do that because we only have few
35:02
milliseconds for the code to be executed and we wanna make it very expensive for any automated tool to try to solve the code within that allowed time frame. Like these techniques might not be that interesting if if you just give the reverse engineer as much time as he want to reverse engineer the code but in the context of the code has to be
35:24
executed in few milliseconds these techniques are very helpful and for sure it's gonna be helpful if you wanna add if you would add multiple layers of self-modifying. So at the end you will the code will be morphed into these structures here as you can see these are
35:43
like three basic blocks so every single instruction will be transformed into these randomly generated basic blocks. Here is a sample of the code being executed and self-modifying
36:00
just to give you a feel all the basic blocks they will end up looking exactly the same and while being executed they will change and they will connect and they will un unfold in memory as you can see here the this code will change now and unveil unreveal that
36:40
instruction sorry. So these morphing techniques if it's used normally you know to to
37:16
just to morph piece of code it's it wouldn't be helpful at all but the point is that the
37:21
code will have only few milliseconds to be executed and to respond back to the remote metamorphic engine with the right response. So here as you can see these are four different generation four different samples of the same function being morphed and every
37:46
time the function is morphed it will it will respond back with a totally different return value and then the engine will receive the return value and then decrypt it and return it
38:03
back to its original value and as you can see here the response time is six milliseconds I was here connecting the trusted and untrusted area the remote metamorphic engine and the and the client on a same machine in a local local host if
38:20
you're gonna connect it remotely it might be much longer than that and here as you can see every time the code is being morphed it will it will be it will it will result in a totally different code size so the code size will be different the structure will be
38:41
different and the instruction sets that are used will be totally different as you can see the first time the original code actually if you if you just assemble it it will be around maybe 30 or 40 bytes and here these 30 40 bytes of the original code are being transformed
39:05
into 15,000 bytes or around like 2,000 or 5,000 5 or 6,000 instructions. Now in order
39:30
to determine if the code is being really executed or if the if the challenge really been solved without being tampered or without being reverse engineered or without being
39:43
emulated or instrumented if someone just hooked into the challenge response protocol and then tried to just set the return value into any value and trying to fool the
40:02
protocol um the remote metamorphic engine can easily detect that. So any any any immunity system in nature actually is based on knowing the self so the remote metamorphic engine actually tries to learn and know about the self by comparing the responses or the
40:24
challenges that are returned comparing it to the previous execution so we have the same function we would execute it maybe 5 times and every time the code will look different the structure will look different and uh the function will actually include totally different functionalities because it's mutated and every time the function get
40:43
morphed and being sent to be executed it will return a different return value but then the engine will use the the randomly generated decrypted decryption or mutation key to solve the challenge and return it back to the original value. So as you can see here if anyone
41:02
tries to tamper the code while it's being while it's being executed or try to tamper the response while the response is being returned by just simply hooking into response and then sending faking the response the engine can easily determine that by comparing the
41:21
responses with the previously returned responses so as you can see here these are 7 different mutated functions of the same exact functions every time the function is being sent to be executed it will return back totally different value and then all these values
41:44
should decrypt back to the same exact value so the remote metamorphic engine can determine if there's any tampering attempts if the decrypted value will look totally different will result in a different value than all the previously generated code. On the
42:04
other hand the engine can be able to determine if the the code been executed natively or in a healthy way compared to being instrumented or being analyzed based on time so we in in this challenges for example we're allowing 500 milliseconds for the
42:24
code to be executed and to return back and then the engine can determine if the code is being analyzed or the code is being emulated or being instrumented if the response returns back in a higher time frame than the allowed time frame so the way I actually said
42:47
the allowed time frame I'm actually doing it manually by executing the code natively and then measuring the code the execution time of the code and then would allow because every
43:02
time you send the code to be executed will return back in a different time frame so you have to enable um a time frame because the code will keep on fluctuating um so I'm mainly allowing the average of the execution time multiplied by 3 to 5 factors and this way is
43:25
like it's based on the assumption that if the code is being analyzed there must be at least 2 or 3 instructions being inserted to analyze the code uh or there must be at least comparison instructions so as you can see here uh on these are all the same functions these
43:45
are all same generation uh 7 different generations of the same function every time it's being executed it will return back in a slightly different time frame the first time returned back in 45 milliseconds second one 65 milliseconds and then in this sample I'm
44:03
allowing only 500 milliseconds for the code to be executed and then once the return value returns back in a higher time frame the engine can determine that the code is being analyzed and then it can uh act in a different way in case of a malware using these
44:21
techniques perhaps that would be the most tricky part of it you know a malware wouldn't really be able to change the behavior because the malware will still have to communicate with API still will have to communicate with operating system so still you can determine or you can signature the the malware based on behavior analysis but the
44:41
tricky part is that the malware can be analyzing the code or analyzing the reverse engineer while the reverse engineer is analyzing the code or the malware can analyze the execution or the instrumentation environment while the instrumentation environment is analyzing the malware so that might eliminate or it's just the tricky part you know once you
45:03
know about it it's gonna be so easy to bypass it but if you don't know about it a malware can evade and um evade the analysis and maybe act in a totally different way and in this case the functionality is even not stored in the executable file that you're analyzing all the functionalities are stored remotely and then you wouldn't be able to go
45:25
to the next step in analyzing the code. So it's it's just this trick you know that the malware can use to evade reverse engineering so it's the technique is not really that wouldn't really add much you know to malware evading malware as much as it can be
45:45
used for um defensive approaches trying to eliminate the code from being reverse engineered or pushing integrity checking functions to check on the integrity of the code or the integrity of the environment you're executing the code in and this is actually the
46:03
most um this is the most challenging part is that how can you really trust in any nodes that you are connecting to? How can you trust on the integrity of things in
46:20
your network? How can you trust the integrity of the processes that are running in your network where you are relying on static code? If you're relying on static code and that static code can be hooked can be patched can be tampered in memory and then you wouldn't really be able to determine or trust its input or output. So the remote
46:46
metamorphic engine can be used to check on the integrity of connected things and to ensure that things cannot be reverse engineered or it will be much harder to be reverse
47:00
engineered um or or get tampered. On the other hand execution time also you can determine that the code is being analyzed or the code is being faked if the response returned back in a lower time frame than the allowed time frame. In this case maybe
47:23
someone is trying to fake the responses and executing it in a much faster CPU or performing any tampering attempts that would that would result in the code being returned very fast. So these two variants would help a lot to determine if the code is being executed
47:44
natively or it's being tampered or it's being uh reverse engineered. So I'm running out of
48:01
time so if you have any question you can feel free to email me I'm gonna be using this email address for the coming couple of weeks and uh with that it's been an honor to be part of your day today. Thank you for joining me. Have a good day. Thank you.