Fun with Mind Reading: Using EEG and Machine Learning to Perform Lie Detection
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 96 | |
Autor | ||
Lizenz | CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben | |
Identifikatoren | 10.5446/51715 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
NDC Oslo 201642 / 96
2
7
8
9
12
14
19
20
26
28
31
33
38
40
43
45
48
50
51
61
63
65
76
79
80
83
87
88
90
93
94
96
00:00
Maschinelles LernenHauptidealSoftwareentwicklerTwitter <Softwareplattform>GruppenoperationInverser LimesEinsDemo <Programm>ComputervirusSchnittmengeKorrelationsfunktionAlgorithmische LerntheorieMaschinenschreibenVorhersagbarkeitPunktMusterspracheFeuchtigkeitCASE <Informatik>GruppenoperationLie-GruppeExogene VariableFigurierte ZahlVirtuelle MaschineMapping <Computergraphik>MAPDivergente ReiheWellenlehreZweiSchreib-Lese-KopfMinimalgradGüte der AnpassungTwitter <Softwareplattform>Gruppe <Mathematik>Lesen <Datenverarbeitung>Kategorie <Mathematik>SoftwareÜberlagerung <Mathematik>Design by ContractBestimmtheitsmaßMultiplikationsoperatorSatellitensystemQuantenzustandWeb logUMLProgramm/Quellcode
04:43
ZweiMultiplikationsoperatorWürfelRechter WinkelDifferenteSoftwareEchtzeitsystemGamecontrollerInternetworkingDemo <Programm>Güte der AnpassungAggregatzustandSuite <Programmpaket>SoftwareentwicklerLeistung <Physik>Physikalisches SystemTypentheorieQuick-SortHyperbelverfahrenMapping <Computergraphik>MusterspracheMAPWellenpaketOrdnung <Mathematik>Lesen <Datenverarbeitung>Elektronisches BuchComputerspielPolstelleProgrammiergerätWarteschlangeSichtenkonzeptFolientastaturARM <Computerarchitektur>Trajektorie <Kinematik>Physikalische TheorieBildschirmfensterBildverstehenComputeranimation
10:20
KontrollstrukturDemo <Programm>Reelle ZahlBitrateSoftwareMaschinencodeMaschinelles LernenHill-DifferentialgleichungMittelwertAlgorithmische ProgrammierspracheSpieltheorieMinimalgradInterface <Schaltung>Divergente ReiheLie-GruppePunktExpertensystemSchwebungGüte der AnpassungGradientt-TestLeistung <Physik>Kartesische KoordinatenSoftwareentwicklerBitrateRobotikMittelwertRechter WinkelArithmetischer AusdruckInteraktives FernsehenBitEinsDifferenteMultiplikationsoperatorNP-hartes ProblemAvatar <Informatik>MatrizenrechnungGamecontrollerMAPVirtualisierungSoundverarbeitungAggregatzustandRückkopplungZahlenbereichKlasse <Mathematik>ZweiComputerspielPrototypingBenutzeroberflächeRegelkreisKognitionswissenschaftVirtuelle MaschineCASE <Informatik>Abgeschlossene MengeBestimmtheitsmaßFlächeninhaltAnnulatorGarbentheorieWellenlehreDruckspannungMathematikGraphfärbungTaskWorkstation <Musikinstrument>MaßerweiterungEchtzeitsystemExogene VariableLesen <Datenverarbeitung>SoftwaretestPhysikalischer EffektEinflussgrößeMultiplikationArithmetische FolgeKurvenanpassungProtokoll <Datenverarbeitungssystem>Programm/Quellcode
18:13
InformationsspeicherungDateiformatSoftwaretestRechter WinkelFilter <Stochastik>DifferentePunktSurjektivitätBildschirmmaskeLesen <Datenverarbeitung>Computeranimation
20:05
MultiplikationsoperatorVierDifferenteSkalarproduktVirtuelle MaschineReiheLesen <Datenverarbeitung>VorhersagbarkeitZweiElektronische PublikationEinsDiagrammBitHyperbelverfahrenSchreib-Lese-KopfRandomisierungZoomRechter WinkelZeitstempelNP-hartes ProblemMathematikWechselsprungPhysikalischer EffektDivergente ReiheArithmetische FolgeMAPAbzählenInformationZehnMereologieComputeranimation
23:53
SoftwaretestGreen-FunktionDemo <Programm>Endliche ModelltheorieEinsData MiningDiagrammMultiplikationsoperatorSichtenkonzeptMAPRechter WinkelWellenpaketCASE <Informatik>Quick-SortAbzählenGreen-FunktionZahlenbereichLesen <Datenverarbeitung>ComputerspielOrtsoperatorSchreib-Lese-KopfWeb logKontextbezogenes SystemTabellenkalkulationComputeranimation
26:06
MustererkennungWellenlehreEnterprise-Resource-PlanningPrimzahlzwillingeBewegungsunschärfeAutorisierungMAPGradientMultiplikationsoperatort-TestExogene VariableDatenfeldWärmeübergangLesen <Datenverarbeitung>RauschenMustererkennungEinflussgrößeFlash-SpeicherMusterspracheFamilie <Mathematik>Rechter WinkelNatürliche ZahlBildverstehenMittelwertBildgebendes VerfahrenDemoszene <Programmierung>ARM <Computerarchitektur>KorrelationsfunktionGüte der AnpassungGoogolMustervergleichWellenformComputeranimation
29:15
Maschinelles LernenLeistung <Physik>AlgorithmusEndliche ModelltheorieTestdatenEndliche ModelltheorieWellenpaketEin-AusgabeRechter WinkelQuantenzustandDifferenteSchnittmengeAlgorithmusHeegaard-ZerlegungDatenflussKreuzvalidierungFormale SpracheStichprobenumfangTemplateSoftwareentwicklerBrowserBimodulAlgorithmische LerntheorieProzess <Informatik>StabMusterspracheVirtuelle MaschineResultanteSoftwaretestBildschirmmaskeKorrelationsfunktionMathematikMaschinencodeRechenschieberTabellenkalkulationZweiLie-GruppeKartesische Koordinaten
32:41
Inverser LimesMaschinelles LernenAlgorithmusCheat <Computerspiel>TestdatenRechenwerkKonvexe HülleLokales MinimumSmith-DiagrammInklusion <Mathematik>PrimzahlzwillingeAlgorithmusVirtuelle MaschinePunktLineare RegressionVorhersagbarkeitTypentheorieRechter WinkelHidden-Markov-ModellGruppe <Mathematik>Verband <Mathematik>EntscheidungstheorieKlasse <Mathematik>KontinuumshypotheseKategorie <Mathematik>Lie-GruppeTopologieSupport-Vektor-MaschineHalbleiterspeicherZahlenbereichWellenpaketEndliche ModelltheorieVerschlingung
34:24
Maschinelles LernenPrototypingDemo <Programm>Web logDemo <Programm>VerschlingungSchnittmenget-TestZweiEinflussgrößeWechselsprungGruppenoperationVirtuelle MaschineComputeranimation
35:09
Perfekte GruppeGeradeOrtsoperatorAggregatzustandZeitreihenanalyseBitrateNegative ZahlZahlenbereichGruppe <Mathematik>DatensatzEntscheidungstheorieSchaltnetzBitTopologieLie-GruppeProzess <Informatik>GraphDifferentialResultanteVarietät <Mathematik>StichprobenumfangKerr-LösungKurvenanpassungRechter WinkelComputeranimation
37:58
PrototypingMaschinelles LernenDemo <Programm>Algorithmische LerntheorieVirtuelle MaschineDatensatzQuick-SortDemo <Programm>VideokonferenzCASE <Informatik>BenutzerschnittstellenverwaltungssystemZweiSchnittmengeDatenverwaltungMultiplikationsoperatorProdukt <Mathematik>Hinterlegungsverfahren <Kryptologie>Metropolitan area networkWeb SiteComputeranimation
39:41
MehrrechnersystemHypermediaDatenverwaltungComputeranimationBesprechung/Interview
40:26
Maschinelles LernenDienst <Informatik>Demo <Programm>WellenlehreDatenverwaltungArithmetisches MittelBildschirmfensterApp <Programm>Gemeinsamer SpeicherProgrammierungGamecontrollerComputeranimation
41:09
PrognoseverfahrenLie-GruppeMaschinelles LernenDemo <Programm>HackerEinsKonvexe HülleMenütechnikExogene VariableInklusion <Mathematik>Zellularer AutomatAntwortfunktionRechenwerkMailing-ListeSpeicherabzugTablet PCMerkmalsextraktionWellenformPaarvergleichStetige FunktionFrequenzGruppe <Mathematik>ProgrammbibliothekApache ForrestStationäre VerteilungDatenmodellVektorpotenzialBeobachtungsstudieInformationMathematikVersionsverwaltungBitARM <Computerarchitektur>Rechter WinkelZahlenbereichKlasse <Mathematik>Geschlecht <Mathematik>Fächer <Mathematik>KorrelationsfunktionAlgorithmusMaschinencodeEndliche ModelltheorieResultanteMailing-ListeLesen <Datenverarbeitung>GamecontrollerOrdnung <Mathematik>MateriewelleSoundverarbeitungSoftwaretestMultiplikationsoperatorSondierungWeb SiteEinsWeb ServicesDifferenteDatenverwaltungTeilbarkeitGraphfärbungVirtuelle MaschineWellenlehreAggregatzustandBildschirmmaskePunktCASE <Informatik>Quick-SortSchnitt <Mathematik>SchlüsselverwaltungBildschirmfensterApp <Programm>Schreib-Lese-KopfVariableFrequenzEichtheorieMobiles InternetProzess <Informatik>InformationsüberlastungMultiplikationWeb logLie-GruppeWürfelWellenformAnalytische FortsetzungEigentliche AbbildungBenutzerbeteiligungRandomisierungWaveletDatensatzTypentheorieProgramm/Quellcode
48:13
SpezialrechnerMaschinelles SehenTreiber <Programm>ZahlenbereichVirtuelle MaschineWeb SiteTreiber <Programm>ARM <Computerarchitektur>AggregatzustandFigurierte ZahlProzess <Informatik>StichprobenumfangQuellcodeFacebookData MiningNeuroinformatikBildgebendes VerfahrenBildverstehenMultiplikationsoperatorWeg <Topologie>Computeranimation
49:23
RechenwerkMerkmalsextraktionWellenformPaarvergleichTablet PCFrequenzGruppe <Mathematik>Stetige FunktionProgrammbibliothekApache ForrestSuite <Programmpaket>Stationäre VerteilungDatenmodellEreignishorizontVektorpotenzialRückkopplungLoopSkriptspracheBimodulReynolds-ZahlRelationentheorieDivergente ReiheDean-ZahlVirtuelle MaschineTwitter <Softwareplattform>HauptidealSoftwareentwicklerEndliche ModelltheorieFrequenzMustererkennungProzess <Informatik>Wort <Informatik>Vollständiger VerbandAlgorithmusEntscheidungstheorieQuick-SortAggregatzustandDifferenteSprachsyntheseTypentheorieZeitbereichZustandsmaschineRelationentheorieRückkopplungKontrolltheorieProgrammbibliothekSchnittmengeInformationWaveletMusterspracheAlgorithmische LerntheorieDomain <Netzwerk>BenutzerbeteiligungWeb logBildschirmfensterApp <Programm>ZahlenbereichKartesische KoordinatenSoftwareentwicklerRauschenMultiplikationsoperatorEichtheorieEchtzeitsystemMultiplikationLie-GruppeVirtuelle Maschinep-BlockVideokonferenzElektronische PublikationDatenverwaltungARM <Computerarchitektur>QuaderWellenlehreMailing-ListeFilter <Stochastik>Rechter WinkelKontrollstrukturArithmetische FolgeWeb ServicesGeschlecht <Mathematik>RechenschieberGRASS <Programm>SkriptspracheSoftwareInformationsspeicherungMinkowski-MetrikGeradeProgramm/QuellcodeXML
Transkript: Englisch(automatisch erzeugt)
00:04
Hello. My name is Jennifer Marsman. This crazy contraption you see on my head is made by a company called Emotiv. It's called the Epoch Plus. And I have been coveting this headset for a really long time.
00:22
I think it's amazingly cool because what it does is it actually reads EEG or your brain waves. So I was fortunate enough to get one of these headsets. So the first thing, I have a whole, and I had a whole bunch of crazy ideas of things I wanted to do with it. I won't go through all of them now because I still might try to implement some of them. But I have just a bunch of ideas.
00:42
But for the first thing, I'm really into machine learning. And so the first thing I decided to do with this headset was I took it and I put it on my husband. And I asked him a series of questions. And first, I had him tell me the truth. And then I had him lie to me. And what that did was gave me a labeled data set of here's what your EEG looks like when you're telling the truth.
01:05
Here's what it looks like when you're lying. And so then that enabled me to use machine learning to build a classifier to perform lie detection. So that's what we're going to talk about today. My name is Jennifer Marsman. I do work for Microsoft. I am not on the Azure machine learning team, although I have collaborated with them very closely.
01:26
I do have a blog, and I'm on Twitter. So if you have questions or things that you don't get answered today, feel free to reach out to me. I'm happy to respond with additional questions. I know some people might have to leave since it's the last session of the day. So feel free to reach out afterwards.
01:41
All right. Oh, and I do want to mention, too, I'm not a neuroscientist at all. My background, I do have a master's degree in machine learning. It's kind of fun. Is the big bang theory, is that popular here? Yes, yes? Okay, good. I'm seeing some nods. All right. That's funny. So my husband has a Ph.D. And my little group of friends, a lot of people have Ph.D.s.
02:00
And so I'm kind of the Howard Wolowitz of my group of friends with only a master's degree. So, yeah, you guys can make fun of me, too. So here's what we're going to do and what we could kind of figure out from what I did. But I don't want to make any claims to being a neuroscientist. I am not. All I have is a lowly master's in machine learning. All right. So what we're going to do. Let me start by just kind of level setting.
02:22
Like, what is EEG? What are we even talking about here? And so essentially what this is, is it's in a non-invasive way of measuring your brain activity. Those electrochemical signals that make up your brain. And so what I'm wearing, for those of you who were here a little early and saw me frantically getting set up,
02:41
how this headset works, it actually has 14 little touch points resting against my scalp. And I used, I read 14 channels of data, and there's little felt pads. And this is a wet sensor, so I don't need gel or anything like that. But I do use saline solution, the same thing I use for my contact lenses,
03:00
and put those on each of the felt tabs so there is like a wet, moist sensor on each of these things. And so what it can do is actually then pick up that activity. And I do want to be completely clear here. It is not mind reading, just to be really clear. I submitted once to a conference, and it said, I had the title of this being fun with mind reading, blah, blah, blah, something like that.
03:24
And it was, I think it was a conference where they didn't have much of a sense of humor, because they sent me back a response that said, you know, that is not mind reading. And it's like, duh. But just in case it wasn't clear, I am not mind reading anyone. So what, I don't know what you're thinking if I put the headset on you.
03:41
I can't tell what you're thinking, which is probably a good thing. But what I can do is I can create mappings between patterns of EEG and an action, either in the digital world or in the real world. And so I'll show you more of that in one second. And I can also use patterns of EEG and apply machine learning.
04:03
And all machine learning really is, or what all supervised machine learning is under the covers, is being able to find the correlations between a set of features, so something that influences whatever it is that you want to predict, and then something that you want to predict, so a label. So supervised machine learning helps you find the correlations between, you know,
04:22
these features and how they affect this label. All right? So I can do that as well with this. So I can make predictions based on patterns of EEG data I've collected. All right. So before we jump into my work, let me just give you a quick demo so I can kind of show you why I fell in love with this device
04:41
and why I just had to have it. So I'm going to show you. This is a piece of software I did not write. This came with a headset. This is something called the Epic or Epoch control panel. And ooh, I got a couple going things. So there's a system that shows how well the sensor reading is. And you can see I went yellow on a couple of them here.
05:02
So let me see if I stick a little. Sometimes if you stick a dab more solution on. I think that's this one right here. We're real-time debugging, guys. Real-time debugging the cyborgs. Nice. One, two, three, four. Let's see if I can make this one go. All right. We'll see if we can actually do this demo not operating at full.
05:22
Oh, there we go. Yes, yellow, yellow. Come on, just go yellow. I don't need green. Just give me yellow. You can do it. Come on, baby. There are so many pictures of me on the internet with bad hair.
05:42
All right. So we're going to call that good. All right. So let me show you a couple of things. And if it doesn't work, absolutely perfect, please forgive me. So what I have here, first I want to show you this is a thing called the expressive suite. And so what this allows me to do is it allows you to this headset actually has APIs, a developer SDK, which picks up eye movements and facial movements.
06:05
So if I smile, my regular big smile, he should smile, too. And then if I look, you have to tell me if this works. I'm going to look this way. Is it working? Is he looking that way? No? Okay. Yeah? All right. Let me do some blinks.
06:21
Blink, look this way. Look that way. Blink, blink. Look this way. Raise your eyebrows. Is it kind of working? Is it kind of, all right, yeah, yeah. So you guys are seeing it now? Awesome. So what this tells you, this should tell you a couple different things. It should tell you that it's EEG is affected by your facial movements, right?
06:41
Because if it's sensitive enough to pick up my facial movements, the sensors are able to pick up those sort of things as well. So just keep that in mind for later. Now, this is what really got me excited. So we also have the ability to use pattern matching, right? So to use patterns of EEG in order to take things. So what I'm going to do is train this cube to move with the power of my mind.
07:06
So what we're going to do is, first of all, I need to take a neutral brain state. So get a sense for what my brain is like at rest. Because there may be different levels of brain activity. Like, my brain activity might spike at, like, 24. And my husband's might spike at, like, 3.
07:23
So we need to be able to account for things like this. So first of all, let me go ahead and I'm going to hit train. And I'm just going to go to my happy place. It takes eight seconds to train. So there's going to be an awkward silence. And we're all going to be okay with it, all right? And I'm just going to go all zen and just try to relax with a million people staring at me.
07:42
And then we'll get a sense of what my brain's neutral activity is like, okay? Here we go, eight seconds. I'm also going to close my eyes. So will somebody just yell when the eight seconds is up? I'm not kidding. Please, somebody yell. Just be like, it's done. And then I'll be done. Okay, here we go. Happy place. And eight seconds, go.
08:11
Thank you very much. All right. So it updated that. So now what I'm going to do is choose some movement. Now, there's different types of movements you can do. I'm actually going to do pull because I think that one's the most visually interesting, like seeing the cube move towards you.
08:24
So I'm going to envision, now I'm going to think, for the next eight seconds, I'm going to think pull with my mind and be all like Jean Grey, all right? And so I'm going to think pull with my mind. And then what that'll do is it's mapping what the pattern of EEG looks like in my brain when I'm thinking pull.
08:40
And, of course, I don't really have to be thinking pull. I could be thinking hamster dance. But whatever I'm thinking, as long as I think that consistently every time I want the cube to pull, it should work. All right? So let's give that a try. I'm now going to think pull. We're on pull. Again, it's going to take eight seconds. Don't make me laugh. Here we go.
09:09
Okay, I'm not sure how I did on that, but let's see. All right. So we're going to try that. And now the cube is live. So now I'm going to try to repeat what I was thinking.
09:31
Yeah, let me retrain this. One second. Hold on. We're going to do pull again. I'm going to retrain because I actually was a little more nervous than I usually am. Ah, stop. Abort. Abort. I wasn't ready.
09:41
All right. One more time. Here we go. Okay. So I was thinking it. So obviously you saw I do it. All right. So one, two, three.
10:01
All right. Yay. So let me do it again. Prove to you that really works. That's cool, isn't it? All right. So you can see how after I was like that, I'm like, I have to have this device. This must be mine. So let me show you. I actually first found out about this from an awesome TED Talk that was done by Tan Lee, who is the founder.
10:25
Woo, for female founder of this company, Motive. So she was the one who created this headset. So here's just a little clip from the TED Talk that I want to play for you, just showing some of the cool things that people are doing with this device. So I'd like to show you a few examples because there are many possible applications for this new interface.
10:46
In games and virtual worlds, for example, your facial expressions can naturally and intuitively be used to control an avatar or virtual character. Obviously, you can experience the fantasy of magic and control the world with your mind.
11:00
And also, colors, lighting, sound, and effects can dynamically respond to your emotional state to heighten the experience that you're having in real time. Can you imagine that? Like, imagine making a video game and having that real-time feedback from the user. And so when they start to get bored or check out, just have another zombie pop out.
11:23
Like, that would be so cool. Like, that's honestly really neat scenarios. And moving on to some applications developed by developers and researchers around the world. With robots and simple machines, for example. In this case, flying a toy helicopter simply by thinking lift with your mind. The technology can also be applied to real-world applications.
11:44
In this example, a smart home. You know, from the user interface of the control system to opening curtains or closing curtains.
12:02
And of course, also to the lighting. Turning them on. Right here. And finally, to real life-changing applications such as being able to control an electric wheelchair. In this example, facial expressions are mapped to the movement command.
12:51
So don't, like, make somebody laugh when they're about to go up a cliff or something. You have to kind of be careful there. Isn't that amazing? Like, I love the idea of, so what really gets me excited is being able to use technology to make people's lives better.
13:06
And the idea of, like, being able to help a paraplegic, like someone paralyzed from the neck down. And then being able to control things and interact with the world or move or whatever. Using the power of facial movements and the power of their mind is just so, like, exciting to me.
13:20
So I'm super excited about this. So we're thinking about all these mind-changing ways we can help humanity. And I finally get one of these headsets. And what does Jennifer do with it? Lie detection on my husband. Okay. So let me tell you a little bit about this. So I started thinking. Well, so when you think about lie detection, okay, I don't, again, I don't have a neuroscience degree.
13:46
I'm not a brain expert. But I did take, like, one or two classes on the brain and cognitive science in grad school. And I remember that when your brain, when you're telling the truth, that typically activates the recall centers in your brain, right?
14:02
And then when you're lying, that typically activates the creative centers in your brain. So I started thinking, if I have this headset that has these, you know, all these different sensors on different sectors of my brain, could I tell the difference? Like, it seems like that would work, right? It seems like that wouldn't be that hard. So I was like, let me just go ahead and try.
14:20
I'm just going to see if I can do this. So, and then there's a lot of problems with how we do lie detection today. I'm sure you guys have watched all the same crime dramas on TV that I have. And you know that, you know, the hero can always beat the lie detector by putting, like, a tack in your shoe or something. And so when you're lying, like, your stress, anxiety level goes up because you're telling a lie. But if you step on a tack or something when they're asking you
14:42
the questions that you're answering truthfully to, that's pain, and pain also causes anxiety. So you can flatline at that level, and then it looks like you're always telling the truth because it's consistent. So there's ways to beat polygraphs that people are kind of, that are kind of well-known ways and, you know, government agents and such are trained in these methods. And they measure, polygraphs also measure things like, you know, heart rate and
15:03
galvanic skin response, your sweat levels, how fast you're breathing, things like that. But really, at the end of the day, these things measure your emotional state, right, how wound up you are. And if you're really nervous, that can send triggers too. So I started thinking, you know, could there be a better way? What if I could use EEG data to predict lies?
15:24
So that was kind of where this all came from. I did, in good little former grad student fashion, I laid out here all the tools that I used to complete this work. Okay, so let me start by telling you, here's my initial experiment procedure. This has actually changed. We've evolved. We're on, like, the third or fourth experiment procedure at this point.
15:44
But this was my first one. This is the early work that was all done on my poor sainted husband. So I put him in a chair. I made, I had him sit down. I had him close his eyes. And actually, that was for two reasons. Number one, I was actually sending markers in the data for when I started asking the question and when he was answering.
16:03
And I wanted to be able to zero in as closely as possible on when his brain was actually thinking about, because I have all this brainwave data coming in, right? I wanted to zero in and be able to parse out, here's the chunk where he's actually responding to the question. So I could pinpoint where he was lying and telling the truth. So I didn't want him reading over my shoulders, because if I had his eyes open, he was sitting right next to me.
16:25
If his eyes were open, he would definitely be reading the questions, because he would. And then he might be anticipating, and then I might not get that area right. So that was one reason I made him close his eyes. And then the second reason is, I just wanted his face to be relaxed. You saw how there's facial movement, like those facial detection things.
16:42
That actually changes the waves, right? I'll show you in just a second. It does. And so I didn't want him to be scratching his face or doing anything. I wanted as smooth a face as possible, because that will help give me the cleanest data for my prototypes. So he did that.
17:01
And then I ran this tool called Emotive Testbench. I'm actually just going to show it to you. And I asked him a series of questions. The questions are in the little right-hand sidebar, but they're all very simple yes or no questions. You know, is your name Eric Marsman? Do you have five children? Do you have a dog? And I purposefully made some of them have the correct answer yes, and some of them the correct answer is no.
17:27
And the reason for that is I wanted to have a good confusion matrix with lots of examples between yes and no and true and false, right? And so you needed ones in all four quadrants. So there are some that are true and some that are false. Yes, he does have a PhD. No, he doesn't have red hair.
17:41
No, we don't have five children, or at least I don't have five children with him. So if there are five children somewhere, then we've got a problem. So those were the questions I asked. And I did ask him multiple times as well, because I didn't want to, since you saw EEG is such noisy data,
18:01
the recommended way to use it is to actually ask multiple times and then take averages to smooth some of these curves. So he did that as well. And then I had him reply yes or no. So again, eyes closed, minimal facial movement. So let me actually just kind of show you some of this. I want to jump off and show you the tool, because this is kind of cool. This is emotive test bench, the one I used. And you can see I'm in orange now. You can see my brainwaves are pretty boring right now.
18:22
So these are all 14 of the channels. And if I You get a little activity there? So you can do that. It's also really fun to scare someone when they're wearing it. There's usually a big woo when they get scared. It's fun. And there's all kinds of fun things you can do. If you're looking for some fun on a Friday night.
18:42
So there are all kinds of cool things. And you can actually, there's filters down here, so you can turn them on or off if you don't want to watch all of the channels of data. You can remove some and just zero in on others. So this is how I actually did it, is I used this tool. And I was using, here, let me show you the markers. So I sent some markers manually.
19:03
So let me show you my little, where would I put this? Jennifer, emotive, under my data. Let's do this. Okay. This was data run two, but same difference. So you can manually send these markers. So I'd be asking him the questions, like right here. And then I, when he started to ask, answer a question, I would send a question asked.
19:25
So that way, so you can see this little one is now moving across too. So that's annotated at that point in the data. And then when he answered, I could hit two. So now I know here's where he was processing and thinking that I could extract those waveforms out and throw away the rest. All right. So that's what that looks like.
19:42
And let me see. So we can close that down. And then what happens is that this tool actually stores the data in a proprietary format. But they have a tool to convert it to CSV, which is kind of crazy why they don't just store it in CSV. But right here, you can launch this proprietary format and convert it to CSV.
20:00
So you can convert it to CSV. And then let me show you that as well. I have one example here. So this, let me zoom in a little so you can actually read it a bit. I'll zoom in. Sorry, I know this is a long room. So what I have here is a couple different things. So first of all, this headset is taking readings across all of the channels 128 times per second.
20:24
So if you do the math, that works out to roughly every eight milliseconds. So every eight milliseconds it will run here. So what this counter does, it's not actually very interesting, the counter counts up to 128 and then resets. And then counts up to 128. So you can use that if you choose. This is also not very interesting.
20:41
These are the 14 channels of data that we're getting. And they correlate, you see some funny things, F7, F3, you know, 01, 02, that's actually the ones in the back here. But I'll show you a diagram in just a second of where these guys map to places on the head. So we have all of our data right here. Then we have some, there's actually a gyroscope in here, and that's kind of interesting too.
21:02
So in my initial work on this, I wasn't using the gyroscope information, but it could actually be telling, right? Because I don't know if you guys have ever watched like the World Series of Poker. And you know how sometimes they have like a little tell where they like tilt their head a little bit or something like that? Usually if you do have a tell like that, you probably wouldn't make it to the World Series of Poker.
21:20
But you know how some people do that. So I thought there actually might be some signal in that data. I don't know. But anyway, I've thrown it out for now, but that's something that might be interesting. So this is the marker column. When I send those markers, remember I was sending a 1 for question asked and a 2 for, to close it out. So I can, what I do then is just parse this data, find the 1, and then find the 2 that follows it, and then just grab that data.
21:42
And then I throw out kind of the surrounding stuff. All right? So that's the marker column. That's not interesting. Here's the timestamp columns. So this is timestamp seconds and timestamp milliseconds. They actually, the way they generate it is they actually have two different columns. That makes it a little bit hard for me to do math with.
22:01
So I actually created one column on the end that is, you know, seconds times a thousand plus milliseconds so that I could get it just in one value to make the math easier. And then we have here all of these, what we call contact qualities. So this is what reads how well the signal strength is, right?
22:21
How well this contact is against my scalp, and how strong of a signal I can pick up. So all of these zeros right here, you see these kind of at the beginning where there's a bunch of zeros. But then once you put it on, they all kind of go to fours. Four should be good. Yeah, so by this time they're all fours. And I'll show you in just a second a chart for what those correlate to. But essentially four is the green dots that you were seeing before, which is full strength.
22:42
And zero is black, which means it's not reading anything at all. And then there's progressive levels of degradation in between. And those are kind of represented by an enumeration here. So there's one of those for each of the contacts against my scalp. So we'll do that. And then up to here is all the data that is provided by the headset, by the emotive tool and what they generate.
23:04
And then I add these on the end. Remember for machine learning, I need to have a labeled data set, right, for it to train on to be able to make good predictions. So this is my label column, the is truth. And I annotate it with one if it is true, and I do zero if he's lying. So this particular file was user 21 telling the truth.
23:23
So all of the is truth ones are one. And then I had a user ID. So this is in a little bit later work where I've expanded beyond just my husband, and I'm doing this to lots of random people now. And then so there's user IDs, user 21. And then there's a time column right here, which is just, you know, milliseconds plus seconds times a thousand.
23:45
All right? So that the time is just in one column. All right. So that's what my data looks like. Now let's jump back here for a second. All right. I just threw that in here for a couple of reasons. Number one, it's a lot easier.
24:00
He's somewhat more follically challenged. So it's a lot easier to see the headset on his head than it is on mine. So that way you can kind of visualize and see a little better. I have to say it's really funny. Whenever the creators of emotive demo this device, they always do it on a blind, bald guy. Always on a bald guy, every single time. And it's so easy.
24:20
Like, you put it on a bald guy and it just works. It's awesome. Like, you put it on me, and I have to, like, shift, like, chunks of hair around, and it still looks crazy, and to, like, make sure that this rests right around. It's much harder. I was considering shaving my head a while back just to make it easier, and that didn't end up having. But anyway, yes. So it works so much better on him.
24:40
But anyway, so there's it documented for posterity and my blog. My poor, sainted husband. No, Jennifer, I don't have an Ashley Madison account. Can I be done now? We're very happy. Seriously, we are. Okay, sensor quality.
25:01
So the next thing is that level of context quality. So this is really useful for data cleaning, right? I want to make sure I'm using the best possible data when I'm training my model to be able to make sure I'm dealing with good data here. Because garbage in, garbage out, right? So in this case, here's the enumeration that it uses. I showed you those context quality columns in the spreadsheet a second ago.
25:24
And we have green is four. So that's the highest level. So all these greens that you see here would come up as four. And then there's levels of degradation down, yellow, orange, red, and then black. So that's that. Here's where the positions actually map to places on the head.
25:41
So you can see O1 and O2 are the ones that are down here. So the way this diagram is shown is actually it's a top down view where you're looking at the top of the head. And that little bump there is the nose. So basically you're looking at it like this, right? Okay. So this is AF3 and AF4 right here. And then the T7 and T8 right over my ears, that sort of thing.
26:03
All right. So you can see kind of where they correlate. Okay. So then the next thing I did was, again, like a good little grad student, I wanted to see what work had already done in this field. Because I know, you know, usually when you have an awesome idea, somebody else has already thought of it. So I just wanted to see, okay, what existing work is out there?
26:22
Has anybody else kind of looked at this and what did they find? And I did find something called the P300 ERP. ERP stands for event-related potential. It's actually a way of working with EEG that kind of makes it a little easier. It averages stuff a little to make it cancel out some of that noise. So but what the P300 ERP actually measures.
26:42
So this is what it is. This is what it looks like. It's this little dip that you see right here. Whoop. Right there. And you know how when you're walking through a conference, let's say, you're walking through NDC Oslo and you see a face and you recognize that face. You get that flash of recognition. And you know them from somewhere, but you can't think of where, but you just that spark of
27:01
ooh, you know, your brain is pattern matching and it goes bam, I know that person from somewhere. That's the P300 ERP, that flash of recognition that you got. And that's something that is a little harder to fake, right? So what work is being done, so this is what it looks like on the actual waveform is there's a little dip right here. And it usually happens or it happens roughly like 250 to 500 milliseconds after the stimuli, the vision that you see.
27:27
So that is, that's why they call it the P300 because it's about 300 milliseconds afterwards. And how this is used is like some government agencies have been experimenting with this by, when they're bringing someone in to see if
27:42
they're lying or not, instead of asking them yes or no questions, they've done some work to put images in front of them, okay? So imagine, you know, showing a bunch of images, and some of them might be neutral images of just, you know, home or whatever. But then you stick in there a picture of the murder weapon or a picture of the crime scene or something like that.
28:03
And so if that person has that little flash where they recognize that they've seen it before, then you know that they've been there or they have some recognition or some knowledge of the crime, they've at least seen it before. So that's something that they're looking at. And then these charts basically say that in a fancy graphical way, but it's pretty obvious, right?
28:20
If it correlates to someone what the suspect should know, data that the suspect should not know, then there's a problem. If they know something that they shouldn't know, then you have a problem. All right? So kind of cool stuff. So it's not completely the same as what I'm doing, because I'm still asking yes or no questions. But what this work actually taught me, or something I got from this work, was kind of
28:42
what level of time are we talking about here for these responses in the brain to happen. One of my concerns is, remember I said it's taking readings, 128 readings per second, and that's roughly every 8 milliseconds. And I wasn't sure, is that enough time? Is that frequent enough? Like, how fast are these things happening in your mind?
29:01
And so to know that that instant flash of recognition is happening, you know, about 300 milliseconds, that made me think, okay, cool, I'm good. I don't have an issue with every 8 milliseconds should be sufficient for this. So that was helpful. All right. So now let me take a step back. I apologize for those of you. How many people saw my talk on Wednesday with the Titanic stuff? Okay, cool.
29:24
So this is going to be just like two or three slides of repeat, so you guys can doze off for a second. But the first tool I turn to to do this is Azure Machine Learning. And the reason I use that is because it's a great tool for like rapid application development. It's a browser-based thing. You can just drag and drop in algorithms and modules and try things out really fast, and it's just great for prototyping.
29:44
So it has 25 different machine learning algorithms. It has all these data or all these modules that you can link up and create a data flow to build your models. They contain a lot of data cleaning things that you would all kind of want already. You have the ability to include your own Python and your own R snippets in there as well.
30:04
So if you're kind of doing data science already, you're probably working in one of those languages. So you have the ability to kind of pull some of that in there as well. And then you can very, very easily train and test your models. And then it is literally a button press to deploy out, and it generates sample code for you, and it's just so much win there.
30:20
So I thought it was a really good tool to just start experimenting with. So this is what I went with to begin. So what I'm doing is something called supervised machine learning. And so here's a template of what supervised machine learning could look like in Azure Machine Learning. So first of all, you'd want some data set. So I could upload those data sets like the CSV that I just showed you.
30:41
Uploaded that up into Azure Machine Learning. Then I applied some data cleaning stuff here. And then in supervised machine learning, there's a couple different ways to do it. You can use a cross-validation technique using K-folds and some other ways. But here's one simple way to do it. You can also split your data set.
31:00
So the reason we split our data set is because we want to be able to validate that our model is actually performing as well as we think it is. And so I take, let's say, 70 percent. Let's say you do a 70-30 split. So maybe I take just 70 percent of my training data and build a model based on that using some algorithm. And then I hold back 30 percent.
31:21
And the reason for that is if I tell the algorithm, okay, for these specific features, this is the answer. This is what the right answer is, and it trains based on that. And then I want to test it, and I give it those exact same features as input. Of course it knows the answer, right? Because I already trained it with that. So what you really want to test is how well does your model perform on new data that it's never seen before.
31:44
So that's why when we have, you know, this much data, you hold back a little of it. So you build your model with this much, and then let me just use this remaining, and I'll just pass it in. So what this does right here is this train model builds a model based on some algorithm. And the algorithm defines what kind of math you use to find the correlations between your features and your label.
32:03
And then we have the data that we passed in here, which is EEG and truth and lie and all the stuff you just saw in the Excel spreadsheet. And then we build a model, and then the train model comes out here, and then we apply that 30 percent of the test data. And so it passes in those EEG patterns. The model gives it a result, and then we have the known right answers of what that, you know, whether those were really truth or lie.
32:26
So then we can use that to score the model and see how well it was actually performing. And then I can do the whole same thing over here and tweak, try different algorithms, try different initial parameters, things like that. And then you can evaluate different models against each other. So that was kind of the process I went through initially.
32:42
And the one thing that you may be wondering, though, is, okay, I have this algorithm that you're passing in. You said, Azure Machine Learning has 25 different algorithms. How do I know which one to use? And so there's a great tool called the Azure Machine Learning Algorithm Cheat Sheet. So this is great because it teaches, it kind of gives you declaratively what the algorithms do and what each one is good at.
33:01
So you start at this little yellow node in the beginning. And you can do, you have the ability to find unusual data points using some anomaly detection algorithms. You can discover like things and group them together using a clustering algorithm. And then you can predict either values, like numbers, values on a continuum. So numbers over here. And then you can predict categories.
33:21
So what I'm trying to do is predict truth or lie. So those are two distinct categories. So we'd go here. And then there are two categories in my world. Things are either truth or lie. There are no half, I didn't ask any half truth questions. Like, does this dress make me look fat?
33:40
That may be like, eh. So they were all very clear yes or no. So we have a two class classifier in here. Although, hmm. All right. And then what you can see if you get back to this point is that it actually shows you, okay, for a two class classifier, here's what each of these things are good at, right? SVMs or support vector machines are really good if you have a lot of features.
34:01
Linear models train really fast. They're great like that. All these on the right hand side are nonlinear models or algorithms. So they work really well when you have nonlinear data. Decision tree is highly, or boosted decision tree is highly accurate. But it's also got a very large memory footprint. So all of those types of things. So it's really nice to kind of give you a general sense of, okay, let me try like maybe these couple first
34:22
and then you can iterate on that. All right. So if you want to download that yourself, there is a download link at aka.ms.wac Azure Machine Learning cheat sheet. And then there's also a blog post I did for, on kind of getting started with Azure Machine Learning. It's got a lot of cool data sets that you can play with in there and other neat stuff.
34:42
So you're welcome to check that out as well. I did write it for a student audience. So the title is actually how to win a hackathon with Azure Machine Learning. But ignore the first paragraph and then the rest of it is just getting started with Azure Machine Learning. Okay. So the next thing I want to show you is kind of an early demo for what it looked like.
35:05
So let me jump over to Azure ML for a second. And I'm going to show you now this, let me be really clear here. This is not performing lie detection. What this is doing is I had all the data. So I had asked them all the truth data and then all the lie data like all in a row. Right? So it's kind of like one session sitting down asking both of them.
35:20
And then I wanted to see in this one session where, you know, his mental state was probably very similar. Right? Because I did them all in a row. First truth, first lie, then lie. Like so it was all like right then when his state, you know, he hadn't changed mental state significantly between those two things. Can I even differentiate at all between the truth and the lie? Like is this even possible?
35:41
So this was kind of my first experiment. So what I did was I dropped some of the data. I did a little bit of data cleaning up here. So I have here's a bunch of, here's the truth data of his and here's the lie data. I did a little bit of work here. This one is not doing, this one isn't even doing like time series processing.
36:01
It's just saying can I even differentiate between these two sessions at all in this truth state of mind and this lie state of mind. And so what I can show you from this is if we visualize the results, you can see that the, I think it was just that one did not do well.
36:24
But if you look right here at the decision, this was an ensemble of decision trees. This one actually did really, really well. So let me show you how to read this graph. So this is actually graphing the true positive rate against the false positive rate. And so you ideally want zero false negatives, right?
36:43
And one, the 100% false negative. So perfection is whoop all the way up the side and then whoop all the way over. So that would be like perfection. And you can see this one actually did really well. If you look down here at the data, I selected the red curve, which is the one that's doing better. This, by the way, if you look at kind of the diagonal line across the middle, that's essentially 50-50 random chance like a monkey flipping a coin.
37:07
So you want to do better than that. And then this one did better, but this is like, okay, that's about like 75%, it looks. And then this one is actually, if we jump down here to the data, we can see it gives us our true positives and our true negatives.
37:24
You want those numbers to be big. And our false positives and our false negatives, you want those numbers to be low. And you can see we have an accuracy of 93%, which is pretty good. Precision of 91, recall of 94, and then about 93 for the F1 score, which is a combination of recall and precision.
37:42
So overall, 98% are under the curve. So that's pretty good. So I started thinking, okay, maybe there is something to this. Like maybe I can actually do this thing. So on to my next steps. Let me go back here. So I got really excited.
38:01
I was asked by the Azure Machine Learning team to come out and demo this to like the vice president of machine learning at Microsoft and the GM of the team and all that stuff. It kept going up and up. So it was really very cool. And the other thing that was really fun is that about two weeks after I did this initial work with my husband, my team had an offsite.
38:23
So I work on a distributed team. So I live in Michigan in the United States right now, but I have teammates all over the U.S. And so like once a year, we all get together for a big thing where it's like a rah-rah, kick off the beginning of the fiscal year and talk about our goals for the year and that sort of thing. So we all get together at this team offsite.
38:40
And we happen to have a really nice camera crew there the last time. And the reason for that is we all do Channel 9 videos. That's like one of our commitments and stuff. So one of the marketing team had actually brought out this camera crew in case anybody wanted to do videos while we were there. So I grabbed my manager, and I was like, hey, remember I told you I was kind of working on this new machine learning product?
39:03
Can I borrow you for a sec? And so I grabbed him, and I grabbed the cameraman and said, record this, please. And so I actually did put this headset on my boss for the second trial run. And so first of all, I had to get a label data set of his brain waves, right?
39:23
So first I asked him to tell me the truth. Are you a female? No. Are you a male? Yes. Have you ever worked for Microsoft? Yes. Okay, so you guys get the idea, right?
39:41
Okay. So then I asked him to lie to me. Do you have a PhD? Yes. No. Do you currently have a pet? Yes. Nope. Do you have blue eyes? Yes. Okay, so you get it.
40:01
No. What would you guys do if you had access to your manager's brain waves? Do you believe Microsoft is the best company in the world to work for? Yes. Am I going to get a promotion this year?
40:23
Yes. You guys all heard that awkward little giggle from my manager. Okay, so what did that mean? Did it mean that he has some good news that he can't share with me yet?
40:41
Or did that mean that I really suck and he doesn't know how to tell me? Let's find out. So what I did that night, of course, I did what any girl would do, and I ran back to my room and built a classifier with his brain waves. So I then wrote a really horrible Windows 10 app because I am not a UI person.
41:02
If any of you guys are UI people and would like to pair a program with me for a while and make this better, you're welcome to. Or you should put it on GitHub or something. But I just wrote a really, really ugly Windows 10 app, and I grabbed a control. This is a, I thought, so what I pictured in my head was like a cool gauge control. So I found one from Telerik, and so that's like a Telerik control,
41:21
and then I have really simple buttons and really ugly text. But, you know, maybe some background color or something. I don't know. I'm not a UI person. I'm an algorithms person. But if anyone wants to work with me, you're welcome to. All right, so here are the questions just in case you forgot them. Do you believe Microsoft is the best company in the world to work for? My manager said yes. Now, what this is, this code actually calls the web service.
41:42
So I created a classifier, built a model, created a classifier, published it, stood up a web service, so then it can call the web service and get the results back. So this is calling that Azure web service. So do you believe Microsoft is the best company to work for? He said yes. Survey says he just got an iPhone.
42:10
He's really into design. I don't know. We might have a closet Apple fan on our hands. All right. Then the question that really matters here, am I going to get a promotion this year?
42:22
Let's find out in like a month if that works out for me. We get our things. I think it's like end of June is our fiscal year. So we should find out like a little bit after that. So we'll see. I'll tweet.
42:41
I'll let you know. All right. So that is, we've tortured the manager. We've tortured the husband. What do we do? Where do we go from here? So the next thing, so I've refined the process. So there's some flaws, if you guys haven't caught them, in what I was doing a couple times.
43:01
There's a couple things that I wanted to fix and make better. So I don't expect you to read all this because that's crazy talk, but let me just pull out some of the things about how I started doing additional data runs to collect more data because it works really well, you know, if I have the person's data, but I need people's brain waves, you know, or I'll look a little different. So I wanted to be able to normalize it so that I could put it on anyone
43:21
and just know if they're telling the truth or lying. So the next step here was I changed from taking, like on the initial step with my husband, I only did one marker of when I asked the question, and then I took like just a certain time period after that. I changed it to actually take multiple markers on either side. Here's when you start, here's when I start asking the question and when they finish answering, so we can pull that out.
43:43
I also took a neutral brain state recording. So I think that can help with the normalization, right, if you see what your mind is like at rest. You can either take deltas or use that to normalize or that sort of thing. And remember, too, when I was doing that pulling the cube with my mind thing, it needed a neutral state, right, to be able to correlate my brain waves and make that work properly.
44:02
So I want to use that and implement that in there as well. So that's one of the things I'm working on. I add more to questions. It was a very short list in the initial work with my husband, so I've added on since many more questions. Ooh, I also started switching the order. I started thinking about fatigue effect, right. So think about if I sit you down and I put the headset on you
44:24
and start asking you questions, it's first, ooh, it's kind of fun. There's this random blog woman asking me stupid questions. But then it kind of gets boring after a while, you know. Your attention kind of fades or whatever. And so I didn't want, I usually, at first I was asking truth first and then lie always. And so I didn't want the, when people's attention kind of starts to wane,
44:43
I didn't want that always to be in the lie data, right, because that would unfairly mess it up. So I switched the order. So for some people I asked them to lie first and then I have them tell the truth. And then some people I have tell the truth first and then lie. So that evens out and kind of reduces the effect of that fatigue effect.
45:00
The next thing is I'm keeping additional data about the test subject. So that gives me even more information. I'll show you that in just a second. But one of the things I was taking that I thought would be kind of interesting is whether they're right or left-handed. I know that sounds kind of random, but remember how there's the thing that it correlates to the dominant side of your brain, like right-handed people use this dominant side.
45:21
So I don't know if there's anything to that, but I can throw that into the classifier and see if it finds any correlation. And then I also started doing, so one of the things I'm worried about is just some of the confounding variables here. If you have the same user in the session, was it really detecting, you know, truth or lie, or was it just detecting like session state, like what state your mind is in?
45:43
And so you can reduce that by like taking multiple readings at different times from the same person, like let me get a reading from my husband first thing in the morning before he's had his coffee, and then again in the evening when he's really tired, and then in the middle of the day or somewhere when he's fresh, get one when he's drunk, like all of these kind of different things to kind of see how to normalize that.
46:06
And then I've added some additional questions. I've tried to make all of them clear cut yes or no questions. The gender ones I know are a little difficult. I haven't had anyone in the study yet who hasn't had it be a clear yes or no answer,
46:20
but that is a concern or something that I would take very seriously if I was working with someone like that, so I would accommodate that as well. But a lot of different questions, so maybe those should be thrown out, but the rest of those I tried to do very clear yes or no type questions. And then here's some additional data that I'm collecting about each person just to see if any of those factors correlate.
46:42
Age, I know there's been, I don't know if there's a brain study show that's on in the U.S. that has kind of shown some information about how our brain changes with age, so I thought that that might be relevant, and then the right or left-handed thing, and then through gender in there as well. All right, so let's talk a little bit about some of the things that I'm doing,
47:02
like what are the machine learning techniques you can use to play with this, because obviously you don't want to just compare sessions. So a couple different things I'm trying right now, I'm trying a lot of different things, honestly, and one of the things is unrolling the waveform. So imagine you have, remember you saw what the waves look like, there are all these waves coming through, and so one way is to treat each feature like, you know, you start at the same point
47:25
and then kind of say, okay, for these 14 channels, just put all the data here and then there and then there, and so basically you unroll the waveform and make it one long feature list, and then use that to feed it in, so that's one idea. It works if you can pinpoint the exact time when it started and match those up,
47:42
but it's a little bit, it can be a little hard to normalize and such. Another thing I'm doing is using wavelets. So wavelets is actually a really interesting thing. So what that does is we're dividing continuous time signal into frequency bands, and if you look at, there was actually a Kaggle challenge, if any of you guys are familiar with Kaggle, does anyone know Kaggle?
48:02
Nice, okay, let me tell you, Kaggle is awesome, you guys, you guys should check out Kaggle. How much time do I have? I have like, let me show you Kaggle really quick. So there's Kaggle, so Kaggle is this website that allows you, it's really fun if you want to learn machine learning, because what it gives you is, number one, a clearly defined problem to solve,
48:21
and number two, the data to solve it with. So, awesome, right? So Kaggle is, it's K, Kaggle, K-A-G-G-L-E, Kaggle. So you do that and you go to the competition site, and then it's all of this really cool stuff like, if you go here, like one of them, you see they have like cash prizes and stuff too, so a lot of cool stuff there.
48:41
But if you click on this, the State Farm Distracted Driver Detection, this is an insurance company in the U.S., and so the question they're trying to solve is computer vision spot distracted drivers? And so they give you some samples of some images of people with phones driving and such, so you can use these to try to solve the problem of using computer vision, figure out if they're distracted or not.
49:01
And then they give you all the data to do that, that's kind of step two here, so you can grab their data and then make a submission. So it's really cool because big companies are putting like some of their hardest machine learning problems out there to kind of like crowd source them, and then the winner can win money or sometimes a job, some people, there's a Facebook one to win a job at Facebook right now, and Yelp has done that as well to be a Yelp data mining engineer.
49:21
So anyway, it's really cool. But anyway, so Kaggle has these competitions all the time, was the short thing before I went off track. And so there was this Kaggle EEG challenge a while back, and the challenge was around differentiating with EEG between grasp and lift. And so think about someone with a prosthetic arm, okay? If you could even use your brain to differentiate between I want to close my fingers
49:45
versus I want to, you know, bend my thing and build that into the prosthetic, that would be amazing, right? So that's what they were doing is can you use the EEG to differentiate between grasp and lift. And so they provided a great data set to do that. And the winner, I actually didn't find out about this until after the contest ended,
50:02
but the winner, they usually do a blog post for the winner where they can discuss the techniques they used and stuff. And it turned out that a low time domain frequency band was one of the most telling signals to differentiate between the EEG. So that was kind of cool information as well. So I've been doing some wavelet work as well and using that too.
50:20
Next, I've also been using this Pi EEG library that was online. Thank you, Forrest Chang Bao. That has been very helpful to me. It grabs certain features out of EEG that are interesting and relevant features. And so I've been using that Python library as well to grab some of those features. I've also thought about state machines or a hidden Markov model.
50:41
So Markov chains, the way they work, they're essentially like little mini state machines, all right? And so when you think about it, the problem of lie detection is somewhat like a state machine because this process has to happen where I have to hear the question, process and understand, cognitively translate those words into meetings in my head,
51:00
decide whether I'm going to tell the truth or lie, formulate what I'm going to say, and then actually do the work to vocalize it, right? So there's like this little state machine of stuff that has to happen, right? And so I thought that might be interesting because if we could further subdivide and break it into these little sessions, that might actually be relevant. So I was looking at those as well.
51:20
The comparing to neutral brain state, so I wasn't doing that in the initial work, and I think that's highly relevant to normalize between people. The initial work, I was just focused on one person. I'm now trying to generalize it to everyone, the neutral brain state I need for data normalization across different types of brains. Using ERP, I already mentioned, instead of raw EEG, that's a good thing to use just because it filters out a lot of some of that extraneous noise.
51:45
And then ensembling is another great technique. So if you look at who's been winning these Kaggle challenges, it's usually always somebody who's using an ensemble. And so what that means is instead of just using like one algorithm in one single model, you actually use multiple models in conjunction,
52:01
and they kind of like can feed into each other and create an even better model. So that's what ensembling is, and so I'm looking at that as well. And the other thing, which I don't know why I forgot to put it on here, but is deep learning, okay? So deep learning, I think, is made for this problem, okay? So when you think about it, this problem is almost identical to the problem of speech recognition, right?
52:22
Because when you look at speech, I'm sure if you guys ever processed a video file, you see our speech patterns look like waves, too, right? We're processing them in the frequency domain. And each of us all has, we might be saying the same words, but we all have our individual accents and our individual, some of us have higher voices and some of us have lower voices, but we're all saying the same words.
52:41
So it's like the same problem fundamentally, right? So, and deep learning is what solved speech recognition. So that's something I really, really want to do, but the problem with it is that deep learning requires a lot of data to be effective. And it's just me, doo-doo-doo, in my spare time, going out and doing this and collecting data.
53:00
So I just don't have enough data yet to do a convincing model, but I'm working on it. And then Microsoft has a deep learning toolkit called the CNTK. So that's one of the things I'll probably start with and try some other stuff, too. Yeah. So anyway, that's another thing. So a lot, a lot of fun stuff there. Okay. So the next steps here, I, what I'm, my work in progress,
53:23
I'm collecting data from lots of people now. I'm trying to balance between genders, trying to get different, just all kinds of diverse people. I want to be great to get some children. You know what I'm saying, just a lot of ages, a lot of genders, that sort of thing to try to get the, build the best possible model. I want to, I'm experimenting with different features to improve the accuracy.
53:44
So all of the stuff you saw on the previous slide, different methodologies within machine learning, different things to try. So I'm running a ton of experiments on that in progress. And then the last thing is just implementing a real-time feedback loop. So right now, this would not, this is not actually even hard, so I need to just like block some time and just get it done.
54:02
But just when I stick the headset on somebody in pseudo real-time, it'd be cool to ask them, be able to, because that'd be such a fun demo, right, to stick it on like Satya. And then like in real-time, be able to ask him questions and then have the little gauge actually go truth or lie in real-time. So, and that actually is not that hard to do, because I just, so I need to take, I just need to automate the process
54:23
of converting the stuff in the CSV, calling the web service, and then displaying it in the Windows 10 app. So that's all doable. So I got to just do that. Maybe I'll do that. Maybe I'll do that this weekend. So anyway, that's the last thing. I do want to say thank you to a number of people who have helped me out
54:41
after I kind of did this initial work. As I said, the GM of Azure Machine Learning invited me out to share this stuff, and some of the data scientists in there were really excited about it and offered to help and stuff. So they've contributed some various things. Python, Jun Zhang in particular, did some Python scripts,
55:00
which were very helpful, and I've talked with a bunch of other people. So I just wanted to call out and thank them for their great support. And then just in summary, this headset is so cool, and I have like 12 million more ideas of things I want to do in conjunction with machine learning and brain waves. Azure Machine Learning is a phenomenal tool.
55:20
It is a great way to, if you are just playing around and experimenting with crazy ideas, you can get a machine learning model up and running so very quickly. It's a great tool for like rapid application development and trying things out and experimenting and creating an amazing model, using the algorithms that are used by Bing and Xbox
55:41
and developed from Microsoft Research. So really, really good stuff there. So husbands and managers, beware. Mind control is next. Thank you guys.
56:06
Thank you for coming. Thank you.
Empfehlungen
Serie mit 2 Medien