Bayes is BAE
Video in TIB AVPortal:
Bayes is BAE
Formal Metadata
Title 
Bayes is BAE

Title of Series  
Part Number 
35

Number of Parts 
86

Author 

License 
CC Attribution  ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and noncommercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license. 
Identifiers 

Publisher 

Release Date 
2017

Language 
English

Content Metadata
Subject Area  
Abstract 
Before programming, before formal probability there was Bayes. He introduced the notion that multiple uncertain estimates which are related could be combined to form a more certain estimate. It turns out that this extremely simple idea has a profound impact on how we write programs and how we can think about life. The applications range from machine learning and robotics to determining cancer treatments. In this talk we'll take an in depth look at Bayses rule and how it can be applied to solve problems in programming and beyond.

00:00
Predictability
Computer animation
Personal digital assistant
Power (physics)
01:19
Computer animation
Phase transition
Rule of inference
02:07
Statistics
Computer animation
Internet service provider
Multiplication sign
Authorization
Bit
Object (grammar)
Mathematician
Error message
Rule of inference
Flux
03:19
Computer animation
Ring (mathematics)
Independence (probability theory)
Quicksort
Surface of revolution
Number
04:01
Personal digital assistant
Surface
Sampling (statistics)
Inverse element
04:48
Predictability
Probability distribution
Resultant
05:26
Degree (graph theory)
Machine learning
Computer animation
Artificial neural network
Mass
Computer programming
06:08
Server (computing)
Mobile app
Arm
Multiplication sign
Cellular automaton
Gradient
Shared memory
Bit
Client (computing)
Information privacy
Public key certificate
Event horizon
Computer programming
Neuroinformatik
Revision control
Word
Data management
Computer animation
Serviceoriented architecture
07:38
Server (computing)
Mobile app
Beta function
Computer animation
INTEGRAL
Multiplication sign
Bit
Serviceoriented architecture
08:12
Service (economics)
Computer animation
Open source
State of matter
Square number
Software bug
09:22
Area
Data mining
Computer animation
State of matter
Representation (politics)
Selforganization
Branch (computer science)
Dimensional analysis
Power (physics)
Form (programming)
Number
10:24
Ocean current
Type theory
Computer animation
Electronic program guide
Twitter
10:57
Area
Artificial neural network
Robotics
Gradient
Planning
Bit
Rule of inference
Social class
Number
11:39
Point (geometry)
Curve
Graph (mathematics)
Personal digital assistant
Robotics
Normal distribution
Measurement
Position operator
12:34
Predictability
Functional (mathematics)
Information
State of matter
Rule of inference
Measurement
Convolution
Product (business)
Estimator
Computer animation
Robotics
Resultant
Position operator
Physical system
14:11
Predictability
Filter <Stochastik>
Point (geometry)
Group action
Dot product
Robotics
Direction (geometry)
Endliche Modelltheorie
Measurement
14:54
Point (geometry)
Filter <Stochastik>
Dot product
Computer animation
Robotics
Bit
Mereology
Freeware
Graph coloring
Social class
16:13
Zirkulation <Strömungsmechanik>
MIDI
Disk readandwrite head
Resultant
17:24
Arithmetic mean
Goodness of fit
Computer animation
Order (biology)
Bit
Quicksort
Bayes' theorem
Disk readandwrite head
18:01
Point (geometry)
Information
Mapping
Artificial neural network
Order (biology)
Disk readandwrite head
Resultant
Computer programming
19:28
Point (geometry)
Computer animation
Personal digital assistant
Selforganization
Nichtlineares Gleichungssystem
Quicksort
Disk readandwrite head
20:13
Multiplication
Word
Goodness of fit
Computer animation
Personal digital assistant
Network topology
Disk readandwrite head
Rule of inference
21:10
Point (geometry)
Computer animation
Network topology
Expert system
Branch (computer science)
Right angle
Disk readandwrite head
Event horizon
Rule of inference
Resultant
Element (mathematics)
22:27
Point (geometry)
Greatest element
Computer animation
Expandierender Graph
Total S.A.
Bit
Disk readandwrite head
Rule of inference
Form (programming)
Spacetime
23:04
Point (geometry)
Computer animation
Personal digital assistant
Multiplication sign
Order (biology)
Sheaf (mathematics)
Summierbarkeit
Disk readandwrite head
Power (physics)
24:21
Point (geometry)
Greatest element
Computer animation
Mapping
Personal digital assistant
Multiplication sign
Order (biology)
Bit
Disk readandwrite head
Rule of inference
Form (programming)
Number
26:21
Point (geometry)
Information
Multiplication sign
Sheaf (mathematics)
Control flow
Mass
Rule of inference
Mathematics
Computer animation
Personal digital assistant
Right angle
Nichtlineares Gleichungssystem
Posterior probability
Physical system
27:38
Filter <Stochastik>
Predictability
Graph (mathematics)
Information
Link (knot theory)
Measurement
Convolution
Estimator
Word
Mathematics
Computer animation
Recursion
Posterior probability
28:44
Logical constant
Filter <Stochastik>
Code
Observational error
Multiplication sign
Execution unit
Range (statistics)
Shape (magazine)
Computer programming
Number
Estimator
Bit rate
Robotics
Term (mathematics)
Endliche Modelltheorie
Error message
Binary multiplier
Position operator
Physical system
Exception handling
Form (programming)
Predictability
Area
Noise (electronics)
Smoothing
Sigmaalgebra
Planning
Bit
Line (geometry)
Measurement
Computer animation
Ring (mathematics)
Personal digital assistant
Phase transition
Escape character
Quicksort
Gradient descent
33:04
Predictability
Computer animation
Robotics
Square number
Planning
Bit
Line (geometry)
Freeware
Error message
Measurement
34:07
Filter <Stochastik>
Angle
Personal digital assistant
Multiplication sign
Matrix (mathematics)
Bit
Form (programming)
Physical system
35:05
Probability distribution
Filter <Stochastik>
Curve
Estimator
Personal digital assistant
Different (Kate Ryan album)
Order (biology)
Planning
Extension (kinesiology)
Measurement
36:24
Filter <Stochastik>
Predictability
Information
Touch typing
Bit
Mereology
Rule of inference
Hypothesis
37:00
Explosion
Predictability
Information
Multiplication sign
Set (mathematics)
Bit
Special unitary group
Disk readandwrite head
Rule of inference
System call
Physical system
Power (physics)
38:52
Goodness of fit
Mathematics
Digital photography
Algorithm
Computer animation
Rule of inference
Computer programming
Entire function
39:24
Filter <Stochastik>
Noise (electronics)
Voting
Computer animation
Root
KalmanFilter
Order of magnitude
YouTube
40:26
Estimator
Computer animation
Personal digital assistant
Military base
Plotter
Theorem
Insertion loss
Rule of inference
Theory
00:00
the and and
00:13
what if you could predict the future I Ch what if we all could I'm here today to tell you but you can't the the we all can so we have the power to predict the future of the bad news and that not very good at a the good news we is that even the bad prediction and you can tell us something about in today we so the they will the gain you will discover and so what they we in this case the and and the thing you have
01:20
and introducing our
01:24
protagonist's this is Thomas
01:27
Bayes Thomas was born in 17 0 1 the maybe we don't exactly know His brought more into a town of socalled Hertfordshire no the close OK of possibly we we can't know for certain the we don't actually even know what is look like what we do know is that phase was a Presbyterian minister and a statistician we also know that his most famous work was published a paper that give us this rule was not published until after his death before this he published 2 other papers the divine benevolence or attempt to
02:09
prove that the principal and of the divine providence in government is the happiness of his creatures yes that is 1 title as well as
02:18
an introduction of the doctrine of flux on the defense of the mathematicians against the objections of the author of the analyst and you know I like my titles a little bit shorter but every
02:31
edge has different preferences so you know what why do we
02:34
care about this well on Bayes contributed significantly to probability of formulation of Bayes rule and again even though wasn't published until after after his death let's let's travel back and put our minds and in a commoner of the of the error so the year is 17 20 Sweden Prussia just signed the Treaty of Stockholm Maria Mozart the mother of the person who wrote the requiem that we just enjoyed and Wolfgang Amadeus Mozart so not Mozart but his mother was born in 17 20 and statistics is all the rage as well as probabilities at the time we can do things like
03:22
say given given we know the number of winning the ring tickets a Raphael was the probability of any 1 given to get will be a winner in the 1720 is
03:33
Gulliver's Travels was published so this is 45 years before the American Revolution 45 years before but you know we went battle with Britain and Indiana independence also in the 1720 is easter
03:46
island is discovered because the people knew was there before but the Dutch didn't it of so and I although if you know this sort you see this but and there's actually a
04:01
lot more to the statues is a lot more underneath the surface on which is also true very true probabilities as well see what
04:09
we knew how to get the probability of a wing to get what we do know how to do was the inverse and inverse probability says that the OK what if we draw 100 tickets and we know when we find that 10 of them are aware of what does that say about the probability of drawing on air well in this case it's pretty simple 10 winners we draw 100 tickets is about 10 % but what if we have fewer samples what if we have 1 sample we drew 1 ticket it it was a winner well what does that mean that a hundred per cent of tickets for winners is is you know is that we're in a gas so the answer is no we we
04:50
would guess that that that's not you know you like I well like maybe it's a really weird Raphael but like I have not found any Raphael's that are like that on and the reason why you were able to correctly answer that is
05:03
because you can predict the future the the the even if their predictions wrong not dead on but it's still better than making no prediction at all this was Bayes
05:18
insight that we can take 2 probability distributions that are related and even of the both inaccurate the result will be more accurate
05:26
we can also do things with this passage such as machine learning and artificial intelligence
05:31
will be focusing on artificial intelligence and in the stock didn't but I would take a 2nd and I'll introduce myself my name is much means it is pronounced like schnapps that will plunge at the beginning of maintains rockets
05:47
poorly I have have commit to rails Israel's Kumar and I'm also
05:52
taking a cs in masters a masses in CS at Georgia Tech with the online program I'm I went there for my bachelors for mechanical engineering degree absolutely hated that it was brutal not very much fun but i the only charging
06:08
me 7 grand for the entire program so it's pretty cheap you not a bad deal and so I I work I work full time for a and for a time share company we the we it it's on it's time
06:26
pure computers that's that's the that's what we do the and so you hopefully some of you already know a broker is so instead of pitching a word explain program explain some new features you might not heard of a we have a thing I introduce called automatic certificate management this will provision a letting incorrect SSL search for your app and automatically rotated every 90 days which is pretty sweet but we also have SSL for free on and and that was on all Peyton hobby diners the and but the as a we offer for free is what's known as on S and I as a cell and I'm of you heard about legislation of that went through a Congress that was like a FCC you cannot protect people's privacy anybody about the event yet so on adding SSL on your server is is going to help your clients I get a little bit of protection on the free version as a cell that we have um which is as an i does leak the host names your eyes the arm but we also have a NSA grade as a cell of which is an addon that you have to add and then you also have to provision making your certificate we have for a BCI which is continuous
07:40
integration of its in beta you can give that a shot a review apps which I absolutely positively love try these if you haven't every time you make a pull request from broker will automatically deploy a staging server just for the apple request 0 I can fix the CSS boggarts like did you really did you of you know the person reviewing can click through see actual light deployed Apps and uh and verify that the
08:07
so that's it for the company I work for a typically this would be the time I do a little bit of selfpromotion it typically I would do something like
08:13
promote other service that are called co triage which is the best place to get started intervene open source but and I'm not going be talking about could triage instead what hour talk about is the biggest problem a country faces are especially I come from Texas and state Texas and is gerrymandering which is awful and unlike could triage gerrymandering is very bad yeah the in you know the anyway so this is this is
08:45
this is gerrymandering basically given a given a population so you could represent perfectly and and say OK well there are more blue squares than there are more red square so we should have more of you know blue districts than red districts but if you look at all the way over on the side you can create the district in such a way that all magically now there are more red districts so good this is uh this is where I live and this is the District of that in Texas that stretches from San Antonio to Austin I don't know if you know that's a really far away the the yeah I mean it like just look at it
09:24
seriously us so yeah gerrymandering can like take so a voice dimensions the power your vote and and so I think we need countrywide of redistricting forms and it's not just me things this my district exactly ruled illegal by the US State of Texas side the judicial branch of unfortunately area illegal district will not deter are the people in charge of redistricting in Texas and the refusing hear any of the bills on on the issue the and you might
09:54
say wow that's really important issue OK but what can I do so I I I highly recommend on looking up your state of representatives you a House Representatives and a Senate representative like find them find their mine uh Kirk Watson Eddie Rodriguez I have the phone numbers in my phone on and then call them and let them know I K I care about redistricting and I care about gerrymandering and like I want this to be an issue that that we should push you might say all is the more than I can do what there are under low organizations of for example
10:25
in Texas there's D gerrymander Texas which is a really long Twitter handle the and they give guides and talk about current legislation and and those 2 types of things so I guess I just think that gerrymandering is very unpatriotic Texan it can be on Arizona and to you know no no bias and and a really 608 the the freedom to elect people who represent us so OK back to
10:53
base and but so on
10:58
Artificial Intelligence up for this talk and we have to talk about some examples of for the grad courses taking Georgia Tech worry we've been using Bayes rule for artificial intelligence with robotics if not familiar this is world class realm
11:13
on the loan is very different than from the
11:16
remembrance arriving in my but there is no more than number the money there to
11:23
OK can we get the audio just like a little bit again area I so when we have a robot we need to get that robot somewhere we need 2 things we need to know where the robot is and we also need to have a plan on how to get there so robots don't see a world the same way we see
11:41
them they see them through a sensors and the sensors are unfortunately
11:46
noisy so they don't see the world perfectly clear so given the case that we have a robot
11:51
and it really simple I can move but just right left if we take a measurement it will tell us about where it is we can represent this by putting it on a graph and this this is a normal distribution so here we have a robot it's at around its appositions 0 but we don't know for sure that it's at position 0 it could be further away could be all the way over . 6 but this is a lot less likely the but it's not very probable the more accurate measurement the steeper or curve will be and all we are in now this point I'm it's almost impossible but it would be at point 6 and and it's much more likely that it would be a lot closer to . 0 and so a robot is
12:35
a an example of a low information state system we could take thousands or hundreds of measurements of that robot as a just sitting there and average them together but on you know what our world is changing or there's other things impacting our senses or it's like a a robot needs to move and do things on and so 1 of the things that we can do the is a use Bayes rule we can make a prediction and with that prediction use it to increase the accuracy of the estimate of where the robot as so
13:10
previously we thought we're at position 0 the plus or minus some air will then we can predict what the world would look like if we to drive forwards by 10 feet if if we did that it would look something kind like this we were 0 now attend but we want sure so we take measurement and it says up were not at 10 it which it showing that were at 5 so what we do our measurement in our predictions prediction disagree tough of so probably good guess might be somewhere right in between the 2 we can take our measurement in our prediction and make convolution which is a really fancy way of saying the product of 2 functions now the result is actually more accurate than either of our guesses individually so even or measurement was noisy we don't actually know for a 5 in our prediction was noisy we don't were not actually depend on the end result is more reliable In this gives us a common
14:13
filter from a common filter can use anytime you have an action a a model of motion and some noisy data that you want to produce a more accurate predictions so how does a common book you might ask this is an example
14:28
of a homework assignment that was that there was given to us in the green represents an actual robots half were all the little red dots or the noisy measurements and it's so noisy that if you just take 2 up to subsequent points to measurements you can't tell which direction the robot is moving and because the 2nd point might actually be way behind the 1st point so it's incredibly incredibly noisy and this is
14:54
part of part of the class but you can actually go to Udacity and take the cop take the course for free and this is the final like the final thing that they do costs of if you end up going to draw there's a little bit more involved but to make things even more interesting not only do you have to figure out where the robot is you have your own robot that moves slightly slower than the 1 you're trying to find and you have to chase it so you have to predict where it'll be a timer to into the future and then be there the and then sorry for anybody who's colorblind they they pick the colors not me but so I was this look like well if we can apply a common filter and we end up something you know kind like this before our red dots were virtually unusable as I mentioned in the given 2 points we can even determine that the direction but with this correctly implemented we can see our chaser robot getting closer and closer and the so I like a little bit audience participation who here likes money
16:03
please OK are I think some people that raise their hands it's OK about war we look at how common filter looks like a let's look at some cold hard cash this is a
16:16
1913 Liberty head nickels the it was produced without the approval of the US Mid and as a result they only made 5 of them only 5 of these guys into circulation as a result it's incredibly incredibly rail rare and you find this it's worth 3 . 7 million dollars so yeah that's of I'd say that's a pretty penny but but but I'll be here all week folks but this is not the Liberty Head nickel this is a trick coin that for some some reason you're out of your work when collecting friend happened to have that has 2 heads instead of being the actual and Liberty Head nickel and this quaint collecting friend also has a 3 . 7 million dollar coins and for some strange reason
17:06
they put 2 coins into a bag and shake it up and draw 1 so we have 1 fair coin in 1 trickling in a bag they they say hey you know I like you join up Johnny playing your make 3 . 7 million dollars and so they they take a coin out they flip it and
17:27
they say that up OK it landed on heads From here on they might try to make some sort of a wager about like OK well you know if if it's a 3 . 7 million dollar coin you can keep that but otherwise laughter you know I don't know my lawn or something good mean fairly equivalent right up but like it would that be a good better not in order to know we have to know what is the probability that given it landed on heads that we have a fair coin To do this we can
17:58
use bayes rule this is what it looks like the to explain a little bit the of the
18:03
syntax of peace and probability and we're saying of what the the probability of A given B so this is the probability of that we have a 3 . 7 million dollar coin given that we know it was a that's that's the information that's all we knew so in order to do this we can we can pluses out piece by piece so the probability of heads what is the probability of
18:31
heads we have the 3 total chances of getting heads in 1 chance of getting tales so we have a 3 out of 4 were 75 per cent chance of getting heads another way that we can do this is say well there's a 50 % chance that we get Our or their point and if we get that here point there's a 50 % chance that it's heads there's we add that to a 50 per cent chance of getting our trick coin and if we get a trick when there's a 100 per cent chance that we're going get heads and when you do that you end up with the exact same result on this is just the more map the way of achieving that of intuition because later on and I tried to teach my program intuition didn't didn't work out too well but also so this is a talk on artificial intelligence and I have to admit I don't know a whole lot about artificial intelligence or I would've written artificial intelligence to write my talk so put
19:29
the thank you again sort of organ and is under or our equation and keep moving so now I wanna know what is the probability of a the probability of getting that 3 . 7 million dollar point all we know and we have 2 2 different cases the equally probable we have a 50 % chance of getting that and we can add this back to our equation the last piece is the probability of heads given that we have a fair point given that we have this 3 . 7 million dollar point so in that case we only have like assuming
20:06
that we have the the fair point we flip it there's only 1 of 2 chance that we have heads so that 50 % we can add
20:14
here we put all that together we end up with a with a 1 in 3 words your . 3 3 per cent of the 33 per cent chance of of owning a you know multi million dollar 1913 Liberty Head nickel so 1 3 it's not you know it's not great but so is not nothing this is what we can
20:41
do with with Bayes rule given to related probabilities in this case where the probability will get heads and also what what is the probability that will draw all our money coin we can accurately predict that that relationship con Academy has a really good resource on Bayes rule and of instead another way to teach this this is the very map the way OMB 1 other way to look at this is with trees so here's but essentially that
21:11
to answer this question we need only rewind and grow a tree the 1st and then he picks 1 of 2 points so our tree grows to branches leading to 2 equally likely outcomes fair or unfair the next event he flips the coin we grow again if he had a fair coin we notice foot can result in 2 equally likely outcomes heads and tails while the unfair coin results 2 outcomes are both heads that are trees finished and we see it has 4 leaves representing work equally likely outcomes the final step in new evidence he says that whenever we gain evidence we must trim archery we cut any branch leading details because we know tales did not occur and that is it so the probability that he chose the fair coin uh is uh the 1 Pharaoh come leading to heads divided by the 3 possible outcomes leading to head the war onethird the yeah right so if
22:16
we use trees or we use Bayes rule we get the same outcome I'm an expert in probability but that's probably a good thing 1 element I mentioned but did well on was Total Probability
22:28
and also very terribly sorry I lied about Bayes rule the but that is in
22:35
all of Bayes rule it actually looks a little bit more like this so this is the expanded form and a sea both side by side on this is just expanded that the total probability of B expand on the bottom so what
22:49
exactly is total probability if we're gonna look
22:52
at all our problem another way we can say alright well we with 50 per cent chance of all our there are actual point or the dollar trickle and in in this problem space if we're land on heads
23:06
heads is going to completely take up the of the trick when case if we uh if if we have a trick when there's a hunter percent chance of heads however it only half takes up the 3 . 7 million dollars a point if we land on tails tails falls entirely inside of the 3 . 7 million or point and on every 100 % chance that that is the point now what we actually wanna know is of this book is our is this section so yeah so what we what we want to know is the the probability the total probability of getting getting heads in order to do that we can calculated by adding up this section along with this section and that will give us the total probability to write it out long form and with the probability of heads given that we have power from the appointed times the probability of the fair point plus the probability of heads times the calling multiplied by the probability of of uh getting that require so it's not just
24:14
the summation and we did this previously what i why should you decide that explain exactly why we did war where we're getting that map from
24:21
and so that's that's where came from but we can make this a little bit tougher than
24:28
what if we flip 2 points are what if we flip a coin twice in a land on heads both times In order to do that
24:36
if it makes actually little simpler if we use the expanded form of Anakin dwell on exactly where we got all the numbers from as much but on here the suffix I indicates each of the different cases so we could have a coin that is a fair point or we could have a coin that is the other the not the point so the probability of a landing on heads twice given our on their point is going to be of you flip it in its 50 per cent chance heads you flip it again it's a 50 % chance heads multiply those 2 together the probability of getting that the point hasn't changed it never well there's always a 50 % chance of getting 1 out of 2 points and then we can know we can flesh out the summation and at the bottom and again it's so it's a a 0 . 2 5 times of times a half plus if we if we get heads or if we have the trick coin it's 100 some probability so it's 1 times the probability of getting the trick 1 which is a half you go with me OK alright so if you how this together you end up with a fair the which is 0 . 2 and now Bayes rule doesn't claim certainty you know like RR values are going down is more and more and more likely that we do not have a fair point but it's never actually reach 0 and that's a really important part because if it does reach 0 and and then we flip that again and it turned out to be tails well the way Bayes rule is written on it would never recover from that mathematically would never recover from the so us article that map but we need it is anybody ready for a break from
26:21
aaai so we'll take a break from mass but with some more mass right
26:29
for that on the plumber math jacket the the the I do appreciate you all have a bearing with me so if we look back at Bayes rule
26:44
again 1 of the of 1 way to represent it would be splitting the equation out this is exactly what we had before but on on 1 side we basically have a constant that the probability of of getting our fair point every single time was exactly the same so but this is going to call our prior this it without any information at all in this system on uh we we can say that would be the the probability of getting a point this other section is after we had information so the posterior so post information on and it even even if our prior is 0 . 5 our posterior if we have the case where the we got tails are posterior is so large that actually pull the 0 . 5 up all the way to be 100 % and say we we definitively have a fair point so
27:39
they are a common filter is a recursive Bayes estimation and I can guarantee you that all of these are words the and the of previously we looked at a graph and we had a
27:51
prediction and so that that is actually our prior and it's not we also had a measurement and asking your posterior this is the thing that updated after we got new information and our our convolution where the somewhere in between we don't exactly know where so what that's where actually implementing a common filter come from so that the next example that comes from Simon D. let me I have a link to this resource a step by step goes through in really explains the math but I know your has might be running a little bit but like I'm barely skimming the surface of and it's some it's really interesting but he also has a a fairly unique I and fairly simple example that I'm gonna walk through the heart of how to implement in a common filter so let's say we got a
28:44
plane and this plane is really simple on all we can do is land apparently up and all the way you control it is by multiplying your current altitude by some other value in this case it's 0 . 7 5 and this gives us a nice you know it's nice steady landing toward the end it's like small moving in smaller and smaller and smaller increments until eventually become a temperature unfortunately are measurements are really really really noisy so this is our that line but with 20 % noise and on we're actually going below the ground here we're going negative measurements so according to our measurements like repeatedly slamming into the ground and online I know like visually mental units like 0 yeah there's nice align in there it's like but if if you are writing a system that depends on those measurements we needed to be a nice straight line nice smooth line instead of this jagged thing that sometimes indicates were below the ground this so we're actually
29:43
program this in a common filter burgonet ring start off with our rate of descent just 0 . 7 5 are initial position and measurement error the were then a bigger just make a guess for the say well let's just assume you were at the very 1st position that you were measured at and we also introduce a new thing called P which is our our estimation error this is our prediction error is going a value between 0 and 1 that we're gonna use to remember how we can adjust it our robot sort of new backandforth is close to the prediction is closer to the measurement and that's how we're going to do that and to get started we pull a measurement of a measurement array on I do apologize this is in Python by a yeah I assume everybody is a polyglot and luckily all the code is identical to what it would be in Ruby except for the very top line the for k in range 10 I so our area so we start off with a gas we we multiply where we currently were by are constants of 0 . 7 5 that's now where we think we are we then 1 saying build in some way where if we move just a little teeny tiny bit or predictions probably pretty accurate but if we move a whole lot our predictions not that accurate so we're going to multiply emotion buyer are prediction error and the reason we do this twice is that prediction error is actually represented as sigma squared so it's error squared and so you don't really need to know that just more place that's the prediction phase then after we predicted we have to update it with our measurements an escape this gain line instead go go straight to the to the actual update so we have our guess of where we currently are no we added with a mysterious and a number of of times and the current measurement minus the previous guess and so the way that we can think about this gain is it's it's on the ratio of our last measurement and the predictions if our prediction error is really well like really really low then our gain is really really low and if it's so low that it gets pretty close to 0 we can approximate 0 and when that happens we can actually eliminate how this entire term and that means that we should just ignore our noisy measurements altogether our last prediction was so good it was so good we don't even need our new measurements or either that or our measurements were so bad but that is not helping us in any way shape or form if the prediction error is high then it means we really hiding in when that happens we end up approaching 1 when we do this we have an X gas and then we also have a negative x gas in those 2 terms cancel each other out and we end up just guessing whatever our measurement is this means that our our our what we throw our previous prediction and just use a measurement you might wanna do this in a case where turns out that your senses really really really accurate but your prediction model is not so way to visualize that is if our prediction
33:07
is less certain were less accurate it's kind of a little bit more flat and in our robot would be leaning toward the measurement or if our prediction is more certain it's a little bit more peaky then our can be leaning more toward the predictions but
33:23
you put all this together and you recursively update your prediction error and you end up with a the graphic
33:30
helps a little bit like this so that the the the the jagged line represents our very noisy measurements the blue line represents the actual value of the plane and the and the little green squares are what we're predicting no it's not dead on like again we're not and we're not perfect at predicting the future but we're pretty close were a lot better than what we had previously and were given this hopefully airplane more room crashed into the ground to repeatedly so that's on that's free
34:09
much the simplest case of a common filter the we can we can get a lot a lot deeper there's a lot more uh scenarios in situations of 1 of the more the more common things is having a common filter in and in a matrix form for example in this case we only had altitude but what if we also had engine speed and me the like the barometric pressure and and the angle of our flaps and the angle of those are the pilot is pulling back on the the new controls like and if we put all this
34:45
together and said if they are related and instead of individually writing common filter for each of them we put them in 1 common filter on it actually ends up being a much much much more accurate for the entire system and so that we this looks pretty similar but it's some yeah there's a little bit more going on that we don't this really have time to get into the other case
35:06
where a a a common filter gets into trouble is in motion that is isn't linear so previously yes we have a nice gentle curve but each step itself was linear each step was just based on a constant multiplied by the previous step but there are cases where we have sinusoidal motion or of the logarithmic or just you know not not linear and when that happens we end up having about 2 different probability distributions and when when we put them together they in order to add to probabilities tributions together they have to be on the same plane and here were at work on estimating and making a bad estimation granted this is still is likely better than doing it without any kind of just taking the noisy measurements but I would recommend not doing this instead there's other ways out and there's an extended Kalman filter there's
35:58
unscented Kalman filter out in this is going away I think of extend common filters is is it rotates the plane of the our on probability distribution so that it approximates on it is still has to be online and it still has to both of them have to be on the same line but and we can approximate our curve by by rotating are right OK
36:25
so and so that's it for Bayes rule are that's right that's it for common filter I'm I did wanna go back a little bit too to Bayes rule and touch on the 2 most
36:35
important parts the so the prediction
36:43
if we never predict the future and then we can't know for writer on this is what scientists this is why scientists start with a hypothesis about hypothesis is wrong were forced to reevaluate our underlying assumptions when we have it and then whenever we get new information we have to update we have
37:01
to update our own set of beliefs the interesting thing about this is we can never be too sure of ourselves no matter how many times we get heads we can never be 100 per cent sure that it is a trick coin unless we actually investigated that's why this is a probability as soon as soon as the dips true that you end up and going all the way to 0 it or if you just make that claim if you say 0 this is your sentences could ever happen Bayes rule will not help you your system can never recover I so yes I already gave the example previously of even if you get tails heads like sorry Bayes rule tells you there's a 0 % chance you cannot recover so no matter how sure of yourself that you are you always need to remain a little bit
37:49
skeptical the you might think that this is your percent chance are that that there's a hundred centuries of the sun coming tomorrow that be a pretty good bet and for most days but you'd be right but if it turns out that tomorrow is the day that our son turns into a red giant and consumes the earth on hopefully you're millennia of prior experience with the sun coming up every day but doesn't call you accidently die to up on that note it always pays to have good information of ended gasses we don't have to wait until power us on explodes we can actually take a look at other stars and see what happens to them so we we can compare our situation to it's like 0 maybe not it's it's not exactly the same but it'll give us a better prediction than we would have otherwise and so the more data and more predictions that we make the better outcomes will be but that's again
38:55
the so I highly recommend a book called
39:01
algorithms to live by and I think it's a provides a book every program should read it's it's so it's very narrative and it has an entire chapter on Bayes rule it's very easy to read it's doesn't get into the math nittygritty like I did and I also have I see some people taking photos on leave it up here and speak the delay the next slide OK good but I also highly
39:24
recommend on the signal and the noise this is a book written by Nate Silver's about probability and over of runs out 5 38 he he uh successfully predicted but our 45th president has won by chance of winning on and would likely lose the popular vote and he did not predict the magnitude by which you would lose the popular vote and to set the audio I got its roots are in regret of Requiem in D minor but previously at the common
40:00
tutorial on youtube at least last common dashed tutorial this is Steven D. Levy is a resource and then also on if you're really in the common filters anyone see a lot of that the the unscented Kalman filters extend common filters this is a great resource it's outages Billy such common Bashir book and unfortunately all this is also in Python on but uh it's I mean if you know Ruby it's pretty easy to read
40:26
you can also check out Udacity in Georgia Tech and if you didn't know they is not short babies and it's saw African American Vernacular and stands for but before anyone else so the cook Copernicus built on top of a base theory and developed uh special cases of when we can truly have no prior estimate what what should we do will plus talk bases work and actually much of what we know is Bayes rule Bayes theorem to be the nice and pleasant polished I think that is actually come from a plots so before there was Copernicus before there was a loss this was very a very much
41:18
a factor or if
41:24
things and